Search | arXiv e-print repository

DNA Calorimetric Force Spectroscopy at Single Base Pair Resolution

Authors: Paolo Rissone, Marc Rico-Pasto, Steve Smith, Felix Ritort

Abstract: DNA hybridization is a fundamental reaction with wide-ranging applications in biotechnology. The nearest-neighbor (NN) model provides the most reliable description of the energetics of duplex formation. Most DNA thermodynamics studies have been done in melting experiments in bulk, of limited resolution due to ensemble averaging. In contrast, single-molecule methods have reached the maturity to der… ▽ More DNA hybridization is a fundamental reaction with wide-ranging applications in biotechnology. The nearest-neighbor (NN) model provides the most reliable description of the energetics of duplex formation. Most DNA thermodynamics studies have been done in melting experiments in bulk, of limited resolution due to ensemble averaging. In contrast, single-molecule methods have reached the maturity to derive DNA thermodynamics with unprecedented accuracy. We combine single-DNA mechanical unzip** experiments using a temperature jump optical trap with machine learning methods and derive the temperature-dependent DNA energy parameters of the NN model. In particular, we measure the previously unknown ten heat-capacity change parameters $ΔC_p$, relevant for thermodynamical predictions throughout the DNA stability range. Calorimetric force spectroscopy establishes a groundbreaking methodology to accurately study nucleic acids, from chemically modified DNA to RNA and DNA/RNA hybrid structures. △ Less

Submitted 29 April, 2024; originally announced April 2024.

Comments: Main: 23 pages, 4 figures, 1 table SI: 13 pages, 7 figures, 5 tables

arXiv:2311.00136 [pdf, other]

Neuroformer: Multimodal and Multitask Generative Pretraining for Brain Data

Authors: Antonis Antoniades, Yiyi Yu, Joseph Canzano, William Wang, Spencer LaVere Smith

Abstract: State-of-the-art systems neuroscience experiments yield large-scale multimodal data, and these data sets require new tools for analysis. Inspired by the success of large pretrained models in vision and language domains, we reframe the analysis of large-scale, cellular-resolution neuronal spiking data into an autoregressive spatiotemporal generation problem. Neuroformer is a multimodal, multitask g… ▽ More State-of-the-art systems neuroscience experiments yield large-scale multimodal data, and these data sets require new tools for analysis. Inspired by the success of large pretrained models in vision and language domains, we reframe the analysis of large-scale, cellular-resolution neuronal spiking data into an autoregressive spatiotemporal generation problem. Neuroformer is a multimodal, multitask generative pretrained transformer (GPT) model that is specifically designed to handle the intricacies of data in systems neuroscience. It scales linearly with feature size, can process an arbitrary number of modalities, and is adaptable to downstream tasks, such as predicting behavior. We first trained Neuroformer on simulated datasets, and found that it both accurately predicted simulated neuronal circuit activity, and also intrinsically inferred the underlying neural circuit connectivity, including direction. When pretrained to decode neural responses, the model predicted the behavior of a mouse with only few-shot fine-tuning, suggesting that the model begins learning how to do so directly from the neural representations themselves, without any explicit supervision. We used an ablation study to show that joint training on neuronal responses and behavior boosted performance, highlighting the model's ability to associate behavioral and neural representations in an unsupervised manner. These findings show that Neuroformer can analyze neural datasets and their emergent properties, informing the development of models and hypotheses associated with the brain. △ Less

Submitted 15 March, 2024; v1 submitted 31 October, 2023; originally announced November 2023.

Comments: 9 pages for main paper. 22 pages in total. 13 figures, 1 table

arXiv:2305.18804 [pdf]

Neural correlates of cognitive ability and visuo-motor speed: validation of IDoCT on UK Biobank Data

Authors: Valentina Giunchiglia, Sharon Curtis, Stephen Smith, Naomi Allen, Adam Hampshire

Abstract: Automated online and App-based cognitive assessment tasks are becoming increasingly popular in large-scale cohorts and biobanks due to advantages in affordability, scalability and repeatability. However, the summary scores that such tasks generate typically conflate the cognitive processes that are the intended focus of assessment with basic visuomotor speeds, testing device latencies and speed-ac… ▽ More Automated online and App-based cognitive assessment tasks are becoming increasingly popular in large-scale cohorts and biobanks due to advantages in affordability, scalability and repeatability. However, the summary scores that such tasks generate typically conflate the cognitive processes that are the intended focus of assessment with basic visuomotor speeds, testing device latencies and speed-accuracy tradeoffs. This lack of precision presents a fundamental limitation when studying brain-behaviour associations. Previously, we developed a novel modelling approach that leverages continuous performance recordings from large-cohort studies to achieve an iterative decomposition of cognitive tasks (IDoCT), which outputs data-driven estimates of cognitive abilities, and device and visuomotor latencies, whilst recalibrating trial-difficulty scales. Here, we further validate the IDoCT approach with UK BioBank imaging data. First, we examine whether IDoCT can improve ability distributions and trial-difficulty scales from an adaptive picture-vocabulary task (PVT). Then, we confirm that the resultant visuomotor and cognitive estimates associate more robustly with age and education than the original PVT scores. Finally, we conduct a multimodal brain-wide association study with free-text analysis to test whether the brain regions that predict the IDoCT estimates have the expected differential relationships with visuomotor vs. language and memory labels within the broader imaging literature. Our results support the view that the rich performance timecourses recorded during computerised cognitive assessments can be leveraged with modelling frameworks like IDoCT to provide estimates of human cognitive abilities that have superior distributions, re-test reliabilities and brain-wide associations. △ Less

Submitted 30 May, 2023; originally announced May 2023.

arXiv:2206.01338 [pdf, other]

Biologically-plausible backpropagation through arbitrary timespans via local neuromodulators

Authors: Yuhan Helena Liu, Stephen Smith, Stefan Mihalas, Eric Shea-Brown, Uygar Sümbül

Abstract: The spectacular successes of recurrent neural network models where key parameters are adjusted via backpropagation-based gradient descent have inspired much thought as to how biological neuronal networks might solve the corresponding synaptic credit assignment problem. There is so far little agreement, however, as to how biological networks could implement the necessary backpropagation through tim… ▽ More The spectacular successes of recurrent neural network models where key parameters are adjusted via backpropagation-based gradient descent have inspired much thought as to how biological neuronal networks might solve the corresponding synaptic credit assignment problem. There is so far little agreement, however, as to how biological networks could implement the necessary backpropagation through time, given widely recognized constraints of biological synaptic network signaling architectures. Here, we propose that extra-synaptic diffusion of local neuromodulators such as neuropeptides may afford an effective mode of backpropagation lying within the bounds of biological plausibility. Going beyond existing temporal truncation-based gradient approximations, our approximate gradient-based update rule, ModProp, propagates credit information through arbitrary time steps. ModProp suggests that modulatory signals can act on receiving cells by convolving their eligibility traces via causal, time-invariant and synapse-type-specific filter taps. Our mathematical analysis of ModProp learning, together with simulation results on benchmark temporal tasks, demonstrate the advantage of ModProp over existing biologically-plausible temporal credit assignment rules. These results suggest a potential neuronal mechanism for signaling credit information related to recurrent interactions over a longer time horizon. Finally, we derive an in-silico implementation of ModProp that could serve as a low-complexity and causal alternative to backpropagation through time. △ Less

Submitted 13 January, 2023; v1 submitted 2 June, 2022; originally announced June 2022.

Comments: NeurIPS 2022 Camera Ready

arXiv:2109.12190 [pdf]

Human genetic admixture through the lens of population genomics

Authors: Shyamalika Gopalan, Samuel Patillo Smith, Katharine Korunes, Iman Hamid, Sohini Ramachandran, Amy Goldberg

Abstract: Over the last fifty years, geneticists have made great strides in understanding how our species' evolutionary history gave rise to current patterns of human genetic diversity classically summarized by Lewontin in his 1972 paper, 'The Apportionment of Human Diversity'. One evolutionary process that requires special attention in both population genetics and statistical genetics is admixture: gene fl… ▽ More Over the last fifty years, geneticists have made great strides in understanding how our species' evolutionary history gave rise to current patterns of human genetic diversity classically summarized by Lewontin in his 1972 paper, 'The Apportionment of Human Diversity'. One evolutionary process that requires special attention in both population genetics and statistical genetics is admixture: gene flow between two or more previously separated source populations to form a new admixed population. The admixture process introduces unique patterns of genetic variation within and between populations, which in turn influences the inference of demographic histories, identification genetic targets of selection, and prediction of phenotypes. In this review, we highlight recent studies and methodological advances that have leveraged genomic signatures of admixture to gain insights into human history, natural selection, and complex trait architecture. We also outline some challenges for admixture population genetics, including limitations of applying methods designed for single-ancestry populations to the study of admixed populations. △ Less

Submitted 11 February, 2022; v1 submitted 24 September, 2021; originally announced September 2021.

arXiv:2010.00308 [pdf]

The NonHuman Primate Neuroimaging & Neuroanatomy Project

Authors: Takuya Hayashi, Yujie Hou, Matthew F Glasser, Joonas A Autio, Kenneth Knoblauch, Miho Inoue-Murayama, Tim Coalson, Essa Yacoub, Stephen Smith, Henry Kennedy, David C Van Essen

Abstract: Multi-modal neuroimaging projects are advancing our understanding of human brain architecture, function, connectivity using high-quality non-invasive data from many subjects. However, ground truth validation of connectivity using invasive tracers is not feasible in humans. Our NonHuman Primate Neuroimaging & Neuroanatomy Project (NHP_NNP) is an international effort (6 laboratories in 5 countries)… ▽ More Multi-modal neuroimaging projects are advancing our understanding of human brain architecture, function, connectivity using high-quality non-invasive data from many subjects. However, ground truth validation of connectivity using invasive tracers is not feasible in humans. Our NonHuman Primate Neuroimaging & Neuroanatomy Project (NHP_NNP) is an international effort (6 laboratories in 5 countries) to: (i) acquire and analyze high-quality multi-modal brain imaging data of macaque and marmoset monkeys using protocols and methods adapted from the HCP; (ii) acquire quantitative invasive tract-tracing data for cortical and subcortical projections to cortical areas; and (iii) map the distributions of different brain cell types with immunocytochemical stains to better define brain areal boundaries. We are acquiring high-resolution structural, functional, and diffusion MRI data together with behavioral measures from over 100 individual macaques and marmosets in order to generate non-invasive measures of brain architecture such as myelin and cortical thickness maps, as well as functional and diffusion tractography-based connectomes. We are using classical and next-generation anatomical tracers to generate quantitative connectivity maps based on brain-wide counting of labeled cortical and subcortical neurons, providing ground truth measures of connectivity. Advanced statistical modeling techniques address the consistency of both kinds of data across individuals, allowing comparison of tracer-based and non-invasive MRI-based connectivity measures. We aim to develop improved cortical and subcortical areal atlases by combining histological and imaging methods. Finally, we are collecting genetic and sociality-associated behavioral data in all animals in an effort to understand how genetic variation shapes the connectome and behavior. △ Less

Submitted 10 January, 2021; v1 submitted 1 October, 2020; originally announced October 2020.

arXiv:2004.07975 [pdf, other]

New Light on Cortical Neuropeptides and Synaptic Network Plasticity

Authors: Stephen J. Smith, Michael Hawrylycz, Jean Rossier, Uygar Sümbül

Abstract: Neuropeptides, members of a large and evolutionarily ancient family of proteinaceous cell-cell signaling molecules, are widely recognized as extremely potent regulators of brain function and behavior. At the cellular level, neuropeptides are known to act mainly via modulation of ion channel and synapse function, but functional impacts emerging at the level of complex cortical synaptic networks hav… ▽ More Neuropeptides, members of a large and evolutionarily ancient family of proteinaceous cell-cell signaling molecules, are widely recognized as extremely potent regulators of brain function and behavior. At the cellular level, neuropeptides are known to act mainly via modulation of ion channel and synapse function, but functional impacts emerging at the level of complex cortical synaptic networks have resisted mechanistic analysis. New findings from single-cell RNA-seq transcriptomics now illuminate intricate patterns of cortical neuropeptide signaling gene expression and new tools now offer powerful molecular access to cortical neuropeptide signaling. Here we highlight some of these new findings and tools, focusing especially on prospects for experimental and theoretical exploration of peptidergic and synaptic networks interactions underlying cortical function and plasticity. △ Less

Submitted 16 April, 2020; originally announced April 2020.

Comments: 22 pages, 4 figures, 1 table, to be published in Current Opinion in Neurobiology

arXiv:1911.03439 [pdf]

doi 10.1007/978-3-031-09282-4_17

Towards Monitoring Parkinson's Disease Following Drug Treatment: CGP Classification of rs-MRI Data

Authors: Amir Dehsarvi, Jennifer Kay South Palomares, Stephen Leslie Smith

Abstract: Background and Objective: It is commonly accepted that accurate monitoring of neurodegenerative diseases is crucial for effective disease management and delivery of medication and treatment. This research develops automatic clinical monitoring techniques for PD, following treatment, using the novel application of EAs. Specifically, the research question addressed was: Can accurate monitoring of PD… ▽ More Background and Objective: It is commonly accepted that accurate monitoring of neurodegenerative diseases is crucial for effective disease management and delivery of medication and treatment. This research develops automatic clinical monitoring techniques for PD, following treatment, using the novel application of EAs. Specifically, the research question addressed was: Can accurate monitoring of PD be achieved using EAs on rs-fMRI data for patients prescribed Modafinil (typically prescribed for PD patients to relieve physical fatigue)? Methods: This research develops novel clinical monitoring tools using data from a controlled experiment where participants were administered Modafinil versus placebo, examining the novel application of EAs to both map and predict the functional connectivity in participants using rs-fMRI data. Specifically, CGP was used to classify DCM analysis and timeseries data. Results were validated with two other commonly used classification methods (ANN and SVM) and via k-fold cross-validation. Results: Findings revealed a maximum accuracy of 74.57% for CGP. Furthermore, CGP provided comparable performance accuracy relative to ANN and SVM. Nevertheless, EAs enable us to decode the classifier, in terms of understanding the data inputs that are used, more easily than in ANN and SVM. Conclusions: These findings underscore the applicability of both DCM analyses for classification and CGP as a novel classification technique for brain imaging data with medical implications for medication monitoring. Furthermore, classification of fMRI data for research typically involves statistical modelling techniques being often hypothesis driven, whereas EAs use data-driven explanatory modelling methods resulting in numerous benefits. DCM analysis is novel for classification and advantageous as it provides information on the causal links between different brain regions. △ Less

Submitted 6 November, 2019; originally announced November 2019.

Comments: arXiv admin note: substantial text overlap with arXiv:1910.05378

arXiv:1911.00526 [pdf]

Automated Assignment of Backbone Resonances Using Residual Dipolar Couplings Acquired from a Protein with Known Structure

Authors: P. Shealy, R. Mukhopadhyay, S. Smith, H. Valafar

Abstract: Resonance assignment is a critical first step in the investigation of protein structures using NMR spectroscopy. The development of assignment methods that require less experimental data is possible with prior knowledge of the macromolecular structure. Automated methods of performing the task of resonance assignment can significantly reduce the financial cost and time requirement for protein struc… ▽ More Resonance assignment is a critical first step in the investigation of protein structures using NMR spectroscopy. The development of assignment methods that require less experimental data is possible with prior knowledge of the macromolecular structure. Automated methods of performing the task of resonance assignment can significantly reduce the financial cost and time requirement for protein structure determination. Such methods can also be beneficial in validating a protein's solution state structure. Here we present a new approach to the assignment problem. Our approach uses only RDC data to assign backbone resonances. It provides simultaneous order tensor estimation and assignment. Our approach compares independent order tensor estimates to determine when the correct order tensor has been found. We demonstrate the algorithm's viability using simulated data from the protein domain 1A1Z. △ Less

Submitted 1 November, 2019; originally announced November 2019.

Comments: BioComp 2008, 7 pages

arXiv:1910.05378 [pdf]

Classification of Resting-State fMRI using Evolutionary Algorithms: Towards a Brain Imaging Biomarker for Parkinson's Disease

Authors: Amir Dehsarvi, Stephen L. Smith

Abstract: Accurate early diagnosis and monitoring of neurodegenerative conditions is essential for effective disease management and delivery of medication and treatment. This research develops automatic methods for detecting brain imaging preclinical biomarkers for Parkinson's disease (PD) by considering the novel application of evolutionary algorithms. A fundamental novel element of this work is the use of… ▽ More Accurate early diagnosis and monitoring of neurodegenerative conditions is essential for effective disease management and delivery of medication and treatment. This research develops automatic methods for detecting brain imaging preclinical biomarkers for Parkinson's disease (PD) by considering the novel application of evolutionary algorithms. A fundamental novel element of this work is the use of evolutionary algorithms to both map and predict the functional connectivity in patients using resting state functional MRI data taken from the PPMI to identify PD progression biomarkers. Specifically, Cartesian Genetic Programming was used to classify DCM data as well as time-series data. The findings were validated using two other commonly used classification methods (Artificial Neural Networks and Support Vector Machines) and by employing k-fold cross-validation. Across DCM and time-series analyses, findings revealed maximum accuracies of 75.21% for early stage (prodromal) PD patients versus healthy controls, 85.87% for PD patients versus prodromal PD patients, and 92.09% for PD patients versus healthy controls. Prodromal PD patients were classified from healthy controls with high accuracy - this is notable and represents the key finding of this research since current methods of diagnosing prodromal PD have both low reliability and low accuracy. Furthermore, Cartesian Genetic Programming provided comparable performance accuracy relative to ANN and SVM. Evolutionary algorithms enable us to decode the classifier in terms of understanding the data inputs that are used, more easily than in ANN and SVM. Hence, these findings underscore the relevance of both DCM analyses for classification and CGP as a novel classification tool for brain imaging data with medical implications for disease diagnosis, particularly in early and asymptomatic stages. △ Less

Submitted 11 October, 2019; originally announced October 2019.

arXiv:1908.05214 [pdf, other]

Stability Analysis of a Bulk-Surface Reaction Model for Membrane-Protein Clustering

Authors: L. M. Stolerman, M. Getz, S. G. Llewellyn Smith, M. Holst, P. Rangamani

Abstract: Protein aggregation on the plasma membrane (PM) is of critical importance to many cellular processes such as cell adhesion, endocytosis, fibrillar conformation, and vesicle transport. Lateral diffusion of protein aggregates or clusters on the surface of the PM plays an important role in governing their heterogeneous surface distribution. However, the stability behavior of the surface distribution… ▽ More Protein aggregation on the plasma membrane (PM) is of critical importance to many cellular processes such as cell adhesion, endocytosis, fibrillar conformation, and vesicle transport. Lateral diffusion of protein aggregates or clusters on the surface of the PM plays an important role in governing their heterogeneous surface distribution. However, the stability behavior of the surface distribution of protein aggregates remains poorly understood. Therefore, understanding the spatial patterns that can emerge on the PM solely through protein-protein interaction, lateral diffusion, and feedback is an important step towards a complete description of the mechanisms behind protein clustering on the cell surface. In this work, we investigate the pattern formation of a reaction-diffusion model that describes the dynamics of a system of ligand-receptor complexes. The purely diffusive ligand in the cytosol can bind receptors in the PM, and the resultant ligand-receptor complexes not only diffuse laterally but can also form clusters resulting in different oligomers. Finally, the largest oligomers recruit ligands from the cytosol in a positive feedback. From a methodological viewpoint, we provide theoretical estimates for diffusion-driven instabilities of the protein aggregates based on the Turing mechanism. Our main result is a threshold phenomenon, in which a sufficiently high recruitment of ligands promotes the input of new monomeric components and consequently drives the formation of a single-patch spatially heterogeneous steady-state. △ Less

Submitted 14 August, 2019; originally announced August 2019.

Comments: 30 pages, 13 figures

arXiv:1804.02835 [pdf, other]

A Community-Developed Open-Source Computational Ecosystem for Big Neuro Data

Authors: Randal Burns, Eric Perlman, Alex Baden, William Gray Roncal, Ben Falk, Vikram Chandrashekhar, Forrest Collman, Sharmishtaa Seshamani, Jesse Patsolic, Kunal Lillaney, Michael Kazhdan, Robert Hider Jr., Derek Pryor, Jordan Matelsky, Timothy Gion, Priya Manavalan, Brock Wester, Mark Chevillet, Eric T. Trautman, Khaled Khairy, Eric Bridgeford, Dean M. Kleissas, Daniel J. Tward, Ailey K. Crow, Matthew A. Wright , et al. (5 additional authors not shown)

Abstract: Big imaging data is becoming more prominent in brain sciences across spatiotemporal scales and phylogenies. We have developed a computational ecosystem that enables storage, visualization, and analysis of these data in the cloud, thusfar spanning 20+ publications and 100+ terabytes including nanoscale ultrastructure, microscale synaptogenetic diversity, and mesoscale whole brain connectivity, maki… ▽ More Big imaging data is becoming more prominent in brain sciences across spatiotemporal scales and phylogenies. We have developed a computational ecosystem that enables storage, visualization, and analysis of these data in the cloud, thusfar spanning 20+ publications and 100+ terabytes including nanoscale ultrastructure, microscale synaptogenetic diversity, and mesoscale whole brain connectivity, making NeuroData the largest and most diverse open repository of brain data. △ Less

Submitted 9 April, 2018; v1 submitted 9 April, 2018; originally announced April 2018.

arXiv:1803.07886 [pdf, other]

Beyond activator-inhibitor networks: the generalised Turing mechanism

Authors: Stephen Smith, Neil Dalchau

Abstract: The Turing patterning mechanism is believed to underly the formation of repetitive structures in development, such as zebrafish stripes and mammalian digits, but it has proved difficult to isolate the specific biochemical species responsible for pattern formation. Meanwhile, synthetic biologists have designed Turing systems for implementation in cell colonies, but none have yet led to visible patt… ▽ More The Turing patterning mechanism is believed to underly the formation of repetitive structures in development, such as zebrafish stripes and mammalian digits, but it has proved difficult to isolate the specific biochemical species responsible for pattern formation. Meanwhile, synthetic biologists have designed Turing systems for implementation in cell colonies, but none have yet led to visible patterns in the laboratory. In both cases, the relationship between underlying chemistry and emergent biology remains mysterious. To help resolve the mystery, this article asks the question: what kinds of biochemical systems can generate Turing patterns? We find general conditions for Turing pattern inception -- the ability to generate unstable patterns from random noise -- which may lead to the ultimate formation of stable patterns, depending on biochemical non-linearities. We find that a wide variety of systems can generate stable Turing patterns, including several which are currently unknown, such as two-species systems composed of two self-activators, and systems composed of a short-range inhibitor and a long-range activator. We furthermore find that systems which are widely believed to generate stable patterns may in fact only generate unstable patterns, which ultimately converge to spatially-homogeneous concentrations. Our results suggest that a much wider variety of systems than is commonly believed could be responsible for observed patterns in development, or could be good candidates for synthetic patterning networks. △ Less

Submitted 21 March, 2018; originally announced March 2018.

arXiv:1712.01481 [pdf]

Estimated Incidence of Ophthalmic Conditions Associated with Optic Nerve Disease in Middle Tennessee

Authors: Shikha Chaganti, Katrina M. Nelson, Robert Harrigan, Kunal P. Nabar, Naresh Nandakumar, Tara Goecks, Seth A. Smith, Bennett A. Landman, Louise A. Mawn

Abstract: Aims. The objective of this paper is to determine the incidence of ophthalmic disease potentially leading to optic nerve disease in Middle Tennessee. Methods. We use a retrospective population-based incidence study design focusing on the population of middle Tennessee and its nearby suburbs (N=3 397 515). The electronic medical records for all patients evaluated or treated at a large tertiary care… ▽ More Aims. The objective of this paper is to determine the incidence of ophthalmic disease potentially leading to optic nerve disease in Middle Tennessee. Methods. We use a retrospective population-based incidence study design focusing on the population of middle Tennessee and its nearby suburbs (N=3 397 515). The electronic medical records for all patients evaluated or treated at a large tertiary care hospital and clinics with an initial diagnosis of a disease either affecting the optic nerve, or potentially associated with optic nerve disease, between 2007 and 2014 were retrieved and analyzed. Results. 18 291 patients (10 808 F) with 18 779 incidence events were identified with an age range of 0-101 years from the query of the Vanderbilt BioVU. Estimated age-adjusted incidence per 100 000 population per year was 198.4/145.1 (glaucoma F/M), 14.4/11.4 (intrinsic optic nerve disease F/M), 10.6/5.8 (optic nerve edema F/M), 6.1/6.6 (orbital inflammation F/M), and 23.7/6.7 (thyroid disease F/M). Glaucoma incidence was strongly correlated with age with the incidence sharply increasing after age 40. Optic nerve edema incidence peaked in the 25-34 old females. African American population has increased likelihood of glaucoma, orbital inflammation, and thyroid disease. Conclusions. Map** the incidence of pathologies of the optic nerve is essential to the understanding of the relative likelihood of these conditions and impacts upon public health. We find incidence of optic nerve diseases strongly varies by gender, age, and race which have not been previously studied using a unified framework or within a single metropolitan population △ Less

Submitted 5 December, 2017; originally announced December 2017.

arXiv:1611.05479 [pdf, other]

doi 10.1371/journal.pcbi.1005493

Probabilistic Fluorescence-Based Synapse Detection

Authors: Anish K. Simhal, Cecilia Aguerrebere, Forrest Collman, Joshua T. Vogelstein, Kristina D. Micheva, Richard J. Weinberg, Stephen J. Smith, Guillermo Sapiro

Abstract: Brain function results from communication between neurons connected by complex synaptic networks. Synapses are themselves highly complex and diverse signaling machines, containing protein products of hundreds of different genes, some in hundreds of copies, arranged in precise lattice at each individual synapse. Synapses are fundamental not only to synaptic network function but also to network deve… ▽ More Brain function results from communication between neurons connected by complex synaptic networks. Synapses are themselves highly complex and diverse signaling machines, containing protein products of hundreds of different genes, some in hundreds of copies, arranged in precise lattice at each individual synapse. Synapses are fundamental not only to synaptic network function but also to network development, adaptation, and memory. In addition, abnormalities of synapse numbers or molecular components are implicated in most mental and neurological disorders. Despite their obvious importance, mammalian synapse populations have so far resisted detailed quantitative study. In human brains and most animal nervous systems, synapses are very small and very densely packed: there are approximately 1 billion synapses per cubic millimeter of human cortex. This volumetric density poses very substantial challenges to proteometric analysis at the critical level of the individual synapse. The present work describes new probabilistic image analysis methods for single-synapse analysis of synapse populations in both animal and human brains. △ Less

Submitted 16 November, 2016; originally announced November 2016.

Comments: Current awaiting peer review

arXiv:1601.03064 [pdf, other]

doi 10.1103/PhysRevE.93.052135

The breakdown of the reaction-diffusion master equation with non-elementary rates

Authors: Stephen Smith, Ramon Grima

Abstract: The chemical master equation (CME) is the exact mathematical formulation of chemical reactions occurring in a dilute and well-mixed volume. The reaction-diffusion master equation (RDME) is a stochastic description of reaction-diffusion processes on a spatial lattice, assuming well-mixing only on the length scale of the lattice. It is clear that, for the sake of consistency, the solution of the RDM… ▽ More The chemical master equation (CME) is the exact mathematical formulation of chemical reactions occurring in a dilute and well-mixed volume. The reaction-diffusion master equation (RDME) is a stochastic description of reaction-diffusion processes on a spatial lattice, assuming well-mixing only on the length scale of the lattice. It is clear that, for the sake of consistency, the solution of the RDME of a chemical system should converge to the solution of the CME of the same system in the limit of fast diffusion: indeed, this has been tacitly assumed in most literature concerning the RDME. We show that, in the limit of fast diffusion, the RDME indeed converges to a master equation, but not necessarily the CME. We introduce a class of propensity functions, such that if the RDME has propensities exclusively of this class then the RDME converges to the CME of the same system; while if the RDME has propensities not in this class then convergence is not guaranteed. These are revealed to be elementary and non-elementary propensities respectively. We also show that independent of the type of propensity, the RDME converges to the CME in the simultaneous limit of fast diffusion and large volumes. We illustrate our results with some simple example systems, and argue that the RDME cannot be an accurate description of systems with non-elementary rates. △ Less

Submitted 11 January, 2016; originally announced January 2016.

Comments: 8 pages, 3 figures

arXiv:1510.03172 [pdf, ps, other]

doi 10.1063/1.4936394

Model reduction for stochastic chemical systems with abundant species

Authors: Stephen Smith, Claudia Cianci, Ramon Grima

Abstract: Biochemical processes typically involve many chemical species, some in abundance and some in low molecule numbers. Here we first identify the rate constant limits under which the concentrations of a given set of species will tend to infinity (the abundant species) while the concentrations of all other species remains constant (the non-abundant species). Subsequently we prove that in this limit, th… ▽ More Biochemical processes typically involve many chemical species, some in abundance and some in low molecule numbers. Here we first identify the rate constant limits under which the concentrations of a given set of species will tend to infinity (the abundant species) while the concentrations of all other species remains constant (the non-abundant species). Subsequently we prove that in this limit, the fluctuations in the molecule numbers of non-abundant species are accurately described by a hybrid stochastic description consisting of a chemical master equation coupled to deterministic rate equations. This is a reduced description when compared to the conventional chemical master equation which describes the fluctuations in both abundant and non-abundant species. We show that the reduced master equation can be solved exactly for a number of biochemical networks involving gene expression and enzyme catalysis, whose conventional chemical master equation description is analytically impenetrable. We use the linear noise approximation to obtain approximate expressions for the difference between the variance of fluctuations in the non-abundant species as predicted by the hybrid approach and by the conventional chemical master equation. Furthermore we show that surprisingly, irrespective of any separation in the mean molecule numbers of various species, the conventional and hybrid master equations exactly agree for a class of chemical systems. △ Less

Submitted 12 October, 2015; originally announced October 2015.

Comments: 25 pages, 11 figures

arXiv:1405.7316 [pdf, other]

doi 10.1371/journal.pone.0108095

High-resolution transcriptome analysis with long-read RNA sequencing

Authors: Hyunghoon Cho, Joe Davis, Xin Li, Kevin S. Smith, Alexis Battle, Stephen B. Montgomery

Abstract: RNA sequencing (RNA-seq) enables characterization and quantification of individual transcriptomes as well as detection of patterns of allelic expression and alternative splicing. Current RNA-seq protocols depend on high-throughput short-read sequencing of cDNA. However, as ongoing advances are rapidly yielding increasing read lengths, a technical hurdle remains in identifying the degree to which d… ▽ More RNA sequencing (RNA-seq) enables characterization and quantification of individual transcriptomes as well as detection of patterns of allelic expression and alternative splicing. Current RNA-seq protocols depend on high-throughput short-read sequencing of cDNA. However, as ongoing advances are rapidly yielding increasing read lengths, a technical hurdle remains in identifying the degree to which differences in read length influence various transcriptome analyses. In this study, we generated two paired-end RNA-seq datasets of differing read lengths (2x75 bp and 2x262 bp) for lymphoblastoid cell line GM12878 and compared the effect of read length on transcriptome analyses, including read-map** performance, gene and transcript quantification, and detection of allele-specific expression (ASE) and allele-specific alternative splicing (ASAS) patterns. Our results indicate that, while the current long-read protocol is considerably more expensive than short-read sequencing, there are important benefits that can only be achieved with longer read length, including lower map** bias and reduced ambiguity in assigning reads to genomic elements, such as mRNA transcript. We show that these benefits ultimately lead to improved detection of cis-acting regulatory and splicing variation effects within individuals. △ Less

Submitted 28 May, 2014; originally announced May 2014.

Comments: 29 pages, 8 figures, 11 supplementary figures

arXiv:1306.3543 [pdf, other]

The Open Connectome Project Data Cluster: Scalable Analysis and Vision for High-Throughput Neuroscience

Authors: Randal Burns, William Gray Roncal, Dean Kleissas, Kunal Lillaney, Priya Manavalan, Eric Perlman, Daniel R. Berger, Davi D. Bock, Kwanghun Chung, Logan Grosenick, Narayanan Kasthuri, Nicholas C. Weiler, Karl Deisseroth, Michael Kazhdan, Jeff Lichtman, R. Clay Reid, Stephen J. Smith, Alexander S. Szalay, Joshua T. Vogelstein, R. Jacob Vogelstein

Abstract: We describe a scalable database cluster for the spatial analysis and annotation of high-throughput brain imaging data, initially for 3-d electron microscopy image stacks, but for time-series and multi-channel data as well. The system was designed primarily for workloads that build connectomes---neural connectivity maps of the brain---using the parallel execution of computer vision algorithms on hi… ▽ More We describe a scalable database cluster for the spatial analysis and annotation of high-throughput brain imaging data, initially for 3-d electron microscopy image stacks, but for time-series and multi-channel data as well. The system was designed primarily for workloads that build connectomes---neural connectivity maps of the brain---using the parallel execution of computer vision algorithms on high-performance compute clusters. These services and open-science data sets are publicly available at http://openconnecto.me. The system design inherits much from NoSQL scale-out and data-intensive computing architectures. We distribute data to cluster nodes by partitioning a spatial index. We direct I/O to different systems---reads to parallel disk arrays and writes to solid-state storage---to avoid I/O interference and maximize throughput. All programming interfaces are RESTful Web services, which are simple and stateless, improving scalability and usability. We include a performance evaluation of the production system, highlighting the effectiveness of spatial data organization. △ Less

Submitted 18 June, 2013; v1 submitted 14 June, 2013; originally announced June 2013.

Comments: 11 pages, 13 figures

arXiv:0707.0662 [pdf]

doi 10.1529/biophysj.106.094243

Force unfolding kinetics of RNA using optical tweezers. II. Modeling experiments

Authors: M. Manosas, J. -D. Wen, P. T. X. Li, S. B. Smith, C. Bustamante, I. Tinoco, Jr., F. Ritort

Abstract: By exerting mechanical force it is possible to unfold/refold RNA molecules one at a time. In a small range of forces, an RNA molecule can hop between the folded and the unfolded state with force-dependent kinetic rates. Here, we introduce a mesoscopic model to analyze the hop** kinetics of RNA hairpins in an optical tweezers setup. The model includes different elements of the experimental setu… ▽ More By exerting mechanical force it is possible to unfold/refold RNA molecules one at a time. In a small range of forces, an RNA molecule can hop between the folded and the unfolded state with force-dependent kinetic rates. Here, we introduce a mesoscopic model to analyze the hop** kinetics of RNA hairpins in an optical tweezers setup. The model includes different elements of the experimental setup (beads, handles and RNA sequence) and limitations of the instrument (time lag of the force-feedback mechanism and finite bandwidth of data acquisition). We investigated the influence of the instrument on the measured hop** rates. Results from the model are in good agreement with the experiments reported in the companion article (1). The comparison between theory and experiments allowed us to infer the values of the intrinsic molecular rates of the RNA hairpin alone and to search for the optimal experimental conditions to do the measurements. We conclude that long handles and soft laser traps represent the best conditions to extract rate estimates that are closest to the intrinsic molecular rates. The methodology and rationale presented here can be applied to other experimental setups and other molecules. △ Less

Submitted 4 July, 2007; originally announced July 2007.

Comments: PDF file, 32 pages including 9 figures plus supplementary material

Journal ref: Biophysical Journal, 92 (2007) 3010-3021

arXiv:0707.0580 [pdf]

doi 10.1529/biophysj.106.094052

Force unfolding kinetics of RNA using optical tweezers. I. Effects of experimental variables on measured results

Authors: J. -D. Wen, M. Manosas, P. T. X. Li, S. B. Smith, C. Bustamante, F. Ritort, I. Tinoco Jr

Abstract: Experimental variables of optical tweezers instrumentation that affect RNA folding/unfolding kinetics were investigated. A model RNA hairpin, P5ab, was attached to two micron-sized beads through hybrid RNA/DNA handles; one bead was trapped by dual-beam lasers and the other was held by a micropipette. Several experimental variables were changed while measuring the unfolding/refolding kinetics, in… ▽ More Experimental variables of optical tweezers instrumentation that affect RNA folding/unfolding kinetics were investigated. A model RNA hairpin, P5ab, was attached to two micron-sized beads through hybrid RNA/DNA handles; one bead was trapped by dual-beam lasers and the other was held by a micropipette. Several experimental variables were changed while measuring the unfolding/refolding kinetics, including handle lengths, trap stiffness, and modes of force applied to the molecule. In constant-force mode where the tension applied to the RNA was maintained through feedback control, the measured rate coefficients varied within 40% when the handle lengths were changed by 10 fold (1.1 to 10.2 Kbp); they increased by two- to three-fold when the trap stiffness was lowered to one third (from 0.1 to 0.035 pN/nm). In the passive mode, without feedback control and where the force applied to the RNA varied in response to the end-to-end distance change of the tether, the RNA hopped between a high-force folded-state and a low-force unfolded-state. In this mode, the rates increased up to two-fold with longer handles or softer traps. Overall, the measured rates remained with the same order-of-magnitude over the wide range of conditions studied. In the companion paper (1), we analyze how the measured kinetics parameters differ from the intrinsic molecular rates of the RNA, and thus how to obtain the molecular rates. △ Less

Submitted 4 July, 2007; originally announced July 2007.

Comments: PDF file, 30 pages, 7 figures

Journal ref: Biophysical Journal, 92 (2007) 2996-3009

arXiv:cond-mat/0605737 [pdf, ps, other]

doi 10.1103/PhysRevLett.96.118301

Condensation transition in DNA-polyaminoamide dendrimer fibers studied using optical tweezers

Authors: F. Ritort, S. Mihardja, S. B. Smith, C. Bustamante

Abstract: When mixed together, DNA and polyaminoamide (PAMAM) dendrimers form fibers that condense into a compact structure. We use optical tweezers to pull condensed fibers and investigate the decondensation transition by measuring force-extension curves (FECs). A characteristic plateau force (around 10 pN) and hysteresis between the pulling and relaxation cycles are observed for different dendrimer size… ▽ More When mixed together, DNA and polyaminoamide (PAMAM) dendrimers form fibers that condense into a compact structure. We use optical tweezers to pull condensed fibers and investigate the decondensation transition by measuring force-extension curves (FECs). A characteristic plateau force (around 10 pN) and hysteresis between the pulling and relaxation cycles are observed for different dendrimer sizes, indicating the existence of a first-order transition between two phases (condensed and extended) of the fiber. The fact that we can reproduce the same FECs in the absence of additional dendrimers in the buffer medium indicates that dendrimers remain irreversibly bound to the DNA backbone. Upon salt variation FECs change noticeably confirming that electrostatic forces drive the condensation transition. Finally, we propose a simple model for the decondensing transition that qualitatively reproduces the FECs and which is confirmed by AFM images. △ Less

Submitted 30 May, 2006; originally announced May 2006.

Comments: Latex version, 4 pages+3 color figures

Journal ref: Phys. Rev. Lett. 96, 118301 (2006)

arXiv:astro-ph/0308311 [pdf, ps, other]

doi 10.1016/j.icarus.2004.04.009

Transport of Ionizing Radiation in Terrestrial-like Exoplanet Atmospheres

Authors: David S. Smith, John Scalo, J. Craig Wheeler

Abstract: (Abridged) The propagation of ionizing radiation through model atmospheres of terrestrial-like exoplanets is studied for a large range of column densities and incident photon energies using a Monte Carlo code we have developed to treat Compton scattering and photoabsorption. Incident spectra from parent star flares, supernovae, and gamma-ray bursts are modeled and compared to energetic particles… ▽ More (Abridged) The propagation of ionizing radiation through model atmospheres of terrestrial-like exoplanets is studied for a large range of column densities and incident photon energies using a Monte Carlo code we have developed to treat Compton scattering and photoabsorption. Incident spectra from parent star flares, supernovae, and gamma-ray bursts are modeled and compared to energetic particles in importance. We find that terrestrial-like exoplanets with atmospheres thinner than about 100 g cm^-2 transmit and reprocess a significant fraction of incident gamma-rays, producing a characteristic, flat surficial spectrum. Thick atmospheres (>~ 100 g cm^-2) efficiently block even gamma-rays, but nearly all incident energy is redistributed into diffuse UV and visible aurora-like emission, increasing the effective atmospheric transmission by many orders of magnitude. Depending on the presence of molecular UV absorbers and atmospheric thickness, up to 10% of the incident energy can reach the surface as UV reemission. For the Earth, between 2 x 10^-3 and 4 x 10^-2 of the incident flux reaches the ground in the biologically effective 200--320 nm range, depending on O_2/O_3 shielding. Finally, we suggest that transient atmospheric ionization layers can be frequently created at low altitudes. We conclude that these events can produce frequent fluctuations in atmospheric ionization levels and surficial UV fluxes on terrestrial-like planets. △ Less

Submitted 2 June, 2004; v1 submitted 18 August, 2003; originally announced August 2003.

Comments: 59 pages, 15 figures; in press in Icarus; minor edits, no results changed

Journal ref: Smith, D.S., and Scalo, J.M. (2004) Icarus, 171, 229

arXiv:astro-ph/0307543 [pdf]

doi 10.1023/B:ORIG.0000043120.28077.c9

Importance of Biologically Active Aurora-like Ultraviolet Emission: Stochastic Irradiation of Earth and Mars by Flares and Explosions

Authors: David S. Smith, John Scalo, J. Craig Wheeler

Abstract: (Abridged) We show that sizeable fractions of incident ionizing radiation from stochastic astrophysical sources can be redistributed to biologically and chemically important UV wavelengths, a significant fraction of which can reach the surface. This redistribution is mediated by secondary electrons, resulting from Compton scattering and X-ray photoabsorption, with energies low enough to excite a… ▽ More (Abridged) We show that sizeable fractions of incident ionizing radiation from stochastic astrophysical sources can be redistributed to biologically and chemically important UV wavelengths, a significant fraction of which can reach the surface. This redistribution is mediated by secondary electrons, resulting from Compton scattering and X-ray photoabsorption, with energies low enough to excite atmospheric molecules and atoms, resulting in a rich aurora-like spectrum. We calculate the fraction of energy redistributed into biologically and chemically important wavelength regions for spectra characteristic of stellar flares and supernovae using a Monte-Carlo transport code written for this problem and then estimate the fraction of this energy that is transmitted from the atmospheric altitudes of redistribution to the surface for a few illustrative cases. Redistributed fractions are found to be of order 1%, even in the presence of an ozone shield. This result implies that planetary organisms will be subject to mutationally significant, if intermittent, fluences of UV-B and harder radiation even in the presence of a narrow-band UV shield like ozone. We also calculate the surficial transmitted fraction of ionizing radiation and redistributed ultraviolet radiation for two illustrative evolving Mars atmospheres whose initial surface pressures were 1 bar. Our results suggest that coding organisms on planets orbiting low-mass stars (and on the early Earth) may evolve very differently than on contemporary Earth, with diversity and evolutionary rate controlled by a stochastically varying mutation rate and frequent hypermutation episodes. △ Less

Submitted 30 July, 2003; originally announced July 2003.

Comments: 21 pages, 2 figures, accepted for publication in Origins of Life and Evolution of the Biosphere

Journal ref: Origins of life and evolution of the biosphere October 2004, Volume 34, Issue 5, pp 513-532

Showing 1–24 of 24 results for author: Smith, S