-
Machine learning that predicts well may not learn the correct physical descriptions of glassy systems
Authors:
Arabind Swain,
Sean Alexander Ridout,
Ilya Nemenman
Abstract:
The complexity of glasses makes it challenging to explain their dynamics. Machine Learning (ML) has emerged as a promising pathway for understanding glassy dynamics by linking their structural features to rearrangement dynamics. Support Vector Machine (SVM) was one of the first methods used to detect such correlations. Specifically, a certain output of SVMs trained to predict dynamics from structu…
▽ More
The complexity of glasses makes it challenging to explain their dynamics. Machine Learning (ML) has emerged as a promising pathway for understanding glassy dynamics by linking their structural features to rearrangement dynamics. Support Vector Machine (SVM) was one of the first methods used to detect such correlations. Specifically, a certain output of SVMs trained to predict dynamics from structure, the distance from the separating hyperplane, was interpreted as being linearly related to the activation energy for the rearrangement. By numerical analysis of toy models, we explore under which conditions it is possible to infer the energy barrier to rearrangements from the distance to the separating hyperplane. We observe that such successful inference is possible only under very restricted conditions. Typical tests, such as the apparent Arrhenius dependence of the probability of rearrangement on the inferred energy and the temperature, or high cross-validation accuracy do not guarantee success. We propose practical approaches for measuring the quality of the energy inference and for modifying the inferred model to improve the inference, which should be usable in the context of realistic datasets.
△ Less
Submitted 1 March, 2024;
originally announced March 2024.
-
Learning force laws in many-body systems
Authors:
Wentao Yu,
Eslam Abdelaleem,
Ilya Nemenman,
Justin C. Burton
Abstract:
Scientific laws describing natural systems may be more complex than our intuition can handle, and thus how we discover laws must change. Machine learning (ML) models can analyze large quantities of data, but their structure should match the underlying physical constraints to provide useful insight. Here we demonstrate a ML approach that incorporates such physical intuition to infer force laws in d…
▽ More
Scientific laws describing natural systems may be more complex than our intuition can handle, and thus how we discover laws must change. Machine learning (ML) models can analyze large quantities of data, but their structure should match the underlying physical constraints to provide useful insight. Here we demonstrate a ML approach that incorporates such physical intuition to infer force laws in dusty plasma experiments. Trained on 3D particle trajectories, the model accounts for inherent symmetries and non-identical particles, accurately learns the effective non-reciprocal forces between particles, and extracts each particle's mass and charge. The model's accuracy (R^2 > 0.99) points to new physics in dusty plasma beyond the resolution of current theories and demonstrates how ML-powered approaches can guide new routes of scientific discovery in many-body systems.
△ Less
Submitted 8 October, 2023;
originally announced October 2023.
-
Simultaneous Dimensionality Reduction: A Data Efficient Approach for Multimodal Representations Learning
Authors:
Eslam Abdelaleem,
Ahmed Roman,
K. Michael Martini,
Ilya Nemenman
Abstract:
We explore two primary classes of approaches to dimensionality reduction (DR): Independent Dimensionality Reduction (IDR) and Simultaneous Dimensionality Reduction (SDR). In IDR methods, of which Principal Components Analysis is a paradigmatic example, each modality is compressed independently, striving to retain as much variation within each modality as possible. In contrast, in SDR, one simultan…
▽ More
We explore two primary classes of approaches to dimensionality reduction (DR): Independent Dimensionality Reduction (IDR) and Simultaneous Dimensionality Reduction (SDR). In IDR methods, of which Principal Components Analysis is a paradigmatic example, each modality is compressed independently, striving to retain as much variation within each modality as possible. In contrast, in SDR, one simultaneously compresses the modalities to maximize the covariation between the reduced descriptions while paying less attention to how much individual variation is preserved. Paradigmatic examples include Partial Least Squares and Canonical Correlations Analysis. Even though these DR methods are a staple of statistics, their relative accuracy and data set size requirements are poorly understood. We introduce a generative linear model to synthesize multimodal data with known variance and covariance structures to examine these questions. We assess the accuracy of the reconstruction of the covariance structures as a function of the number of samples, signal-to-noise ratio, and the number of varying and covarying signals in the data. Using numerical experiments, we demonstrate that linear SDR methods consistently outperform linear IDR methods and yield higher-quality, more succinct reduced-dimensional representations with smaller datasets. Remarkably, regularized CCA can identify low-dimensional weak covarying structures even when the number of samples is much smaller than the dimensionality of the data, which is a regime challenging for all dimensionality reduction methods. Our work corroborates and explains previous observations in the literature that SDR can be more effective in detecting covariation patterns in data. These findings suggest that SDR should be preferred to IDR in real-world data analysis when detecting covariation is more important than preserving variation.
△ Less
Submitted 6 February, 2024; v1 submitted 5 October, 2023;
originally announced October 2023.
-
Deep Variational Multivariate Information Bottleneck -- A Framework for Variational Losses
Authors:
Eslam Abdelaleem,
Ilya Nemenman,
K. Michael Martini
Abstract:
Variational dimensionality reduction methods are known for their high accuracy, generative abilities, and robustness. We introduce a framework to unify many existing variational methods and design new ones. The framework is based on an interpretation of the multivariate information bottleneck, in which an encoder graph, specifying what information to compress, is traded-off against a decoder graph…
▽ More
Variational dimensionality reduction methods are known for their high accuracy, generative abilities, and robustness. We introduce a framework to unify many existing variational methods and design new ones. The framework is based on an interpretation of the multivariate information bottleneck, in which an encoder graph, specifying what information to compress, is traded-off against a decoder graph, specifying a generative model. Using this framework, we rederive existing dimensionality reduction methods including the deep variational information bottleneck and variational auto-encoders. The framework naturally introduces a trade-off parameter extending the deep variational CCA (DVCCA) family of algorithms to beta-DVCCA. We derive a new method, the deep variational symmetric informational bottleneck (DVSIB), which simultaneously compresses two variables to preserve information between their compressed representations. We implement these algorithms and evaluate their ability to produce shared low dimensional latent spaces on Noisy MNIST dataset. We show that algorithms that are better matched to the structure of the data (in our case, beta-DVCCA and DVSIB) produce better latent spaces as measured by classification accuracy, dimensionality of the latent variables, and sample efficiency. We believe that this framework can be used to unify other multi-view representation learning algorithms and to derive and implement novel problem-specific loss functions.
△ Less
Submitted 6 February, 2024; v1 submitted 5 October, 2023;
originally announced October 2023.
-
Extrinsic vs Intrinsic Criticality in Systems with Many Components
Authors:
Vudtiwat Ngampruetikorn,
Ilya Nemenman,
David J. Schwab
Abstract:
Biological systems with many components often exhibit seemingly critical behaviors, characterized by atypically large correlated fluctuations. Yet the underlying causes remain unclear. Here we define and examine two types of criticality. Intrinsic criticality arises from interactions within the system which are fine-tuned to a critical point. Extrinsic criticality, in contrast, emerges without fin…
▽ More
Biological systems with many components often exhibit seemingly critical behaviors, characterized by atypically large correlated fluctuations. Yet the underlying causes remain unclear. Here we define and examine two types of criticality. Intrinsic criticality arises from interactions within the system which are fine-tuned to a critical point. Extrinsic criticality, in contrast, emerges without fine tuning when observable degrees of freedom are coupled to unobserved fluctuating variables. We unify both types of criticality using the language of learning and information theory. We show that critical correlations, intrinsic or extrinsic, lead to diverging mutual information between two halves of the system, and are a feature of learning problems, in which the unobserved fluctuations are inferred from the observable degrees of freedom. We argue that extrinsic criticality is equivalent to standard inference, whereas intrinsic criticality describes fractional learning, in which the amount to be learned depends on the system size. We show further that both types of criticality are on the same continuum, connected by a smooth crossover. In addition, we investigate the observability of Zipf's law, a power-law rank-frequency distribution often used as an empirical signature of criticality. We find that Zipf's law is a robust feature of extrinsic criticality but can be nontrivial to observe for some intrinsically critical systems, including critical mean-field models. We further demonstrate that models with global dynamics, such as oscillatory models, can produce observable Zipf's law without relying on either external fluctuations or fine tuning. Our findings suggest that while possible in theory, fine tuning is not the only, nor the most likely, explanation for the apparent ubiquity of criticality in biological systems with many components.
△ Less
Submitted 25 September, 2023;
originally announced September 2023.
-
Data efficiency, dimensionality reduction, and the generalized symmetric information bottleneck
Authors:
K. Michael Martini,
Ilya Nemenman
Abstract:
The Symmetric Information Bottleneck (SIB), an extension of the more familiar Information Bottleneck, is a dimensionality reduction technique that simultaneously compresses two random variables to preserve information between their compressed versions. We introduce the Generalized Symmetric Information Bottleneck (GSIB), which explores different functional forms of the cost of such simultaneous re…
▽ More
The Symmetric Information Bottleneck (SIB), an extension of the more familiar Information Bottleneck, is a dimensionality reduction technique that simultaneously compresses two random variables to preserve information between their compressed versions. We introduce the Generalized Symmetric Information Bottleneck (GSIB), which explores different functional forms of the cost of such simultaneous reduction. We then explore the dataset size requirements of such simultaneous compression. We do this by deriving bounds and root-mean-squared estimates of statistical fluctuations of the involved loss functions. We show that, in typical situations, the simultaneous GSIB compression requires qualitatively less data to achieve the same errors compared to compressing variables one at a time. We suggest that this is an example of a more general principle that simultaneous compression is more data efficient than independent compression of each of the input variables.
△ Less
Submitted 2 February, 2024; v1 submitted 11 September, 2023;
originally announced September 2023.
-
Bayesian estimation of the Kullback-Leibler divergence for categorical sytems using mixtures of Dirichlet priors
Authors:
Francesco Camaglia,
Ilya Nemenman,
Thierry Mora,
Aleksandra M. Walczak
Abstract:
In many applications in biology, engineering and economics, identifying similarities and differences between distributions of data from complex processes requires comparing finite categorical samples of discrete counts. Statistical divergences quantify the difference between two distributions. However, their estimation is very difficult and empirical methods often fail, especially when the samples…
▽ More
In many applications in biology, engineering and economics, identifying similarities and differences between distributions of data from complex processes requires comparing finite categorical samples of discrete counts. Statistical divergences quantify the difference between two distributions. However, their estimation is very difficult and empirical methods often fail, especially when the samples are small. We develop a Bayesian estimator of the Kullback-Leibler divergence between two probability distributions that makes use of a mixture of Dirichlet priors on the distributions being compared. We study the properties of the estimator on two examples: probabilities drawn from Dirichlet distributions, and random strings of letters drawn from Markov chains. We extend the approach to the squared Hellinger divergence. Both estimators outperform other estimation techniques, with better results for data with a large number of categories and for higher values of divergences.
△ Less
Submitted 9 July, 2023;
originally announced July 2023.
-
Inferring Local Structure from Pairwise Correlations
Authors:
Mahajabin Rahman,
Ilya Nemenman
Abstract:
To construct models of large, multivariate complex systems, such as those in biology, one needs to constrain which variables are allowed to interact. This can be viewed as detecting "local" structures among the variables. In the context of a simple toy model of 2D natural and synthetic images, we show that pairwise correlations between the variables -- even when severely undersampled -- provide en…
▽ More
To construct models of large, multivariate complex systems, such as those in biology, one needs to constrain which variables are allowed to interact. This can be viewed as detecting "local" structures among the variables. In the context of a simple toy model of 2D natural and synthetic images, we show that pairwise correlations between the variables -- even when severely undersampled -- provide enough information to recover local relations, including the dimensionality of the data, and to reconstruct arrangement of pixels in fully scrambled images. This proves to be successful even though higher order interaction structures are present in our data. We build intuition behind the success, which we hope might contribute to modeling complex, multivariate systems and to explaining the success of modern attention-based machine learning approaches.
△ Less
Submitted 17 October, 2023; v1 submitted 7 May, 2023;
originally announced May 2023.
-
Neural criticality from effective latent variables
Authors:
Mia C. Morrell,
Ilya Nemenman,
Audrey J. Sederberg
Abstract:
Observations of power laws in neural activity data have raised the intriguing notion that brains may operate in a critical state. One example of this critical state is "avalanche criticality," which has been observed in various systems, including cultured neurons, zebrafish, rodent cortex, and human EEG. More recently, power laws were also observed in neural populations in the mouse under an activ…
▽ More
Observations of power laws in neural activity data have raised the intriguing notion that brains may operate in a critical state. One example of this critical state is "avalanche criticality," which has been observed in various systems, including cultured neurons, zebrafish, rodent cortex, and human EEG. More recently, power laws were also observed in neural populations in the mouse under an activity coarse-graining procedure, and they were explained as a consequence of the neural activity being coupled to multiple latent dynamical variables. An intriguing possibility is that avalanche criticality emerges due to a similar mechanism. Here, we determine the conditions under which latent dynamical variables give rise to avalanche criticality. We find that populations coupled to multiple latent variables produce critical behavior across a broader parameter range than those coupled to a single, quasi-static latent variable, but in both cases, avalanche criticality is observed without fine-tuning of model parameters. We identify two regimes of avalanches, both critical but differing in the amount of information carried about the latent variable. Our results suggest that avalanche criticality arises in neural systems in which activity is effectively modeled as a population driven by a few dynamical variables and these variables can be inferred from the population activity.
△ Less
Submitted 13 October, 2023; v1 submitted 2 January, 2023;
originally announced January 2023.
-
Intrinsic Motivation in Dynamical Control Systems
Authors:
Stas Tiomkin,
Ilya Nemenman,
Daniel Polani,
Naftali Tishby
Abstract:
Biological systems often choose actions without an explicit reward signal, a phenomenon known as intrinsic motivation. The computational principles underlying this behavior remain poorly understood. In this study, we investigate an information-theoretic approach to intrinsic motivation, based on maximizing an agent's empowerment (the mutual information between its past actions and future states).…
▽ More
Biological systems often choose actions without an explicit reward signal, a phenomenon known as intrinsic motivation. The computational principles underlying this behavior remain poorly understood. In this study, we investigate an information-theoretic approach to intrinsic motivation, based on maximizing an agent's empowerment (the mutual information between its past actions and future states). We show that this approach generalizes previous attempts to formalize intrinsic motivation, and we provide a computationally efficient algorithm for computing the necessary quantities. We test our approach on several benchmark control problems, and we explain its success in guiding intrinsically motivated behaviors by relating our information-theoretic control function to fundamental properties of the dynamical system representing the combined agent-environment system. This opens the door for designing practical artificial, intrinsically motivated controllers and for linking animal behaviors to their dynamical properties.
△ Less
Submitted 29 December, 2022;
originally announced January 2023.
-
Generative random latent features models and statistics of natural images
Authors:
Philipp Fleig,
Ilya Nemenman
Abstract:
Complex, multivariable systems are often analyzed by grou** their constituent units into components, sometimes referred to as latent features, which afford physical or biological interpretation. However, a priori many different types of latent features and data decompositions can be defined, and one typically uses a trial and error approach to determine a decomposition that is natural to the sys…
▽ More
Complex, multivariable systems are often analyzed by grou** their constituent units into components, sometimes referred to as latent features, which afford physical or biological interpretation. However, a priori many different types of latent features and data decompositions can be defined, and one typically uses a trial and error approach to determine a decomposition that is natural to the system and its data. It is highly desirable to develop principled understanding of which decomposition is appropriate for given a data set. In this work, we take a step in this direction and argue that sample-sample correlations in the data carry important information to this effect. For this we construct a generative random latent feature matrix model of large data based on linear mixing of latent features. Key ingredient of our model is that we allow for statistical dependence between the mixing coefficients and argue that the model captures characteristic properties found in many types of natural data. Latent dimensionality and correlation patterns of the data are controlled by only two model parameters. The model's data patterns include (overlap**) clusters, sparse mixing, and constrained (non-negative) mixing. We describe the characteristic correlation and eigenvalue distributions of each pattern. Finally, we fit the model on correlation data from natural images and find a near perfect match with the sparse mixing regime of our model. This finding is in line with the well-known sparse coding structure in natural scene images and provides information about the appropriate data decomposition, namely a sparse coding scheme. We believe that our work will deliver similar insights for diverse data of biological systems.
△ Less
Submitted 13 June, 2024; v1 submitted 6 December, 2022;
originally announced December 2022.
-
Low probability states, data statistics, and entropy estimation
Authors:
Damián G. Hernández,
Ahmed Roman,
Ilya Nemenman
Abstract:
A fundamental problem in analysis of complex systems is getting a reliable estimate of entropy of their probability distributions over the state space. This is difficult because unsampled states can contribute substantially to the entropy, while they do not contribute to the Maximum Likelihood estimator of entropy, which replaces probabilities by the observed frequencies. Bayesian estimators overc…
▽ More
A fundamental problem in analysis of complex systems is getting a reliable estimate of entropy of their probability distributions over the state space. This is difficult because unsampled states can contribute substantially to the entropy, while they do not contribute to the Maximum Likelihood estimator of entropy, which replaces probabilities by the observed frequencies. Bayesian estimators overcome this obstacle by introducing a model of the low-probability tail of the probability distribution. Which statistical features of the observed data determine the model of the tail, and hence the output of such estimators, remains unclear. Here we show that well-known entropy estimators for probability distributions on discrete state spaces model the structure of the low probability tail based largely on few statistics of the data: the sample size, the Maximum Likelihood estimate, the number of coincidences among the samples, the dispersion of the coincidences. We derive approximate analytical entropy estimators for undersampled distributions based on these statistics, and we use the results to propose an intuitive understanding of how the Bayesian entropy estimators work.
△ Less
Submitted 3 July, 2022;
originally announced July 2022.
-
Multi-dimensional structure of C. elegans thermal learning
Authors:
Ahmed Roman,
Konstantine Palanski,
Ilya Nemenman,
William S Ryu
Abstract:
Quantitative models of associative learning that explain the behavior of real animals with high precision have turned out very difficult to construct. We do this in the context of the dynamics of the thermal preference of C. elegans. For this, we quantify C. elegans thermotaxis in response to various conditioning parameters, genetic perturbations, and operant behavior using a fast, high-throughput…
▽ More
Quantitative models of associative learning that explain the behavior of real animals with high precision have turned out very difficult to construct. We do this in the context of the dynamics of the thermal preference of C. elegans. For this, we quantify C. elegans thermotaxis in response to various conditioning parameters, genetic perturbations, and operant behavior using a fast, high-throughput microfluidic droplet assay. We then model this data comprehensively, within a new, biologically interpretable, multi-modal framework. We discover that the dynamics of thermal preference are described by two independent contributions and require a model with at least four dynamical variables. One pathway positively associates the experienced temperature independently of food and the other negatively associates to the temperature when food is absent.
△ Less
Submitted 1 June, 2022;
originally announced June 2022.
-
Ballistic deposition with memory: a new universality class of surface growth with a new scaling law
Authors:
Ahmed Roman,
Ruomin Zhu,
Ilya Nemenman
Abstract:
Motivated by recent experimental studies in microbiology, we suggest a modification of the classic ballistic deposition model of surface growth, where the memory of a deposition at a site induces more depositions at that site or its neighbors. By studying the statistics of surfaces in this model, we obtain three independent critical exponents: the growth exponent $β=5/4$, the roughening exponent…
▽ More
Motivated by recent experimental studies in microbiology, we suggest a modification of the classic ballistic deposition model of surface growth, where the memory of a deposition at a site induces more depositions at that site or its neighbors. By studying the statistics of surfaces in this model, we obtain three independent critical exponents: the growth exponent $β=5/4$, the roughening exponent $α= 2$, and the new (size) exponent $γ= 1/2$. The model requires a modification to the Family-Vicsek scaling, resulting in the dynamical exponent $z = \frac{α+γ}β = 2$. This modified scaling collapses the surface width vs time curves for various lattice sizes. This is a previously unobserved universality class of surface growth that could describe surface properties of a wide range of natural systems.
△ Less
Submitted 22 February, 2022;
originally announced February 2022.
-
Statistical properties of large data sets with linear latent features
Authors:
Philipp Fleig,
Ilya Nemenman
Abstract:
Analytical understanding of how low-dimensional latent features reveal themselves in large-dimensional data is still lacking. We study this by defining a linear latent feature model with additive noise constructed from probabilistic matrices, and analytically and numerically computing the statistical distributions of pairwise correlations and eigenvalues of the correlation matrix. This allows us t…
▽ More
Analytical understanding of how low-dimensional latent features reveal themselves in large-dimensional data is still lacking. We study this by defining a linear latent feature model with additive noise constructed from probabilistic matrices, and analytically and numerically computing the statistical distributions of pairwise correlations and eigenvalues of the correlation matrix. This allows us to resolve the latent feature structure across a wide range of data regimes set by the number of recorded variables, observations, latent features and the signal-to-noise ratio. We find a characteristic imprint of latent features in the distribution of correlations and eigenvalues and provide an analytic estimate for the boundary between signal and noise even in the absence of a clear spectral gap.
△ Less
Submitted 8 November, 2021;
originally announced November 2021.
-
Inferring phenomenological models of first passage processes
Authors:
Catalina Rivera,
David Hofmann,
Ilya Nemenman
Abstract:
Biochemical processes in cells are governed by complex networks of many chemical species interacting stochastically in diverse ways and on different time scales. Constructing microscopically accurate models of such networks is often infeasible. Instead, here we propose a systematic framework for building phenomenological models of such networks from experimental data, focusing on accurately approx…
▽ More
Biochemical processes in cells are governed by complex networks of many chemical species interacting stochastically in diverse ways and on different time scales. Constructing microscopically accurate models of such networks is often infeasible. Instead, here we propose a systematic framework for building phenomenological models of such networks from experimental data, focusing on accurately approximating the time it takes to complete the process, the First Passage (FP) time. Our phenomenological models are mixtures of Gamma distributions, which have a natural biophysical interpretation. The complexity of the models is adapted automatically to account for the amount of available data and its temporal resolution. The framework can be used for predicting the behavior of various FP systems under varying external conditions. To demonstrate the utility of the approach, we build models for the distribution of inter-spike intervals of a morphologically complex neuron, a Purkinje cell, from experimental and simulated data. We demonstrate that the developed models can not only fit the data but also make nontrivial predictions. We demonstrate that our coarse-grained models provide constraints on more mechanistically accurate models of the involved phenomena.
△ Less
Submitted 11 August, 2020;
originally announced August 2020.
-
Latent dynamical variables produce signatures of spatiotemporal criticality in large biological systems
Authors:
Mia C. Morrell,
Audrey J. Sederberg,
Ilya Nemenman
Abstract:
Understanding the activity of large populations of neurons is difficult due to the combinatorial complexity of possible cell-cell interactions. To reduce the complexity, coarse-graining had been previously applied to experimental neural recordings, which showed over two decades of scaling in free energy, activity variance, eigenvalue spectra, and correlation time, hinting that the mouse hippocampu…
▽ More
Understanding the activity of large populations of neurons is difficult due to the combinatorial complexity of possible cell-cell interactions. To reduce the complexity, coarse-graining had been previously applied to experimental neural recordings, which showed over two decades of scaling in free energy, activity variance, eigenvalue spectra, and correlation time, hinting that the mouse hippocampus operates in a critical regime. We model the experiment by simulating conditionally independent binary neurons coupled to a small number of long-timescale stochastic fields and then replicating the coarse-graining procedure and analysis. This reproduces the experimentally-observed scalings, suggesting that they may arise from coupling the neural population activity to latent dynamic stimuli. Further, parameter sweeps for our model suggest that emergence of scaling requires most of the cells in a population to couple to the latent stimuli, predicting that even the celebrated place cells must also respond to non-place stimuli.
△ Less
Submitted 10 August, 2020;
originally announced August 2020.
-
Unsupervised Bayesian Ising Approximation for revealing the neural dictionary in songbirds
Authors:
Damián G. Hernández,
Samuel J. Sober,
Ilya Nemenman
Abstract:
The problem of deciphering how low-level patterns (action potentials in the brain, amino acids in a protein, etc.) drive high-level biological features (sensorimotor behavior, enzymatic function) represents the central challenge of quantitative biology. The lack of general methods for doing so from the size of datasets that can be collected experimentally severely limits our understanding of the b…
▽ More
The problem of deciphering how low-level patterns (action potentials in the brain, amino acids in a protein, etc.) drive high-level biological features (sensorimotor behavior, enzymatic function) represents the central challenge of quantitative biology. The lack of general methods for doing so from the size of datasets that can be collected experimentally severely limits our understanding of the biological world. For example, in neuroscience, some sensory and motor codes have been shown to consist of precisely timed multi-spike patterns. However, the combinatorial complexity of such pattern codes have precluded development of methods for their comprehensive analysis. Thus, just as it is hard to predict a protein's function based on its sequence, we still do not understand how to accurately predict an organism's behavior based on neural activity. Here we derive a method for solving this class of problems. We demonstrate its utility in an application to neural data, detecting precisely timed spike patterns that code for specific motor behaviors in a songbird vocal system. Our method detects such codewords with an arbitrary number of spikes, does so from small data sets, and accounts for dependencies in occurrences of codewords. Detecting such dictionaries of important spike patterns --- rather than merely identifying the timescale on which such patterns exist, as in some prior approaches --- opens the door for understanding fine motor control and the neural bases of sensorimotor learning in animals. For example, for the first time, we identify differences in encoding motor exploration versus typical behavior. Crucially, our method can be used not only for analysis of neural systems, but also for understanding the structure of correlations in other biological and nonbiological datasets.
△ Less
Submitted 19 November, 2019;
originally announced November 2019.
-
Precise Spatial Memory in Local Random Networks
Authors:
Joseph L. Natale,
H. George E. Hentschel,
Ilya Nemenman
Abstract:
Self-sustained, elevated neuronal activity persisting on time scales of ten seconds or longer is thought to be vital for aspects of working memory, including brain representations of real space. Continuous-attractor neural networks, one of the most well-known modeling frameworks for persistent activity, have been able to model crucial aspects of such spatial memory. These models tend to require hi…
▽ More
Self-sustained, elevated neuronal activity persisting on time scales of ten seconds or longer is thought to be vital for aspects of working memory, including brain representations of real space. Continuous-attractor neural networks, one of the most well-known modeling frameworks for persistent activity, have been able to model crucial aspects of such spatial memory. These models tend to require highly structured or regular synaptic architectures. In contrast, we elaborate a geometrically-embedded model with a local but otherwise random connectivity profile which, combined with a global regulation of the mean firing rate, produces localized, finely spaced discrete attractors that effectively span a 2D manifold. We demonstrate how the set of attracting states can reliably encode a representation of the spatial locations at which the system receives external input, thereby accomplishing spatial memory via attractor dynamics without synaptic fine-tuning or regular structure. We measure the network's storage capacity and find that the statistics of retrievable positions are also equivalent to a full tiling of the plane, something hitherto achievable only with (approximately) translationally invariant synapses, and which may be of interest in modeling such biological phenomena as visuospatial working memory in two dimensions.
△ Less
Submitted 15 November, 2019;
originally announced November 2019.
-
Thermodynamic Computing
Authors:
Tom Conte,
Erik DeBenedictis,
Natesh Ganesh,
Todd Hylton,
John Paul Strachan,
R. Stanley Williams,
Alexander Alemi,
Lee Altenberg,
Gavin Crooks,
James Crutchfield,
Lidia del Rio,
Josh Deutsch,
Michael DeWeese,
Khari Douglas,
Massimiliano Esposito,
Michael Frank,
Robert Fry,
Peter Harsha,
Mark Hill,
Christopher Kello,
Jeff Krichmar,
Suhas Kumar,
Shih-Chii Liu,
Seth Lloyd,
Matteo Marsili
, et al. (14 additional authors not shown)
Abstract:
The hardware and software foundations laid in the first half of the 20th Century enabled the computing technologies that have transformed the world, but these foundations are now under siege. The current computing paradigm, which is the foundation of much of the current standards of living that we now enjoy, faces fundamental limitations that are evident from several perspectives. In terms of hard…
▽ More
The hardware and software foundations laid in the first half of the 20th Century enabled the computing technologies that have transformed the world, but these foundations are now under siege. The current computing paradigm, which is the foundation of much of the current standards of living that we now enjoy, faces fundamental limitations that are evident from several perspectives. In terms of hardware, devices have become so small that we are struggling to eliminate the effects of thermodynamic fluctuations, which are unavoidable at the nanometer scale. In terms of software, our ability to imagine and program effective computational abstractions and implementations are clearly challenged in complex domains. In terms of systems, currently five percent of the power generated in the US is used to run computing systems - this astonishing figure is neither ecologically sustainable nor economically scalable. Economically, the cost of building next-generation semiconductor fabrication plants has soared past $10 billion. All of these difficulties - device scaling, software complexity, adaptability, energy consumption, and fabrication economics - indicate that the current computing paradigm has matured and that continued improvements along this path will be limited. If technological progress is to continue and corresponding social and economic benefits are to continue to accrue, computing must become much more capable, energy efficient, and affordable. We propose that progress in computing can continue under a united, physically grounded, computational paradigm centered on thermodynamics. Herein we propose a research agenda to extend these thermodynamic foundations into complex, non-equilibrium, self-organizing systems and apply them holistically to future computing systems that will harness nature's innate computational capacity. We call this type of computing "Thermodynamic Computing" or TC.
△ Less
Submitted 14 November, 2019; v1 submitted 5 November, 2019;
originally announced November 2019.
-
Randomly connected networks generate emergent selectivity and predict decoding properties of large populations of neurons
Authors:
Audrey J. Sederberg,
Ilya Nemenman
Abstract:
Advances in neural recording methods enable sampling from populations of thousands of neurons during the performance of behavioral tasks, raising the question of how recorded activity relates to the theoretical models of computations underlying performance. In the context of decision making in rodents, patterns of functional connectivity between choice-selective cortical neurons, as well as broadl…
▽ More
Advances in neural recording methods enable sampling from populations of thousands of neurons during the performance of behavioral tasks, raising the question of how recorded activity relates to the theoretical models of computations underlying performance. In the context of decision making in rodents, patterns of functional connectivity between choice-selective cortical neurons, as well as broadly distributed choice information in both excitatory and inhibitory populations, were recently reported [1]. The straightforward interpretation of these data suggests a mechanism relying on specific patterns of anatomical connectivity to achieve selective pools of inhibitory as well as excitatory neurons. We investigate an alternative mechanism for the emergence of these experimental observations using a computational approach. We find that a randomly connected network of excitatory and inhibitory neurons generates single-cell selectivity, patterns of pairwise correlations, and indistinguishable excitatory and inhibitory readout weight distributions, as observed in recorded neural populations. Further, we make the readily verifiable experimental predictions that, for this type of evidence accumulation task, there are no anatomically defined sub-populations of neurons representing choice, and that choice preference of a particular neuron changes with the details of the task. This work suggests that distributed stimulus selectivity and patterns of functional organization in population codes could be emergent properties of randomly connected networks.
△ Less
Submitted 22 September, 2019;
originally announced September 2019.
-
Physical limit to concentration sensing in a changing environment
Authors:
Thierry Mora,
Ilya Nemenman
Abstract:
Cells adapt to changing environments by sensing ligand concentrations using specific receptors. The accuracy of sensing is ultimately limited by the finite number of ligand molecules bound by receptors. Previously derived physical limits to sensing accuracy have assumed that the concentration was constant and ignored its temporal fluctuations. We formulate the problem of concentration sensing in a…
▽ More
Cells adapt to changing environments by sensing ligand concentrations using specific receptors. The accuracy of sensing is ultimately limited by the finite number of ligand molecules bound by receptors. Previously derived physical limits to sensing accuracy have assumed that the concentration was constant and ignored its temporal fluctuations. We formulate the problem of concentration sensing in a strongly fluctuating environment as a non-linear field-theoretic problem, for which we find an excellent approximate Gaussian solution. We derive a new physical bound on the relative error in concentration $c$ which scales as $δc/c \sim (Dacτ)^{-1/4}$ with ligand diffusivity $D$, receptor cross-section $a$, and characteristic fluctuation time scale $τ$, in stark contrast with the usual Berg and Purcell bound $δc/c \sim (DacT)^{-1/2}$ for a perfect receptor sensing concentration during time $T$. We show how the bound can be achieved by a simple biochemical network downstream the receptor that adapts the kinetics of signaling as a function of the square root of the sensed concentration.
△ Less
Submitted 12 August, 2019;
originally announced August 2019.
-
Universal properties of concentration sensing in large ligand-receptor networks
Authors:
Vijay Singh,
Ilya Nemenman
Abstract:
Cells estimate concentrations of chemical ligands in their environment using a limited set of receptors. Recent work has shown that the temporal sequence of binding and unbinding events on just a single receptor can be used to estimate the concentrations of multiple ligands. Here, for a network of many ligands and many receptors, we show that such temporal sequences can be used to estimate the con…
▽ More
Cells estimate concentrations of chemical ligands in their environment using a limited set of receptors. Recent work has shown that the temporal sequence of binding and unbinding events on just a single receptor can be used to estimate the concentrations of multiple ligands. Here, for a network of many ligands and many receptors, we show that such temporal sequences can be used to estimate the concentration of a few times as many ligand species as there are receptors. Crucially, we show that the spectrum of the inverse covariance matrix of these estimates has several universal properties, which we trace to properties of Vandermonde matrices. We argue that this can be used by cells in realistic biochemical decoding networks.
△ Less
Submitted 20 June, 2019;
originally announced June 2019.
-
Estimation of mutual information for real-valued data with error bars and controlled bias
Authors:
Caroline M. Holmes,
Ilya Nemenman
Abstract:
Estimation of mutual information between (multidimensional) real-valued variables is used in analysis of complex systems, biological systems, and recently also quantum systems. This estimation is a hard problem, and universally good estimators provably do not exist. Kraskov et al. (PRE, 2004) introduced a successful mutual information estimation approach based on the statistics of distances betwee…
▽ More
Estimation of mutual information between (multidimensional) real-valued variables is used in analysis of complex systems, biological systems, and recently also quantum systems. This estimation is a hard problem, and universally good estimators provably do not exist. Kraskov et al. (PRE, 2004) introduced a successful mutual information estimation approach based on the statistics of distances between neighboring data points, which empirically works for a wide class of underlying probability distributions. Here we improve this estimator by (i) expanding its range of applicability, and by providing (ii) a self-consistent way of verifying the absence of bias, (iii) a method for estimation of its variance, and (iv) a criterion for choosing the values of the free parameter of the estimator. We demonstrate the performance of our estimator on synthetic data sets, as well as on neurophysiological and systems biology data sets.
△ Less
Submitted 21 March, 2019;
originally announced March 2019.
-
Receptor crosstalk improves concentration sensing of multiple ligands
Authors:
Martin Carballo-Pacheco,
Jonathan Desponds,
Tatyana Gavrilchenko,
Andreas Mayer,
Roshan Prizak,
Gautam Reddy,
Ilya Nemenman,
Thierry Mora
Abstract:
Cells need to reliably sense external ligand concentrations to achieve various biological functions such as chemotaxis or signaling. The molecular recognition of ligands by surface receptors is degenerate in many systems leading to crosstalk between different receptors. Crosstalk is often thought of as a deviation from optimal specific recognition, as the binding of non-cognate ligands can interfe…
▽ More
Cells need to reliably sense external ligand concentrations to achieve various biological functions such as chemotaxis or signaling. The molecular recognition of ligands by surface receptors is degenerate in many systems leading to crosstalk between different receptors. Crosstalk is often thought of as a deviation from optimal specific recognition, as the binding of non-cognate ligands can interfere with the detection of the receptor's cognate ligand, possibly leading to a false triggering of a downstream signaling pathway. Here we quantify the optimal precision of sensing the concentrations of multiple ligands by a collection of promiscuous receptors. We demonstrate that crosstalk can improve precision in concentration sensing and discrimination tasks. To achieve superior precision, the additional information about ligand concentrations contained in short binding events of the non-cognate ligand should be exploited. We present a proofreading scheme to realize an approximate estimation of multiple ligand concentrations that reaches a precision close to the derived optimal bounds. Our results help rationalize the observed ubiquity of receptor crosstalk in molecular sensing.
△ Less
Submitted 10 October, 2018;
originally announced October 2018.
-
Automated, predictive, and interpretable inference of C. elegans escape dynamics
Authors:
Bryan C. Daniels,
William S. Ryu,
Ilya Nemenman
Abstract:
The roundworm C. elegans exhibits robust escape behavior in response to rapidly rising temperature. The behavior lasts for a few seconds, shows history dependence, involves both sensory and motor systems, and is too complicated to model mechanistically using currently available knowledge. Instead we model the process phenomenologically, and we use the Sir Isaac dynamical inference platform to infe…
▽ More
The roundworm C. elegans exhibits robust escape behavior in response to rapidly rising temperature. The behavior lasts for a few seconds, shows history dependence, involves both sensory and motor systems, and is too complicated to model mechanistically using currently available knowledge. Instead we model the process phenomenologically, and we use the Sir Isaac dynamical inference platform to infer the model in a fully automated fashion directly from experimental data. The inferred model requires incorporation of an unobserved dynamical variable, and is biologically interpretable. The model makes accurate predictions about the dynamics of the worm behavior, and it can be used to characterize the functional logic of the dynamical system underlying the escape response. This work illustrates the power of modern artificial intelligence to aid in discovery of accurate and interpretable models of complex natural systems.
△ Less
Submitted 25 September, 2018;
originally announced September 2018.
-
Increased adaptability to rapid environmental change can more than make up for the two-fold cost of males
Authors:
Caroline M. Holmes,
Ilya Nemenman,
Daniel B. Weissman
Abstract:
The famous "two-fold cost of sex" is really the cost of anisogamy -- why should females mate with males who do not contribute resources to offspring, rather than isogamous partners who contribute equally? In typical anisogamous populations, a single very fit male can have an enormous number of offspring, far larger than is possible for any female or isogamous individual. If the sexual selection on…
▽ More
The famous "two-fold cost of sex" is really the cost of anisogamy -- why should females mate with males who do not contribute resources to offspring, rather than isogamous partners who contribute equally? In typical anisogamous populations, a single very fit male can have an enormous number of offspring, far larger than is possible for any female or isogamous individual. If the sexual selection on males aligns with the natural selection on females, anisogamy thus allows much more rapid adaptation via super-successful males. We show via simulations that this effect can be sufficient to overcome the two-fold cost and maintain anisogamy against isogamy in populations adapting to environmental change. The key quantity is the variance in male fitness -- if this exceeds what is possible in an isogamous population, anisogamous populations can win out in direct competition by adapting faster.
△ Less
Submitted 6 June, 2018;
originally announced June 2018.
-
Chance, long tails, and inference: a non-Gaussian, Bayesian theory of vocal learning in songbirds
Authors:
Baohua Zhou,
David Hofmann,
Itai Pinkoviezky,
Samuel J. Sober,
Ilya Nemenman
Abstract:
Traditional theories of sensorimotor learning posit that animals use sensory error signals to find the optimal motor command in the face of Gaussian sensory and motor noise. However, most such theories cannot explain common behavioral observations, for example that smaller sensory errors are more readily corrected than larger errors and that large abrupt (but not gradually introduced) errors lead…
▽ More
Traditional theories of sensorimotor learning posit that animals use sensory error signals to find the optimal motor command in the face of Gaussian sensory and motor noise. However, most such theories cannot explain common behavioral observations, for example that smaller sensory errors are more readily corrected than larger errors and that large abrupt (but not gradually introduced) errors lead to weak learning. Here we propose a new theory of sensorimotor learning that explains these observations. The theory posits that the animal learns an entire probability distribution of motor commands rather than trying to arrive at a single optimal command, and that learning arises via Bayesian inference when new sensory information becomes available. We test this theory using data from a songbird, the Bengalese finch, that is adapting the pitch (fundamental frequency) of its song following perturbations of auditory feedback using miniature headphones. We observe the distribution of the sung pitches to have long, non-Gaussian tails, which, within our theory, explains the observed dynamics of learning. Further, the theory makes surprising predictions about the dynamics of the shape of the pitch distribution, which we confirm experimentally.
△ Less
Submitted 23 July, 2017;
originally announced July 2017.
-
Reverse-engineering biological networks from large data sets
Authors:
Joseph L. Natale,
David Hofmann,
Damian G. Hernández,
Ilya Nemenman
Abstract:
Much of contemporary systems biology owes its success to the abstraction of a network, the idea that diverse kinds of molecular, cellular, and organismal species and interactions can be modeled as relational nodes and edges in a graph of dependencies. Since the advent of high-throughput data acquisition technologies in fields such as genomics, metabolomics, and neuroscience, the automated inferenc…
▽ More
Much of contemporary systems biology owes its success to the abstraction of a network, the idea that diverse kinds of molecular, cellular, and organismal species and interactions can be modeled as relational nodes and edges in a graph of dependencies. Since the advent of high-throughput data acquisition technologies in fields such as genomics, metabolomics, and neuroscience, the automated inference and reconstruction of such interaction networks directly from large sets of activation data, commonly known as reverse-engineering, has become a routine procedure. Whereas early attempts at network reverse-engineering focused predominantly on producing maps of system architectures with minimal predictive modeling, reconstructions now play instrumental roles in answering questions about the statistics and dynamics of the underlying systems they represent. Many of these predictions have clinical relevance, suggesting novel paradigms for drug discovery and disease treatment. While other reviews focus predominantly on the details and effectiveness of individual network inference algorithms, here we examine the emerging field as a whole. We first summarize several key application areas in which inferred networks have made successful predictions. We then outline the two major classes of reverse-engineering methodologies, emphasizing that the type of prediction that one aims to make dictates the algorithms one should employ. We conclude by discussing whether recent breakthroughs justify the computational costs of large-scale reverse-engineering sufficiently to admit it as a mainstay in the quantitative analysis of living systems.
△ Less
Submitted 24 May, 2017; v1 submitted 17 May, 2017;
originally announced May 2017.
-
Luria-Delbruck, revisited: The classic experiment does not rule out Lamarckian evolution
Authors:
Caroline Holmes,
Mahan Ghafari,
Abbas Anzar,
Varun Saravanan,
Ilya Nemenman
Abstract:
We re-examined data from the classic Luria-Delbruck fluctuation experiment, which is often credited with establishing a Darwinian basis for evolution. We argue that, for the Lamarckian model of evolution to be ruled out by the experiment, the experiment must favor pure Darwinian evolution over both the Lamarckian model and a model that allows both Darwinian and Lamarckian mechanisms. Analysis of t…
▽ More
We re-examined data from the classic Luria-Delbruck fluctuation experiment, which is often credited with establishing a Darwinian basis for evolution. We argue that, for the Lamarckian model of evolution to be ruled out by the experiment, the experiment must favor pure Darwinian evolution over both the Lamarckian model and a model that allows both Darwinian and Lamarckian mechanisms. Analysis of the combined model was not performed in the original 1943 paper. The Luria-Delbruck paper also did not consider the possibility of neither model fitting the experiment. Using Bayesian model selection, we find that the Luria-Delbruck experiment, indeed, favors the Darwinian evolution over purely Lamarckian. However, our analysis does not rule out the combined model, and hence cannot rule out Lamarckian contributions to the evolutionary dynamics.
△ Less
Submitted 19 January, 2017;
originally announced January 2017.
-
Single variant bottleneck in the early dynamics of H. influenzae bacteremia in neonatal rats questions the theory of independent action
Authors:
Xinxian Shao,
Bruce R. Levin,
Ilya Nemenman
Abstract:
There is an abundance of information about the genetic basis, physiological and molecular mechanisms of bacterial pathogenesis. In contrast, relatively little is known about population dynamic processes, by which bacteria colonize hosts and invade tissues and cells and thereby cause disease. In an article published in 1978, Moxon and Murphy presented evidence that, when inoculated intranasally wit…
▽ More
There is an abundance of information about the genetic basis, physiological and molecular mechanisms of bacterial pathogenesis. In contrast, relatively little is known about population dynamic processes, by which bacteria colonize hosts and invade tissues and cells and thereby cause disease. In an article published in 1978, Moxon and Murphy presented evidence that, when inoculated intranasally with a mixture streptomycin sensitive and resistant (Sm$^S$ and Sm$^R$) and otherwise isogenic stains of Haemophilus influenzae type b (Hib), neonatal rats develop a bacteremic infection that often is dominated by only one strain, Sm$^S$ or Sm$^R$. After rulling out other possibilities through years of related experiments, the field seems to have settled on a plausible explanation for this phenomenon: the first bacterium to invade the host activates the host immune response that `shuts the door' on the second invading strain. To explore this hypothesis in a necessarily quantitative way, we modeled this process with a set of mixed stochastic and deterministic differential equations. Our analysis of the properties of this model with realistic parameters suggests that this hypothesis cannot explain the experimental results of Moxon and Murphy, and in particular the observed relationship between the frequency of different types of blood infections (bacteremias) and the inoculum size. We propose modifications to the model that come closer to explaining these data. However, the modified and better fitting model contradicts the common theory of independent action of individual bacteria in establishing infections. We discuss the implications of these results.
△ Less
Submitted 15 September, 2016;
originally announced September 2016.
-
Motor control by precisely timed spike patterns
Authors:
Kyle H. Srivastava,
Caroline M. Holmes,
Michiel Vellema,
Andrea Pack,
Coen P. H. Elemans,
Ilya Nemenman,
Samuel J. Sober
Abstract:
A fundamental problem in neuroscience is to understand how sequences of action potentials ("spikes") encode information about sensory signals and motor outputs. Although traditional theories of neural coding assume that information is conveyed by the total number of spikes fired (spike rate), recent studies of sensory and motor activity have shown that far more information is carried by the millis…
▽ More
A fundamental problem in neuroscience is to understand how sequences of action potentials ("spikes") encode information about sensory signals and motor outputs. Although traditional theories of neural coding assume that information is conveyed by the total number of spikes fired (spike rate), recent studies of sensory and motor activity have shown that far more information is carried by the millisecond-scale timing patterns of action potentials (spike timing). However, it is unknown whether or how subtle differences in spike timing drive differences in perception or behavior, leaving it unclear whether the information carried by spike timing actually plays a causal role in brain function. Here we demonstrate how a precise spike timing code is read out downstream by the muscles to control behavior. We provide both correlative and causal evidence to show that the nervous system uses millisecond-scale variations in the timing of spikes within multi-spike patterns to regulate a relatively simple behavior - respiration in the Bengalese finch, a songbird. These findings suggest that a fundamental assumption of current theories of motor coding requires revision, and that significant improvements in applications, such as neural prosthetic devices, can be achieved by using precise spike timing information.
△ Less
Submitted 30 May, 2016; v1 submitted 29 May, 2016;
originally announced May 2016.
-
Growth of bacteria in 3-d colonies
Authors:
Xinxian Shao,
Andrew Mugler,
Justin Kim,
Ha Jun Jeong,
Bruce Levin,
Ilya Nemenman
Abstract:
The dynamics of growth of bacterial populations has been extensively studied for planktonic cells in well-agitated liquid culture, in which all cells have equal access to nutrients. In the real world, bacteria are more likely to live in physically structured habitats as colonies, within which individual cells vary in their access to nutrients. The dynamics of bacterial growth in such conditions is…
▽ More
The dynamics of growth of bacterial populations has been extensively studied for planktonic cells in well-agitated liquid culture, in which all cells have equal access to nutrients. In the real world, bacteria are more likely to live in physically structured habitats as colonies, within which individual cells vary in their access to nutrients. The dynamics of bacterial growth in such conditions is poorly understood, and, unlike that for liquid culture, there is not a standard broadly used mathematical model for bacterial populations growing in colonies in three dimensions (3-d). By extending the classic Monod model of resource-limited population growth to allow for spatial heterogeneity in the bacterial access to nutrients, we develop a 3-d model of colonies, in which bacteria consume diffusing nutrients in their vicinity. By following the changes in density of E.coli in liquid and embedded in glucose-limited soft agar, we evaluate the fit of this model to experimental data. The model accounts for the experimentally observed presence of a sub-exponential, diffusion-limited growth regime in colonies, which is absent in liquid cultures. The model predicts and our experiments confirm that, as a consequence of inter-colony competition for the diffusing nutrients and of cell death, there is a non-monotonic relationship between total number of colonies within the habitat and the total number of individual cells in all of these colonies. This combined theoretical-experimental study reveals that, within 3-d colonies, E.coli cells are loosely packed, and colonies produce about 2.5 times as many cells as the liquid culture from the same amount of nutrients. Our model provides a baseline description of bacterial growth in 3-d, deviations from which can be used to identify phenotypic heterogeneities and inter-cellular interactions that further contribute to the structure of bacterial communities.
△ Less
Submitted 3 May, 2016;
originally announced May 2016.
-
Extrinsic and intrinsic correlations in molecular information transmission
Authors:
Vijay Singh,
Martin Tchernookov,
Ilya Nemenman
Abstract:
Cells measure concentrations of external ligands by capturing ligand molecules with cell surface receptors. The numbers of molecules captured by different receptors co-vary because they depend on the same extrinsic ligand fluctuations. However, these numbers also counter-vary due to the intrinsic stochasticity of chemical processes because a single molecule randomly captured by a receptor cannot b…
▽ More
Cells measure concentrations of external ligands by capturing ligand molecules with cell surface receptors. The numbers of molecules captured by different receptors co-vary because they depend on the same extrinsic ligand fluctuations. However, these numbers also counter-vary due to the intrinsic stochasticity of chemical processes because a single molecule randomly captured by a receptor cannot be captured by another. Such structure of receptor correlations is generally believed to lead to an increase in information about the external signal compared to the case of independent receptors. We analyze a solvable model of two molecular receptors and show that, contrary to this widespread expectation, the correlations have a small and negative effect on the information about the ligand concentration. Further, we show that measurements that average over multiple receptors are almost as informative as those that track the states of every individual one.
△ Less
Submitted 26 February, 2016;
originally announced February 2016.
-
Stereotypical escape behavior in Caenorhabditis elegans allows quantification of nociceptive stimuli levels
Authors:
Kawai Leung,
Aylia Mohammadi,
William S. Ryu,
Ilya Nemenman
Abstract:
Experiments of pain with human subjects are difficult, subjective, and ethically constrained. Since the molecular mechanisms of pain transduction are reasonably conserved among different species, these problems are partially solved by the use of animal models. However, animals cannot easily communicate to us their own pain levels. Thus progress depends crucially on our ability to quantitatively an…
▽ More
Experiments of pain with human subjects are difficult, subjective, and ethically constrained. Since the molecular mechanisms of pain transduction are reasonably conserved among different species, these problems are partially solved by the use of animal models. However, animals cannot easily communicate to us their own pain levels. Thus progress depends crucially on our ability to quantitatively and objectively infer the perceived level of noxious stimuli from the behavior of animals. Here we develop a quantitative model to infer the perceived level of thermal nociception from the stereotyped nociceptive response of individual nematodes Caenorhabditis elegans stimulated by an IR laser. The model provides a method for quantification of analgesic effects of chemical stimuli or genetic mutations in C. elegans. We test the nociception of ibuprofen-treated worms and a TRPV (transient receptor potential) mutant, and we show that the perception of thermal nociception for the ibuprofen treated worms is lower than the wild-type. At the same time, our model shows that the mutant changes the worm's behavior beyond affecting nociception. Finally, we determine the stimulus level that best distinguishes the analgesic effects and the minimum number of worms that allow for a statistically significant identification of these effects.
△ Less
Submitted 18 January, 2016;
originally announced January 2016.
-
Role of spatial averaging in multicellular gradient sensing
Authors:
Tyler Smith,
Sean Fancher,
Andre Levchenko,
Ilya Nemenman,
Andrew Mugler
Abstract:
Gradient sensing underlies important biological processes including morphogenesis, polarization, and cell migration. The precision of gradient sensing increases with the length of a detector (a cell or group of cells) in the gradient direction, since a longer detector spans a larger range of concentration values. Intuition from analyses of concentration sensing suggests that precision should also…
▽ More
Gradient sensing underlies important biological processes including morphogenesis, polarization, and cell migration. The precision of gradient sensing increases with the length of a detector (a cell or group of cells) in the gradient direction, since a longer detector spans a larger range of concentration values. Intuition from analyses of concentration sensing suggests that precision should also increase with detector length in the direction transverse to the gradient, since then spatial averaging should reduce the noise. However, here we show that, unlike for concentration sensing, the precision of gradient sensing decreases with transverse length for the simplest gradient sensing model, local excitation--global inhibition (LEGI). The reason is that gradient sensing ultimately relies on a subtraction of measured concentration values. While spatial averaging indeed reduces the noise in these measurements, which increases precision, it also reduces the covariance between the measurements, which results in the net decrease in precision. We demonstrate how a recently introduced gradient sensing mechanism, regional excitation--global inhibition (REGI), overcomes this effect and recovers the benefit of transverse averaging. Using a REGI-based model, we compute the optimal two- and three-dimensional detector shapes, and argue that they are consistent with the shapes of naturally occurring gradient-sensing cell populations.
△ Less
Submitted 28 December, 2015;
originally announced December 2015.
-
Cell-cell communication enhances the capacity of cell ensembles to sense shallow gradients during morphogenesis
Authors:
David Ellison,
Andrew Mugler,
Matthew Brennan,
Sung Hoon Lee,
Robert Huebner,
Eliah Shamir,
Laura A. Woo,
Joseph Kim,
Patrick Amar,
Ilya Nemenman,
Andrew J. Ewald,
Andre Levchenko
Abstract:
Collective cell responses to exogenous cues depend on cell-cell interactions. In principle, these can result in enhanced sensitivity to weak and noisy stimuli. However, this has not yet been shown experimentally, and, little is known about how multicellular signal processing modulates single cell sensitivity to extracellular signaling inputs, including those guiding complex changes in the tissue f…
▽ More
Collective cell responses to exogenous cues depend on cell-cell interactions. In principle, these can result in enhanced sensitivity to weak and noisy stimuli. However, this has not yet been shown experimentally, and, little is known about how multicellular signal processing modulates single cell sensitivity to extracellular signaling inputs, including those guiding complex changes in the tissue form and function. Here we explored if cell-cell communication can enhance the ability of cell ensembles to sense and respond to weak gradients of chemotactic cues. Using a combination of experiments with mammary epithelial cells and mathematical modeling, we find that multicellular sensing enables detection of and response to shallow Epidermal Growth Factor (EGF) gradients that are undetectable by single cells. However, the advantage of this type of gradient sensing is limited by the noisiness of the signaling relay, necessary to integrate spatially distributed ligand concentration information. We calculate the fundamental sensory limits imposed by this communication noise and combine them with the experimental data to estimate the effective size of multicellular sensory groups involved in gradient sensing. Functional experiments strongly implicated intercellular communication through gap junctions and calcium release from intracellular stores as mediators of collective gradient sensing. The resulting integrative analysis provides a framework for understanding the advantages and limitations of sensory information processing by relays of chemically coupled cells.
△ Less
Submitted 19 August, 2015;
originally announced August 2015.
-
Time to Quantify Falsifiability
Authors:
Ilya Nemenman
Abstract:
Here we argue that the notion of falsifiability, a key concept in defining a valid scientific theory, can be quantified using Bayesian Model Selection, which is a standard tool in modern statistics. This relates falsifiability to the quantitative version of the Occam's razor, and allows transforming some long-running arguments about validity of certain scientific theories from philosophical discus…
▽ More
Here we argue that the notion of falsifiability, a key concept in defining a valid scientific theory, can be quantified using Bayesian Model Selection, which is a standard tool in modern statistics. This relates falsifiability to the quantitative version of the Occam's razor, and allows transforming some long-running arguments about validity of certain scientific theories from philosophical discussions to mathematical calculations. This is a Letter to the editor.
△ Less
Submitted 2 June, 2015;
originally announced June 2015.
-
Accurate sensing of multiple ligands with a single receptor
Authors:
Vijay Singh,
Ilya Nemenman
Abstract:
Cells use surface receptors to estimate the concentration of external ligands. Limits on the accuracy of such estimations have been well studied for pairs of ligand and receptor species. However, the environment typically contains many ligands, which can bind to the same receptors with different affinities, resulting in cross-talk. In traditional rate models, such cross-talk prevents accurate infe…
▽ More
Cells use surface receptors to estimate the concentration of external ligands. Limits on the accuracy of such estimations have been well studied for pairs of ligand and receptor species. However, the environment typically contains many ligands, which can bind to the same receptors with different affinities, resulting in cross-talk. In traditional rate models, such cross-talk prevents accurate inference of individual ligand concentrations. In contrast, here we show that knowing the precise timing sequence of stochastic binding and unbinding events allows one receptor to provide information about multiple ligands simultaneously and with a high accuracy. We argue that such high-accuracy estimation of multiple concentrations can be realized by the familiar kinetic proofreading mechanism.
△ Less
Submitted 31 May, 2015;
originally announced June 2015.
-
Limits to the precision of gradient sensing with spatial communication and temporal integration
Authors:
Andrew Mugler,
Andre Levchenko,
Ilya Nemenman
Abstract:
Gradient sensing requires at least two measurements at different points in space. These measurements must then be communicated to a common location to be compared, which is unavoidably noisy. While much is known about the limits of measurement precision by cells, the limits placed by the communication are not understood. Motivated by recent experiments, we derive the fundamental limits to the prec…
▽ More
Gradient sensing requires at least two measurements at different points in space. These measurements must then be communicated to a common location to be compared, which is unavoidably noisy. While much is known about the limits of measurement precision by cells, the limits placed by the communication are not understood. Motivated by recent experiments, we derive the fundamental limits to the precision of gradient sensing in a multicellular system, accounting for communication and temporal integration. The gradient is estimated by comparing a "local" and a "global" molecular reporter of the external concentration, where the global reporter is exchanged between neighboring cells. Using the fluctuation-dissipation framework, we find, in contrast to the case when communication is ignored, that precision saturates with the number of cells independently of the measurement time duration, since communication establishes a maximum lengthscale over which sensory information can be reliably conveyed. Surprisingly, we also find that precision is improved if the local reporter is exchanged between cells as well, albeit more slowly than the global reporter. The reason is that while exchange of the local reporter weakens the comparison, it decreases the measurement noise. We term such a model "regional excitation--global inhibition" (REGI). Our results demonstrate that fundamental sensing limits are necessarily sharpened when the need to communicate information is taken into account.
△ Less
Submitted 16 May, 2015;
originally announced May 2015.
-
On the sufficiency of pairwise interactions in maximum entropy models of biological networks
Authors:
Lina Merchan,
Ilya Nemenman
Abstract:
Biological information processing networks consist of many components, which are coupled by an even larger number of complex multivariate interactions. However, analyses of data sets from fields as diverse as neuroscience, molecular biology, and behavior have reported that observed statistics of states of some biological networks can be approximated well by maximum entropy models with only pairwis…
▽ More
Biological information processing networks consist of many components, which are coupled by an even larger number of complex multivariate interactions. However, analyses of data sets from fields as diverse as neuroscience, molecular biology, and behavior have reported that observed statistics of states of some biological networks can be approximated well by maximum entropy models with only pairwise interactions among the components. Based on simulations of random Ising spin networks with $p$-spin ($p>2$) interactions, here we argue that this reduction in complexity can be thought of as a natural property of densely interacting networks in certain regimes, and not necessarily as a special property of living systems. By connecting our analysis to the theory of random constraint satisfaction problems, we suggest a reason for why some biological systems may operate in this regime.
△ Less
Submitted 11 May, 2015;
originally announced May 2015.
-
Of fishes and birthdays: Efficient estimation of polymer configurational entropies
Authors:
Ilya Nemenman,
Michael E. Wall,
Charlie E. Strauss
Abstract:
We present an algorithm to estimate the configurational entropy $S$ of a polymer. The algorithm uses the statistics of coincidences among random samples of configurations and is related to the catch-tag-release method for estimation of population sizes, and to the classic "birthday paradox". Bias in the entropy estimation is decreased by grou** configurations in nearly equiprobable partitions ba…
▽ More
We present an algorithm to estimate the configurational entropy $S$ of a polymer. The algorithm uses the statistics of coincidences among random samples of configurations and is related to the catch-tag-release method for estimation of population sizes, and to the classic "birthday paradox". Bias in the entropy estimation is decreased by grou** configurations in nearly equiprobable partitions based on their energies, and estimating entropies separately within each partition. Whereas most entropy estimation algorithms require $N\sim 2^{S}$ samples to achieve small bias, our approach typically needs only $N\sim \sqrt{2^{S}}$. Thus the algorithm can be applied to estimate protein free energies with increased accuracy and decreased computational cost.
△ Less
Submitted 9 February, 2015;
originally announced February 2015.
-
Efficient inference of parsimonious phenomenological models of cellular dynamics using S-systems and alternating regression
Authors:
Bryan C. Daniels,
Ilya Nemenman
Abstract:
The nonlinearity of dynamics in systems biology makes it hard to infer them from experimental data. Simple linear models are computationally efficient, but cannot incorporate these important nonlinearities. An adaptive method based on the S-system formalism, which is a sensible representation of nonlinear mass-action kinetics typically found in cellular dynamics, maintains the efficiency of linear…
▽ More
The nonlinearity of dynamics in systems biology makes it hard to infer them from experimental data. Simple linear models are computationally efficient, but cannot incorporate these important nonlinearities. An adaptive method based on the S-system formalism, which is a sensible representation of nonlinear mass-action kinetics typically found in cellular dynamics, maintains the efficiency of linear regression. We combine this approach with adaptive model selection to obtain efficient and parsimonious representations of cellular dynamics. The approach is tested by inferring the dynamics of yeast glycolysis from simulated data. With little computing time, it produces dynamical models with high predictive power and with structural complexity adapted to the difficulty of the inference problem.
△ Less
Submitted 16 June, 2014;
originally announced June 2014.
-
Automated adaptive inference of coarse-grained dynamical models in systems biology
Authors:
Bryan C. Daniels,
Ilya Nemenman
Abstract:
Cellular regulatory dynamics is driven by large and intricate networks of interactions at the molecular scale, whose sheer size obfuscates understanding. In light of limited experimental data, many parameters of such dynamics are unknown, and thus models built on the detailed, mechanistic viewpoint overfit and are not predictive. At the other extreme, simple ad hoc models of complex processes ofte…
▽ More
Cellular regulatory dynamics is driven by large and intricate networks of interactions at the molecular scale, whose sheer size obfuscates understanding. In light of limited experimental data, many parameters of such dynamics are unknown, and thus models built on the detailed, mechanistic viewpoint overfit and are not predictive. At the other extreme, simple ad hoc models of complex processes often miss defining features of the underlying systems. Here we propose an approach that instead constructs phenomenological, coarse-grained models of network dynamics that automatically adapt their complexity to the amount of available data. Such adaptive models lead to accurate predictions even when microscopic details of the studied systems are unknown due to insufficient data. The approach is computationally tractable, even for a relatively large number of dynamical variables, allowing its software realization, named Sir Isaac, to make successful predictions even when important dynamic variables are unobserved. For example, it matches the known phase space structure for simulated planetary motion data, avoids overfitting in a complex biological signaling system, and produces accurate predictions for a yeast glycolysis model with only tens of data points and over half of the interacting species unobserved.
△ Less
Submitted 24 April, 2014;
originally announced April 2014.
-
Millisecond-scale motor encoding in a cortical vocal area
Authors:
Claire Tang,
Diala Chehayeb,
Kyle Srivastava,
Ilya Nemenman,
Samuel Sober
Abstract:
Studies of motor control have almost universally examined firing rates to investigate how the brain shapes behavior. In principle, however, neurons could encode information through the precise temporal patterning of their spike trains as well as (or instead of) through their firing rates. Although the importance of spike timing has been demonstrated in sensory systems, it is largely unknown whethe…
▽ More
Studies of motor control have almost universally examined firing rates to investigate how the brain shapes behavior. In principle, however, neurons could encode information through the precise temporal patterning of their spike trains as well as (or instead of) through their firing rates. Although the importance of spike timing has been demonstrated in sensory systems, it is largely unknown whether timing differences in motor areas could affect behavior. We tested the hypothesis that significant information about trial-by-trial variations in behavior is represented by spike timing in the songbird vocal motor system. We found that premotor neurons convey information via spike timing far more often than via spike rate and that the amount of information conveyed at the millisecond timescale greatly exceeds the information available from spike counts. These results demonstrate that information can be represented by spike timing in motor circuits and suggest that timing variations evoke differences in behavior.
△ Less
Submitted 14 April, 2014; v1 submitted 2 April, 2014;
originally announced April 2014.
-
Director Field Model of the Primary Visual Cortex for Contour Detection
Authors:
Vijay Singh,
Martin Tchernookov,
Rebecca Butterfield,
Ilya Nemenman
Abstract:
We aim to build the simplest possible model capable of detecting long, noisy contours in a cluttered visual scene. For this, we model the neural dynamics in the primate primary visual cortex in terms of a continuous director field that describes the average rate and the average orientational preference of active neurons at a particular point in the cortex. We then use a linear-nonlinear dynamical…
▽ More
We aim to build the simplest possible model capable of detecting long, noisy contours in a cluttered visual scene. For this, we model the neural dynamics in the primate primary visual cortex in terms of a continuous director field that describes the average rate and the average orientational preference of active neurons at a particular point in the cortex. We then use a linear-nonlinear dynamical model with long range connectivity patterns to enforce long-range statistical context present in the analyzed images. The resulting model has substantially fewer degrees of freedom than traditional models, and yet it can distinguish large contiguous objects from the background clutter by suppressing the clutter and by filling-in occluded elements of object contours. This results in high-precision, high-recall detection of large objects in cluttered scenes. Parenthetically, our model has a direct correspondence with the Landau - de Gennes theory of nematic liquid crystal in two dimensions.
△ Less
Submitted 18 October, 2014; v1 submitted 4 October, 2013;
originally announced October 2013.
-
Zipf's law and criticality in multivariate data without fine-tuning
Authors:
David J. Schwab,
Ilya Nemenman,
Pankaj Mehta
Abstract:
The joint probability distribution of many degrees of freedom in biological systems, such as firing patterns in neural networks or antibody sequence composition in zebrafish, often follow Zipf's law, where a power law is observed on a rank-frequency plot. This behavior has recently been shown to imply that these systems reside near to a unique critical point where the extensive parts of the entrop…
▽ More
The joint probability distribution of many degrees of freedom in biological systems, such as firing patterns in neural networks or antibody sequence composition in zebrafish, often follow Zipf's law, where a power law is observed on a rank-frequency plot. This behavior has recently been shown to imply that these systems reside near to a unique critical point where the extensive parts of the entropy and energy are exactly equal. Here we show analytically, and via numerical simulations, that Zipf-like probability distributions arise naturally if there is an unobserved variable (or variables) that affects the system, e. g. for neural networks an input stimulus that causes individual neurons in the network to fire at time-varying rates. In statistics and machine learning, these models are called latent-variable or mixture models. Our model shows that no fine-tuning is required, i.e. Zipf's law arises generically without tuning parameters to a point, and gives insight into the ubiquity of Zipf's law in a wide range of systems.
△ Less
Submitted 18 June, 2014; v1 submitted 1 October, 2013;
originally announced October 2013.
-
Predictive information in a nonequilibrium critical model
Authors:
Martin Tchernookov,
Ilya Nemenman
Abstract:
We propose predictive information, that is information between a long past of duration T and the entire infinitely long future of a time series, as a universal order parameter to study phase transitions in physical systems. It can be used, in particular, to study nonequlibrium transitions and other exotic transitions, where a simpler order parameter cannot be identifies using traditional symmetry…
▽ More
We propose predictive information, that is information between a long past of duration T and the entire infinitely long future of a time series, as a universal order parameter to study phase transitions in physical systems. It can be used, in particular, to study nonequlibrium transitions and other exotic transitions, where a simpler order parameter cannot be identifies using traditional symmetry arguments. As an example, we calculate predictive information for a stochastic nonequilibrium dynamics problem that forms an absorbing state under a continuous change of a parameter. The information at the transition point diverges as log(T), and a smooth crossover to constant away from the transition is observed.
△ Less
Submitted 17 December, 2012;
originally announced December 2012.
-
Large number of receptors may reduce cellular response time variation
Authors:
Xiang Cheng,
Lina Merchan,
Martin Tchernookov,
Ilya Nemenman
Abstract:
Cells often have tens of thousands of receptors, even though only a few activated receptors can trigger full cellular responses. Reasons for the overabundance of receptors remain unclear. We suggest that, in certain conditions, the large number of receptors results in a competition among receptors to be the first to activate the cell. The competition decreases the variability of the time to cellul…
▽ More
Cells often have tens of thousands of receptors, even though only a few activated receptors can trigger full cellular responses. Reasons for the overabundance of receptors remain unclear. We suggest that, in certain conditions, the large number of receptors results in a competition among receptors to be the first to activate the cell. The competition decreases the variability of the time to cellular activation, and hence results in a more synchronous activation of cells. We argue that, in simple models, this variability reduction does not necessarily interfere with the receptor specificity to ligands achieved by the kinetic proofreading mechanism. Thus cells can be activated accurately in time and specifically to certain signals. We predict the minimum number of receptors needed to reduce the coefficient of variation for the time to activation following binding of a specific ligand. Further, we predict the maximum number of receptors so that the kinetic proofreading mechanism still can improve the specificity of the activation. These predictions fall in line with experimentally reported receptor numbers for multiple systems.
△ Less
Submitted 11 December, 2012; v1 submitted 5 December, 2012;
originally announced December 2012.
-
Population-expression models of immune response
Authors:
Sean P Stromberg,
Rustom Antia,
Ilya Nemenman
Abstract:
The immune response to a pathogen has two basic features. The first is the expansion of a few pathogen-specific cells to form a population large enough to control the pathogen. The second is the process of differentiation of cells from an initial naive phenotype to an effector phenotype which controls the pathogen, and subsequently to a memory phenotype that is maintained and responsible for long-…
▽ More
The immune response to a pathogen has two basic features. The first is the expansion of a few pathogen-specific cells to form a population large enough to control the pathogen. The second is the process of differentiation of cells from an initial naive phenotype to an effector phenotype which controls the pathogen, and subsequently to a memory phenotype that is maintained and responsible for long-term protection. The expansion and the differentiation have been considered largely independently. Changes in cell populations are typically described using ecologically based ordinary differential equation models. In contrast, differentiation of single cells is studied within systems biology and is frequently modeled by considering changes in gene and protein expression in individual cells. Recent advances in experimental systems biology make available for the first time data to allow the coupling of population and high dimensional expression data of immune cells during infections. Here we describe and develop population-expression models which integrate these two processes into systems biology on the multicellular level. When translated into mathematical equations, these models result in non-conservative, non-local advection-diffusion equations. We describe situations where the population-expression approach can make correct inference from data while previous modeling approaches based on common simplifying assumptions would fail. We also explore how model reduction techniques can be used to build population-expression models, minimizing the complexity of the model while kee** the essential features of the system. While we consider problems in immunology in this paper, we expect population-expression models to be more broadly applicable.
△ Less
Submitted 8 December, 2012; v1 submitted 17 September, 2012;
originally announced September 2012.