-
$Γ$-VAE: Curvature regularized variational autoencoders for uncovering emergent low dimensional geometric structure in high dimensional data
Authors:
Jason Z. Kim,
Nicolas Perrin-Gilbert,
Erkan Narmanli,
Paul Klein,
Christopher R. Myers,
Itai Cohen,
Joshua J. Waterfall,
James P. Sethna
Abstract:
Natural systems with emergent behaviors often organize along low-dimensional subsets of high-dimensional spaces. For example, despite the tens of thousands of genes in the human genome, the principled study of genomics is fruitful because biological processes rely on coordinated organization that results in lower dimensional phenotypes. To uncover this organization, many nonlinear dimensionality r…
▽ More
Natural systems with emergent behaviors often organize along low-dimensional subsets of high-dimensional spaces. For example, despite the tens of thousands of genes in the human genome, the principled study of genomics is fruitful because biological processes rely on coordinated organization that results in lower dimensional phenotypes. To uncover this organization, many nonlinear dimensionality reduction techniques have successfully embedded high-dimensional data into low-dimensional spaces by preserving local similarities between data points. However, the nonlinearities in these methods allow for too much curvature to preserve general trends across multiple non-neighboring data clusters, thereby limiting their interpretability and generalizability to out-of-distribution data. Here, we address both of these limitations by regularizing the curvature of manifolds generated by variational autoencoders, a process we coin ``$Γ$-VAE''. We demonstrate its utility using two example data sets: bulk RNA-seq from the The Cancer Genome Atlas (TCGA) and the Genotype Tissue Expression (GTEx); and single cell RNA-seq from a lineage tracing experiment in hematopoietic stem cell differentiation. We find that the resulting regularized manifolds identify mesoscale structure associated with different cancer cell types, and accurately re-embed tissues from completely unseen, out-of distribution cancers as if they were originally trained on them. Finally, we show that preserving long-range relationships to differentiated cells separates undifferentiated cells -- which have not yet specialized -- according to their eventual fate. Broadly, we anticipate that regularizing the curvature of generative models will enable more consistent, predictive, and generalizable models in any high-dimensional system with emergent low-dimensional behavior.
△ Less
Submitted 1 March, 2024;
originally announced March 2024.
-
A scaling theory of armed conflict avalanches
Authors:
Edward D. Lee,
Bryan C. Daniels,
Christopher R. Myers,
David C. Krakauer,
Jessica C. Flack
Abstract:
Armed conflict data display scaling and universal dynamics in both social and physical properties like fatalities and geographic extent. We propose a randomly branching, armed-conflict model that relates multiple properties to one another in a way consistent with data. The model incorporates a fractal lattice on which conflict spreads, uniform dynamics driving conflict growth, and regional virulen…
▽ More
Armed conflict data display scaling and universal dynamics in both social and physical properties like fatalities and geographic extent. We propose a randomly branching, armed-conflict model that relates multiple properties to one another in a way consistent with data. The model incorporates a fractal lattice on which conflict spreads, uniform dynamics driving conflict growth, and regional virulence that modulates local conflict intensity. The quantitative constraints on scaling and universal dynamics we use to develop our minimal model serve more generally as a set of constraints for other models for armed conflict dynamics. We show how this approach akin to thermodynamics imparts mechanistic intuition and unifies multiple conflict properties, giving insight into causation, prediction, and intervention timing.
△ Less
Submitted 29 April, 2020;
originally announced April 2020.
-
Emergent regularities and scaling in armed conflict data
Authors:
Edward D. Lee,
Bryan C. Daniels,
Christopher R. Myers,
David C. Krakauer,
Jessica C. Flack
Abstract:
Armed conflict exhibits regularities beyond known power law distributions of fatalities and duration over varying culture and geography. We systematically cluster conflict reports from a database of $10^5$ events from Africa spanning 20 years into conflict avalanches. Conflict profiles collapse over a range of scales. Duration, diameter, extent, fatalities, and report totals satisfy mutually consi…
▽ More
Armed conflict exhibits regularities beyond known power law distributions of fatalities and duration over varying culture and geography. We systematically cluster conflict reports from a database of $10^5$ events from Africa spanning 20 years into conflict avalanches. Conflict profiles collapse over a range of scales. Duration, diameter, extent, fatalities, and report totals satisfy mutually consistent scaling relations captured with a model combining geographic spread and local conflict-site growth. The emergence of such social scaling laws hints at principles guiding conflict evolution.
△ Less
Submitted 29 April, 2020; v1 submitted 18 March, 2019;
originally announced March 2019.
-
Overshoot during phenotypic switching of cancer cell populations
Authors:
Alessandro L. Sellerio,
Emilio Ciusani,
Noa Bossel Ben-Moshe,
Stefania Coco,
Andrea Piccinini,
Christopher R. Myers,
James P. Sethna,
Costanza Giampietro,
Stefano Zapperi,
Caterina A. M. La Porta
Abstract:
The dynamics of tumor cell populations is hotly debated: do populations derive hierarchically from a subpopulation of cancer stem cells (CSCs), or are stochastic transitions that mutate differentiated cancer cells to CSCs important? Here we argue that regulation must also be important. We sort human melanoma cells using three distinct cancer stem cell (CSC) markers - CXCR6, CD271 and ABCG2 - and o…
▽ More
The dynamics of tumor cell populations is hotly debated: do populations derive hierarchically from a subpopulation of cancer stem cells (CSCs), or are stochastic transitions that mutate differentiated cancer cells to CSCs important? Here we argue that regulation must also be important. We sort human melanoma cells using three distinct cancer stem cell (CSC) markers - CXCR6, CD271 and ABCG2 - and observe that the fraction of non-CSC-marked cells first overshoots to a higher level and then returns to the level of unsorted cells. This clearly indicates that the CSC population is homeostatically regulated. Combining experimental measurements with theoretical modeling and numerical simulations, we show that the population dynamics of cancer cells is associated with a complex miRNA network regulating the Wnt and PI3K pathways. Hence phenotypic switching is not stochastic, but is tightly regulated by the balance between positive and negative cells in the population. Reducing the fraction of CSCs below a threshold triggers massive phenotypic switching, suggesting that a therapeutic strategy based on CSC eradication is unlikely to succeed.
△ Less
Submitted 27 October, 2015;
originally announced October 2015.
-
You Can Run, You Can Hide: The Epidemiology and Statistical Mechanics of Zombies
Authors:
Alexander A. Alemi,
Matthew Bierbaum,
Christopher R. Myers,
James P. Sethna
Abstract:
We use a popular fictional disease, zombies, in order to introduce techniques used in modern epidemiology modelling, and ideas and techniques used in the numerical study of critical phenomena. We consider variants of zombie models, from fully connected continuous time dynamics to a full scale exact stochastic dynamic simulation of a zombie outbreak on the continental United States. Along the way,…
▽ More
We use a popular fictional disease, zombies, in order to introduce techniques used in modern epidemiology modelling, and ideas and techniques used in the numerical study of critical phenomena. We consider variants of zombie models, from fully connected continuous time dynamics to a full scale exact stochastic dynamic simulation of a zombie outbreak on the continental United States. Along the way, we offer a closed form analytical expression for the fully connected differential equation, and demonstrate that the single person per site two dimensional square lattice version of zombies lies in the percolation universality class. We end with a quantitative study of the full scale US outbreak, including the average susceptibility of different geographical regions.
△ Less
Submitted 4 June, 2015; v1 submitted 3 March, 2015;
originally announced March 2015.
-
Driven synchronization in random networks of oscillators
Authors:
Jason Hindes,
Christopher R. Myers
Abstract:
Synchronization is a universal phenomenon found in many non-equilibrium systems. Much recent interest in this area has overlapped with the study of complex networks, where a major focus is determining how a system's connectivity patterns affect the types of behavior that it can produce. Thus far, modeling efforts have focused on the tendency of networks of oscillators to mutually synchronize thems…
▽ More
Synchronization is a universal phenomenon found in many non-equilibrium systems. Much recent interest in this area has overlapped with the study of complex networks, where a major focus is determining how a system's connectivity patterns affect the types of behavior that it can produce. Thus far, modeling efforts have focused on the tendency of networks of oscillators to mutually synchronize themselves, with less emphasis on the effects of external driving. In this work we discuss the interplay between mutual and driven synchronization in networks of phase oscillators of the Kuramoto type, and explore how the structure and emergence of such states depends on the underlying network topology for simple random networks with a given degree distribution. We find a variety of interesting dynamical behaviors, including bifurcations and bistability patterns that are qualitatively different for heterogeneous and homogeneous networks, and which are separated by a Takens-Bogdanov-Cusp singularity in the parameter region where the coupling strength between oscillators is weak. Our analysis is connected to the underlying dynamics of oscillator clusters for important states and transitions.
△ Less
Submitted 15 July, 2015; v1 submitted 28 February, 2015;
originally announced March 2015.
-
An integrated quantum photonic sensor based on Hong-Ou-Mandel interference
Authors:
Sahar Basiri-Esfahani,
Casey R. Myers,
Ardalan Armin,
Joshua Combes,
Gerard J. Milburn
Abstract:
Photonic-crystal-based integrated optical systems have been used for a broad range of sensing applications with great success. This has been motivated by several advantages such as high sensitivity, miniaturization, remote sensing, selectivity and stability. Many photonic crystal sensors have been proposed with various fabrication designs that result in improved optical properties. In parallel, in…
▽ More
Photonic-crystal-based integrated optical systems have been used for a broad range of sensing applications with great success. This has been motivated by several advantages such as high sensitivity, miniaturization, remote sensing, selectivity and stability. Many photonic crystal sensors have been proposed with various fabrication designs that result in improved optical properties. In parallel, integrated optical systems are being pursued as a platform for photonic quantum information processing using linear optics and Fock states. Here we propose a novel integrated Fock state optical sensor architecture that can be used for force, refractive index and possibly local temperature detection. In this scheme, two coupled cavities behave as an "effective beam splitter". The sensor works based on fourth order interference (the Hong-Ou-Mandel effect) and requires a sequence of single photon pulses and consequently has low pulse power. Changes in the parameter to be measured induce variations in the effective beam splitter reflectivity and result in changes to the visibility of interference. We demonstrate this generic scheme in coupled L3 photonic crystal cavities as an example and find that this system, which only relies on photon coincidence detection and does not need any spectral resolution, can estimate forces as small as $10^{-7}$ Newtons and can measure one part per million change in refractive index using a very low input power of $10^{-10}$W. Thus linear optical quantum photonic architectures can achieve comparable sensor performance to semiclassical devices.
△ Less
Submitted 10 June, 2015; v1 submitted 12 February, 2015;
originally announced February 2015.
-
Sloppiness and Emergent Theories in Physics, Biology, and Beyond
Authors:
Mark K. Transtrum,
Benjamin Machta,
Kevin Brown,
Bryan C. Daniels,
Christopher R. Myers,
James P. Sethna
Abstract:
Large scale models of physical phenomena demand the development of new statistical and computational tools in order to be effective. Many such models are `sloppy', i.e., exhibit behavior controlled by a relatively small number of parameter combinations. We review an information theoretic framework for analyzing sloppy models. This formalism is based on the Fisher Information Matrix, which we inter…
▽ More
Large scale models of physical phenomena demand the development of new statistical and computational tools in order to be effective. Many such models are `sloppy', i.e., exhibit behavior controlled by a relatively small number of parameter combinations. We review an information theoretic framework for analyzing sloppy models. This formalism is based on the Fisher Information Matrix, which we interpret as a Riemannian metric on a parameterized space of models. Distance in this space is a measure of how distinguishable two models are based on their predictions. Sloppy model manifolds are bounded with a hierarchy of widths and extrinsic curvatures. We show how the manifold boundary approximation can extract the simple, hidden theory from complicated sloppy models. We attribute the success of simple effective models in physics as likewise emerging from complicated processes exhibiting a low effective dimensionality. We discuss the ramifications and consequences of sloppy models for biochemistry and science more generally. We suggest that the reason our complex world is understandable is due to the same fundamental reason: simple theories of macroscopic behavior are hidden inside complicated microscopic processes.
△ Less
Submitted 30 January, 2015;
originally announced January 2015.
-
Outbreak statistics and scaling laws for externally driven epidemics
Authors:
Sarabjeet Singh,
Christopher R. Myers
Abstract:
Power-law scalings are ubiquitous to physical phenomena undergoing a continuous phase transition. The classic Susceptible-Infectious-Recovered (SIR) model of epidemics is one such example where the scaling behavior near a critical point has been studied extensively. In this system the distribution of outbreak sizes scales as $P(n) \sim n^{-3/2}$ at the critical point as the system size $N$ becomes…
▽ More
Power-law scalings are ubiquitous to physical phenomena undergoing a continuous phase transition. The classic Susceptible-Infectious-Recovered (SIR) model of epidemics is one such example where the scaling behavior near a critical point has been studied extensively. In this system the distribution of outbreak sizes scales as $P(n) \sim n^{-3/2}$ at the critical point as the system size $N$ becomes infinite. The finite-size scaling laws for the outbreak size and duration are also well understood and characterized. In this work, we report scaling laws for a model with SIR structure coupled with a constant force of infection per susceptible, akin to a `reservoir forcing'. We find that the statistics of outbreaks in this system are fundamentally different than those in a simple SIR model. Instead of fixed exponents, all scaling laws exhibit tunable exponents parameterized by the dimensionless rate of external forcing. As the external driving rate approaches a critical value, the scale of the average outbreak size converges to that of the maximal size, and above the critical point, the scaling laws bifurcate into two regimes. Whereas a simple SIR process can only exhibit outbreaks of size $\mathcal{O}(N^{1/3})$ and $\mathcal{O}(N)$ depending on whether the system is at or above the epidemic threshold, a driven SIR process can exhibit a richer spectrum of outbreak sizes that scale as $O(N^ξ)$ where $ξ\in (0,1] \backslash \{2/3\}$ and $\mathcal{O}((N/\log N)^{2/3})$ at the multi-critical point.
△ Less
Submitted 30 December, 2013;
originally announced January 2014.
-
Epidemic fronts in complex networks with metapopulation structure
Authors:
Jason Hindes,
Sarabjeet Singh,
Christopher R. Myers,
David J. Schneider
Abstract:
Infection dynamics have been studied extensively on complex networks, yielding insight into the effects of heterogeneity in contact patterns on disease spread. Somewhat separately, metapopulations have provided a paradigm for modeling systems with spatially extended and "patchy" organization. In this paper we expand on the use of multitype networks for combining these paradigms, such that simple c…
▽ More
Infection dynamics have been studied extensively on complex networks, yielding insight into the effects of heterogeneity in contact patterns on disease spread. Somewhat separately, metapopulations have provided a paradigm for modeling systems with spatially extended and "patchy" organization. In this paper we expand on the use of multitype networks for combining these paradigms, such that simple contagion models can include complexity in the agent interactions and multiscale structure. We first present a generalization of the Volz-Miller mean-field approximation for Susceptible-Infected-Recovered (SIR) dynamics on multitype networks. We then use this technique to study the special case of epidemic fronts propagating on a one-dimensional lattice of interconnected networks - representing a simple chain of coupled population centers - as a necessary first step in understanding how macro-scale disease spread depends on micro-scale topology. Using the formalism of front propagation into unstable states, we derive the effective transport coefficients of the linear spreading: asymptotic speed, characteristic wavelength, and diffusion coefficient for the leading edge of the pulled fronts, and analyze their dependence on the underlying graph structure. We also derive the epidemic threshold for the system and study the front profile for various network configurations. To our knowledge, this is the first such application of front propagation concepts to random network models.
△ Less
Submitted 20 April, 2013; v1 submitted 15 April, 2013;
originally announced April 2013.
-
Variational method for estimating the rate of convergence of Markov Chain Monte Carlo algorithms
Authors:
Fergal P. Casey,
Joshua J. Waterfall,
Ryan N. Gutenkunst,
Christopher R. Myers,
James P. Sethna
Abstract:
We demonstrate the use of a variational method to determine a quantitative lower bound on the rate of convergence of Markov Chain Monte Carlo (MCMC) algorithms as a function of the target density and proposal density. The bound relies on approximating the second largest eigenvalue in the spectrum of the MCMC operator using a variational principle and the approach is applicable to problems with c…
▽ More
We demonstrate the use of a variational method to determine a quantitative lower bound on the rate of convergence of Markov Chain Monte Carlo (MCMC) algorithms as a function of the target density and proposal density. The bound relies on approximating the second largest eigenvalue in the spectrum of the MCMC operator using a variational principle and the approach is applicable to problems with continuous state spaces. We apply the method to one dimensional examples with Gaussian and quartic target densities, and we contrast the performance of the Random Walk Metropolis-Hastings (RWMH) algorithm with a ``smart'' variant that incorporates gradient information into the trial moves. We find that the variational method agrees quite closely with numerical simulations. We also see that the smart MCMC algorithm often fails to converge geometrically in the tails of the target density except in the simplest case we examine, and even then care must be taken to choose the appropriate scaling of the deterministic and random parts of the proposed moves. Again, this calls into question the utility of smart MCMC in more complex problems. Finally, we apply the same method to approximate the rate of convergence in multidimensional Gaussian problems with and without importance sampling. There we demonstrate the necessity of importance sampling for target densities which depend on variables with a wide range of scales.
△ Less
Submitted 16 July, 2008; v1 submitted 31 August, 2006;
originally announced September 2006.
-
The sloppy model universality class and the Vandermonde matrix
Authors:
Joshua J. Waterfall,
Fergal P. Casey,
Ryan N. Gutenkunst,
Kevin S. Brown,
Christopher R. Myers,
Piet W. Brouwer,
Veit Elser,
James P. Sethna
Abstract:
In a variety of contexts, physicists study complex, nonlinear models with many unknown or tunable parameters to explain experimental data. We explain why such systems so often are sloppy; the system behavior depends only on a few `stiff' combinations of the parameters and is unchanged as other `sloppy' parameter combinations vary by orders of magnitude. We contrast examples of sloppy models (fro…
▽ More
In a variety of contexts, physicists study complex, nonlinear models with many unknown or tunable parameters to explain experimental data. We explain why such systems so often are sloppy; the system behavior depends only on a few `stiff' combinations of the parameters and is unchanged as other `sloppy' parameter combinations vary by orders of magnitude. We contrast examples of sloppy models (from systems biology, variational quantum Monte Carlo, and common data fitting) with systems which are not sloppy (multidimensional linear regression, random matrix ensembles). We observe that the eigenvalue spectra for the sensitivity of sloppy models have a striking, characteristic form, with a density of logarithms of eigenvalues which is roughly constant over a large range. We suggest that the common features of sloppy models indicate that they may belong to a common universality class. In particular, we motivate focusing on a Vandermonde ensemble of multiparameter nonlinear models and show in one limit that they exhibit the universal features of sloppy models.
△ Less
Submitted 6 October, 2006; v1 submitted 15 May, 2006;
originally announced May 2006.