Search | arXiv e-print repository

doi 10.1103/PhysRevE.107.024116

Bézier interpolation improves the inference of dynamical models from data

Abstract: Many dynamical systems, from quantum many-body systems to evolving populations to financial markets, are described by stochastic processes. Parameters characterizing such processes can often be inferred using information integrated over stochastic paths. However, estimating time-integrated quantities from real data with limited time resolution is challenging. Here, we propose a framework for accur… ▽ More Many dynamical systems, from quantum many-body systems to evolving populations to financial markets, are described by stochastic processes. Parameters characterizing such processes can often be inferred using information integrated over stochastic paths. However, estimating time-integrated quantities from real data with limited time resolution is challenging. Here, we propose a framework for accurately estimating time-integrated quantities using Bézier interpolation. We applied our approach to two dynamical inference problems: determining fitness parameters for evolving populations and inferring forces driving Ornstein-Uhlenbeck processes. We found that Bézier interpolation reduces the estimation bias for both dynamical inference problems. This improvement was especially noticeable for data sets with limited time resolution. Our method could be broadly applied to improve accuracy for other dynamical inference problems using finitely sampled data. △ Less

Submitted 6 October, 2022; v1 submitted 22 September, 2022; originally announced September 2022.

Comments: 7 pages, 7 figures

arXiv:2102.01521 [pdf]

doi 10.1128/mSystems.00095-21

Pathogenesis, Symptomatology, and Transmission of SARS-CoV-2 through Analysis of Viral Genomics and Structure

Authors: Halie M. Rando, Adam L. MacLean, Alexandra J. Lee, Ronan Lordan, Sandipan Ray, Vikas Bansal, Ashwin N. Skelly, Elizabeth Sell, John J. Dziak, Lamonica Shinholster, Lucy D'Agostino McGowan, Marouen Ben Guebila, Nils Wellhausen, Sergey Knyazev, Simina M. Boca, Stephen Capone, Yanjun Qi, YoSon Park, Yuchen Sun, David Mai, Joel D. Boerckel, Christian Brueffer, James Brian Byrd, Jeremy P. Kamil, **hui Wang , et al. (9 additional authors not shown)

Abstract: The novel coronavirus SARS-CoV-2, which emerged in late 2019, has since spread around the world and infected hundreds of millions of people with coronavirus disease 2019 (COVID-19). While this viral species was unknown prior to January 2020, its similarity to other coronaviruses that infect humans has allowed for rapid insight into the mechanisms that it uses to infect human hosts, as well as the… ▽ More The novel coronavirus SARS-CoV-2, which emerged in late 2019, has since spread around the world and infected hundreds of millions of people with coronavirus disease 2019 (COVID-19). While this viral species was unknown prior to January 2020, its similarity to other coronaviruses that infect humans has allowed for rapid insight into the mechanisms that it uses to infect human hosts, as well as the ways in which the human immune system can respond. Here, we contextualize SARS-CoV-2 among other coronaviruses and identify what is known and what can be inferred about its behavior once inside a human host. Because the genomic content of coronaviruses, which specifies the virus's structure, is highly conserved, early genomic analysis provided a significant head start in predicting viral pathogenesis and in understanding potential differences among variants. The pathogenesis of the virus offers insights into symptomatology, transmission, and individual susceptibility. Additionally, prior research into interactions between the human immune system and coronaviruses has identified how these viruses can evade the immune system's protective mechanisms. We also explore systems-level research into the regulatory and proteomic effects of SARS-CoV-2 infection and the immune response. Understanding the structure and behavior of the virus serves to contextualize the many facets of the COVID-19 pandemic and can influence efforts to control the virus and treat the disease. △ Less

Submitted 3 December, 2021; v1 submitted 1 February, 2021; originally announced February 2021.

arXiv:1907.12793 [pdf, other]

doi 10.1103/PhysRevE.101.012309

Inference of compressed Potts graphical models

Authors: Francesca Rizzato, Alice Coucke, Eleonora de Leonardis, J. P. Barton, Jérôme Tubiana, Remi Monasson, Simona Cocco

Abstract: We consider the problem of inferring a graphical Potts model on a population of variables, with a non-uniform number of Potts colors (symbols) across variables. This inverse Potts problem generally involves the inference of a large number of parameters, often larger than the number of available data, and, hence, requires the introduction of regularization. We study here a double regularization sch… ▽ More We consider the problem of inferring a graphical Potts model on a population of variables, with a non-uniform number of Potts colors (symbols) across variables. This inverse Potts problem generally involves the inference of a large number of parameters, often larger than the number of available data, and, hence, requires the introduction of regularization. We study here a double regularization scheme, in which the number of colors available to each variable is reduced, and interaction networks are made sparse. To achieve this color compression scheme, only Potts states with large empirical frequency (exceeding some threshold) are explicitly modeled on each site, while the others are grouped into a single state. We benchmark the performances of this mixed regularization approach, with two inference algorithms, the Adaptive Cluster Expansion (ACE) and the PseudoLikelihood Maximization (PLM) on synthetic data obtained by sampling disordered Potts models on an Erdos-Renyi random graphs. We show in particular that color compression does not affect the quality of reconstruction of the parameters corresponding to high-frequency symbols, while drastically reducing the number of the other parameters and thus the computational time. Our procedure is also applied to multi-sequence alignments of protein families, with similar results. △ Less

Submitted 3 January, 2020; v1 submitted 30 July, 2019; originally announced July 2019.

Journal ref: Phys. Rev. E 101, 012309 (2020)

arXiv:1508.01469 [pdf, other]

doi 10.1103/PhysRevE.93.022412

Identification of drug resistance mutations in HIV from constraints on natural evolution

Authors: Thomas C. Butler, John P. Barton, Mehran Kardar, Arup K. Chakraborty

Abstract: Human immunodeficiency virus (HIV) evolves with extraordinary rapidity. However, its evolution is constrained by interactions between mutations in its fitness landscape. Here we show that an Ising model describing these interactions, inferred from sequence data obtained prior to the use of antiretroviral drugs, can be used to identify clinically significant sites of resistance mutations. Successfu… ▽ More Human immunodeficiency virus (HIV) evolves with extraordinary rapidity. However, its evolution is constrained by interactions between mutations in its fitness landscape. Here we show that an Ising model describing these interactions, inferred from sequence data obtained prior to the use of antiretroviral drugs, can be used to identify clinically significant sites of resistance mutations. Successful predictions of the resistance sites indicate progress in the development of successful models of real viral evolution at the single residue level, and suggest that our approach may be applied to help design new therapies that are less prone to failure even where resistance data is not yet available. △ Less

Submitted 6 August, 2015; originally announced August 2015.

Comments: 5 pages, 3 figures

Journal ref: Phys. Rev. E 93, 022412 (2016)

arXiv:1412.8065 [pdf, other]

Remarks on the energy costs of insulators in enzymatic cascades

Authors: John P. Barton, Eduardo D. Sontag

Abstract: The connection between optimal biological function and energy use, measured for example by the rate of metabolite consumption, is a current topic of interest in the systems biology literature which has been explored in several different contexts. In [J. P. Barton and E. D. Sontag, Biophys. J. 104, 6 (2013)], we related the metabolic cost of enzymatic futile cycles with their capacity to act as ins… ▽ More The connection between optimal biological function and energy use, measured for example by the rate of metabolite consumption, is a current topic of interest in the systems biology literature which has been explored in several different contexts. In [J. P. Barton and E. D. Sontag, Biophys. J. 104, 6 (2013)], we related the metabolic cost of enzymatic futile cycles with their capacity to act as insulators which facilitate modular interconnections in biochemical networks. There we analyzed a simple model system in which a signal molecule regulates the transcription of one or more target proteins by interacting with their promoters. In this note, we consider the case of a protein with an active and an inactive form, and whose activation is controlled by the signal molecule. As in the original case, higher rates of energy consumption are required for better insulator performance. △ Less

Submitted 27 December, 2014; originally announced December 2014.

Comments: 10 pages, 4 figures

arXiv:1405.0233 [pdf, other]

doi 10.1103/PhysRevE.90.012132

Large Pseudo-Counts and $L_2$-Norm Penalties Are Necessary for the Mean-Field Inference of Ising and Potts Models

Authors: J. P. Barton, S. Cocco, E. De Leonardis, R. Monasson

Abstract: Mean field (MF) approximation offers a simple, fast way to infer direct interactions between elements in a network of correlated variables, a common, computationally challenging problem with practical applications in fields ranging from physics and biology to the social sciences. However, MF methods achieve their best performance with strong regularization, well beyond Bayesian expectations, an em… ▽ More Mean field (MF) approximation offers a simple, fast way to infer direct interactions between elements in a network of correlated variables, a common, computationally challenging problem with practical applications in fields ranging from physics and biology to the social sciences. However, MF methods achieve their best performance with strong regularization, well beyond Bayesian expectations, an empirical fact that is poorly understood. In this work, we study the influence of pseudo-count and $L_2$-norm regularization schemes on the quality of inferred Ising or Potts interaction networks from correlation data within the MF approximation. We argue, based on the analysis of small systems, that the optimal value of the regularization strength remains finite even if the sampling noise tends to zero, in order to correct for systematic biases introduced by the MF approximation. Our claim is corroborated by extensive numerical studies of diverse model systems and by the analytical study of the $m$-component spin model, for large but finite $m$. Additionally we find that pseudo-count regularization is robust against sampling noise, and often outperforms $L_2$-norm regularization, particularly when the underlying network of interactions is strongly heterogeneous. Much better performances are generally obtained for the Ising model than for the Potts model, for which only couplings incoming onto medium-frequency symbols are reliably inferred. △ Less

Submitted 1 May, 2014; originally announced May 2014.

Comments: 25 pages, 17 figures

Journal ref: Phys Rev E 90 (2014) 012132

arXiv:1306.2029 [pdf, ps, other]

doi 10.1103/PhysRevE.88.062705

Spin models inferred from patient data faithfully describe HIV fitness landscapes and enable rational vaccine design

Authors: Karthik Shekhar, Claire F. Ruberman, Andrew L. Ferguson, John P. Barton, Mehran Kardar, Arup K. Chakraborty

Abstract: Mutational escape from vaccine induced immune responses has thwarted the development of a successful vaccine against AIDS, whose causative agent is HIV, a highly mutable virus. Knowing the virus' fitness as a function of its proteomic sequence can enable rational design of potent vaccines, as this information can focus vaccine induced immune responses to target mutational vulnerabilities of the vi… ▽ More Mutational escape from vaccine induced immune responses has thwarted the development of a successful vaccine against AIDS, whose causative agent is HIV, a highly mutable virus. Knowing the virus' fitness as a function of its proteomic sequence can enable rational design of potent vaccines, as this information can focus vaccine induced immune responses to target mutational vulnerabilities of the virus. Spin models have been proposed as a means to infer intrinsic fitness landscapes of HIV proteins from patient-derived viral protein sequences. These sequences are the product of non-equilibrium viral evolution driven by patient-specific immune responses, and are subject to phylogenetic constraints. How can such sequence data allow inference of intrinsic fitness landscapes? We combined computer simulations and variational theory á la Feynman to show that, in most circumstances, spin models inferred from patient-derived viral sequences reflect the correct rank order of the fitness of mutant viral strains. Our findings are relevant for diverse viruses. △ Less

Submitted 9 June, 2013; originally announced June 2013.

Comments: 10 pages, 4 figures and supplementary methods file

Showing 1–7 of 7 results for author: Barton, J P