-
Model-based assessment of sampling protocols for infectious disease genomic surveillance
Authors:
Sebastian Contreras,
Karen Y. Oróstica,
Anamaria Daza-Sanchez,
Joel Wagner,
Philipp Dönges,
David Medina-Ortiz,
Matias Jara,
Ricardo Verdugo,
Carlos Conca,
Viola Priesemann,
Álvaro Olivera-Nappa
Abstract:
Genomic surveillance of infectious diseases allows monitoring circulating and emerging variants and quantifying their epidemic potential. However, due to the high costs associated with genomic sequencing, only a limited number of samples can be analysed. Thus, it is critical to understand how sampling impacts the information generated. Here, we combine a compartmental model for the spread of COVID…
▽ More
Genomic surveillance of infectious diseases allows monitoring circulating and emerging variants and quantifying their epidemic potential. However, due to the high costs associated with genomic sequencing, only a limited number of samples can be analysed. Thus, it is critical to understand how sampling impacts the information generated. Here, we combine a compartmental model for the spread of COVID-19 (distinguishing several SARS-CoV-2 variants) with different sampling strategies to assess their impact on genomic surveillance. In particular, we compare adaptive sampling, i.e., dynamically reallocating resources between screening at points of entry and inside communities, and constant sampling, i.e., assigning fixed resources to the two locations. We show that adaptive sampling uncovers new variants up to five weeks earlier than constant sampling, significantly reducing detection delays and estimation errors. This advantage is most prominent at low sequencing rates. Although increasing the sequencing rate has a similar effect, the marginal benefits of doing so may not always justify the associated costs. Consequently, it is convenient for countries with comparatively few resources to operate at lower sequencing rates, thereby profiting the most from adaptive sampling. Finally, our methodology can be readily adapted to study undersampling in other dynamical systems.
△ Less
Submitted 19 January, 2023;
originally announced January 2023.
-
Mutational signatures and transmissibility of SARS-CoV-2 Gamma and Lambda variants
Authors:
Karen Y. Oróstica,
Sebastian Contreras,
Sebastian B. Mohr,
Jonas Dehning,
Simon Bauer,
David Medina-Ortiz,
Emil N. Iftekhar,
Karen Mujica,
Paulo C. Covarrubias,
Soledad Ulloa,
Andrés E. Castillo,
Ricardo A. Verdugo,
Jorge Fernández,
Álvaro Olivera-Nappa,
Viola Priesemann
Abstract:
The emergence of SARS-CoV-2 variants of concern endangers the long-term control of COVID-19, especially in countries with limited genomic surveillance. In this work, we explored genomic drivers of contagion in Chile. We sequenced 3443 SARS-CoV-2 genomes collected between January and July 2021, where the Gamma (P.1), Lambda (C.37), Alpha (B.1.1.7), B.1.1.348, and B.1.1 lineages were predominant. Us…
▽ More
The emergence of SARS-CoV-2 variants of concern endangers the long-term control of COVID-19, especially in countries with limited genomic surveillance. In this work, we explored genomic drivers of contagion in Chile. We sequenced 3443 SARS-CoV-2 genomes collected between January and July 2021, where the Gamma (P.1), Lambda (C.37), Alpha (B.1.1.7), B.1.1.348, and B.1.1 lineages were predominant. Using a Bayesian model tailored for limited genomic surveillance, we found that Lambda and Gamma variants' reproduction numbers were about 5% and 16% larger than Alpha's, respectively. We observed an overabundance of mutations in the Spike gene, strongly correlated with the variant's transmissibility. Furthermore, the variants' mutational signatures featured a breakpoint concurrent with the beginning of vaccination (mostly CoronaVac, an inactivated virus vaccine), indicating an additional putative selective pressure. Thus, our work provides a reliable method for quantifying novel variants' transmissibility under subsampling (as newly-reported Delta, B.1.617.2) and highlights the importance of continuous genomic surveillance.
△ Less
Submitted 23 August, 2021;
originally announced August 2021.
-
Relaxing restrictions at the pace of vaccination increases freedom and guards against further COVID-19 waves
Authors:
Simon Bauer,
Sebastian Contreras,
Jonas Dehning,
Matthias Linden,
Emil Iftekhar,
Sebastian B. Mohr,
Álvaro Olivera-Nappa,
Viola Priesemann
Abstract:
Mass vaccination offers a promising exit strategy for the COVID-19 pandemic. However, as vaccination progresses, demands to lift restrictions increase, despite most of the population remaining susceptible. Using our age-stratified SEIRD-ICU compartmental model and curated epidemiological and vaccination data, we quantified the rate (relative to vaccination progress) at which countries can lift non…
▽ More
Mass vaccination offers a promising exit strategy for the COVID-19 pandemic. However, as vaccination progresses, demands to lift restrictions increase, despite most of the population remaining susceptible. Using our age-stratified SEIRD-ICU compartmental model and curated epidemiological and vaccination data, we quantified the rate (relative to vaccination progress) at which countries can lift non-pharmaceutical interventions without overwhelming their healthcare systems. We analyzed scenarios ranging from immediately lifting restrictions (accepting high mortality and morbidity) to reducing case numbers to a level where test-trace-and-isolate (TTI) programs efficiently compensate for local spreading events. In general, the age-dependent vaccination roll-out implies a transient decrease of more than ten years in the average age of ICU patients and deceased. The pace of vaccination determines the speed of lifting restrictions; Taking the European Union (EU) as an example case, all considered scenarios allow for steadily increasing contacts starting in May 2021 and relaxing most restrictions by autumn 2021. Throughout summer 2021, only mild contact restrictions will remain necessary. However, only high vaccine uptake can prevent further severe waves. Across EU countries, seroprevalence impacts the long-term success of vaccination campaigns more strongly than age demographics. In addition, we highlight the need for preventive measures to reduce contagion in school settings throughout the year 2021, where children might be drivers of contagion because of them remaining susceptible...
△ Less
Submitted 15 July, 2021; v1 submitted 10 March, 2021;
originally announced March 2021.
-
Peptipedia: a comprehensive database for peptide research supported by Assembled predictive models and Data Mining approaches
Authors:
Cristofer Quiroz,
Yasna Barrera Saavedra,
Benjamín Armijo-Galdames,
Juan Amado-Hinojosa,
Álvaro Olivera-Nappa,
Anamaria Sanchez-Daza,
David Medina-Ortiz
Abstract:
Motivation: Peptides have attracted the attention in this century due to their remarkable therapeutic properties. Computational tools are being developed to take advantage of existing information, encapsulating knowledge and making it available in a simple way for general public use. However, these are property-specific redundant data systems, and usually do not display the data in a clear way. In…
▽ More
Motivation: Peptides have attracted the attention in this century due to their remarkable therapeutic properties. Computational tools are being developed to take advantage of existing information, encapsulating knowledge and making it available in a simple way for general public use. However, these are property-specific redundant data systems, and usually do not display the data in a clear way. In some cases, information download is not even possible. This data needs to be available in a simple form for drug design and other biotechnological applications.
Results: We developed Peptipedia, a user-friendly database and web application to search, characterise and analyse peptide sequences. Our tool integrates the information from thirty previously reported databases, making it the largest repository of peptides with recorded activities so far. Besides, we implemented a variety of services to increase our tool's usability. The significant differences of our tools with other existing alternatives becomes a substantial contribution to develop biotechnological and bioengineering applications for peptides.
Availability: Peptipedia is available for non-commercial use as an open-access software, licensed under the GNU General Public License, version GPL 3.0. The web platform is publicly available at pesb2.cl/peptipedia. Both the source code and sample datasets are available in the GitHub repository https://github.com/CristoferQ/PeptideDatabase.
Contact: [email protected], [email protected]
△ Less
Submitted 28 January, 2021;
originally announced January 2021.
-
On the heterogeneous spread of COVID-19 in Chile
Authors:
Danton Freire-Flores,
Nyna Llanovarced-Kawles,
Anamaria Sanchez-Daza,
Álvaro Olivera-Nappa
Abstract:
Non-pharmaceutical interventions (NPIs) have played a crucial role in controlling the spread of COVID-19. Nevertheless, NPI efficacy varies enormously between and within countries, mainly because of population and behavioural heterogeneity. In this work, we adapted a multi-group SEIRA model to study the spreading dynamics of COVID-19 in Chile, representing geographically separated regions of the c…
▽ More
Non-pharmaceutical interventions (NPIs) have played a crucial role in controlling the spread of COVID-19. Nevertheless, NPI efficacy varies enormously between and within countries, mainly because of population and behavioural heterogeneity. In this work, we adapted a multi-group SEIRA model to study the spreading dynamics of COVID-19 in Chile, representing geographically separated regions of the country by different groups. We use national mobilization statistics to estimate the connectivity between regions and data from governmental repositories to obtain COVID-19 spreading and death rates in each region. We then assessed the effectiveness of different NPIs by studying the temporal evolution of the reproduction number Rt. Analyzing data-driven and model-based estimates of Rt, we found a strong coupling of different regions, highlighting the necessity of organized and coordinated actions to control the spread of SARS-CoV-2. Finally, we evaluated different scenarios to forecast the evolution of COVID-19 in the most densely populated regions, finding that the early lifting of restriction probably will lead to novel outbreaks.
△ Less
Submitted 25 June, 2021; v1 submitted 3 December, 2020;
originally announced December 2020.
-
Combination of digital signal processing and assembled predictive models facilitates the rational design of proteins
Authors:
David Medina-Ortiz,
Sebastian Contreras,
Juan Amado-Hinojosa,
Jorge Torres-Almonacid,
Juan A. Asenjo,
Marcelo Navarrete,
Álvaro Olivera-Nappa
Abstract:
Predicting the effect of mutations in proteins is one of the most critical challenges in protein engineering; by knowing the effect a substitution of one (or several) residues in the protein's sequence has on its overall properties, could design a variant with a desirable function. New strategies and methodologies to create predictive models are continually being developed. However, those that cla…
▽ More
Predicting the effect of mutations in proteins is one of the most critical challenges in protein engineering; by knowing the effect a substitution of one (or several) residues in the protein's sequence has on its overall properties, could design a variant with a desirable function. New strategies and methodologies to create predictive models are continually being developed. However, those that claim to be general often do not reach adequate performance, and those that aim to a particular task improve their predictive performance at the cost of the method's generality. Moreover, these approaches typically require a particular decision to encode the amino acidic sequence, without an explicit methodological agreement in such endeavor. To address these issues, in this work, we applied clustering, embedding, and dimensionality reduction techniques to the AAIndex database to select meaningful combinations of physicochemical properties for the encoding stage. We then used the chosen set of properties to obtain several encodings of the same sequence, to subsequently apply the Fast Fourier Transform (FFT) on them. We perform an exploratory stage of Machine-Learning models in the frequency space, using different algorithms and hyperparameters. Finally, we select the best performing predictive models in each set of properties and create an assembled model. We extensively tested the proposed methodology on different datasets and demonstrated that the generated assembled model achieved notably better performance metrics than those models based on a single encoding and, in most cases, better than those previously reported. The proposed method is available as a Python library for non-commercial use under the GNU General Public License (GPLv3) license.
△ Less
Submitted 7 October, 2020;
originally announced October 2020.
-
Statistically-based methodology for revealing real contagion trends and correcting delay-induced errors in the assessment of COVID-19 pandemic
Authors:
Sebastián Contreras,
Juan Pablo Biron-Lattes,
H. Andrés Villavicencio,
David Medina-Ortiz,
Nyna Llanovarced-Kawles,
Álvaro Olivera-Nappa
Abstract:
COVID-19 pandemic has reshaped our world in a timescale much shorter than what we can understand. Particularities of SARS-CoV-2, such as its persistence in surfaces and the lack of a curative treatment or vaccine against COVID-19, have pushed authorities to apply restrictive policies to control its spreading. As data drove most of the decisions made in this global contingency, their quality is a c…
▽ More
COVID-19 pandemic has reshaped our world in a timescale much shorter than what we can understand. Particularities of SARS-CoV-2, such as its persistence in surfaces and the lack of a curative treatment or vaccine against COVID-19, have pushed authorities to apply restrictive policies to control its spreading. As data drove most of the decisions made in this global contingency, their quality is a critical variable for decision-making actors, and therefore should be carefully curated. In this work, we analyze the sources of error in typically reported epidemiological variables and usual tests used for diagnosis, and their impact on our understanding of COVID-19 spreading dynamics. We address the existence of different delays in the report of new cases, induced by the incubation time of the virus and testing-diagnosis time gaps, and other error sources related to the sensitivity/specificity of the tests used to diagnose COVID-19. Using a statistically-based algorithm, we perform a temporal reclassification of cases to avoid delay-induced errors, building up new epidemiologic curves centered in the day where the contagion effectively occurred. We also statistically enhance the robustness behind the discharge/recovery clinical criteria in the absence of a direct test, which is typically the case of non-first world countries, where the limited testing capabilities are fully dedicated to the evaluation of new cases. Finally, we applied our methodology to assess the evolution of the pandemic in Chile through the Effective Reproduction Number $R_t$, identifying different moments in which data was misleading governmental actions. In doing so, we aim to raise public awareness of the need for proper data reporting and processing protocols for epidemiological modelling and predictions.
△ Less
Submitted 24 June, 2020; v1 submitted 25 May, 2020;
originally announced May 2020.
-
A multi-group SEIRA model for the spread of COVID-19 among heterogeneous populations
Authors:
Sebastian Contreras,
H. Andres Villavicencio,
David Medina-Ortiz,
Juan Pablo Biron-Lattes,
Alvaro Olivera-Nappa
Abstract:
The outbreak and propagation of COVID-19 have posed a considerable challenge to modern society. In particular, the different restrictive actions taken by governments to prevent the spread of the virus have changed the way humans interact and conceive interaction. Due to geographical, behavioral, or economic factors, different sub-groups among a population are more (or less) likely to interact, and…
▽ More
The outbreak and propagation of COVID-19 have posed a considerable challenge to modern society. In particular, the different restrictive actions taken by governments to prevent the spread of the virus have changed the way humans interact and conceive interaction. Due to geographical, behavioral, or economic factors, different sub-groups among a population are more (or less) likely to interact, and thus to spread/acquire the virus. In this work, we present a general multi-group SEIRA model for representing the spread of COVID-19 among a heterogeneous population and test it in a numerical case of study. By highlighting its applicability and the ease with which its general formulation can be adapted to particular studies, we expect our model to lead us to a better understanding of the evolution of this pandemic and to better public-health policies to control it.
△ Less
Submitted 28 April, 2020;
originally announced April 2020.
-
Cell cycle and protein complex dynamics in discovering signaling pathways
Authors:
Daniel Inostroza,
Cecilia Hernández,
Diego Seco,
Gonzalo Navarro,
Alvaro Olivera-Nappa
Abstract:
Signaling pathways are responsible for the regulation of cell processes, such as monitoring the external environment, transmitting information across membranes, and making cell fate decisions. Given the increasing amount of biological data available and the recent discoveries showing that many diseases are related to the disruption of cellular signal transduction cascades, in silico discovery of s…
▽ More
Signaling pathways are responsible for the regulation of cell processes, such as monitoring the external environment, transmitting information across membranes, and making cell fate decisions. Given the increasing amount of biological data available and the recent discoveries showing that many diseases are related to the disruption of cellular signal transduction cascades, in silico discovery of signaling pathways in cell biology has become an active research topic in past years. However, reconstruction of signaling pathways remains a challenge mainly because of the need for systematic approaches for predicting causal relationships, like edge direction and activation/inhibition among interacting proteins in the signal flow. We propose an approach for predicting signaling pathways that integrates protein interactions, gene expression, phenotypes, and protein complex information. Our method first finds candidate pathways using a directed-edge-based algorithm and then defines a graph model to include causal activation relationships among proteins, in candidate pathways using cell cycle gene expression and phenotypes to infer consistent pathways in yeast. Then, we incorporate protein complex coverage information for deciding on the final predicted signaling pathways. We show that our approach improves the predictive results of the state of the art using different ranking metrics.
△ Less
Submitted 6 April, 2020; v1 submitted 26 February, 2020;
originally announced February 2020.