-
Synaptic effects on the intermittent synchronization of gamma rhythms
Authors:
Quynh-Anh Nguyen,
Leonid L Rubchinsky
Abstract:
Synchronization of neural activity in the gamma frequency band is associated with various cognitive phenomena. Abnormalities of gamma synchronization may underlie symptoms of several neurological and psychiatric disorders such as schizophrenia and autism spectrum disorder. Properties of neural oscillations in the gamma band depend critically on the synaptic properties of the underlying circuits. T…
▽ More
Synchronization of neural activity in the gamma frequency band is associated with various cognitive phenomena. Abnormalities of gamma synchronization may underlie symptoms of several neurological and psychiatric disorders such as schizophrenia and autism spectrum disorder. Properties of neural oscillations in the gamma band depend critically on the synaptic properties of the underlying circuits. This study explores how synaptic properties in pyramidal-interneuronal circuits affect not only the average synchronization strength but also the fine temporal patterning of neural synchrony. If two signals show only moderate synchrony strength, it may be possible to consider these dynamics as alternating between synchronized and desynchronized states. We use a model of connected circuits that produces pyramidal-interneuronal gamma (PING) oscillations to explore the temporal patterning of synchronized and desynchronized intervals. Changes in synaptic strength may alter the temporal patterning of synchronized dynamics (even if the average synchrony strength is not changed). Larger values of local synaptic connections promote longer desynchronization durations, while larger values of long-range synaptic connections promote shorter desynchronization durations. Furthermore, we show that circuits with different temporal patterning of synchronization may have different sensitivity to synaptic input. Thus, the alterations of synaptic strength may mediate physiological properties of neural circuits not only through change in the average synchrony level of gamma oscillations, but also through change in how synchrony is patterned in time over very short time scales.
△ Less
Submitted 30 June, 2024;
originally announced July 2024.
-
Quality-Weighted Vendi Scores And Their Application To Diverse Experimental Design
Authors:
Quan Nguyen,
Adji Bousso Dieng
Abstract:
Experimental design techniques such as active search and Bayesian optimization are widely used in the natural sciences for data collection and discovery. However, existing techniques tend to favor exploitation over exploration of the search space, which causes them to get stuck in local optima. This ``collapse" problem prevents experimental design algorithms from yielding diverse high-quality data…
▽ More
Experimental design techniques such as active search and Bayesian optimization are widely used in the natural sciences for data collection and discovery. However, existing techniques tend to favor exploitation over exploration of the search space, which causes them to get stuck in local optima. This ``collapse" problem prevents experimental design algorithms from yielding diverse high-quality data. In this paper, we extend the Vendi scores -- a family of interpretable similarity-based diversity metrics -- to account for quality. We then leverage these quality-weighted Vendi scores to tackle experimental design problems across various applications, including drug discovery, materials discovery, and reinforcement learning. We found that quality-weighted Vendi scores allow us to construct policies for experimental design that flexibly balance quality and diversity, and ultimately assemble rich and diverse sets of high-performing data points. Our algorithms led to a 70%-170% increase in the number of effective discoveries compared to baselines.
△ Less
Submitted 3 May, 2024;
originally announced May 2024.
-
Leak Proof CMap; a framework for training and evaluation of cell line agnostic L1000 similarity methods
Authors:
Steven Shave,
Richard Kasprowicz,
Abdullah M. Athar,
Denise Vlachou,
Neil O. Carragher,
Cuong Q. Nguyen
Abstract:
The Connectivity Map (CMap) is a large publicly available database of cellular transcriptomic responses to chemical and genetic perturbations built using a standardized acquisition protocol known as the L1000 technique. Databases such as CMap provide an exciting opportunity to enrich drug discovery efforts, providing a 'known' phenotypic landscape to explore and enabling the development of state o…
▽ More
The Connectivity Map (CMap) is a large publicly available database of cellular transcriptomic responses to chemical and genetic perturbations built using a standardized acquisition protocol known as the L1000 technique. Databases such as CMap provide an exciting opportunity to enrich drug discovery efforts, providing a 'known' phenotypic landscape to explore and enabling the development of state of the art techniques for enhanced information extraction and better informed decisions. Whilst multiple methods for measuring phenotypic similarity and interrogating profiles have been developed, the field is severely lacking standardized benchmarks using appropriate data splitting for training and unbiased evaluation of machine learning methods. To address this, we have developed 'Leak Proof CMap' and exemplified its application to a set of common transcriptomic and generic phenotypic similarity methods along with an exemplar triplet loss-based method. Benchmarking in three critical performance areas (compactness, distinctness, and uniqueness) is conducted using carefully crafted data splits ensuring no similar cell lines or treatments with shared or closely matching responses or mechanisms of action are present in training, validation, or test sets. This enables testing of models with unseen samples akin to exploring treatments with novel modes of action in novel patient derived cell lines. With a carefully crafted benchmark and data splitting regime in place, the tooling now exists to create performant phenotypic similarity methods for use in personalized medicine (novel cell lines) and to better augment high throughput phenotypic screening technologies with the L1000 transcriptomic technology.
△ Less
Submitted 29 April, 2024;
originally announced April 2024.
-
Segmenting mechanically heterogeneous domains via unsupervised learning
Authors:
Quan Nguyen,
Emma Lejeune
Abstract:
From biological organs to soft robotics, highly deformable materials are essential components of natural and engineered systems. These highly deformable materials can have heterogeneous material properties, and can experience heterogeneous deformations with or without underlying material heterogeneity. Many recent works have established that computational modeling approaches are well suited for un…
▽ More
From biological organs to soft robotics, highly deformable materials are essential components of natural and engineered systems. These highly deformable materials can have heterogeneous material properties, and can experience heterogeneous deformations with or without underlying material heterogeneity. Many recent works have established that computational modeling approaches are well suited for understanding and predicting the consequences of material heterogeneity and for interpreting observed heterogeneous strain fields. In particular, there has been significant work towards develo** inverse analysis approaches that can convert observed kinematic quantities (e.g., displacement, strain) to material properties and mechanical state. Despite the success of these approaches, they are not necessarily generalizable and often rely on tight control and knowledge of boundary conditions. Here, we will build on the recent advances (and ubiquity) of machine learning approaches to explore alternative approaches to detect patterns in heterogeneous material properties and mechanical behavior. Specifically, we will explore unsupervised learning approaches to clustering and ensemble clutering to identify heterogeneous regions. Overall, we find that these approaches are effective, yet limited in their abilities. Through this initial exploration (where all data and code is published alongside this manuscript), we set the stage for future studies that more specifically adapt these methods to mechanical data.
△ Less
Submitted 29 August, 2023;
originally announced August 2023.
-
Measuring unequal distribution of pandemic severity across census years, variants of concern and interventions
Authors:
Quang Dang Nguyen,
Sheryl L. Chang,
Christina M. Jamerlan,
Mikhail Prokopenko
Abstract:
Diverse and complex intervention policies deployed over the last years have shown varied effectiveness in controlling the COVID-19 pandemic. However, a systematic analysis and modelling of the combined effects of different viral lineages and complex intervention policies remains a challenge. Using large-scale agent-based modelling and a high-resolution computational simulation matching census-base…
▽ More
Diverse and complex intervention policies deployed over the last years have shown varied effectiveness in controlling the COVID-19 pandemic. However, a systematic analysis and modelling of the combined effects of different viral lineages and complex intervention policies remains a challenge. Using large-scale agent-based modelling and a high-resolution computational simulation matching census-based demographics of Australia, we carried out a systematic comparative analysis of several COVID-19 pandemic scenarios. The scenarios covered two most recent Australian census years (2016 and 2021), three variants of concern (ancestral, Delta and Omicron), and five representative intervention policies. In addition, we introduced pandemic Lorenz curves measuring an unequal distribution of the pandemic severity across local areas. We quantified nonlinear effects of population heterogeneity on the pandemic severity, highlighting that (i) the population growth amplifies pandemic peaks, (ii) the changes in population size amplify the peak incidence more than the changes in density, and (iii) the pandemic severity is distributed unequally across local areas. We also examined and delineated the effects of urbanisation on the incidence bimodality, distinguishing between urban and regional pandemic waves. Finally, we quantified and examined the impact of school closures, complemented by partial interventions, and identified the conditions when inclusion of school closures may decisively control the transmission. Our results suggest that (a) public health response to long-lasting pandemics must be frequently reviewed and adapted to demographic changes, (b) in order to control recurrent waves, mass-vaccination rollouts need to be complemented by partial NPIs, and (c) healthcare and vaccination resources need to be prioritised towards the localities and regions with high population growth and/or high density.
△ Less
Submitted 26 June, 2023;
originally announced June 2023.
-
Molecule-Morphology Contrastive Pretraining for Transferable Molecular Representation
Authors:
Cuong Q. Nguyen,
Dante Pertusi,
Kim M. Branson
Abstract:
Image-based profiling techniques have become increasingly popular over the past decade for their applications in target identification, mechanism-of-action inference, and assay development. These techniques have generated large datasets of cellular morphologies, which are typically used to investigate the effects of small molecule perturbagens. In this work, we extend the impact of such dataset to…
▽ More
Image-based profiling techniques have become increasingly popular over the past decade for their applications in target identification, mechanism-of-action inference, and assay development. These techniques have generated large datasets of cellular morphologies, which are typically used to investigate the effects of small molecule perturbagens. In this work, we extend the impact of such dataset to improving quantitative structure-activity relationship (QSAR) models by introducing Molecule-Morphology Contrastive Pretraining (MoCoP), a framework for learning multi-modal representation of molecular graphs and cellular morphologies. We scale MoCoP to approximately 100K molecules and 600K morphological profiles using data from the JUMP-CP Consortium and show that MoCoP consistently improves performances of graph neural networks (GNNs) on molecular property prediction tasks in ChEMBL20 across all dataset sizes. The pretrained GNNs are also evaluated on internal GSK pharmacokinetic data and show an average improvement of 2.6% and 6.3% in AUPRC for full and low data regimes, respectively. Our findings suggest that integrating cellular morphologies with molecular graphs using MoCoP can significantly improve the performance of QSAR models, ultimately expanding the deep learning toolbox available for QSAR applications.
△ Less
Submitted 26 June, 2023; v1 submitted 26 April, 2023;
originally announced May 2023.
-
Challenges and perspectives in computational deconvolution of genomics data
Authors:
Lana X. Garmire,
Yijun Li,
Qianhui Huang,
Chuan Xu,
Sarah Teichmann,
Naftali Kaminski,
Matteo Pellegrini,
Quan Nguyen,
Andrew E. Teschendorff
Abstract:
Deciphering cell type heterogeneity is crucial for systematically understanding tissue homeostasis and its dysregulation in diseases. Computational deconvolution is an efficient approach estimating cell type abundances from a variety of omics data. Despite significant methodological progress in computational deconvolution in recent years, challenges are still outstanding. Here we enlist four signi…
▽ More
Deciphering cell type heterogeneity is crucial for systematically understanding tissue homeostasis and its dysregulation in diseases. Computational deconvolution is an efficient approach estimating cell type abundances from a variety of omics data. Despite significant methodological progress in computational deconvolution in recent years, challenges are still outstanding. Here we enlist four significant challenges related to computational deconvolution, from the quality of the reference data, generation of ground truth data, limitations of computational methodologies, and benchmarking design and implementation. Finally, we make recommendations on reference data generation, new directions of computational methodologies and strategies to promote rigorous benchmarking.
△ Less
Submitted 2 September, 2023; v1 submitted 21 November, 2022;
originally announced November 2022.
-
Persistence of the Omicron variant of SARS-CoV-2 in Australia: The impact of fluctuating social distancing
Authors:
Sheryl L. Chang,
Quang Dang Nguyen,
Alexandra Martiniuk,
Vitali Sintchenko,
Tania C. Sorrell,
Mikhail Prokopenko
Abstract:
We modelled emergence and spread of the Omicron variant of SARS-CoV-2 in Australia between December 2021 and June 2022. This pandemic stage exhibited a diverse epidemiological profile with emergence of co-circulating sub-lineages of Omicron, further complicated by differences in social distancing behaviour which varied over time. Our study delineated distinct phases of the Omicron-associated pande…
▽ More
We modelled emergence and spread of the Omicron variant of SARS-CoV-2 in Australia between December 2021 and June 2022. This pandemic stage exhibited a diverse epidemiological profile with emergence of co-circulating sub-lineages of Omicron, further complicated by differences in social distancing behaviour which varied over time. Our study delineated distinct phases of the Omicron-associated pandemic stage, and retrospectively quantified the adoption of social distancing measures, fluctuating over different time periods in response to the observable incidence dynamics. We also modelled the corresponding disease burden, in terms of hospitalisations, intensive care unit occupancy, and mortality. Supported by good agreement between simulated and actual health data, our study revealed that the nonlinear dynamics observed in the daily incidence and disease burden were determined not only by introduction of sub-lineages of Omicron, but also by the fluctuating adoption of social distancing measures. Our high-resolution model can be used in design and evaluation of public health interventions during future crises.
△ Less
Submitted 3 April, 2023; v1 submitted 20 November, 2022;
originally announced November 2022.
-
A general framework for optimising cost-effectiveness of pandemic response under partial intervention measures
Authors:
Quang Dang Nguyen,
Mikhail Prokopenko
Abstract:
The COVID-19 pandemic created enormous public health and socioeconomic challenges. The health effects of vaccination and non-pharmaceutical interventions (NPIs) were often contrasted with significant social and economic costs. We describe a general framework aimed to derive adaptive cost-effective interventions, adequate for both recent and emerging pandemic threats. We also quantify the net healt…
▽ More
The COVID-19 pandemic created enormous public health and socioeconomic challenges. The health effects of vaccination and non-pharmaceutical interventions (NPIs) were often contrasted with significant social and economic costs. We describe a general framework aimed to derive adaptive cost-effective interventions, adequate for both recent and emerging pandemic threats. We also quantify the net health benefits and propose a reinforcement learning approach to optimise adaptive NPIs. The approach utilises an agent-based model simulating pandemic responses in Australia, and accounts for a heterogeneous population with variable levels of compliance fluctuating over time and across individuals. Our analysis shows that a significant net health benefit may be attained by adaptive NPIs formed by partial social distancing measures, coupled with moderate levels of the society's willingness to pay for health gains (health losses averted). We demonstrate that a socially acceptable balance between health effects and incurred economic costs is achievable over a long term, despite possible early setbacks.
△ Less
Submitted 20 November, 2022; v1 submitted 18 May, 2022;
originally announced May 2022.
-
Epigenomic language models powered by Cerebras
Authors:
Meredith V. Trotter,
Cuong Q. Nguyen,
Stephen Young,
Rob T. Woodruff,
Kim M. Branson
Abstract:
Large scale self-supervised pre-training of Transformer language models has advanced the field of Natural Language Processing and shown promise in cross-application to the biological `languages' of proteins and DNA. Learning effective representations of DNA sequences using large genomic sequence corpuses may accelerate the development of models of gene regulation and function through transfer lear…
▽ More
Large scale self-supervised pre-training of Transformer language models has advanced the field of Natural Language Processing and shown promise in cross-application to the biological `languages' of proteins and DNA. Learning effective representations of DNA sequences using large genomic sequence corpuses may accelerate the development of models of gene regulation and function through transfer learning. However, to accurately model cell type-specific gene regulation and function, it is necessary to consider not only the information contained in DNA nucleotide sequences, which is mostly invariant between cell types, but also how the local chemical and structural `epigenetic state' of chromosomes varies between cell types. Here, we introduce a Bidirectional Encoder Representations from Transformers (BERT) model that learns representations based on both DNA sequence and paired epigenetic state inputs, which we call Epigenomic BERT (or EBERT). We pre-train EBERT with a masked language model objective across the entire human genome and across 127 cell types. Training this complex model with a previously prohibitively large dataset was made possible for the first time by a partnership with Cerebras Systems, whose CS-1 system powered all pre-training experiments. We show EBERT's transfer learning potential by demonstrating strong performance on a cell type-specific transcription factor binding prediction task. Our fine-tuned model exceeds state of the art performance on 4 of 13 evaluation datasets from ENCODE-DREAM benchmarks and earns an overall rank of 3rd on the challenge leaderboard. We explore how the inclusion of epigenetic data and task specific feature augmentation impact transfer learning performance.
△ Less
Submitted 14 December, 2021;
originally announced December 2021.
-
Temporal patterns of synchrony in a pyramidal-interneuron gamma (PING) network
Authors:
Quynh-Anh Nguyen,
Leonid L Rubchinsky
Abstract:
Synchronization in neural system plays an important role in many brain functions. Synchronization in the gamma frequency band (30Hz-100Hz) is involved in a variety of cognitive phenomena; abnormalities of the gamma synchronization are found in schizophrenia and autism spectrum disorder. Frequently, the strength of synchronization is not very high and is intermittent even on short time scales (a fe…
▽ More
Synchronization in neural system plays an important role in many brain functions. Synchronization in the gamma frequency band (30Hz-100Hz) is involved in a variety of cognitive phenomena; abnormalities of the gamma synchronization are found in schizophrenia and autism spectrum disorder. Frequently, the strength of synchronization is not very high and is intermittent even on short time scales (a few cycles of oscillations). That is, the network exhibits intervals of synchronization followed by intervals of desynchronization. Neural circuits dynamics may show different distributions of desynchronization durations even if the synchronization strength is fixed. In this study, we use a conductance-based neural network exhibiting pyramidal-interneuron (PING) gamma rhythm to study the temporal patterning of synchronized neural oscillations. We found that changes in the synaptic strength (as well as changes in the membrane kinetics) can alter the temporal patterning of synchrony. Moreover, we found that the changes in the temporal pattern of synchrony may be independent of the changes in the average synchrony strength. Even though the temporal patterning may vary, there is a tendency for dynamics with short (although potentially numerous) desynchronizations, similar to what was observed in experimental studies of neural activity synchronization in the brain. Recent studies suggested that the short desynchronizations dynamics may facilitate the formation and the break-up of transient neural assemblies. Thus, the results of this study suggest that changes of synaptic strength may alter the temporal patterning of the gamma synchronization as to make the neural networks more efficient in the formation of neural assemblies and the facilitation of cognitive phenomena.
△ Less
Submitted 5 April, 2021;
originally announced April 2021.
-
COVID-19 Risk Estimation using a Time-varying SIR-model
Authors:
Mehrdad Kiamari,
Gowri Ramachandran,
Quynh Nguyen,
Eva Pereira,
Jeanne Holm,
Bhaskar Krishnamachari
Abstract:
Policy-makers require data-driven tools to assess the spread of COVID-19 and inform the public of their risk of infection on an ongoing basis. We propose a rigorous hybrid model-and-data-driven approach to risk scoring based on a time-varying SIR epidemic model that ultimately yields a simplified color-coded risk level for each community. The risk score $Γ_t$ that we propose is proportional to the…
▽ More
Policy-makers require data-driven tools to assess the spread of COVID-19 and inform the public of their risk of infection on an ongoing basis. We propose a rigorous hybrid model-and-data-driven approach to risk scoring based on a time-varying SIR epidemic model that ultimately yields a simplified color-coded risk level for each community. The risk score $Γ_t$ that we propose is proportional to the probability of someone currently healthy getting infected in the next 24 hours. We show how this risk score can be estimated using another useful metric of infection spread, $R_t$, the time-varying average reproduction number which indicates the average number of individuals an infected person would infect in turn. The proposed approach also allows for quantification of uncertainty in the estimates of $R_t$ and $Γ_t$ in the form of confidence intervals. Code and data from our effort have been open-sourced and are being applied to assess and communicate the risk of infection in the City and County of Los Angeles.
△ Less
Submitted 11 August, 2020;
originally announced August 2020.
-
Retrosynthetic reaction prediction using neural sequence-to-sequence models
Authors:
Bowen Liu,
Bharath Ramsundar,
Prasad Kawthekar,
Jade Shi,
Joseph Gomes,
Quang Luu Nguyen,
Stephen Ho,
Jack Sloane,
Paul Wender,
Vijay Pande
Abstract:
We describe a fully data driven model that learns to perform a retrosynthetic reaction prediction task, which is treated as a sequence-to-sequence map** problem. The end-to-end trained model has an encoder-decoder architecture that consists of two recurrent neural networks, which has previously shown great success in solving other sequence-to-sequence prediction tasks such as machine translation…
▽ More
We describe a fully data driven model that learns to perform a retrosynthetic reaction prediction task, which is treated as a sequence-to-sequence map** problem. The end-to-end trained model has an encoder-decoder architecture that consists of two recurrent neural networks, which has previously shown great success in solving other sequence-to-sequence prediction tasks such as machine translation. The model is trained on 50,000 experimental reaction examples from the United States patent literature, which span 10 broad reaction types that are commonly used by medicinal chemists. We find that our model performs comparably with a rule-based expert system baseline model, and also overcomes certain limitations associated with rule-based expert systems and with any machine learning approach that contains a rule-based expert system component. Our model provides an important first step towards solving the challenging problem of computational retrosynthetic analysis.
△ Less
Submitted 6 June, 2017;
originally announced June 2017.
-
Innovative in silico approaches to address avian flu using grid technology
Authors:
V. Vincent Breton,
A. L. Da Costa,
P. De Vlieger,
L. Maigne,
D. Sarramia,
Y. -M. Kim,
D. Kim,
H. Q. Nguyen,
T. Solomonides,
Y. -T. Wu,
T. N. Hai
Abstract:
The recent years have seen the emergence of diseases which have spread very quickly all around the world either through human travels like SARS or animal migration like avian flu. Among the biggest challenges raised by infectious emerging diseases, one is related to the constant mutation of the viruses which turns them into continuously moving targets for drug and vaccine discovery. Another chal…
▽ More
The recent years have seen the emergence of diseases which have spread very quickly all around the world either through human travels like SARS or animal migration like avian flu. Among the biggest challenges raised by infectious emerging diseases, one is related to the constant mutation of the viruses which turns them into continuously moving targets for drug and vaccine discovery. Another challenge is related to the early detection and surveillance of the diseases as new cases can appear just anywhere due to the globalization of exchanges and the circulation of people and animals around the earth, as recently demonstrated by the avian flu epidemics. For 3 years now, a collaboration of teams in Europe and Asia has been exploring some innovative in silico approaches to better tackle avian flu taking advantage of the very large computing resources available on international grid infrastructures. Grids were used to study the impact of mutations on the effectiveness of existing drugs against H5N1 and to find potentially new leads active on mutated strains. Grids allow also the integration of distributed data in a completely secured way. The paper presents how we are currently exploring how to integrate the existing data sources towards a global surveillance network for molecular epidemiology.
△ Less
Submitted 23 December, 2008;
originally announced December 2008.