Search | arXiv e-print repository

SynerGPT: In-Context Learning for Personalized Drug Synergy Prediction and Drug Design

Authors: Carl Edwards, Aakanksha Naik, Tushar Khot, Martin Burke, Heng Ji, Tom Hope

Abstract: Predicting synergistic drug combinations can help accelerate discovery of cancer treatments, particularly therapies personalized to a patient's specific tumor via biopsied cells. In this paper, we propose a novel setting and models for in-context drug synergy learning. We are given a small "personalized dataset" of 10-20 drug synergy relationships in the context of specific cancer cell targets. Ou… ▽ More Predicting synergistic drug combinations can help accelerate discovery of cancer treatments, particularly therapies personalized to a patient's specific tumor via biopsied cells. In this paper, we propose a novel setting and models for in-context drug synergy learning. We are given a small "personalized dataset" of 10-20 drug synergy relationships in the context of specific cancer cell targets. Our goal is to predict additional drug synergy relationships in that context. Inspired by recent work that pre-trains a GPT language model (LM) to "in-context learn" common function classes, we devise novel pre-training schemes that enable a GPT model to in-context learn "drug synergy functions". Our model -- which does not use any textual corpora, molecular fingerprints, protein interaction or any other domain-specific knowledge -- is able to achieve competitive results. We further integrate our in-context approach with a genetic algorithm to optimize model prompts and select synergy candidates to test after conducting a patient biopsy. Finally, we explore a novel task of inverse drug design which can potentially enable the design of drugs that synergize specifically to target a given patient's "personalized dataset". Our findings can potentially have an important impact on precision cancer medicine, and also raise intriguing questions on non-textual pre-training for LMs. △ Less

Submitted 24 October, 2023; v1 submitted 19 June, 2023; originally announced July 2023.

arXiv:2110.07117 [pdf, other]

doi 10.1098/rsta.2021.0300

FAIR Data Pipeline: provenance-driven data management for traceable scientific workflows

Authors: Sonia Natalie Mitchell, Andrew Lahiff, Nathan Cummings, Jonathan Hollocombe, Bram Boskamp, Ryan Field, Dennis Reddyhoff, Kristian Zarebski, Antony Wilson, Bruno Viola, Martin Burke, Blair Archibald, Paul Bessell, Richard Blackwell, Lisa A Boden, Alys Brett, Sam Brett, Ruth Dundas, Jessica Enright, Alejandra N. Gonzalez-Beltran, Claire Harris, Ian Hinder, Christopher David Hughes, Martin Knight, Vino Mano , et al. (13 additional authors not shown)

Abstract: Modern epidemiological analyses to understand and combat the spread of disease depend critically on access to, and use of, data. Rapidly evolving data, such as data streams changing during a disease outbreak, are particularly challenging. Data management is further complicated by data being imprecisely identified when used. Public trust in policy decisions resulting from such analyses is easily da… ▽ More Modern epidemiological analyses to understand and combat the spread of disease depend critically on access to, and use of, data. Rapidly evolving data, such as data streams changing during a disease outbreak, are particularly challenging. Data management is further complicated by data being imprecisely identified when used. Public trust in policy decisions resulting from such analyses is easily damaged and is often low, with cynicism arising where claims of "following the science" are made without accompanying evidence. Tracing the provenance of such decisions back through open software to primary data would clarify this evidence, enhancing the transparency of the decision-making process. Here, we demonstrate a Findable, Accessible, Interoperable and Reusable (FAIR) data pipeline developed during the COVID-19 pandemic that allows easy annotation of data as they are consumed by analyses, while tracing the provenance of scientific outputs back through the analytical source code to data sources. Such a tool provides a mechanism for the public, and fellow scientists, to better assess the trust that should be placed in scientific evidence, while allowing scientists to support policy-makers in openly justifying their decisions. We believe that tools such as this should be promoted for use across all areas of policy-facing research. △ Less

Submitted 4 May, 2022; v1 submitted 13 October, 2021; originally announced October 2021.

arXiv:2109.09888 [pdf, other]

Chemical-Reaction-Aware Molecule Representation Learning

Authors: Hongwei Wang, Weijiang Li, Xiaomeng **, Kyunghyun Cho, Heng Ji, Jiawei Han, Martin D. Burke

Abstract: Molecule representation learning (MRL) methods aim to embed molecules into a real vector space. However, existing SMILES-based (Simplified Molecular-Input Line-Entry System) or GNN-based (Graph Neural Networks) MRL methods either take SMILES strings as input that have difficulty in encoding molecule structure information, or over-emphasize the importance of GNN architectures but neglect their gene… ▽ More Molecule representation learning (MRL) methods aim to embed molecules into a real vector space. However, existing SMILES-based (Simplified Molecular-Input Line-Entry System) or GNN-based (Graph Neural Networks) MRL methods either take SMILES strings as input that have difficulty in encoding molecule structure information, or over-emphasize the importance of GNN architectures but neglect their generalization ability. Here we propose using chemical reactions to assist learning molecule representation. The key idea of our approach is to preserve the equivalence of molecules with respect to chemical reactions in the embedding space, i.e., forcing the sum of reactant embeddings and the sum of product embeddings to be equal for each chemical equation. This constraint is proven effective to 1) keep the embedding space well-organized and 2) improve the generalization ability of molecule embeddings. Moreover, our model can use any GNN as the molecule encoder and is thus agnostic to GNN architectures. Experimental results demonstrate that our method achieves state-of-the-art performance in a variety of downstream tasks, e.g., 17.4% absolute Hit@1 gain in chemical reaction prediction, 2.3% absolute AUC gain in molecule property prediction, and 18.5% relative RMSE gain in graph-edit-distance prediction, respectively, over the best baseline method. The code is available at https://github.com/hwwang55/MolR. △ Less

Submitted 22 September, 2021; v1 submitted 20 September, 2021; originally announced September 2021.

arXiv:q-bio/0309017 [pdf]

doi 10.1073/pnas.1936025100

Differential hydrophobicity drives self-assembly in Huntington's disease

Authors: Martin G. Burke, Rudiger Woscholski, Sophia N. Yaliraki

Abstract: Identifying the driving forces and the mechanism of association of huntingtin-exon1, a close marker for the progress of Huntington's disease, is an important prerequisite towards finding potential drug targets, and ultimately a cure. We introduce here a modelling framework based on a key analogy of the physico-chemical properties of the exon1 fragment to block copolymers. We use a systematic mes… ▽ More Identifying the driving forces and the mechanism of association of huntingtin-exon1, a close marker for the progress of Huntington's disease, is an important prerequisite towards finding potential drug targets, and ultimately a cure. We introduce here a modelling framework based on a key analogy of the physico-chemical properties of the exon1 fragment to block copolymers. We use a systematic mesoscale methodology, based on Dissipative Particle Dynamics, which is capable of overcoming kinetic barriers, thus capturing the dynamics of significantly larger systems over longer times than considered before. Our results reveal that the relative hydrophobicity of the poly-glutamine block as compared to the rest of the (proline-based) exon1 fragment, ignored to date, constitutes a major factor in the initiation of the self-assembly process. We find that the assembly is governed by both the concentration of exon1 and the length of the poly-glutamine stretch, with a low length threshold for association even at the lowest volume fractions we considered. Moreover, this self-association occurs irrespective of whether the glutamine stretch is in random coil or hairpin configuration, leading to spherical or cylindrical assemblies, respectively. We discuss the implications of these results for reinterpretation of existing research within this context, including that the routes towards aggregation of exon1 may be distinct to those of the widely studied homopolymeric poly-glutamine peptides. △ Less

Submitted 26 September, 2003; originally announced September 2003.

Showing 1–4 of 4 results for author: Burke, M