-
Otter-Knowledge: benchmarks of multimodal knowledge graph representation learning from different sources for drug discovery
Authors:
Hoang Thanh Lam,
Marco Luca Sbodio,
Marcos Martínez Galindo,
Mykhaylo Zayats,
Raúl Fernández-Díaz,
Víctor Valls,
Gabriele Picco,
Cesar Berrospi Ramis,
Vanessa López
Abstract:
Recent research on predicting the binding affinity between drug molecules and proteins use representations learned, through unsupervised learning techniques, from large databases of molecule SMILES and protein sequences. While these representations have significantly enhanced the predictions, they are usually based on a limited set of modalities, and they do not exploit available knowledge about e…
▽ More
Recent research on predicting the binding affinity between drug molecules and proteins use representations learned, through unsupervised learning techniques, from large databases of molecule SMILES and protein sequences. While these representations have significantly enhanced the predictions, they are usually based on a limited set of modalities, and they do not exploit available knowledge about existing relations among molecules and proteins. In this study, we demonstrate that by incorporating knowledge graphs from diverse sources and modalities into the sequences or SMILES representation, we can further enrich the representation and achieve state-of-the-art results for drug-target binding affinity prediction in the established Therapeutic Data Commons (TDC) benchmarks. We release a set of multimodal knowledge graphs, integrating data from seven public data sources, and containing over 30 million triples. Our intention is to foster additional research to explore how multimodal knowledge enhanced protein/molecule embeddings can improve prediction tasks, including prediction of binding affinity. We also release some pretrained models learned from our multimodal knowledge graphs, along with source code for running standard benchmark tasks for prediction of biding affinity.
△ Less
Submitted 19 October, 2023; v1 submitted 22 June, 2023;
originally announced June 2023.
-
Envisioning a Human-AI collaborative system to transform policies into decision models
Authors:
Vanessa Lopez,
Gabriele Picco,
Inge Vejsbjerg,
Thanh Lam Hoang,
Yufang Hou,
Marco Luca Sbodio,
John Segrave-Daly,
Denisa Moga,
Sean Swords,
Miao Wei,
Eoin Carroll
Abstract:
Regulations govern many aspects of citizens' daily lives. Governments and businesses routinely automate these in the form of coded rules (e.g., to check a citizen's eligibility for specific benefits). However, the path to automation is long and challenging. To address this, recent global initiatives for digital government, proposing to simultaneously express policy in natural language for human co…
▽ More
Regulations govern many aspects of citizens' daily lives. Governments and businesses routinely automate these in the form of coded rules (e.g., to check a citizen's eligibility for specific benefits). However, the path to automation is long and challenging. To address this, recent global initiatives for digital government, proposing to simultaneously express policy in natural language for human consumption as well as computationally amenable rules or code, are gathering broad public-sector interest. We introduce the problem of semi-automatically building decision models from eligibility policies for social services, and present an initial emerging approach to shorten the route from policy documents to executable, interpretable and standardised decision models using AI, NLP and Knowledge Graphs. Despite the many open domain challenges, in this position paper we explore the enormous potential of AI to assist government agencies and policy experts in scaling the production of both human-readable and machine executable policy rules, while improving transparency, interpretability, traceability and accountability of the decision making.
△ Less
Submitted 1 November, 2022;
originally announced December 2022.
-
Neural Unification for Logic Reasoning over Natural Language
Authors:
Gabriele Picco,
Hoang Thanh Lam,
Marco Luca Sbodio,
Vanessa Lopez Garcia
Abstract:
Automated Theorem Proving (ATP) deals with the development of computer programs being able to show that some conjectures (queries) are a logical consequence of a set of axioms (facts and rules). There exists several successful ATPs where conjectures and axioms are formally provided (e.g. formalised as First Order Logic formulas). Recent approaches, such as (Clark et al., 2020), have proposed trans…
▽ More
Automated Theorem Proving (ATP) deals with the development of computer programs being able to show that some conjectures (queries) are a logical consequence of a set of axioms (facts and rules). There exists several successful ATPs where conjectures and axioms are formally provided (e.g. formalised as First Order Logic formulas). Recent approaches, such as (Clark et al., 2020), have proposed transformer-based architectures for deriving conjectures given axioms expressed in natural language (English). The conjecture is verified through a binary text classifier, where the transformers model is trained to predict the truth value of a conjecture given the axioms. The RuleTaker approach of (Clark et al., 2020) achieves appealing results both in terms of accuracy and in the ability to generalize, showing that when the model is trained with deep enough queries (at least 3 inference steps), the transformers are able to correctly answer the majority of queries (97.6%) that require up to 5 inference steps. In this work we propose a new architecture, namely the Neural Unifier, and a relative training procedure, which achieves state-of-the-art results in term of generalisation, showing that mimicking a well-known inference procedure, the backward chaining, it is possible to answer deep queries even when the model is trained only on shallow ones. The approach is demonstrated in experiments using a diverse set of benchmark data.
△ Less
Submitted 17 September, 2021;
originally announced September 2021.