Search | arXiv e-print repository

Neurosymbolic AI for Reasoning on Biomedical Knowledge Graphs

Authors: Lauren Nicole DeLong, Ramon Fernández Mir, Zonglin Ji, Fiona Niamh Coulter Smith, Jacques D. Fleuriot

Abstract: Biomedical datasets are often modeled as knowledge graphs (KGs) because they capture the multi-relational, heterogeneous, and dynamic natures of biomedical systems. KG completion (KGC), can, therefore, help researchers make predictions to inform tasks like drug repositioning. While previous approaches for KGC were either rule-based or embedding-based, hybrid approaches based on neurosymbolic artif… ▽ More Biomedical datasets are often modeled as knowledge graphs (KGs) because they capture the multi-relational, heterogeneous, and dynamic natures of biomedical systems. KG completion (KGC), can, therefore, help researchers make predictions to inform tasks like drug repositioning. While previous approaches for KGC were either rule-based or embedding-based, hybrid approaches based on neurosymbolic artificial intelligence are becoming more popular. Many of these methods possess unique characteristics which make them even better suited toward biomedical challenges. Here, we survey such approaches with an emphasis on their utilities and prospective benefits for biomedicine. △ Less

Submitted 17 July, 2023; originally announced July 2023.

Comments: Proceedings of the $\mathit{40}^{th}$ International Conference on Machine Learning: Workshop on Knowledge and Logical Reasoning in the Era of Data-driven Learning (https://klr-icml2023.github.io/schedule.html). PMLR 202, 2023. Condensed, workshop-ready version of previous survey, arXiv:2302.07200 , which is under review. 13 pages (9 content, 4 references), 3 figures, 1 table

arXiv:2304.00994 [pdf, other]

Machine-Learned Premise Selection for Lean

Authors: Bartosz Piotrowski, Ramon Fernández Mir, Edward Ayers

Abstract: We introduce a machine-learning-based tool for the Lean proof assistant that suggests relevant premises for theorems being proved by a user. The design principles for the tool are (1) tight integration with the proof assistant, (2) ease of use and installation, (3) a lightweight and fast approach. For this purpose, we designed a custom version of the random forest model, trained in an online fashi… ▽ More We introduce a machine-learning-based tool for the Lean proof assistant that suggests relevant premises for theorems being proved by a user. The design principles for the tool are (1) tight integration with the proof assistant, (2) ease of use and installation, (3) a lightweight and fast approach. For this purpose, we designed a custom version of the random forest model, trained in an online fashion. It is implemented directly in Lean, which was possible thanks to the rich and efficient metaprogramming features of Lean 4. The random forest is trained on data extracted from mathlib -- Lean's mathematics library. We experiment with various options for producing training features and labels. The advice from a trained model is accessible to the user via the suggest_premises tactic which can be called in an editor while constructing a proof interactively. △ Less

Submitted 14 June, 2023; v1 submitted 17 March, 2023; originally announced April 2023.

arXiv:2302.07200 [pdf, ps, other]

Neurosymbolic AI for Reasoning over Knowledge Graphs: A Survey

Authors: Lauren Nicole DeLong, Ramon Fernández Mir, Jacques D. Fleuriot

Abstract: Neurosymbolic AI is an increasingly active area of research that combines symbolic reasoning methods with deep learning to leverage their complementary benefits. As knowledge graphs are becoming a popular way to represent heterogeneous and multi-relational data, methods for reasoning on graph structures have attempted to follow this neurosymbolic paradigm. Traditionally, such approaches have utili… ▽ More Neurosymbolic AI is an increasingly active area of research that combines symbolic reasoning methods with deep learning to leverage their complementary benefits. As knowledge graphs are becoming a popular way to represent heterogeneous and multi-relational data, methods for reasoning on graph structures have attempted to follow this neurosymbolic paradigm. Traditionally, such approaches have utilized either rule-based inference or generated representative numerical embeddings from which patterns could be extracted. However, several recent studies have attempted to bridge this dichotomy to generate models that facilitate interpretability, maintain competitive performance, and integrate expert knowledge. Therefore, we survey methods that perform neurosymbolic reasoning tasks on knowledge graphs and propose a novel taxonomy by which we can classify them. Specifically, we propose three major categories: (1) logically-informed embedding approaches, (2) embedding approaches with logical constraints, and (3) rule learning approaches. Alongside the taxonomy, we provide a tabular overview of the approaches and links to their source code, if available, for more direct comparison. Finally, we discuss the unique characteristics and limitations of these methods, then propose several prospective directions toward which this field of research could evolve. △ Less

Submitted 16 May, 2024; v1 submitted 14 February, 2023; originally announced February 2023.

Comments: 21 pages, 6 figures, 2 tables, currently under review. Corresponding GitHub page here: https://github.com/NeSymGraphs. Revised in February 2024 according to major revisions, again in May 2024 according to minor revisions

arXiv:2301.09347 [pdf, other]

Verified reductions for optimization

Authors: Alexander Bentkamp, Ramon Fernández Mir, Jeremy Avigad

Abstract: Numerical and symbolic methods for optimization are used extensively in engineering, industry, and finance. Various methods are used to reduce problems of interest to ones that are amenable to solution by such software. We develop a framework for designing and applying such reductions, using the Lean programming language and interactive proof assistant. Formal verification makes the process more r… ▽ More Numerical and symbolic methods for optimization are used extensively in engineering, industry, and finance. Various methods are used to reduce problems of interest to ones that are amenable to solution by such software. We develop a framework for designing and applying such reductions, using the Lean programming language and interactive proof assistant. Formal verification makes the process more reliable, and the availability of an interactive framework and ambient mathematical library provides a robust environment for constructing the reductions and reasoning about them. △ Less

Submitted 22 February, 2023; v1 submitted 23 January, 2023; originally announced January 2023.

MSC Class: 90C25; 68V15

Showing 1–4 of 4 results for author: Mir, R F