Skip to main content

Showing 1–8 of 8 results for author: Tossou, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2311.00862  [pdf, other

    physics.chem-ph cs.LG

    Role of Structural and Conformational Diversity for Machine Learning Potentials

    Authors: Nikhil Shenoy, Prudencio Tossou, Emmanuel Noutahi, Hadrien Mary, Dominique Beaini, Jiarui Ding

    Abstract: In the field of Machine Learning Interatomic Potentials (MLIPs), understanding the intricate relationship between data biases, specifically conformational and structural diversity, and model generalization is critical in improving the quality of Quantum Mechanics (QM) data generation efforts. We investigate these dynamics through two distinct experiments: a fixed budget one, where the dataset size… ▽ More

    Submitted 30 October, 2023; originally announced November 2023.

    Comments: Accepted at NeurIPS 2023 AI4D3 and AI4S workshops

  2. arXiv:2310.10773  [pdf, other

    cs.LG q-bio.BM

    Gotta be SAFE: A New Framework for Molecular Design

    Authors: Emmanuel Noutahi, Cristian Gabellini, Michael Craig, Jonathan S. C Lim, Prudencio Tossou

    Abstract: Traditional molecular string representations, such as SMILES, often pose challenges for AI-driven molecular design due to their non-sequential depiction of molecular substructures. To address this issue, we introduce Sequential Attachment-based Fragment Embedding (SAFE), a novel line notation for chemical structures. SAFE reimagines SMILES strings as an unordered sequence of interconnected fragmen… ▽ More

    Submitted 10 December, 2023; v1 submitted 16 October, 2023; originally announced October 2023.

    Comments: Code, data and models available at: https://github.com/datamol-io/safe/

  3. arXiv:2310.04292  [pdf, other

    cs.LG

    Towards Foundational Models for Molecular Learning on Large-Scale Multi-Task Datasets

    Authors: Dominique Beaini, Shenyang Huang, Joao Alex Cunha, Zhiyi Li, Gabriela Moisescu-Pareja, Oleksandr Dymov, Samuel Maddrell-Mander, Callum McLean, Frederik Wenkel, Luis Müller, Jama Hussein Mohamud, Ali Parviz, Michael Craig, Michał Koziarski, Jiarui Lu, Zhaocheng Zhu, Cristian Gabellini, Kerstin Klaser, Josef Dean, Cas Wognum, Maciej Sypetkowski, Guillaume Rabusseau, Reihaneh Rabbany, Jian Tang, Christopher Morris , et al. (10 additional authors not shown)

    Abstract: Recently, pre-trained foundation models have enabled significant advancements in multiple fields. In molecular machine learning, however, where datasets are often hand-curated, and hence typically small, the lack of datasets with labeled features, and codebases to manage those datasets, has hindered the development of foundation models. In this work, we present seven novel datasets categorized by… ▽ More

    Submitted 18 October, 2023; v1 submitted 6 October, 2023; originally announced October 2023.

  4. arXiv:2110.04126  [pdf, other

    cs.LG cs.AI q-bio.BM

    3D Infomax improves GNNs for Molecular Property Prediction

    Authors: Hannes Stärk, Dominique Beaini, Gabriele Corso, Prudencio Tossou, Christian Dallago, Stephan Günnemann, Pietro Liò

    Abstract: Molecular property prediction is one of the fastest-growing applications of deep learning with critical real-world impacts. Including 3D molecular structure as input to learned models improves their performance for many molecular tasks. However, this information is infeasible to compute at the scale required by several real-world applications. We propose pre-training a model to reason about the ge… ▽ More

    Submitted 4 June, 2022; v1 submitted 8 October, 2021; originally announced October 2021.

    Comments: 39th International Conference on Machine Learning (ICML 2022). Also accepted at NeurIPS 2021 ML4PH, AI4S, and SSL workshops and as oral at ELLIS ML4Molecules. 24 pages, 7 figures, 18 tables

    Journal ref: 39th International Conference on Machine Learning (ICML 2022)

  5. arXiv:2106.03893  [pdf, other

    cs.LG

    Rethinking Graph Transformers with Spectral Attention

    Authors: Devin Kreuzer, Dominique Beaini, William L. Hamilton, Vincent Létourneau, Prudencio Tossou

    Abstract: In recent years, the Transformer architecture has proven to be very successful in sequence processing, but its application to other data structures, such as graphs, has remained limited due to the difficulty of properly defining positions. Here, we present the $\textit{Spectral Attention Network}$ (SAN), which uses a learned positional encoding (LPE) that can take advantage of the full Laplacian s… ▽ More

    Submitted 27 October, 2021; v1 submitted 7 June, 2021; originally announced June 2021.

    Comments: Accepted in Proceedings of NeurIPS 2021

  6. arXiv:2005.07852  [pdf, other

    stat.ML cs.LG math.DG q-bio.GN

    Geodesics in fibered latent spaces: A geometric approach to learning correspondences between conditions

    Authors: Tariq Daouda, Reda Chhaibi, Prudencio Tossou, Alexandra-Chloé Villani

    Abstract: This work introduces a geometric framework and a novel network architecture for creating correspondences between samples of different conditions. Under this formalism, the latent space is a fiber bundle stratified into a base space encoding conditions, and a fiber space encoding the variations within conditions. Furthermore, this latent space is endowed with a natural pull-back metric. The corresp… ▽ More

    Submitted 27 December, 2020; v1 submitted 15 May, 2020; originally announced May 2020.

    Comments: 36 pages, many figures. v1: Preliminary version. v2: Minor ref fix. v3: Submitted version with enhanced presentation

  7. arXiv:1905.12131  [pdf, other

    cs.LG stat.ML

    Adaptive Deep Kernel Learning

    Authors: Prudencio Tossou, Basile Dura, Francois Laviolette, Mario Marchand, Alexandre Lacoste

    Abstract: Deep kernel learning provides an elegant and principled framework for combining the structural properties of deep learning algorithms with the flexibility of kernel methods. By means of a deep neural network, we learn a parametrized kernel operator that can be combined with a differentiable kernel algorithm during inference. While previous work within this framework has focused on learning a singl… ▽ More

    Submitted 11 December, 2020; v1 submitted 28 May, 2019; originally announced May 2019.

  8. arXiv:1905.11577  [pdf, other

    cs.LG q-bio.BM stat.ML

    Towards Interpretable Sparse Graph Representation Learning with Laplacian Pooling

    Authors: Emmanuel Noutahi, Dominique Beaini, Julien Horwood, Sébastien Giguère, Prudencio Tossou

    Abstract: Recent work in graph neural networks (GNNs) has led to improvements in molecular activity and property prediction tasks. Unfortunately, GNNs often fail to capture the relative importance of interactions between molecular substructures, in part due to the absence of efficient intermediate pooling steps. To address these issues, we propose LaPool (Laplacian Pooling), a novel, data-driven, and interp… ▽ More

    Submitted 2 April, 2020; v1 submitted 27 May, 2019; originally announced May 2019.

    Comments: 11 pages, with Appendices