Skip to main content

Showing 1–5 of 5 results for author: Sanchez-Lengeling, B

.
  1. arXiv:2212.01574  [pdf, other

    cs.CE cs.AI

    Calibration and generalizability of probabilistic models on low-data chemical datasets with DIONYSUS

    Authors: Gary Tom, Riley J. Hickman, Aniket Zinzuwadia, Afshan Mohajeri, Benjamin Sanchez-Lengeling, Alan Aspuru-Guzik

    Abstract: Deep learning models that leverage large datasets are often the state of the art for modelling molecular properties. When the datasets are smaller (< 2000 molecules), it is not clear that deep learning approaches are the right modelling tool. In this work we perform an extensive study of the calibration and generalizability of probabilistic machine learning models on small chemical datasets. Using… ▽ More

    Submitted 6 December, 2022; v1 submitted 3 December, 2022; originally announced December 2022.

    Comments: 15+4 pages, 9+3 figures Comments: Fix author name typo in article and meta data

  2. arXiv:1910.10685  [pdf, other

    stat.ML cs.LG physics.chem-ph

    Machine Learning for Scent: Learning Generalizable Perceptual Representations of Small Molecules

    Authors: Benjamin Sanchez-Lengeling, Jennifer N. Wei, Brian K. Lee, Richard C. Gerkin, Alán Aspuru-Guzik, Alexander B. Wiltschko

    Abstract: Predicting the relationship between a molecule's structure and its odor remains a difficult, decades-old task. This problem, termed quantitative structure-odor relationship (QSOR) modeling, is an important challenge in chemistry, impacting human nutrition, manufacture of synthetic fragrance, the environment, and sensory neuroscience. We propose the use of graph neural networks for QSOR, and show t… ▽ More

    Submitted 25 October, 2019; v1 submitted 23 October, 2019; originally announced October 2019.

    Comments: 18 pages, 13 figures

  3. arXiv:1811.12823  [pdf, other

    cs.LG cs.AI cs.DB stat.ML

    Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models

    Authors: Daniil Polykovskiy, Alexander Zhebrak, Benjamin Sanchez-Lengeling, Sergey Golovanov, Oktai Tatanov, Stanislav Belyaev, Rauf Kurbanov, Aleksey Artamonov, Vladimir Aladinskiy, Mark Veselov, Artur Kadurin, Simon Johansson, Hongming Chen, Sergey Nikolenko, Alan Aspuru-Guzik, Alex Zhavoronkov

    Abstract: Generative models are becoming a tool of choice for exploring the molecular space. These models learn on a large training dataset and produce novel molecular structures with similar properties. Generated structures can be utilized for virtual screening or training semi-supervised predictive models in the downstream tasks. While there are plenty of generative models, it is unclear how to compare an… ▽ More

    Submitted 28 October, 2020; v1 submitted 29 November, 2018; originally announced November 2018.

  4. arXiv:1705.10843  [pdf, other

    stat.ML cs.LG

    Objective-Reinforced Generative Adversarial Networks (ORGAN) for Sequence Generation Models

    Authors: Gabriel Lima Guimaraes, Benjamin Sanchez-Lengeling, Carlos Outeiral, Pedro Luis Cunha Farias, Alán Aspuru-Guzik

    Abstract: In unsupervised data generation tasks, besides the generation of a sample based on previous observations, one would often like to give hints to the model in order to bias the generation towards desirable metrics. We propose a method that combines Generative Adversarial Networks (GANs) and reinforcement learning (RL) in order to accomplish exactly that. While RL biases the data generation process t… ▽ More

    Submitted 6 February, 2018; v1 submitted 30 May, 2017; originally announced May 2017.

    Comments: 10 pages, 7 figures

  5. arXiv:1610.02415  [pdf, other

    cs.LG physics.chem-ph

    Automatic chemical design using a data-driven continuous representation of molecules

    Authors: Rafael Gómez-Bombarelli, Jennifer N. Wei, David Duvenaud, José Miguel Hernández-Lobato, Benjamín Sánchez-Lengeling, Dennis Sheberla, Jorge Aguilera-Iparraguirre, Timothy D. Hirzel, Ryan P. Adams, Alán Aspuru-Guzik

    Abstract: We report a method to convert discrete representations of molecules to and from a multidimensional continuous representation. This model allows us to generate new molecules for efficient exploration and optimization through open-ended spaces of chemical compounds. A deep neural network was trained on hundreds of thousands of existing chemical structures to construct three coupled functions: an enc… ▽ More

    Submitted 5 December, 2017; v1 submitted 7 October, 2016; originally announced October 2016.

    Comments: 26 pages, 8 figures