Map** the Space of Chemical Reactions Using Attention-Based Neural Networks
Authors:
Philippe Schwaller,
Daniel Probst,
Alain C. Vaucher,
Vishnu H. Nair,
David Kreutter,
Teodoro Laino,
Jean-Louis Reymond
Abstract:
Organic reactions are usually assigned to classes containing reactions with similar reagents and mechanisms. Reaction classes facilitate the communication of complex concepts and efficient navigation through chemical reaction space. However, the classification process is a tedious task. It requires the identification of the corresponding reaction class template via annotation of the number of mole…
▽ More
Organic reactions are usually assigned to classes containing reactions with similar reagents and mechanisms. Reaction classes facilitate the communication of complex concepts and efficient navigation through chemical reaction space. However, the classification process is a tedious task. It requires the identification of the corresponding reaction class template via annotation of the number of molecules in the reactions, the reaction center, and the distinction between reactants and reagents. This work shows that transformer-based models can infer reaction classes from non-annotated, simple text-based representations of chemical reactions. Our best model reaches a classification accuracy of 98.2%. We also show that the learned representations can be used as reaction fingerprints that capture fine-grained differences between reaction classes better than traditional reaction fingerprints. The insights into chemical reaction space enabled by our learned fingerprints are illustrated by an interactive reaction atlas providing visual clustering and similarity searching.
△ Less
Submitted 9 December, 2020;
originally announced December 2020.
Predicting retrosynthetic pathways using a combined linguistic model and hyper-graph exploration strategy
Authors:
Philippe Schwaller,
Riccardo Petraglia,
Valerio Zullo,
Vishnu H Nair,
Rico Andreas Haeuselmann,
Riccardo Pisoni,
Costas Bekas,
Anna Iuliano,
Teodoro Laino
Abstract:
We present an extension of our Molecular Transformer architecture combined with a hyper-graph exploration strategy for automatic retrosynthesis route planning without human intervention. The single-step retrosynthetic model sets a new state of the art for predicting reactants as well as reagents, solvents and catalysts for each retrosynthetic step. We introduce new metrics (coverage, class diversi…
▽ More
We present an extension of our Molecular Transformer architecture combined with a hyper-graph exploration strategy for automatic retrosynthesis route planning without human intervention. The single-step retrosynthetic model sets a new state of the art for predicting reactants as well as reagents, solvents and catalysts for each retrosynthetic step. We introduce new metrics (coverage, class diversity, round-trip accuracy and Jensen-Shannon divergence) to evaluate the single-step retrosynthetic models, using the forward prediction and a reaction classification model always based on the transformer architecture. The hypergraph is constructed on the fly, and the nodes are filtered and further expanded based on a Bayesian-like probability. We critically assessed the end-to-end framework with several retrosynthesis examples from literature and academic exams. Overall, the frameworks has a very good performance with few weaknesses due to the bias induced during the training process. The use of the newly introduced metrics opens up the possibility to optimize entire retrosynthetic frameworks through focusing on the performance of the single-step model only.
△ Less
Submitted 17 October, 2019;
originally announced October 2019.