Utilizing Reinforcement Learning for de novo Drug Design
Authors:
Hampus Gummesson Svensson,
Christian Tyrchan,
Ola Engkvist,
Morteza Haghir Chehreghani
Abstract:
Deep learning-based approaches for generating novel drug molecules with specific properties have gained a lot of interest in the last few years. Recent studies have demonstrated promising performance for string-based generation of novel molecules utilizing reinforcement learning. In this paper, we develop a unified framework for using reinforcement learning for de novo drug design, wherein we syst…
▽ More
Deep learning-based approaches for generating novel drug molecules with specific properties have gained a lot of interest in the last few years. Recent studies have demonstrated promising performance for string-based generation of novel molecules utilizing reinforcement learning. In this paper, we develop a unified framework for using reinforcement learning for de novo drug design, wherein we systematically study various on- and off-policy reinforcement learning algorithms and replay buffers to learn an RNN-based policy to generate novel molecules predicted to be active against the dopamine receptor DRD2. Our findings suggest that it is advantageous to use at least both top-scoring and low-scoring molecules for updating the policy when structural diversity is essential. Using all generated molecules at an iteration seems to enhance performance stability for on-policy algorithms. In addition, when replaying high, intermediate, and low-scoring molecules, off-policy algorithms display the potential of improving the structural diversity and number of active molecules generated, but possibly at the cost of a longer exploration phase. Our work provides an open-source framework enabling researchers to investigate various reinforcement learning methods for de novo drug design.
△ Less
Submitted 30 January, 2024; v1 submitted 30 March, 2023;
originally announced March 2023.
Autonomous Drug Design with Multi-Armed Bandits
Authors:
Hampus Gummesson Svensson,
Esben Jannik Bjerrum,
Christian Tyrchan,
Ola Engkvist,
Morteza Haghir Chehreghani
Abstract:
Recent developments in artificial intelligence and automation support a new drug design paradigm: autonomous drug design. Under this paradigm, generative models can provide suggestions on thousands of molecules with specific properties, and automated laboratories can potentially make, test and analyze molecules with minimal human supervision. However, since still only a limited number of molecules…
▽ More
Recent developments in artificial intelligence and automation support a new drug design paradigm: autonomous drug design. Under this paradigm, generative models can provide suggestions on thousands of molecules with specific properties, and automated laboratories can potentially make, test and analyze molecules with minimal human supervision. However, since still only a limited number of molecules can be synthesized and tested, an obvious challenge is how to efficiently select among provided suggestions in a closed-loop system. We formulate this task as a stochastic multi-armed bandit problem with multiple plays, volatile arms and similarity information. To solve this task, we adapt previous work on multi-armed bandits to this setting, and compare our solution with random sampling, greedy selection and decaying-epsilon-greedy selection strategies. According to our simulation results, our approach has the potential to perform better exploration and exploitation of the chemical space for autonomous drug design.
△ Less
Submitted 20 January, 2023; v1 submitted 4 July, 2022;
originally announced July 2022.