Scoreformer: A Surrogate Model For Large-Scale Prediction of Docking Scores
Authors:
Álvaro Ciudad,
Adrián Morales-Pastor,
Laura Malo,
Isaac Filella-Mercè,
Victor Guallar,
Alexis Molina
Abstract:
In this study, we present ScoreFormer, a novel graph transformer model designed to accurately predict molecular docking scores, thereby optimizing high-throughput virtual screening (HTVS) in drug discovery. The architecture integrates Principal Neighborhood Aggregation (PNA) and Learnable Random Walk Positional Encodings (LRWPE), enhancing the model's ability to understand complex molecular struct…
▽ More
In this study, we present ScoreFormer, a novel graph transformer model designed to accurately predict molecular docking scores, thereby optimizing high-throughput virtual screening (HTVS) in drug discovery. The architecture integrates Principal Neighborhood Aggregation (PNA) and Learnable Random Walk Positional Encodings (LRWPE), enhancing the model's ability to understand complex molecular structures and their relationship with their respective docking scores. This approach significantly surpasses traditional HTVS methods and recent Graph Neural Network (GNN) models in both recovery and efficiency due to a wider coverage of the chemical space and enhanced performance. Our results demonstrate that ScoreFormer achieves competitive performance in docking score prediction and offers a substantial 1.65-fold reduction in inference time compared to existing models. We evaluated ScoreFormer across multiple datasets under various conditions, confirming its robustness and reliability in identifying potential drug candidates rapidly.
△ Less
Submitted 25 June, 2024; v1 submitted 13 June, 2024;
originally announced June 2024.
Optimizing Drug Design by Merging Generative AI With Active Learning Frameworks
Authors:
Isaac Filella-Merce,
Alexis Molina,
Marek Orzechowski,
Lucía Díaz,
Yang Ming Zhu,
Julia Vilalta Mor,
Laura Malo,
Ajay S Yekkirala,
Soumya Ray,
Victor Guallar
Abstract:
Traditional drug discovery programs are being transformed by the advent of machine learning methods. Among these, Generative AI methods (GM) have gained attention due to their ability to design new molecules and enhance specific properties of existing ones. However, current GM methods have limitations, such as low affinity towards the target, unknown ADME/PK properties, or the lack of synthetic tr…
▽ More
Traditional drug discovery programs are being transformed by the advent of machine learning methods. Among these, Generative AI methods (GM) have gained attention due to their ability to design new molecules and enhance specific properties of existing ones. However, current GM methods have limitations, such as low affinity towards the target, unknown ADME/PK properties, or the lack of synthetic tractability. To improve the applicability domain of GM methods, we have developed a workflow based on a variational autoencoder coupled with active learning steps. The designed GM workflow iteratively learns from molecular metrics, including drug likeliness, synthesizability, similarity, and docking scores. In addition, we also included a hierarchical set of criteria based on advanced molecular modeling simulations during a final selection step. We tested our GM workflow on two model systems, CDK2 and KRAS. In both cases, our model generated chemically viable molecules with a high predicted affinity toward the targets. Particularly, the proportion of high-affinity molecules inferred by our GM workflow was significantly greater than that in the training data. Notably, we also uncovered novel scaffolds significantly dissimilar to those known for each target. These results highlight the potential of our GM workflow to explore novel chemical space for specific targets, thereby opening up new possibilities for drug discovery endeavors.
△ Less
Submitted 4 May, 2023;
originally announced May 2023.