-
NCIDiff: Non-covalent Interaction-generative Diffusion Model for Improving Reliability of 3D Molecule Generation Inside Protein Pocket
Authors:
Joongwon Lee,
Wonho Zhung,
Woo Youn Kim
Abstract:
Advancements in deep generative modeling have changed the paradigm of drug discovery. Among such approaches, target-aware methods that exploit 3D structures of protein pockets were spotlighted for generating ligand molecules with their plausible binding modes. While docking scores superficially assess the quality of generated ligands, closer inspection of the binding structures reveals the inconsi…
▽ More
Advancements in deep generative modeling have changed the paradigm of drug discovery. Among such approaches, target-aware methods that exploit 3D structures of protein pockets were spotlighted for generating ligand molecules with their plausible binding modes. While docking scores superficially assess the quality of generated ligands, closer inspection of the binding structures reveals the inconsistency in local interactions between a pocket and generated ligands. Here, we address the issue by explicitly generating non-covalent interactions (NCIs), which are universal patterns throughout protein-ligand complexes. Our proposed model, NCIDiff, simultaneously denoises NCI types of protein-ligand edges along with a 3D graph of a ligand molecule during the sampling. With the NCI-generating strategy, our model generates ligands with more reliable NCIs, especially outperforming the baseline diffusion-based models. We further adopted inpainting techniques on NCIs to further improve the quality of the generated molecules. Finally, we showcase the applicability of NCIDiff on drug design tasks for real-world settings with specialized objectives by guiding the generation process with desired NCI patterns.
△ Less
Submitted 27 May, 2024;
originally announced May 2024.
-
Diffusion-based Generative AI for Exploring Transition States from 2D Molecular Graphs
Authors:
Seonghwan Kim,
Jeheon Woo,
Woo Youn Kim
Abstract:
The exploration of transition state (TS) geometries is crucial for elucidating chemical reaction mechanisms and modeling their kinetics. Recently, machine learning (ML) models have shown remarkable performance for prediction of TS geometries. However, they require 3D conformations of reactants and products often with their appropriate orientations as input, which demands substantial efforts and co…
▽ More
The exploration of transition state (TS) geometries is crucial for elucidating chemical reaction mechanisms and modeling their kinetics. Recently, machine learning (ML) models have shown remarkable performance for prediction of TS geometries. However, they require 3D conformations of reactants and products often with their appropriate orientations as input, which demands substantial efforts and computational cost. Here, we propose a generative approach based on the stochastic diffusion method, namely TSDiff, for prediction of TS geometries just from 2D molecular graphs. TSDiff outperformed the existing ML models with 3D geometries in terms of both accuracy and efficiency. Moreover, it enables to sample various TS conformations, because it learned the distribution of TS geometries for diverse reactions in training. Thus, TSDiff was able to find more favorable reaction pathways with lower barrier heights than those in the reference database. These results demonstrate that TSDiff shows promising potential for an efficient and reliable TS exploration.
△ Less
Submitted 12 October, 2023; v1 submitted 20 April, 2023;
originally announced April 2023.
-
GeoTMI:Predicting quantum chemical property with easy-to-obtain geometry via positional denoising
Authors:
Hyeonsu Kim,
Jeheon Woo,
Seonghwan Kim,
Seokhyun Moon,
Jun Hyeong Kim,
Woo Youn Kim
Abstract:
As quantum chemical properties have a dependence on their geometries, graph neural networks (GNNs) using 3D geometric information have achieved high prediction accuracy in many tasks. However, they often require 3D geometries obtained from high-level quantum mechanical calculations, which are practically infeasible, limiting their applicability to real-world problems. To tackle this, we propose a…
▽ More
As quantum chemical properties have a dependence on their geometries, graph neural networks (GNNs) using 3D geometric information have achieved high prediction accuracy in many tasks. However, they often require 3D geometries obtained from high-level quantum mechanical calculations, which are practically infeasible, limiting their applicability to real-world problems. To tackle this, we propose a new training framework, GeoTMI, that employs denoising process to predict properties accurately using easy-to-obtain geometries (corrupted versions of correct geometries, such as those obtained from low-level calculations). Our starting point was the idea that the correct geometry is the best description of the target property. Hence, to incorporate information of the correct, GeoTMI aims to maximize mutual information between three variables: the correct and the corrupted geometries and the property. GeoTMI also explicitly updates the corrupted input to approach the correct geometry as it passes through the GNN layers, contributing to more effective denoising. We investigated the performance of the proposed method using 3D GNNs for three prediction tasks: molecular properties, a chemical reaction property, and relaxed energy in a heterogeneous catalytic system. Our results showed consistent improvements in accuracy across various tasks, demonstrating the effectiveness and robustness of GeoTMI.
△ Less
Submitted 14 December, 2023; v1 submitted 28 March, 2023;
originally announced April 2023.
-
Molecular Generative Model Based On Adversarially Regularized Autoencoder
Authors:
Seung Hwan Hong,
Jaechang Lim,
Seongok Ryu,
Woo Youn Kim
Abstract:
Deep generative models are attracting great attention as a new promising approach for molecular design. All models reported so far are based on either variational autoencoder (VAE) or generative adversarial network (GAN). Here we propose a new type model based on an adversarially regularized autoencoder (ARAE). It basically uses latent variables like VAE, but the distribution of the latent variabl…
▽ More
Deep generative models are attracting great attention as a new promising approach for molecular design. All models reported so far are based on either variational autoencoder (VAE) or generative adversarial network (GAN). Here we propose a new type model based on an adversarially regularized autoencoder (ARAE). It basically uses latent variables like VAE, but the distribution of the latent variables is obtained by adversarial training like in GAN. The latter is intended to avoid both inappropriate approximation of posterior distribution in VAE and difficulty in handling discrete variables in GAN. Our benchmark study showed that ARAE indeed outperformed conventional models in terms of validity, uniqueness, and novelty per generated molecule. We also demonstrated successful conditional generation of drug-like molecules with ARAE for both cases of single and multiple properties control. As a potential real-world application, we could generate EGFR inhibitors sharing the scaffolds of known active molecules while satisfying drug-like conditions simultaneously.
△ Less
Submitted 12 November, 2019;
originally announced December 2019.
-
Uncertainty quantification of molecular property prediction using Bayesian neural network models
Authors:
Seongok Ryu,
Yongchan Kwon,
Woo Youn Kim
Abstract:
In chemistry, deep neural network models have been increasingly utilized in a variety of applications such as molecular property predictions, novel molecule designs, and planning chemical reactions. Despite the rapid increase in the use of state-of-the-art models and algorithms, deep neural network models often produce poor predictions in real applications because model performance is highly depen…
▽ More
In chemistry, deep neural network models have been increasingly utilized in a variety of applications such as molecular property predictions, novel molecule designs, and planning chemical reactions. Despite the rapid increase in the use of state-of-the-art models and algorithms, deep neural network models often produce poor predictions in real applications because model performance is highly dependent on the quality of training data. In the field of molecular analysis, data are mostly obtained from either complicated chemical experiments or approximate mathematical equations, and then quality of data may be questioned.In this paper, we quantify uncertainties of prediction using Bayesian neural networks in molecular property predictions. We estimate both model-driven and data-driven uncertainties, demonstrating the usefulness of uncertainty quantification as both a quality checker and a confidence indicator with the three experiments. Our results manifest that uncertainty quantification is necessary for more reliable molecular applications and Bayesian neural network models can be a practical approach.
△ Less
Submitted 18 November, 2018;
originally announced May 2019.
-
Importance of local exact exchange potential in hybrid functionals for accurate excited states
Authors:
Jaewook Kim,
Kwangwoo Hong,
Sang-Yeon Hwang,
Seongok Ryu,
Sunghwan Choi,
Woo Youn Kim
Abstract:
Density functional theory has been an essential analysis tool for both theoretical and experimental chemists since accurate hybrid functionals were developed. Here we propose a local hybrid method derived from the optimized effective potential (OEP) method and compare its distinct features with conventional nonlocal ones from the Hartree-Fock (HF) exchange operator. Both are formally exact for gro…
▽ More
Density functional theory has been an essential analysis tool for both theoretical and experimental chemists since accurate hybrid functionals were developed. Here we propose a local hybrid method derived from the optimized effective potential (OEP) method and compare its distinct features with conventional nonlocal ones from the Hartree-Fock (HF) exchange operator. Both are formally exact for ground states and thus show similar accuracy for atomization energies and reaction barrier heights. For excited states, the local version yields virtual orbitals with N-electron character, while those of the nonlocal version have mixed characters between N- and (N+1)-electron orbitals. As a result, the orbital energy gaps from the former well approximate excitation energies with a small mean absolute error (MAE = 0.40 eV) for the Caricato benchmark set. The correction from time-dependent density functional theory with a simple local density approximation kernel further improves its accuracy by incorporating multi-configurational effects, resulting in the total MAE of 0.27 eV that outperforms conventional functionals except for MN15.
△ Less
Submitted 28 October, 2016;
originally announced October 2016.