-
A Novel Method for Inference of Acyclic Chemical Compounds with Bounded Branch-height Based on Artificial Neural Networks and Integer Programming
Authors:
Naveed Ahmed Azam,
Jianshen Zhu,
Yanming Sun,
Yu Shi,
Aleksandar Shurbevski,
Liang Zhao,
Hiroshi Nagamochi,
Tatsuya Akutsu
Abstract:
Analysis of chemical graphs is a major research topic in computational molecular biology due to its potential applications to drug design. One approach is inverse quantitative structure activity/property relationship (inverse QSAR/QSPR) analysis, which is to infer chemical structures from given chemical activities/properties. Recently, a framework has been proposed for inverse QSAR/QSPR using arti…
▽ More
Analysis of chemical graphs is a major research topic in computational molecular biology due to its potential applications to drug design. One approach is inverse quantitative structure activity/property relationship (inverse QSAR/QSPR) analysis, which is to infer chemical structures from given chemical activities/properties. Recently, a framework has been proposed for inverse QSAR/QSPR using artificial neural networks (ANN) and mixed integer linear programming (MILP). This method consists of a prediction phase and an inverse prediction phase. In the first phase, a feature vector $f(G)$ of a chemical graph $G$ is introduced and a prediction function $ψ$ on a chemical property $π$ is constructed with an ANN. In the second phase, given a target value $y^*$ of property $π$, a feature vector $x^*$ is inferred by solving an MILP formulated from the trained ANN so that $ψ(x^*)$ is close to $y^*$ and then a set of chemical structures $G^*$ such that $f(G^*)= x^*$ is enumerated by a graph search algorithm. The framework has been applied to the case of chemical compounds with cycle index up to 2. The computational results conducted on instances with $n$ non-hydrogen atoms show that a feature vector $x^*$ can be inferred for up to around $n=40$ whereas graphs $G^*$ can be enumerated for up to $n=15$. When applied to the case of chemical acyclic graphs, the maximum computable diameter of $G^*$ was around up to around 8. We introduce a new characterization of graph structure, "branch-height," based on which an MILP formulation and a graph search algorithm are designed for chemical acyclic graphs. The results of computational experiments using properties such as octanol/water partition coefficient, boiling point and heat of combustion suggest that the proposed method can infer chemical acyclic graphs $G^*$ with $n=50$ and diameter 30.
△ Less
Submitted 21 September, 2020;
originally announced September 2020.
-
Enumerating Chemical Graphs with Two Disjoint Cycles Satisfying Given Path Frequency Specifications
Authors:
Kyousuke Yamashita,
Ryuji Masui,
Xiang Zhou,
Chenxi Wang,
Aleksandar Shurbevski,
Hiroshi Nagamochi,
Tatsuya Akutsu
Abstract:
Enumerating chemical graphs satisfying given constraints is a fundamental problem in mathematical and computational chemistry, and plays an essential part in a recently proposed framework for the inverse QSAR/QSPR. In this paper, constraints are given by feature vectors each of which consists of the frequencies of paths in a given set of paths. We consider the problem of enumerating chemical graph…
▽ More
Enumerating chemical graphs satisfying given constraints is a fundamental problem in mathematical and computational chemistry, and plays an essential part in a recently proposed framework for the inverse QSAR/QSPR. In this paper, constraints are given by feature vectors each of which consists of the frequencies of paths in a given set of paths. We consider the problem of enumerating chemical graphs that satisfy the path frequency constraints, which are given by a pair of feature vectors specifying upper and lower bounds of the frequency of each path. We design a branch-and-bound algorithm for enumerating chemical graphs of bi-block 2-augmented structure, that is, graphs that contain two edge-disjoint cycles. We present some computational experiments with an implementation of our proposed algorithm.
△ Less
Submitted 18 April, 2020;
originally announced April 2020.
-
Enumerating Chemical Graphs with Mono-block 2-Augmented Tree Structure from Given Upper and Lower Bounds on Path Frequencies
Authors:
Yuui Tamura,
Yuhei Nishiyama,
Chenxi Wang,
Yanming Sun,
Aleksandar Shurbevski,
Hiroshi Nagamochi,
Tatsuya Akutsu
Abstract:
We consider a problem of enumerating chemical graphs from given constraints concerning their structures, which has an important application to a novel method for the inverse QSAR/QSPR recently proposed. In this paper, the structure of a chemical graph is specified by a feature vector each of whose entries represents the frequency of a prescribed path. We call a graph a 2-augmented tree if it is ob…
▽ More
We consider a problem of enumerating chemical graphs from given constraints concerning their structures, which has an important application to a novel method for the inverse QSAR/QSPR recently proposed. In this paper, the structure of a chemical graph is specified by a feature vector each of whose entries represents the frequency of a prescribed path. We call a graph a 2-augmented tree if it is obtained from a tree (an acyclic graph) by adding edges between two pairs of nonadjacent vertices. Given a set of feature vectors as the interval between upper and lower bounds of feature vectors, we design an efficient algorithm for enumerating chemical 2-augmented trees that satisfy the path frequency specified by some feature vector in the set. We implemented the proposed algorithm and conducted some computational experiments.
△ Less
Submitted 14 April, 2020;
originally announced April 2020.