$\nabla^2$DFT: A Universal Quantum Chemistry Dataset of Drug-Like Molecules and a Benchmark for Neural Network Potentials
Authors:
Kuzma Khrabrov,
Anton Ber,
Artem Tsypin,
Konstantin Ushenin,
Egor Rumiantsev,
Alexander Telepov,
Dmitry Protasov,
Ilya Shenbin,
Anton Alekseev,
Mikhail Shirokikh,
Sergey Nikolenko,
Elena Tutubalina,
Artur Kadurin
Abstract:
Methods of computational quantum chemistry provide accurate approximations of molecular properties crucial for computer-aided drug discovery and other areas of chemical science. However, high computational complexity limits the scalability of their applications. Neural network potentials (NNPs) are a promising alternative to quantum chemistry methods, but they require large and diverse datasets fo…
▽ More
Methods of computational quantum chemistry provide accurate approximations of molecular properties crucial for computer-aided drug discovery and other areas of chemical science. However, high computational complexity limits the scalability of their applications. Neural network potentials (NNPs) are a promising alternative to quantum chemistry methods, but they require large and diverse datasets for training. This work presents a new dataset and benchmark called $\nabla^2$DFT that is based on the nablaDFT. It contains twice as much molecular structures, three times more conformations, new data types and tasks, and state-of-the-art models. The dataset includes energies, forces, 17 molecular properties, Hamiltonian and overlap matrices, and a wavefunction object. All calculations were performed at the DFT level ($ω$B97X-D/def2-SVP) for each conformation. Moreover, $\nabla^2$DFT is the first dataset that contains relaxation trajectories for a substantial number of drug-like molecules. We also introduce a novel benchmark for evaluating NNPs in molecular property prediction, Hamiltonian prediction, and conformational optimization tasks. Finally, we propose an extendable framework for training NNPs and implement 10 models within it.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
Optimal partitions of the flat torus into parts of smaller diameter
Authors:
Dmitry Protasov,
Alexander Tolmachev,
Vsevolod Voronov
Abstract:
We consider the problem of partitioning a two-dimensional flat torus $T^2$ into $m$ sets in order to minimize the maximal diameter of a part. For $m \leqslant 25$ we give numerical estimates for the maximal diameter $d_m(T^2)$ at which the partition exists. Several approaches are proposed to obtain such estimates. In particular, we use the search for mesh partitions via the SAT solver, the global…
▽ More
We consider the problem of partitioning a two-dimensional flat torus $T^2$ into $m$ sets in order to minimize the maximal diameter of a part. For $m \leqslant 25$ we give numerical estimates for the maximal diameter $d_m(T^2)$ at which the partition exists. Several approaches are proposed to obtain such estimates. In particular, we use the search for mesh partitions via the SAT solver, the global optimization approach for polygonal partitions, and the optimization of periodic hexagonal tilings. For $m=3$, the exact estimate is proved using elementary topological reasoning.
△ Less
Submitted 6 February, 2024;
originally announced February 2024.
Coverings of planar and three-dimensional sets with subsets of smaller diameter
Authors:
Alexander Tolmachev,
Dmitry Protasov,
Vsevolod Voronov
Abstract:
Quantitative estimates related to the classical Borsuk problem of splitting set in Euclidean space into subsets of smaller diameter are considered. For a given $k$ there is a minimal diameter of subsets at which there exists a covering with $k$ subsets of any planar set of unit diameter. In order to find an upper estimate of the minimal diameter we propose an algorithm for finding sub-optimal part…
▽ More
Quantitative estimates related to the classical Borsuk problem of splitting set in Euclidean space into subsets of smaller diameter are considered. For a given $k$ there is a minimal diameter of subsets at which there exists a covering with $k$ subsets of any planar set of unit diameter. In order to find an upper estimate of the minimal diameter we propose an algorithm for finding sub-optimal partitions. In the cases $10 \leqslant k \leqslant 17$ some upper and lower estimates of the minimal diameter are improved. Another result is that any set $M \subset \mathbb{R}^3$ of a unit diameter can be partitioned into four subsets of a diameter not greater than $0.966$.
△ Less
Submitted 22 October, 2022;
originally announced October 2022.