-
Machine learning Hubbard parameters with equivariant neural networks
Authors:
Martin Uhrin,
Austin Zadoks,
Luca Binci,
Nicola Marzari,
Iurii Timrov
Abstract:
Density-functional theory with extended Hubbard functionals (DFT+$U$+$V$) provides a robust framework to accurately describe complex materials containing transition-metal or rare-earth elements. It does so by mitigating self-interaction errors inherent to semi-local functionals which are particularly pronounced in systems with partially-filled $d$ and $f$ electronic states. However, achieving accu…
▽ More
Density-functional theory with extended Hubbard functionals (DFT+$U$+$V$) provides a robust framework to accurately describe complex materials containing transition-metal or rare-earth elements. It does so by mitigating self-interaction errors inherent to semi-local functionals which are particularly pronounced in systems with partially-filled $d$ and $f$ electronic states. However, achieving accuracy in this approach hinges upon the accurate determination of the on-site $U$ and inter-site $V$ Hubbard parameters. In practice, these are obtained either by semi-empirical tuning, requiring prior knowledge, or, more correctly, by using predictive but expensive first-principles calculations. Here, we present a machine learning model based on equivariant neural networks which uses atomic occupation matrices as descriptors, directly capturing the electronic structure, local chemical environment, and oxidation states of the system at hand. We target here the prediction of Hubbard parameters computed self-consistently with iterative linear-response calculations, as implemented in density-functional perturbation theory (DFPT), and structural relaxations. Remarkably, when trained on data from 11 materials spanning various crystal structures and compositions, our model achieves mean absolute relative errors of 3% and 5% for Hubbard $U$ and $V$ parameters, respectively. By circumventing computationally expensive DFT or DFPT self-consistent protocols, our model significantly expedites the prediction of Hubbard parameters with negligible computational overhead, while approaching the accuracy of DFPT. Moreover, owing to its robust transferability, the model facilitates accelerated materials discovery and design via high-throughput calculations, with relevance for various technological applications.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
Optimal sports betting strategies in practice: an experimental review
Authors:
Matej Uhrín,
Gustav Šourek,
Ondřej Hubáček,
Filip Železný
Abstract:
We investigate the most popular approaches to the problem of sports betting investment based on modern portfolio theory and the Kelly criterion. We define the problem setting, the formal investment strategies, and review their common modifications used in practice. The underlying purpose of the reviewed modifications is to mitigate the additional risk stemming from the unrealistic mathematical ass…
▽ More
We investigate the most popular approaches to the problem of sports betting investment based on modern portfolio theory and the Kelly criterion. We define the problem setting, the formal investment strategies, and review their common modifications used in practice. The underlying purpose of the reviewed modifications is to mitigate the additional risk stemming from the unrealistic mathematical assumptions of the formal strategies. We test the resulting methods using a unified evaluation protocol for three sports: horse racing, basketball and soccer. The results show the practical necessity of the additional risk-control methods and demonstrate their individual benefits. Particularly, we show that an adaptive variant of the popular ``fractional Kelly'' method is a very suitable choice across a wide range of settings.
△ Less
Submitted 15 July, 2021;
originally announced July 2021.
-
Data Management Plans: the Importance of Data Management in the BIG-MAP Project
Authors:
Ivano E. Castelli,
Daniel J. Arismendi-Arrieta,
Arghya Bhowmik,
Isidora Cekic-Laskovic,
Simon Clark,
Robert Dominko,
Eibar Flores,
Jackson Flowers,
Karina Ulvskov Frederiksen,
Jesper Friis,
Alexis Grimaud,
Karin Vels Hansen,
Laurence J. Hardwick,
Kersti Hermansson,
Lukas Königer,
Hanne Lauritzen,
Frédéric Le Cras,
Hongjiao Li,
Sandrine Lyonnard,
Henning Lorrmann,
Nicola Marzari,
Leszek Niedzicki,
Giovanni Pizzi,
Fuzhan Rahmanian,
Helge Stein
, et al. (5 additional authors not shown)
Abstract:
Open access to research data is increasingly important for accelerating research. Grant authorities therefore request detailed plans for how data is managed in the projects they finance. We have recently developed such a plan for the EU-H2020 BIG-MAP project - a cross-disciplinary project targeting disruptive battery-material discoveries. Essential for reaching the goal is extensive sharing of res…
▽ More
Open access to research data is increasingly important for accelerating research. Grant authorities therefore request detailed plans for how data is managed in the projects they finance. We have recently developed such a plan for the EU-H2020 BIG-MAP project - a cross-disciplinary project targeting disruptive battery-material discoveries. Essential for reaching the goal is extensive sharing of research data across scales, disciplines and stakeholders, not limited to BIG-MAP and the European BATTERY 2030+ initiative but within the entire battery community. The key challenges faced in develo** the data management plan for such a large and complex project were to generate an overview of the enormous amount of data that will be produced, to build an understanding of the data flow within the project and to agree on a roadmap for making all data FAIR. This paper describes the process we followed and how we structured the plan.
△ Less
Submitted 3 June, 2021;
originally announced June 2021.
-
Common workflows for computing material properties using different quantum engines
Authors:
Sebastiaan P. Huber,
Emanuele Bosoni,
Marnik Bercx,
Jens Bröder,
Augustin Degomme,
Vladimir Dikan,
Kristjan Eimre,
Espen Flage-Larsen,
Alberto Garcia,
Luigi Genovese,
Dominik Gresch,
Conrad Johnston,
Guido Petretto,
Samuel Poncé,
Gian-Marco Rignanese,
Christopher J. Sewell,
Berend Smit,
Vasily Tseplyaev,
Martin Uhrin,
Daniel Wortmann,
Aliaksandr V. Yakutovich,
Austin Zadoks,
Pezhman Zarabadi-Poor,
Bonan Zhu,
Nicola Marzari
, et al. (1 additional authors not shown)
Abstract:
The prediction of material properties through electronic-structure simulations based on density-functional theory has become routinely common, thanks, in part, to the steady increase in the number and robustness of available simulation packages. This plurality of codes and methods aiming to solve similar problems is both a boon and a burden. While providing great opportunities for cross-verificati…
▽ More
The prediction of material properties through electronic-structure simulations based on density-functional theory has become routinely common, thanks, in part, to the steady increase in the number and robustness of available simulation packages. This plurality of codes and methods aiming to solve similar problems is both a boon and a burden. While providing great opportunities for cross-verification, these packages adopt different methods, algorithms, and paradigms, making it challenging to choose, master, and efficiently use any one for a given task. Leveraging recent advances in managing reproducible scientific workflows, we demonstrate how develo** common interfaces for workflows that automatically compute material properties can tackle the challenge mentioned above, greatly simplifying interoperability and cross-verification. We introduce design rules for reproducible and reusable code-agnostic workflow interfaces to compute well-defined material properties, which we implement for eleven different quantum engines and use to compute three different material properties. Each implementation encodes carefully selected simulation parameters and workflow logic, making the implementer's expertise of the quantum engine directly available to non-experts. Full provenance and reproducibility of the workflows is guaranteed through the use of the AiiDA infrastructure. All workflows are made available as open-source and come pre-installed with the Quantum Mobile virtual machine, making their use straightforward.
△ Less
Submitted 11 May, 2021;
originally announced May 2021.
-
Through the eyes of a descriptor: Constructing complete, invertible descriptions of atomic environments
Authors:
Martin Uhrin
Abstract:
In this work we apply methods for describing 3D images to the problem of encoding atomic environments in a way that is invariant to rotations, translations, and permutations of the atoms and, crucially, can be decoded back into the original environment modulo global orientation without the need for training a model. From the point of view of decoding, the descriptor is optimally complete and can b…
▽ More
In this work we apply methods for describing 3D images to the problem of encoding atomic environments in a way that is invariant to rotations, translations, and permutations of the atoms and, crucially, can be decoded back into the original environment modulo global orientation without the need for training a model. From the point of view of decoding, the descriptor is optimally complete and can be extended to arbitrary order, allowing for a systematic convergence of the fidelity of the description. In experiments on molecules ranging from 3 to 29 atoms in size, we demonstrate that positions can be decoded with a 97% success rate and positions plus species with a 70% rate of success, rising to 95% if a second fingerprint is used. In all cases, consistent recovery is observed for molecules with 17 or fewer atoms. Additionally, we evaluate the descriptor's performance in predicting the energies and forces of bulk Ni, Cu, Li, Mo, Si and Ge by means of a neural network model trained on DFT data. When comparing to six machine learning interaction potential methods that use various descriptors and regression schemes our descriptor is found be to competitive, in several cases outperforming well established methods. The combined ability to both decode and make property predictions from a representation that does not need to be learned lays the foundations for a novel way of building generative models that are tasked with solving the inverse problem of predicting atomic arrangements that are statistically likely to have certain desired properties.
△ Less
Submitted 27 October, 2021; v1 submitted 19 April, 2021;
originally announced April 2021.
-
OPTIMADE, an API for exchanging materials data
Authors:
Casper W. Andersen,
Rickard Armiento,
Evgeny Blokhin,
Gareth J. Conduit,
Shyam Dwaraknath,
Matthew L. Evans,
Ádám Fekete,
Abhijith Gopakumar,
Saulius Gražulis,
Andrius Merkys,
Fawzi Mohamed,
Corey Oses,
Giovanni Pizzi,
Gian-Marco Rignanese,
Markus Scheidgen,
Leopold Talirz,
Cormac Toher,
Donald Winston,
Rossella Aversa,
Kamal Choudhary,
Pauline Colinet,
Stefano Curtarolo,
Davide Di Stefano,
Claudia Draxl,
Suleyman Er
, et al. (31 additional authors not shown)
Abstract:
The Open Databases Integration for Materials Design (OPTIMADE) consortium has designed a universal application programming interface (API) to make materials databases accessible and interoperable. We outline the first stable release of the specification, v1.0, which is already supported by many leading databases and several software packages. We illustrate the advantages of the OPTIMADE API throug…
▽ More
The Open Databases Integration for Materials Design (OPTIMADE) consortium has designed a universal application programming interface (API) to make materials databases accessible and interoperable. We outline the first stable release of the specification, v1.0, which is already supported by many leading databases and several software packages. We illustrate the advantages of the OPTIMADE API through worked examples on each of the public materials databases that support the full API specification.
△ Less
Submitted 25 August, 2021; v1 submitted 2 March, 2021;
originally announced March 2021.
-
Workflows in AiiDA: Engineering a high-throughput, event-based engine for robust and modular computational workflows
Authors:
Martin Uhrin,
Sebastiaan P. Huber,
Jusong Yu,
Nicola Marzari,
Giovanni Pizzi
Abstract:
Over the last two decades, the field of computational science has seen a dramatic shift towards incorporating high-throughput computation and big-data analysis as fundamental pillars of the scientific discovery process. This has necessitated the development of tools and techniques to deal with the generation, storage and processing of large amounts of data. In this work we present an in-depth look…
▽ More
Over the last two decades, the field of computational science has seen a dramatic shift towards incorporating high-throughput computation and big-data analysis as fundamental pillars of the scientific discovery process. This has necessitated the development of tools and techniques to deal with the generation, storage and processing of large amounts of data. In this work we present an in-depth look at the workflow engine powering AiiDA, a widely adopted, highly flexible and database-backed informatics infrastructure with an emphasis on data reproducibility. We detail many of the design choices that were made which were informed by several important goals: the ability to scale from running on individual laptops up to high-performance supercomputers, managing jobs with runtimes spanning from fractions of a second to weeks and scaling up to thousands of jobs concurrently, and all this while maximising robustness. In short, AiiDA aims to be a Swiss army knife for high-throughput computational science. As well as the architecture, we outline important API design choices made to give workflow writers a great deal of liberty whilst guiding them towards writing robust and modular workflows, ultimately enabling them to encode their scientific knowledge to the benefit of the wider scientific community.
△ Less
Submitted 21 July, 2020; v1 submitted 17 July, 2020;
originally announced July 2020.
-
kiwiPy: Robust, high-volume, messaging for big-data and computational science workflows
Authors:
Martin Uhrin,
Sebastiaan P. Huber
Abstract:
In this work we present kiwiPy, a Python library designed to support robust message based communication for high-throughput, big-data, applications while being general enough to be useful wherever high-volumes of messages need to be communicated in a predictable manner. KiwiPy relies on the RabbitMQ protocol, an industry standard message broker, while providing a simple and intuitive interface tha…
▽ More
In this work we present kiwiPy, a Python library designed to support robust message based communication for high-throughput, big-data, applications while being general enough to be useful wherever high-volumes of messages need to be communicated in a predictable manner. KiwiPy relies on the RabbitMQ protocol, an industry standard message broker, while providing a simple and intuitive interface that can be used in both multithreaded and coroutine based applications. To demonstrate some of kiwiPy's functionality we give examples from AiiDA, a high-throughput simulation platform, where kiwiPy is used as a key component of the workflow engine.
△ Less
Submitted 15 May, 2020;
originally announced May 2020.
-
Materials Cloud, a platform for open computational science
Authors:
Leopold Talirz,
Snehal Kumbhar,
Elsa Passaro,
Aliaksandr V. Yakutovich,
Valeria Granata,
Fernando Gargiulo,
Marco Borelli,
Martin Uhrin,
Sebastiaan P. Huber,
Spyros Zoupanos,
Carl S. Adorf,
Casper W. Andersen,
Ole Schütt,
Carlo A. Pignedoli,
Daniele Passerone,
Joost VandeVondele,
Thomas C. Schulthess,
Berend Smit,
Giovanni Pizzi,
Nicola Marzari
Abstract:
Materials Cloud is a platform designed to enable open and seamless sharing of resources for computational science, driven by applications in materials modelling. It hosts 1) archival and dissemination services for raw and curated data, together with their provenance graph, 2) modelling services and virtual machines, 3) tools for data analytics, and pre-/post-processing, and 4) educational material…
▽ More
Materials Cloud is a platform designed to enable open and seamless sharing of resources for computational science, driven by applications in materials modelling. It hosts 1) archival and dissemination services for raw and curated data, together with their provenance graph, 2) modelling services and virtual machines, 3) tools for data analytics, and pre-/post-processing, and 4) educational materials. Data is citable and archived persistently, providing a comprehensive embodiment of the FAIR principles that extends to computational workflows. Materials Cloud leverages the AiiDA framework to record the provenance of entire simulation pipelines (calculations performed, codes used, data generated) in the form of graphs that allow to retrace and reproduce any computed result. When an AiiDA database is shared on Materials Cloud, peers can browse the interconnected record of simulations, download individual files or the full database, and start their research from the results of the original authors. The infrastructure is agnostic to the specific simulation codes used and can support diverse applications in computational science that transcend its initial materials domain.
△ Less
Submitted 27 March, 2020;
originally announced March 2020.
-
AiiDA 1.0, a scalable computational infrastructure for automated reproducible workflows and data provenance
Authors:
Sebastiaan. P. Huber,
Spyros Zoupanos,
Martin Uhrin,
Leopold Talirz,
Leonid Kahle,
Rico Häuselmann,
Dominik Gresch,
Tiziano Müller,
Aliaksandr V. Yakutovich,
Casper W. Andersen,
Francisco F. Ramirez,
Carl S. Adorf,
Fernando Gargiulo,
Snehal Kumbhar,
Elsa Passaro,
Conrad Johnston,
Andrius Merkys,
Andrea Cepellotti,
Nicolas Mounet,
Nicola Marzari,
Boris Kozinsky,
Giovanni Pizzi
Abstract:
The ever-growing availability of computing power and the sustained development of advanced computational methods have contributed much to recent scientific progress. These developments present new challenges driven by the sheer amount of calculations and data to manage. Next-generation exascale supercomputers will harden these challenges, such that automated and scalable solutions become crucial.…
▽ More
The ever-growing availability of computing power and the sustained development of advanced computational methods have contributed much to recent scientific progress. These developments present new challenges driven by the sheer amount of calculations and data to manage. Next-generation exascale supercomputers will harden these challenges, such that automated and scalable solutions become crucial. In recent years, we have been develo** AiiDA (http://www.aiida.net), a robust open-source high-throughput infrastructure addressing the challenges arising from the needs of automated workflow management and data provenance recording. Here, we introduce developments and capabilities required to reach sustained performance, with AiiDA supporting throughputs of tens of thousands processes/hour, while automatically preserving and storing the full data provenance in a relational database making it queryable and traversable, thus enabling high-performance data analytics. AiiDA's workflow language provides advanced automation, error handling features and a flexible plugin model to allow interfacing with any simulation software. The associated plugin registry enables seamless sharing of extensions, empowering a vibrant user community dedicated to making simulations more robust, user-friendly and reproducible.
△ Less
Submitted 24 March, 2020;
originally announced March 2020.
-
Single-Layered Hittorf's Phosphorus: A Wide-Bandgap High Mobility 2D Material
Authors:
Georg Schusteritsch,
Martin Uhrin,
Chris J. Pickard
Abstract:
We propose here a two-dimensional material based on a single layer of violet or Hittorf's phosphorus. Using first-principles density functional theory, we find it to be energetically very stable, comparable to other previously proposed single-layered phosphorus structures. It requires only a small energetic cost of approximately $0.04~\text{eV/atom}$ to be created from its bulk structure, Hittorf'…
▽ More
We propose here a two-dimensional material based on a single layer of violet or Hittorf's phosphorus. Using first-principles density functional theory, we find it to be energetically very stable, comparable to other previously proposed single-layered phosphorus structures. It requires only a small energetic cost of approximately $0.04~\text{eV/atom}$ to be created from its bulk structure, Hittorf's phosphorus, or a binding energy of $0.3-0.4~\text{J/m}^2$ per layer, suggesting the possibility of exfoliation in experiments. We find single-layered Hittorf's phosphorus to be a wide band gap semiconductor with a direct band gap of approximately $2.5$~eV and our calculations show it is expected to have a high and highly anisotropic hole mobility with an upper bound lying between $3000-7000$~cm$^2$V$^{-1}$s$^{-1}$. These combined properties make single-layered Hittorf's phosphorus a very good candidate for future applications in a wide variety of technologies, in particular for high frequency electronics, and optoelectronic devices operating in the low wavelength blue color range.
△ Less
Submitted 14 April, 2016;
originally announced April 2016.
-
Predicting Non-Square 2D Dice Probabilities
Authors:
G. A. T. Pender,
M. Uhrin
Abstract:
The prediction of the final state probabilities of a general cuboid randomly thrown onto a surface is a problem that naturally arises in the minds of men and women familiar with regular cubic dice and the basic concepts of probability. Indeed, it was considered by Newton in 1664 [1]. In this paper we make progress on the 2D problem (which can be realised in 3D by considering a long cuboid, or alte…
▽ More
The prediction of the final state probabilities of a general cuboid randomly thrown onto a surface is a problem that naturally arises in the minds of men and women familiar with regular cubic dice and the basic concepts of probability. Indeed, it was considered by Newton in 1664 [1]. In this paper we make progress on the 2D problem (which can be realised in 3D by considering a long cuboid, or alternatively a rectangular cross-sectioned dreidel).
For the two-dimensional case we suggest a model that predicts this based on the side length ratio. We test this theory both experimentally and computationally, and find good agreement between our theory, experimental and computational results.
Our theory is known, from its derivation, to be an approximation for particularly bouncy or grippy surfaces where the die rolls through many revolutions before settling. On real surfaces we would expect (and we observe) that the true probability ratio for a 2D die is a somewhat closer to unity than predicted by our theory.
This problem may also have wider relevance in the testing of physics engines.
△ Less
Submitted 23 July, 2014;
originally announced July 2014.
-
The MOLDY short-range molecular dynamics package
Authors:
GJ Ackland,
K D'Mellow,
SL Daraszewicz,
DJ Hepburn,
M Uhrin,
K. Stratford
Abstract:
We describe a parallelised version of the MOLDY molecular dynamics program. This Fortran code is aimed at systems which may be described by short-range potentials and specifically those which may be addressed with the embedded atom method. This includes a wide range of transition metals and alloys. MOLDY provides a range of options in terms of the molecular dynamics ensemble used and the boundary…
▽ More
We describe a parallelised version of the MOLDY molecular dynamics program. This Fortran code is aimed at systems which may be described by short-range potentials and specifically those which may be addressed with the embedded atom method. This includes a wide range of transition metals and alloys. MOLDY provides a range of options in terms of the molecular dynamics ensemble used and the boundary conditions which may be applied. A number of standard potentials are provided, and the modular structure of the code allows new potentials to be added easily. The code is parallelised using OpenMP and can therefore be run on shared memory systems, including modern multicore processors. Particular attention is paid to the updates required in the main force loop, where synchronisation is often required in OpenMP implementations of molecular dynamics. We examine the performance of the parallel code in detail and give some examples of applications to realistic problems, including the dynamic compression of copper and carbon migration in an iron-carbon alloy.
△ Less
Submitted 13 July, 2011;
originally announced July 2011.