Shared Metadata for Data-Centric Materials Science
Authors:
Luca M. Ghiringhelli,
Carsten Baldauf,
Tristan Bereau,
Sandor Brockhauser,
Christian Carbogno,
Javad Chamanara,
Stefano Cozzini,
Stefano Curtarolo,
Claudia Draxl,
Shyam Dwaraknath,
Ádám Fekete,
James Kermode,
Christoph T. Koch,
Markus Kühbach,
Alvin Noe Ladines,
Patrick Lambrix,
Maja-Olivia Lenz-Himmer,
Sergey Levchenko,
Micael Oliveira,
Adam Michalchuk,
Ron Miller,
Berk Onat,
Pasquale Pavone,
Giovanni Pizzi,
Benjamin Regler
, et al. (10 additional authors not shown)
Abstract:
The expansive production of data in materials science, their widespread sharing and repurposing requires educated support and stewardship. In order to ensure that this need helps rather than hinders scientific work, the implementation of the FAIR-data principles (Findable, Accessible, Interoperable, and Reusable) must not be too narrow. Besides, the wider materials-science community ought to agree…
▽ More
The expansive production of data in materials science, their widespread sharing and repurposing requires educated support and stewardship. In order to ensure that this need helps rather than hinders scientific work, the implementation of the FAIR-data principles (Findable, Accessible, Interoperable, and Reusable) must not be too narrow. Besides, the wider materials-science community ought to agree on the strategies to tackle the challenges that are specific to its data, both from computations and experiments. In this paper, we present the result of the discussions held at the workshop on "Shared Metadata and Data Formats for Big-Data Driven Materials Science". We start from an operative definition of metadata, and what features a FAIR-compliant metadata schema should have. We will mainly focus on computational materials-science data and propose a constructive approach for the FAIRification of the (meta)data related to ground-state and excited-states calculations, potential-energy sampling, and generalized workflows. Finally, challenges with the FAIRification of experimental (meta)data and materials-science ontologies are presented together with an outlook of how to meet them.
△ Less
Submitted 23 August, 2023; v1 submitted 29 May, 2022;
originally announced May 2022.
TCMI: a non-parametric mutual-dependence estimator for multivariate continuous distributions
Authors:
Benjamin Regler,
Matthias Scheffler,
Luca M. Ghiringhelli
Abstract:
The identification of relevant features, i.e., the driving variables that determine a process or the properties of a system, is an essential part of the analysis of data sets with a large number of variables. A mathematical rigorous approach to quantifying the relevance of these features is mutual information. Mutual information determines the relevance of features in terms of their joint mutual d…
▽ More
The identification of relevant features, i.e., the driving variables that determine a process or the properties of a system, is an essential part of the analysis of data sets with a large number of variables. A mathematical rigorous approach to quantifying the relevance of these features is mutual information. Mutual information determines the relevance of features in terms of their joint mutual dependence to the property of interest. However, mutual information requires as input probability distributions, which cannot be reliably estimated from continuous distributions such as physical quantities like lengths or energies. Here, we introduce total cumulative mutual information (TCMI), a measure of the relevance of mutual dependences that extends mutual information to random variables of continuous distribution based on cumulative probability distributions. TCMI is a non-parametric, robust, and deterministic measure that facilitates comparisons and rankings between feature sets with different cardinality. The ranking induced by TCMI allows for feature selection, i.e., the identification of variable sets that are nonlinear statistically related to a property of interest, taking into account the number of data samples as well as the cardinality of the set of variables. We evaluate the performance of our measure with simulated data, compare its performance with similar multivariate-dependence measures, and demonstrate the effectiveness of our feature-selection method on a set of standard data sets and a typical scenario in materials science.
△ Less
Submitted 30 July, 2022; v1 submitted 30 January, 2020;
originally announced January 2020.
Nonequilibrium Quantum Phase Transitions in the Dicke Model
Authors:
V. M. Bastidas,
C. Emary,
B. Regler,
T. Brandes
Abstract:
We establish a set of nonequilibrium quantum phase transitions in the Dicke model by considering a monochromatic nonadiabatic modulation of the atom-field coupling. For weak driving the system exhibits a set of sidebands which allow the circumvention of the no-go theorem which otherwise forbids the occurence of superradiant phase transitions. At strong driving we show that the system exhibits a ri…
▽ More
We establish a set of nonequilibrium quantum phase transitions in the Dicke model by considering a monochromatic nonadiabatic modulation of the atom-field coupling. For weak driving the system exhibits a set of sidebands which allow the circumvention of the no-go theorem which otherwise forbids the occurence of superradiant phase transitions. At strong driving we show that the system exhibits a rich multistable structure and exhibits both first- and second-order nonequilibrium quantum phase transitions.
△ Less
Submitted 30 January, 2012; v1 submitted 15 August, 2011;
originally announced August 2011.