-
Machine-Learning Based Selection and Synthesis of Candidate Metal-Insulator Transition Metal Oxides
Authors:
Alexandru B. Georgescu,
Peiwen Ren,
Christopher Karpovich,
Elsa Olivetti,
James M. Rondinelli
Abstract:
The discovery of materials that exhibit a metal-insulator transition (MIT) is key to the development of multiple types of novel efficient microelectronic and optoelectronic devices. However, identifying MIT materials is challenging due to a combination of high computational cost of electronic structure calculations needed to understand their mechanism, the mechanisms' complexity, and the labor-int…
▽ More
The discovery of materials that exhibit a metal-insulator transition (MIT) is key to the development of multiple types of novel efficient microelectronic and optoelectronic devices. However, identifying MIT materials is challenging due to a combination of high computational cost of electronic structure calculations needed to understand their mechanism, the mechanisms' complexity, and the labor-intensive experimental validation process. To that end, we use a machine learning classification model to rapidly screen a high-throughput crystal structure database to identify candidate compounds exhibiting thermally-driven MITs. We focus on three candidate oxides, Ca$_2$Fe$_3$O$_8$, CaCo$_2$O$_4$, and CaMn$_2$O$_4$, and identify their MIT mechanism using high-fidelity density functional theory calculations. Then, we provide a probabilistic estimate of which synthesis reactions may lead to their realization. Our approach couples physics-informed machine learning, density functional theory calculations, and machine learning-suggested synthesis to reduce the time to discovery and synthesis of new technologically relevant materials.
△ Less
Submitted 20 March, 2024;
originally announced April 2024.
-
brainlife.io: A decentralized and open source cloud platform to support neuroscience research
Authors:
Soichi Hayashi,
Bradley A. Caron,
Anibal Sólon Heinsfeld,
Sophia Vinci-Booher,
Brent McPherson,
Daniel N. Bullock,
Giulia Bertò,
Guiomar Niso,
Sandra Hanekamp,
Daniel Levitas,
Kimberly Ray,
Anne MacKenzie,
Lindsey Kitchell,
Josiah K. Leong,
Filipi Nascimento-Silva,
Serge Koudoro,
Hanna Willis,
Jasleen K. Jolly,
Derek Pisner,
Taylor R. Zuidema,
Jan W. Kurzawski,
Kyriaki Mikellidou,
Aurore Bussalb,
Christopher Rorden,
Conner Victory
, et al. (39 additional authors not shown)
Abstract:
Neuroscience research has expanded dramatically over the past 30 years by advancing standardization and tool development to support rigor and transparency. Consequently, the complexity of the data pipeline has also increased, hindering access to FAIR (Findable, Accessible, Interoperabile, and Reusable) data analysis to portions of the worldwide research community. brainlife.io was developed to red…
▽ More
Neuroscience research has expanded dramatically over the past 30 years by advancing standardization and tool development to support rigor and transparency. Consequently, the complexity of the data pipeline has also increased, hindering access to FAIR (Findable, Accessible, Interoperabile, and Reusable) data analysis to portions of the worldwide research community. brainlife.io was developed to reduce these burdens and democratize modern neuroscience research across institutions and career levels. Using community software and hardware infrastructure, the platform provides open-source data standardization, management, visualization, and processing and simplifies the data pipeline. brainlife.io automatically tracks the provenance history of thousands of data objects, supporting simplicity, efficiency, and transparency in neuroscience research. Here brainlife.io's technology and data services are described and evaluated for validity, reliability, reproducibility, replicability, and scientific utility. Using data from 4 modalities and 3,200 participants, we demonstrate that brainlife.io's services produce outputs that adhere to best practices in modern neuroscience research.
△ Less
Submitted 11 August, 2023; v1 submitted 3 June, 2023;
originally announced June 2023.
-
Supervised Tractogram Filtering using Geometric Deep Learning
Authors:
Pietro Astolfi,
Ruben Verhagen,
Laurent Petit,
Emanuele Olivetti,
Silvio Sarubbo,
Jonathan Masci,
Davide Boscaini,
Paolo Avesani
Abstract:
A tractogram is a virtual representation of the brain white matter. It is composed of millions of virtual fibers, encoded as 3D polylines, which approximate the white matter axonal pathways. To date, tractograms are the most accurate white matter representation and thus are used for tasks like presurgical planning and investigations of neuroplasticity, brain disorders, or brain networks. However,…
▽ More
A tractogram is a virtual representation of the brain white matter. It is composed of millions of virtual fibers, encoded as 3D polylines, which approximate the white matter axonal pathways. To date, tractograms are the most accurate white matter representation and thus are used for tasks like presurgical planning and investigations of neuroplasticity, brain disorders, or brain networks. However, it is a well-known issue that a large portion of tractogram fibers is not anatomically plausible and can be considered artifacts of the tracking procedure. With Verifyber, we tackle the problem of filtering out such non-plausible fibers using a novel fully-supervised learning approach. Differently from other approaches based on signal reconstruction and/or brain topology regularization, we guide our method with the existing anatomical knowledge of the white matter. Using tractograms annotated according to anatomical principles, we train our model, Verifyber, to classify fibers as either anatomically plausible or non-plausible. The proposed Verifyber model is an original Geometric Deep Learning method that can deal with variable size fibers, while being invariant to fiber orientation. Our model considers each fiber as a graph of points, and by learning features of the edges between consecutive points via the proposed sequence Edge Convolution, it can capture the underlying anatomical properties. The output filtering results highly accurate and robust across an extensive set of experiments, and fast; with a 12GB GPU, filtering a tractogram of 1M fibers requires less than a minute. Verifyber implementation and trained models are available at https://github.com/FBK-NILab/verifyber.
△ Less
Submitted 6 December, 2022;
originally announced December 2022.
-
Polycrystalline MnBi as a transverse thermoelectric material
Authors:
Alessandro Sola,
Elena Olivetti,
Luca Martino,
Vittorio Basso
Abstract:
To assess the potential of polycrystalline MnBi as a transverse thermoelectric material, we have experimentally investigated its anomalous Nernst effect (ANE) by means of the heat flux method. We prepared MnBi samples by powder metallurgy; this technique allows the preparation of samples in arbitrary shapes with the possibility to tailor their magnetic properties. In the material exhibiting the hi…
▽ More
To assess the potential of polycrystalline MnBi as a transverse thermoelectric material, we have experimentally investigated its anomalous Nernst effect (ANE) by means of the heat flux method. We prepared MnBi samples by powder metallurgy; this technique allows the preparation of samples in arbitrary shapes with the possibility to tailor their magnetic properties. In the material exhibiting the highest remanent magnetization, we found a value of the ANE thermopower of -1.1 $μ$V/K at 1 T, after the compensation of the ordinary Nernst effect from pure bismuth present inside the polycrystalline sample. This value is comparable with those reported in the literature for single crystals.
△ Less
Submitted 17 November, 2022;
originally announced November 2022.
-
MatKG: The Largest Knowledge Graph in Materials Science -- Entities, Relations, and Link Prediction through Graph Representation Learning
Authors:
Vineeth Venugopal,
Sumit Pai,
Elsa Olivetti
Abstract:
This paper introduces MatKG, a novel graph database of key concepts in material science spanning the traditional material-structure-property-processing paradigm. MatKG is autonomously generated through transformer-based, large language models and generates pseudo ontological schema through statistical co-occurrence map**. At present, MatKG contains over 2 million unique relationship triples deri…
▽ More
This paper introduces MatKG, a novel graph database of key concepts in material science spanning the traditional material-structure-property-processing paradigm. MatKG is autonomously generated through transformer-based, large language models and generates pseudo ontological schema through statistical co-occurrence map**. At present, MatKG contains over 2 million unique relationship triples derived from 80,000 entities. This allows the curated analysis, querying, and visualization of materials knowledge at unique resolution and scale. Further, Knowledge Graph Embedding models are used to learn embedding representations of nodes in the graph which are used for downstream tasks such as link prediction and entity disambiguation. MatKG allows the rapid dissemination and assimilation of data when used as a knowledge base, while enabling the discovery of new relations when trained as an embedding model.
△ Less
Submitted 31 October, 2022;
originally announced October 2022.
-
Deep Reinforcement Learning for Inverse Inorganic Materials Design
Authors:
Elton Pan,
Christopher Karpovich,
Elsa Olivetti
Abstract:
A major obstacle to the realization of novel inorganic materials with desirable properties is the inability to perform efficient optimization across both materials properties and synthesis of those materials. In this work, we propose a reinforcement learning (RL) approach to inverse inorganic materials design, which can identify promising compounds with specified properties and synthesizability co…
▽ More
A major obstacle to the realization of novel inorganic materials with desirable properties is the inability to perform efficient optimization across both materials properties and synthesis of those materials. In this work, we propose a reinforcement learning (RL) approach to inverse inorganic materials design, which can identify promising compounds with specified properties and synthesizability constraints. Our model learns chemical guidelines such as charge and electronegativity neutrality while maintaining chemical diversity and uniqueness. We demonstrate a multi-objective RL approach, which can generate novel compounds with targeted materials properties including formation energy and bulk/shear modulus alongside a lower sintering temperature synthesis objectives. Using this approach, the model can predict promising compounds of interest, while suggesting an optimized chemical design space for inorganic materials discovery.
△ Less
Submitted 21 October, 2022;
originally announced October 2022.
-
Data-driven prediction of room temperature density for multicomponent silicate-based glasses
Authors:
Kai Gong,
Elsa Olivetti
Abstract:
Density is one of the most commonly measured or estimated materials properties, especially for glasses and melts that are of significant interest to many fields, including metallurgy, geology, materials science and sustainable cements. Here, two types of machine learning (ML) models (i.e., random forest (RF) and artificial neural network (ANN)) have been developed to predict the room-temperature d…
▽ More
Density is one of the most commonly measured or estimated materials properties, especially for glasses and melts that are of significant interest to many fields, including metallurgy, geology, materials science and sustainable cements. Here, two types of machine learning (ML) models (i.e., random forest (RF) and artificial neural network (ANN)) have been developed to predict the room-temperature density of glasses in the compositional space of CaO-MgO-Al2O3-SiO2-TiO2-FeO-Fe2O3-Na2O-K2O-MnO (CMASTFNKM), based on ~2100 data points mined from ~140 literature studies. The results show that the RF and ANN models give accurate predictions of glass density with R2 values, RMSE, and MAPE of ~0.96-0.98, ~0.02-0.03 g/cm3 and ~0.59-0.79%, respectively, for the 15% testing set, which are more accurate compared with empirical density models based on ionic packing ratio (with R2 values, RMSE, and MAPE of ~0.28-0.91, ~0.05-0.15 g/cm3, and ~1.40-4.61%, respectively). Furthermore, glass density is shown to be a reliable reactivity indicator for a range of CaO-Al2O3-SiO2 (CAS) and volcanic glasses due to its strong correlation (R2 values above ~0.90) with the average metal-oxygen dissociation energy (a structural descriptor) of these glasses. Analysis of the predicted density-composition relationships from these models (for selected compositional subspaces) suggests that the ANN model exhibits a certain level of transferability (i.e., ability to extrapolate to compositional space not (or less) covered in the database) and captures known features including the mixed alkaline earth effects for (CaO-MgO)0.5-(Al2O3-SiO2)0.5 glasses.
△ Less
Submitted 5 September, 2022;
originally announced September 2022.
-
Augmenting Scientific Creativity with Retrieval across Knowledge Domains
Authors:
Hyeonsu B. Kang,
Sheshera Mysore,
Kevin Huang,
Haw-Shiuan Chang,
Thorben Prein,
Andrew McCallum,
Aniket Kittur,
Elsa Olivetti
Abstract:
Exposure to ideas in domains outside a scientist's own may benefit her in reformulating existing research problems in novel ways and discovering new application domains for existing solution ideas. While improved performance in scholarly search engines can help scientists efficiently identify relevant advances in domains they may already be familiar with, it may fall short of hel** them explore…
▽ More
Exposure to ideas in domains outside a scientist's own may benefit her in reformulating existing research problems in novel ways and discovering new application domains for existing solution ideas. While improved performance in scholarly search engines can help scientists efficiently identify relevant advances in domains they may already be familiar with, it may fall short of hel** them explore diverse ideas \textit{outside} such domains. In this paper we explore the design of systems aimed at augmenting the end-user ability in cross-domain exploration with flexible query specification. To this end, we develop an exploratory search system in which end-users can select a portion of text core to their interest from a paper abstract and retrieve papers that have a high similarity to the user-selected core aspect but differ in terms of domains. Furthermore, end-users can `zoom in' to specific domain clusters to retrieve more papers from them and understand nuanced differences within the clusters. Our case studies with scientists uncover opportunities and design implications for systems aimed at facilitating cross-domain exploration and inspiration.
△ Less
Submitted 14 December, 2022; v1 submitted 2 June, 2022;
originally announced June 2022.
-
Inorganic Synthesis Reaction Condition Prediction with Generative Machine Learning
Authors:
Christopher Karpovich,
Zach Jensen,
Vineeth Venugopal,
Elsa Olivetti
Abstract:
Data-driven synthesis planning with machine learning is a key step in the design and discovery of novel inorganic compounds with desirable properties. Inorganic materials synthesis is often guided by chemists' prior knowledge and experience, built upon experimental trial-and-error that is both time and resource consuming. Recent developments in natural language processing (NLP) have enabled large-…
▽ More
Data-driven synthesis planning with machine learning is a key step in the design and discovery of novel inorganic compounds with desirable properties. Inorganic materials synthesis is often guided by chemists' prior knowledge and experience, built upon experimental trial-and-error that is both time and resource consuming. Recent developments in natural language processing (NLP) have enabled large-scale text mining of scientific literature, providing open source databases of synthesis information of synthesized compounds, material precursors, and reaction conditions (temperatures, times). In this work, we employ a conditional variational autoencoder (CVAE) to predict suitable inorganic reaction conditions for the crucial inorganic synthesis steps of calcination and sintering. We find that the CVAE model is capable of learning subtle differences in target material composition, precursor compound identities, and choice of synthesis route (solid-state, sol-gel) that are present in the inorganic synthesis space. Moreover, the CVAE can generalize well to unseen chemical entities and shows promise for predicting reaction conditions for previously unsynthesized compounds of interest.
△ Less
Submitted 17 December, 2021;
originally announced December 2021.
-
Development of structural descriptors to predict dissolution rate of volcanic glasses: molecular dynamic simulations
Authors:
Kai Gong,
Elsa Olivetti
Abstract:
Establishing the composition-structure-property relationships for amorphous materials is critical for many important natural and engineering processes, including the dissolution of highly complex volcanic glasses. In this investigation, we performed force field molecular dynamics (MD) simulations to generate detailed structural representations for ten natural CaO-MgO-Al2O3-SiO2-TiO2-FeO-Fe2O3-Na2O…
▽ More
Establishing the composition-structure-property relationships for amorphous materials is critical for many important natural and engineering processes, including the dissolution of highly complex volcanic glasses. In this investigation, we performed force field molecular dynamics (MD) simulations to generate detailed structural representations for ten natural CaO-MgO-Al2O3-SiO2-TiO2-FeO-Fe2O3-Na2O-K2O glasses with compositions ranging from rhyolitic to basaltic. Based on the resulting atomic structural representations at 300 K, we have calculated the partial radial distribution functions, nearest interatomic distances and coordination number, which are consistent with the literature data on silicate-based glasses. Based on these structural attributes and classical bond valence models, we have introduced a novel structural descriptor, i.e., average metal-oxygen (M-O) bond strength parameter, which has captured the log dissolution rates of the ten glasses at both acidic and basic conditions (based on literature data) with R2 values of ~0.80-0.92 based on linear regression. This structural descriptor is seen to outperform several other structural descriptors also derived from MD simulation results, including the average metal oxide dissociation energy, the average self-diffusion coefficient of all the atoms at their melting points, and the energy barrier of self-diffusion. Furthermore, we showed that the MD-derived descriptors generally exhibit better predictive performance than the degree of depolymerization parameter commonly used to describe glass and mineral reactivity. The results suggest that the structural descriptors derived from MD simulations, especially the average M-O bond strength parameter, are promising structural descriptors for connecting composition with dissolution rates of highly complex natural glasses.
△ Less
Submitted 29 July, 2021;
originally announced July 2021.
-
Learning the Crystal Structure Genome for Property Classification
Authors:
Yiqun Wang,
Xiao-Jie Zhang,
Fei Xia,
Elsa A. Olivetti,
Ram Seshadri,
James M. Rondinelli
Abstract:
Materials property predictions have improved from advances in machine learning algorithms, delivering materials discoveries and novel insights through data-driven models of structure-property relationships. Nearly all available models rely on featurization of materials composition, however, whether the exclusive use of structural knowledge in such models has the capacity to make comparable predict…
▽ More
Materials property predictions have improved from advances in machine learning algorithms, delivering materials discoveries and novel insights through data-driven models of structure-property relationships. Nearly all available models rely on featurization of materials composition, however, whether the exclusive use of structural knowledge in such models has the capacity to make comparable predictions remains unknown. Here we employ a deep neural network model to decode structure-property relationships in crystalline materials without explicitly considering chemical compositions. The focus is on classification of crystal systems, mechanical elasticity, electronic band gap, and phase stability. Our model utilizes a three-dimensional (3D) momentum space representation of structure from elastic x-ray scattering theory that exhibits rotation and permutation invariance. We perform novel ablation studies to help interpret the model performance by perturbing the physically meaningful input features (i.e., the diffraction patterns) instead of tuning the architecture of the learning model as in conventional ablation methods. We find that the spatial symmetry of the 3D diffraction patterns, which reflects crystalline symmetry operations, is more important than the diffraction intensities contained within for the model to make a successful classification. Our work showcases the potential of using statistical learning models to help understand materials physics, rather than performing predictive and generative tasks as in most materials informatics research. We also argue that learning the crystal structure genome in a chemistry-agnostic manner demonstrates that some crystal structures inherently host high propensities for optimal materials properties, which enables the decoupling of structure and composition for future codesign of multifunctionality.
△ Less
Submitted 11 April, 2022; v1 submitted 5 January, 2021;
originally announced January 2021.
-
Database, Features, and Machine Learning Model to Identify Thermally Driven Metal-Insulator Transition Compounds
Authors:
Alexandru B. Georgescu,
Peiwen Ren,
Aubrey R. Toland,
Shengtong Zhang,
Kyle D. Miller,
Daniel W. Apley,
Elsa A. Olivetti,
Nicholas Wagner,
James M. Rondinelli
Abstract:
Metal-insulator transition (MIT) compounds are materials that may exhibit insulating or metallic behavior, depending on the physical conditions, and are of immense fundamental interest owing to their potential applications in emerging microelectronics. There is a dearth of thermally-driven MIT materials, however, which makes delineating these compounds from those that are exclusively insulating or…
▽ More
Metal-insulator transition (MIT) compounds are materials that may exhibit insulating or metallic behavior, depending on the physical conditions, and are of immense fundamental interest owing to their potential applications in emerging microelectronics. There is a dearth of thermally-driven MIT materials, however, which makes delineating these compounds from those that are exclusively insulating or metallic challenging. Here we report a material database comprising temperature-controlled MITs (and metals and insulators with similar chemical composition and stoichiometries to the MIT compounds) from high quality experimental literature, built through a combination of materials-domain knowledge and natural language processing. We featurize the dataset using compositional, structural, and energetic descriptors, including two MIT relevant energy scales, an estimated Hubbard interaction and the charge transfer energy, as well as the structure-bond-stress metric referred to as the global-instability index (GII). We then perform supervised classification, constructing three electronic-state classifiers: metal vs non-metal (M), insulator vs non-insulator (I), and MIT vs non-MIT (T). We identify two important descriptors that separate metals, insulators, and MIT materials in a 2D feature space: the average deviation of the covalent radius and the range of the Mendeleev number. We further elaborate on other important features (GII and Ewald energy), and examine how they affect classification of binary vanadium and titanium oxides. We discuss the relationship of these atomic features to the physical interactions underlying MITs in the rare-earth nickelate family. Last, we implement an online version of the classifiers, enabling quick probabilistic class predictions by uploading a crystallographic structure file.
△ Less
Submitted 21 July, 2021; v1 submitted 25 October, 2020;
originally announced October 2020.
-
Tractogram filtering of anatomically non-plausible fibers with geometric deep learning
Authors:
Pietro Astolfi,
Ruben Verhagen,
Laurent Petit,
Emanuele Olivetti,
Jonathan Masci,
Davide Boscaini,
Paolo Avesani
Abstract:
Tractograms are virtual representations of the white matter fibers of the brain. They are of primary interest for tasks like presurgical planning, and investigation of neuroplasticity or brain disorders. Each tractogram is composed of millions of fibers encoded as 3D polylines. Unfortunately, a large portion of those fibers are not anatomically plausible and can be considered artifacts of the trac…
▽ More
Tractograms are virtual representations of the white matter fibers of the brain. They are of primary interest for tasks like presurgical planning, and investigation of neuroplasticity or brain disorders. Each tractogram is composed of millions of fibers encoded as 3D polylines. Unfortunately, a large portion of those fibers are not anatomically plausible and can be considered artifacts of the tracking algorithms. Common methods for tractogram filtering are based on signal reconstruction, a principled approach, but unable to consider the knowledge of brain anatomy. In this work, we address the problem of tractogram filtering as a supervised learning problem by exploiting the ground truth annotations obtained with a recent heuristic method, which labels fibers as either anatomically plausible or non-plausible according to well-established anatomical properties. The intuitive idea is to model a fiber as a point cloud and the goal is to investigate whether and how a geometric deep learning model might capture its anatomical properties. Our contribution is an extension of the Dynamic Edge Convolution model that exploits the sequential relations of points in a fiber and discriminates with high accuracy plausible/non-plausible fibers.
△ Less
Submitted 9 July, 2020; v1 submitted 24 March, 2020;
originally announced March 2020.
-
Automatic Tissue Segmentation with Deep Learning in Patients with Congenital or Acquired Distortion of Brain Anatomy
Authors:
Gabriele Amorosino,
Denis Peruzzo,
Pietro Astolfi,
Daniela Redaelli,
Paolo Avesani,
Filippo Arrigoni,
Emanuele Olivetti
Abstract:
Brains with complex distortion of cerebral anatomy present several challenges to automatic tissue segmentation methods of T1-weighted MR images. First, the very high variability in the morphology of the tissues can be incompatible with the prior knowledge embedded within the algorithms. Second, the availability of MR images of distorted brains is very scarce, so the methods in the literature have…
▽ More
Brains with complex distortion of cerebral anatomy present several challenges to automatic tissue segmentation methods of T1-weighted MR images. First, the very high variability in the morphology of the tissues can be incompatible with the prior knowledge embedded within the algorithms. Second, the availability of MR images of distorted brains is very scarce, so the methods in the literature have not addressed such cases so far. In this work, we present the first evaluation of state-of-the-art automatic tissue segmentation pipelines on T1-weighted images of brains with different severity of congenital or acquired brain distortion. We compare traditional pipelines and a deep learning model, i.e. a 3D U-Net trained on normal-appearing brains. Unsurprisingly, traditional pipelines completely fail to segment the tissues with strong anatomical distortion. Surprisingly, the 3D U-Net provides useful segmentations that can be a valuable starting point for manual refinement by experts/neuroradiologists.
△ Less
Submitted 24 March, 2020;
originally announced March 2020.
-
A Test for Shared Patterns in Cross-modal Brain Activation Analysis
Authors:
Elena Kalinina,
Fabian Pedregosa,
Vittorio Iacovella,
Emanuele Olivetti,
Paolo Avesani
Abstract:
Determining the extent to which different cognitive modalities (understood here as the set of cognitive processes underlying the elaboration of a stimulus by the brain) rely on overlap** neural representations is a fundamental issue in cognitive neuroscience. In the last decade, the identification of shared activity patterns has been mostly framed as a supervised learning problem. For instance,…
▽ More
Determining the extent to which different cognitive modalities (understood here as the set of cognitive processes underlying the elaboration of a stimulus by the brain) rely on overlap** neural representations is a fundamental issue in cognitive neuroscience. In the last decade, the identification of shared activity patterns has been mostly framed as a supervised learning problem. For instance, a classifier is trained to discriminate categories (e.g. faces vs. houses) in modality I (e.g. perception) and tested on the same categories in modality II (e.g. imagery). This type of analysis is often referred to as cross-modal decoding. In this paper we take a different approach and instead formulate the problem of assessing shared patterns across modalities within the framework of statistical hypothesis testing. We propose both an appropriate test statistic and a scheme based on permutation testing to compute the significance of this test while making only minimal distributional assumption. We denote this test cross-modal permutation test (CMPT). We also provide empirical evidence on synthetic datasets that our approach has greater statistical power than the cross-modal decoding method while maintaining low Type I errors (rejecting a true null hypothesis). We compare both approaches on an fMRI dataset with three different cognitive modalities (perception, imagery, visual search). Finally, we show how CMPT can be combined with Searchlight analysis to explore spatial distribution of shared activity patterns.
△ Less
Submitted 8 October, 2019;
originally announced October 2019.
-
Anatomically-Informed Multiple Linear Assignment Problems for White Matter Bundle Segmentation
Authors:
Giulia Bertò,
Paolo Avesani,
Franco Pestilli,
Daniel Bullock,
Bradley Caron,
Emanuele Olivetti
Abstract:
Segmenting white matter bundles from human tractograms is a task of interest for several applications. Current methods for bundle segmentation consider either only prior knowledge about the relative anatomical position of a bundle, or only its geometrical properties. Our aim is to improve the results of segmentation by proposing a method that takes into account information about both the underlyin…
▽ More
Segmenting white matter bundles from human tractograms is a task of interest for several applications. Current methods for bundle segmentation consider either only prior knowledge about the relative anatomical position of a bundle, or only its geometrical properties. Our aim is to improve the results of segmentation by proposing a method that takes into account information about both the underlying anatomy and the geometry of bundles at the same time. To achieve this goal, we extend a state-of-the-art example-based method based on the Linear Assignment Problem (LAP) by including prior anatomical information within the optimization process. The proposed method shows a significant improvement with respect to the original method, in particular on small bundles.
△ Less
Submitted 16 July, 2019;
originally announced July 2019.
-
The Materials Science Procedural Text Corpus: Annotating Materials Synthesis Procedures with Shallow Semantic Structures
Authors:
Sheshera Mysore,
Zach Jensen,
Edward Kim,
Kevin Huang,
Haw-Shiuan Chang,
Emma Strubell,
Jeffrey Flanigan,
Andrew McCallum,
Elsa Olivetti
Abstract:
Materials science literature contains millions of materials synthesis procedures described in unstructured natural language text. Large-scale analysis of these synthesis procedures would facilitate deeper scientific understanding of materials synthesis and enable automated synthesis planning. Such analysis requires extracting structured representations of synthesis procedures from the raw text as…
▽ More
Materials science literature contains millions of materials synthesis procedures described in unstructured natural language text. Large-scale analysis of these synthesis procedures would facilitate deeper scientific understanding of materials synthesis and enable automated synthesis planning. Such analysis requires extracting structured representations of synthesis procedures from the raw text as a first step. To facilitate the training and evaluation of synthesis extraction models, we introduce a dataset of 230 synthesis procedures annotated by domain experts with labeled graphs that express the semantics of the synthesis sentences. The nodes in this graph are synthesis operations and their typed arguments, and labeled edges specify relations between the nodes. We describe this new resource in detail and highlight some specific challenges to annotating scientific text with shallow semantic structure. We make the corpus available to the community to promote further research and development of scientific information extraction systems.
△ Less
Submitted 13 July, 2019; v1 submitted 16 May, 2019;
originally announced May 2019.
-
Inorganic Materials Synthesis Planning with Literature-Trained Neural Networks
Authors:
Edward Kim,
Zach Jensen,
Alexander van Grootel,
Kevin Huang,
Matthew Staib,
Sheshera Mysore,
Haw-Shiuan Chang,
Emma Strubell,
Andrew McCallum,
Stefanie Jegelka,
Elsa Olivetti
Abstract:
Leveraging new data sources is a key step in accelerating the pace of materials design and discovery. To complement the strides in synthesis planning driven by historical, experimental, and computed data, we present an automated method for connecting scientific literature to synthesis insights. Starting from natural language text, we apply word embeddings from language models, which are fed into a…
▽ More
Leveraging new data sources is a key step in accelerating the pace of materials design and discovery. To complement the strides in synthesis planning driven by historical, experimental, and computed data, we present an automated method for connecting scientific literature to synthesis insights. Starting from natural language text, we apply word embeddings from language models, which are fed into a named entity recognition model, upon which a conditional variational autoencoder is trained to generate syntheses for arbitrary materials. We show the potential of this technique by predicting precursors for two perovskite materials, using only training data published over a decade prior to their first reported syntheses. We demonstrate that the model learns representations of materials corresponding to synthesis-related properties, and that the model's behavior complements existing thermodynamic knowledge. Finally, we apply the model to perform synthesizability screening for proposed novel perovskite compounds.
△ Less
Submitted 17 February, 2019; v1 submitted 31 December, 2018;
originally announced January 2019.
-
Graph similarity drives zeolite diffusionless transformations and intergrowth
Authors:
Daniel Schwalbe-Koda,
Zach Jensen,
Elsa Olivetti,
Rafael Gomez-Bombarelli
Abstract:
Predicting and directing polymorphic transformations is a critical challenge in zeolite synthesis. Although interzeolite transformations enable selective crystallization, their design lacks predictions to connect framework similarity and experimental observations. Here, computational and theoretical tools are combined to data-mine, analyze and explain interzeolite relations. It is observed that bu…
▽ More
Predicting and directing polymorphic transformations is a critical challenge in zeolite synthesis. Although interzeolite transformations enable selective crystallization, their design lacks predictions to connect framework similarity and experimental observations. Here, computational and theoretical tools are combined to data-mine, analyze and explain interzeolite relations. It is observed that building units are weak predictors of topology interconversion and insufficient to explain intergrowth. By introducing a supercell-invariant metric that compares crystal structures using graph theory, we show that topotactic and reconstructive (diffusionless) transformations occur only between graph-similar pairs. Furthermore, all known instances of intergrowth occur between either structurally-similar or graph-similar frameworks. Backed with exhaustive literature results, we identify promising pairs for realizing novel diffusionless transformations and intergrowth. Hundreds of low-distance pairs are identified among known zeolites, and thousands of hypothetical frameworks are connected to known zeolites counterparts. The theory opens a venue to understand and control zeolite polymorphism.
△ Less
Submitted 10 March, 2021; v1 submitted 6 December, 2018;
originally announced December 2018.
-
Automatically Extracting Action Graphs from Materials Science Synthesis Procedures
Authors:
Sheshera Mysore,
Edward Kim,
Emma Strubell,
Ao Liu,
Haw-Shiuan Chang,
Srikrishna Kompella,
Kevin Huang,
Andrew McCallum,
Elsa Olivetti
Abstract:
Computational synthesis planning approaches have achieved recent success in organic chemistry, where tabulated synthesis procedures are readily available for supervised learning. The syntheses of inorganic materials, however, exist primarily as natural language narratives contained within scientific journal articles. This synthesis information must first be extracted from the text in order to enab…
▽ More
Computational synthesis planning approaches have achieved recent success in organic chemistry, where tabulated synthesis procedures are readily available for supervised learning. The syntheses of inorganic materials, however, exist primarily as natural language narratives contained within scientific journal articles. This synthesis information must first be extracted from the text in order to enable analogous synthesis planning methods for inorganic materials. In this work, we present a system for automatically extracting structured representations of synthesis procedures from the texts of materials science journal articles that describe explicit, experimental syntheses of inorganic compounds. We define the structured representation as a set of linked events made up of extracted scientific entities and evaluate two unsupervised approaches for extracting these structures on expert-annotated articles: a strong heuristic baseline and a generative model of procedural text. We also evaluate a variety of supervised models for extracting scientific entities. Our results provide insight into the nature of the data and directions for further work in this exciting new area of research.
△ Less
Submitted 28 November, 2017; v1 submitted 18 November, 2017;
originally announced November 2017.
-
Comparison of Distances for Supervised Segmentation of White Matter Tractography
Authors:
Emanuele Olivetti,
Giulia Bertò,
Pietro Gori,
Nusrat Sharmin,
Paolo Avesani
Abstract:
Tractograms are mathematical representations of the main paths of axons within the white matter of the brain, from diffusion MRI data. Such representations are in the form of polylines, called streamlines, and one streamline approximates the common path of tens of thousands of axons. The analysis of tractograms is a task of interest in multiple fields, like neurosurgery and neurology. A basic buil…
▽ More
Tractograms are mathematical representations of the main paths of axons within the white matter of the brain, from diffusion MRI data. Such representations are in the form of polylines, called streamlines, and one streamline approximates the common path of tens of thousands of axons. The analysis of tractograms is a task of interest in multiple fields, like neurosurgery and neurology. A basic building block of many pipelines of analysis is the definition of a distance function between streamlines. Multiple distance functions have been proposed in the literature, and different authors use different distances, usually without a specific reason other than invoking the "common practice". To this end, in this work we want to test such common practices, in order to obtain factual reasons for choosing one distance over another. For these reasons, in this work we compare many streamline distance functions available in the literature. We focus on the common task of automatic bundle segmentation and we adopt the recent approach of supervised segmentation from expert-based examples. Using the HCP dataset, we compare several distances obtaining guidelines on the choice of which distance function one should use for supervised bundle segmentation.
△ Less
Submitted 4 August, 2017;
originally announced August 2017.
-
Heterogeneous nucleation and heat flux avalanches in La(Fe,Si)$_{13}$ magnetocaloric compounds near the critical point
Authors:
C. Bennati,
L. Gozzelino,
E. S. Olivetti,
V. Basso
Abstract:
The phase transformation kinetics of LaFe$_{11.41}$Mn$_{0.30}$Si$_{1.29}$-H$_{1.65}$ magnetocaloric compound is addressed by low rate calorimetry experiments. Scans at 1 mK/s show that its first order phase transitions are made by multiple heat fllux avalanches. Getting very close to the critical point, the step-like discontinuous behavior associated with avalanches is smoothed out and thermal hys…
▽ More
The phase transformation kinetics of LaFe$_{11.41}$Mn$_{0.30}$Si$_{1.29}$-H$_{1.65}$ magnetocaloric compound is addressed by low rate calorimetry experiments. Scans at 1 mK/s show that its first order phase transitions are made by multiple heat fllux avalanches. Getting very close to the critical point, the step-like discontinuous behavior associated with avalanches is smoothed out and thermal hysteresis disappears. This result is confirmed by magneto-resistivity measurements and allows to measure accurate values of the zero field hysteresis ($ΔT_{hyst}$ = 0.37 K) and of the critical field (H$_c$ = 1.19 T). The number and magnitude of heat flux avalanches change with magnetic field, showing the interplay between the intrinsic energy barrier between phases and the microstructural disorder of the sample.
△ Less
Submitted 1 September, 2016;
originally announced September 2016.
-
Map** Tractography Across Subjects
Authors:
Thien Bao Nguyen,
Emanuele Olivetti,
Paolo Avesani
Abstract:
Diffusion magnetic resonance imaging (dMRI) and tractography provide means to study the anatomical structures within the white matter of the brain. When studying tractography data across subjects, it is usually necessary to align, i.e. to register, tractographies together. This registration step is most often performed by applying the transformation resulting from the registration of other volumet…
▽ More
Diffusion magnetic resonance imaging (dMRI) and tractography provide means to study the anatomical structures within the white matter of the brain. When studying tractography data across subjects, it is usually necessary to align, i.e. to register, tractographies together. This registration step is most often performed by applying the transformation resulting from the registration of other volumetric images (T1, FA). In contrast with registration methods that "transform" tractographies, in this work, we try to find which streamline in one tractography correspond to which streamline in the other tractography, without any transformation. In other words, we try to find a "map**" between the tractographies. We propose a graph-based solution for the tractography map** problem and we explain similarities and differences with the related well-known graph matching problem. Specifically, we define a loss function based on the pairwise streamline distance and reformulate the map** problem as combinatorial optimization of that loss function. We show preliminary promising results where we compare the proposed method, implemented with simulated annealing, against a standard registration techniques in a task of segmentation of the corticospinal tract.
△ Less
Submitted 29 January, 2016;
originally announced January 2016.
-
The Kernel Two-Sample Test for Brain Networks
Authors:
Emanuele Olivetti,
Sandro Vega-Pons,
Paolo Avesani
Abstract:
In clinical and neuroscientific studies, systematic differences between two populations of brain networks are investigated in order to characterize mental diseases or processes. Those networks are usually represented as graphs built from neuroimaging data and studied by means of graph analysis methods. The typical machine learning approach to study these brain graphs creates a classifier and tests…
▽ More
In clinical and neuroscientific studies, systematic differences between two populations of brain networks are investigated in order to characterize mental diseases or processes. Those networks are usually represented as graphs built from neuroimaging data and studied by means of graph analysis methods. The typical machine learning approach to study these brain graphs creates a classifier and tests its ability to discriminate the two populations. In contrast to this approach, in this work we propose to directly test whether two populations of graphs are different or not, by using the kernel two-sample test (KTST), without creating the intermediate classifier. We claim that, in general, the two approaches provides similar results and that the KTST requires much less computation. Additionally, in the regime of low sample size, we claim that the KTST has lower frequency of Type II error than the classification approach. Besides providing algorithmic considerations to support these claims, we show strong evidence through experiments and one simulation.
△ Less
Submitted 19 November, 2015;
originally announced November 2015.
-
The Approximation of the Dissimilarity Projection
Authors:
Emanuele Olivetti,
Thien Bao Nguyen,
Paolo Avesani
Abstract:
Diffusion magnetic resonance imaging (dMRI) data allow to reconstruct the 3D pathways of axons within the white matter of the brain as a tractography. The analysis of tractographies has drawn attention from the machine learning and pattern recognition communities providing novel challenges such as finding an appropriate representation space for the data. Many of the current learning algorithms req…
▽ More
Diffusion magnetic resonance imaging (dMRI) data allow to reconstruct the 3D pathways of axons within the white matter of the brain as a tractography. The analysis of tractographies has drawn attention from the machine learning and pattern recognition communities providing novel challenges such as finding an appropriate representation space for the data. Many of the current learning algorithms require the input to be from a vectorial space. This requirement contrasts with the intrinsic nature of the tractography because its basic elements, called streamlines or tracks, have different lengths and different number of points and for this reason they cannot be directly represented in a common vectorial space. In this work we propose the adoption of the dissimilarity representation which is an Euclidean embedding technique defined by selecting a set of streamlines called prototypes and then map** any new streamline to the vector of distances from prototypes. We investigate the degree of approximation of this projection under different prototype selection policies and prototype set sizes in order to characterise its use on tractography data. Additionally we propose the use of a scalable approximation of the most effective prototype selection policy that provides fast and accurate dissimilarity approximations of complete tractographies.
△ Less
Submitted 2 April, 2015;
originally announced April 2015.
-
MEG Decoding Across Subjects
Authors:
Emanuele Olivetti,
Seyed Mostafa Kia,
Paolo Avesani
Abstract:
Brain decoding is a data analysis paradigm for neuroimaging experiments that is based on predicting the stimulus presented to the subject from the concurrent brain activity. In order to make inference at the group level, a straightforward but sometimes unsuccessful approach is to train a classifier on the trials of a group of subjects and then to test it on unseen trials from new subjects. The ext…
▽ More
Brain decoding is a data analysis paradigm for neuroimaging experiments that is based on predicting the stimulus presented to the subject from the concurrent brain activity. In order to make inference at the group level, a straightforward but sometimes unsuccessful approach is to train a classifier on the trials of a group of subjects and then to test it on unseen trials from new subjects. The extreme difficulty is related to the structural and functional variability across the subjects. We call this approach "decoding across subjects". In this work, we address the problem of decoding across subjects for magnetoencephalographic (MEG) experiments and we provide the following contributions: first, we formally describe the problem and show that it belongs to a machine learning sub-field called transductive transfer learning (TTL). Second, we propose to use a simple TTL technique that accounts for the differences between train data and test data. Third, we propose the use of ensemble learning, and specifically of stacked generalization, to address the variability across subjects within train data, with the aim of producing more stable classifiers. On a face vs. scramble task MEG dataset of 16 subjects, we compare the standard approach of not modelling the differences across subjects, to the proposed one of combining TTL and ensemble learning. We show that the proposed approach is consistently more accurate than the standard one.
△ Less
Submitted 16 April, 2014;
originally announced April 2014.
-
Influence of sample geometry on inductive dam** measurement methods
Authors:
N. Liebing,
S. Serrano-Guisan,
A. Caprile,
E. S. Olivetti,
F. Celegato,
M. Pasquale,
A. Müller,
H. W. Schumacher
Abstract:
We study the precession frequency and effective dam** of patterned permalloy thin films of different geometry using integrated inductive test structures. The test structures consist of coplanar wave guides fabricated onto patterned permalloy stripes of different geometry. The width, length and position of the permalloy stripe with respect to the center conductor of the wave guide are varied. The…
▽ More
We study the precession frequency and effective dam** of patterned permalloy thin films of different geometry using integrated inductive test structures. The test structures consist of coplanar wave guides fabricated onto patterned permalloy stripes of different geometry. The width, length and position of the permalloy stripe with respect to the center conductor of the wave guide are varied. The precession frequency and effective dam** of the different devices is derived by inductive measurements in time and frequency domain in in-plane magnetic fields. While the precession frequencies do not reveal a significant dependence on the sample geometry we find a decrease of the measured dam** with increasing width of the permalloy centered underneath the center conductor of the coplanar wave guide. We attribute this effect to an additional dam** contribution due to inhomogeneous line broadening at the edges of the permalloy stripes which does not contribute to the inductive signal provided the permalloy stripe is wider than the center conductor. Consequences for inductive determination of the effective dam** using such integrated reference samples are discussed.
△ Less
Submitted 29 October, 2013;
originally announced October 2013.