Leveraging Language Representation for Material Recommendation, Ranking, and Exploration
Authors:
Jiaxing Qu,
Yuxuan Richard Xie,
Kamil M. Ciesielski,
Claire E. Porter,
Eric S. Toberer,
Elif Ertekin
Abstract:
Data-driven approaches for material discovery and design have been accelerated by emerging efforts in machine learning. However, general representations of crystals to explore the vast material search space remain limited. We introduce a material discovery framework that uses natural language embeddings derived from language models as representations of compositional and structural features. The d…
▽ More
Data-driven approaches for material discovery and design have been accelerated by emerging efforts in machine learning. However, general representations of crystals to explore the vast material search space remain limited. We introduce a material discovery framework that uses natural language embeddings derived from language models as representations of compositional and structural features. The discovery framework consists of a joint scheme that first recalls relevant candidates, and next ranks the candidates based on multiple target properties. The contextual knowledge encoded in language representations conveys information about material properties and structures, enabling both representational similarity analysis for recall, and multi-task learning to share information across related properties. By applying the framework to thermoelectrics, we demonstrate diversified recommendations of prototype structures and identify under-studied high-performance material spaces. The recommended materials are corroborated by first-principles calculations and experiments, revealing novel materials with potential high performance. Our framework provides a task-agnostic means for effective material recommendation and can be applied to various material systems.
△ Less
Submitted 19 May, 2023; v1 submitted 1 May, 2023;
originally announced May 2023.
Semantic classifier approach to document classification
Authors:
Piotr Borkowski,
Krzysztof Ciesielski,
Mieczysław A. Kłopotek
Abstract:
In this paper we propose a new document classification method, bridging discrepancies (so-called semantic gap) between the training set and the application sets of textual data. We demonstrate its superiority over classical text classification approaches, including traditional classifier ensembles. The method consists in combining a document categorization technique with a single classifier or a c…
▽ More
In this paper we propose a new document classification method, bridging discrepancies (so-called semantic gap) between the training set and the application sets of textual data. We demonstrate its superiority over classical text classification approaches, including traditional classifier ensembles. The method consists in combining a document categorization technique with a single classifier or a classifier ensemble (SEMCOM algorithm - Committee with Semantic Categorizer).
△ Less
Submitted 16 January, 2017;
originally announced January 2017.