-
Addressing materials' microstructure diversity using transfer learning
Authors:
Aurèle Goetz,
Ali Riza Durmaz,
Martin Müller,
Akhil Thomas,
Dominik Britz,
Pierre Kerfriden,
Chris Eberl
Abstract:
Materials' microstructures are signatures of their alloying composition and processing history. Therefore, microstructures exist in a wide variety. As materials become increasingly complex to comply with engineering demands, advanced computer vision (CV) approaches such as deep learning (DL) inevitably gain relevance for quantifying microstrucutures' constituents from micrographs. While DL can out…
▽ More
Materials' microstructures are signatures of their alloying composition and processing history. Therefore, microstructures exist in a wide variety. As materials become increasingly complex to comply with engineering demands, advanced computer vision (CV) approaches such as deep learning (DL) inevitably gain relevance for quantifying microstrucutures' constituents from micrographs. While DL can outperform classical CV techniques for many tasks, shortcomings are poor data efficiency and generalizability across datasets. This is inherently in conflict with the expense associated with annotating materials data through experts and extensive materials diversity. To tackle poor domain generalizability and the lack of labeled data simultaneously, we propose to apply a sub-class of transfer learning methods called unsupervised domain adaptation (UDA). These algorithms address the task of finding domain-invariant features when supplied with annotated source data and unannotated target data, such that performance on the latter distribution is optimized despite the absence of annotations. Exemplarily, this study is conducted on a lath-shaped bainite segmentation task in complex phase steel micrographs. Here, the domains to bridge are selected to be different metallographic specimen preparations (surface etchings) and distinct imaging modalities. We show that a state-of-the-art UDA approach surpasses the naïve application of source domain trained models on the target domain (generalization baseline) to a large extent. This holds true independent of the domain shift, despite using little data, and even when the baseline models were pre-trained or employed data augmentation. Through UDA, mIoU was improved over generalization baselines from 82.2%, 61.0%, 49.7% to 84.7%, 67.3%, 73.3% on three target datasets, respectively. This underlines this techniques' potential to cope with materials variance.
△ Less
Submitted 29 July, 2021;
originally announced July 2021.
-
Pommerman: A Multi-Agent Playground
Authors:
Cinjon Resnick,
Wes Eldridge,
David Ha,
Denny Britz,
Jakob Foerster,
Julian Togelius,
Kyunghyun Cho,
Joan Bruna
Abstract:
We present Pommerman, a multi-agent environment based on the classic console game Bomberman. Pommerman consists of a set of scenarios, each having at least four players and containing both cooperative and competitive aspects. We believe that success in Pommerman will require a diverse set of tools and methods, including planning, opponent/teammate modeling, game theory, and communication, and cons…
▽ More
We present Pommerman, a multi-agent environment based on the classic console game Bomberman. Pommerman consists of a set of scenarios, each having at least four players and containing both cooperative and competitive aspects. We believe that success in Pommerman will require a diverse set of tools and methods, including planning, opponent/teammate modeling, game theory, and communication, and consequently can serve well as a multi-agent benchmark. To date, we have already hosted one competition, and our next one will be featured in the NIPS 2018 competition track.
△ Less
Submitted 21 April, 2022; v1 submitted 19 September, 2018;
originally announced September 2018.
-
JESC: Japanese-English Subtitle Corpus
Authors:
Reid Pryzant,
Yongjoo Chung,
Dan Jurafsky,
Denny Britz
Abstract:
In this paper we describe the Japanese-English Subtitle Corpus (JESC). JESC is a large Japanese-English parallel corpus covering the underrepresented domain of conversational dialogue. It consists of more than 3.2 million examples, making it the largest freely available dataset of its kind. The corpus was assembled by crawling and aligning subtitles found on the web. The assembly process incorpora…
▽ More
In this paper we describe the Japanese-English Subtitle Corpus (JESC). JESC is a large Japanese-English parallel corpus covering the underrepresented domain of conversational dialogue. It consists of more than 3.2 million examples, making it the largest freely available dataset of its kind. The corpus was assembled by crawling and aligning subtitles found on the web. The assembly process incorporates a number of novel preprocessing elements to ensure high monolingual fluency and accurate bilingual alignments. We summarize its contents and evaluate its quality using human experts and baseline machine translation (MT) systems.
△ Less
Submitted 21 February, 2018; v1 submitted 29 October, 2017;
originally announced October 2017.
-
Efficient Attention using a Fixed-Size Memory Representation
Authors:
Denny Britz,
Melody Y. Guan,
Minh-Thang Luong
Abstract:
The standard content-based attention mechanism typically used in sequence-to-sequence models is computationally expensive as it requires the comparison of large encoder and decoder states at each time step. In this work, we propose an alternative attention mechanism based on a fixed size memory representation that is more efficient. Our technique predicts a compact set of K attention contexts duri…
▽ More
The standard content-based attention mechanism typically used in sequence-to-sequence models is computationally expensive as it requires the comparison of large encoder and decoder states at each time step. In this work, we propose an alternative attention mechanism based on a fixed size memory representation that is more efficient. Our technique predicts a compact set of K attention contexts during encoding and lets the decoder compute an efficient lookup that does not need to consult the memory. We show that our approach performs on-par with the standard attention mechanism while yielding inference speedups of 20% for real-world translation tasks and more for tasks with longer sequences. By visualizing attention scores we demonstrate that our models learn distinct, meaningful alignments.
△ Less
Submitted 1 July, 2017;
originally announced July 2017.
-
Advanced Steel Microstructural Classification by Deep Learning Methods
Authors:
Seyed Majid Azimi,
Dominik Britz,
Michael Engstler,
Mario Fritz,
Frank Mücklich
Abstract:
The inner structure of a material is called microstructure. It stores the genesis of a material and determines all its physical and chemical properties. While microstructural characterization is widely spread and well known, the microstructural classification is mostly done manually by human experts, which gives rise to uncertainties due to subjectivity. Since the microstructure could be a combina…
▽ More
The inner structure of a material is called microstructure. It stores the genesis of a material and determines all its physical and chemical properties. While microstructural characterization is widely spread and well known, the microstructural classification is mostly done manually by human experts, which gives rise to uncertainties due to subjectivity. Since the microstructure could be a combination of different phases or constituents with complex substructures its automatic classification is very challenging and only a few prior studies exist. Prior works focused on designed and engineered features by experts and classified microstructures separately from the feature extraction step. Recently, Deep Learning methods have shown strong performance in vision applications by learning the features from data together with the classification step. In this work, we propose a Deep Learning method for microstructural classification in the examples of certain microstructural constituents of low carbon steel. This novel method employs pixel-wise segmentation via Fully Convolutional Neural Networks (FCNN) accompanied by a max-voting scheme. Our system achieves 93.94% classification accuracy, drastically outperforming the state-of-the-art method of 48.89% accuracy. Beyond the strong performance of our method, this line of research offers a more robust and first of all objective way for the difficult task of steel quality appreciation.
△ Less
Submitted 15 February, 2018; v1 submitted 20 June, 2017;
originally announced June 2017.
-
Massive Exploration of Neural Machine Translation Architectures
Authors:
Denny Britz,
Anna Goldie,
Minh-Thang Luong,
Quoc Le
Abstract:
Neural Machine Translation (NMT) has shown remarkable progress over the past few years with production systems now being deployed to end-users. One major drawback of current architectures is that they are expensive to train, typically requiring days to weeks of GPU time to converge. This makes exhaustive hyperparameter search, as is commonly done with other neural network architectures, prohibitiv…
▽ More
Neural Machine Translation (NMT) has shown remarkable progress over the past few years with production systems now being deployed to end-users. One major drawback of current architectures is that they are expensive to train, typically requiring days to weeks of GPU time to converge. This makes exhaustive hyperparameter search, as is commonly done with other neural network architectures, prohibitively expensive. In this work, we present the first large-scale analysis of NMT architecture hyperparameters. We report empirical results and variance numbers for several hundred experimental runs, corresponding to over 250,000 GPU hours on the standard WMT English to German translation task. Our experiments lead to novel insights and practical advice for building and extending NMT architectures. As part of this contribution, we release an open-source NMT framework that enables researchers to easily experiment with novel techniques and reproduce state of the art results.
△ Less
Submitted 21 March, 2017; v1 submitted 10 March, 2017;
originally announced March 2017.
-
Generating High-Quality and Informative Conversation Responses with Sequence-to-Sequence Models
Authors:
Louis Shao,
Stephan Gouws,
Denny Britz,
Anna Goldie,
Brian Strope,
Ray Kurzweil
Abstract:
Sequence-to-sequence models have been applied to the conversation response generation problem where the source sequence is the conversation history and the target sequence is the response. Unlike translation, conversation responding is inherently creative. The generation of long, informative, coherent, and diverse responses remains a hard task. In this work, we focus on the single turn setting. We…
▽ More
Sequence-to-sequence models have been applied to the conversation response generation problem where the source sequence is the conversation history and the target sequence is the response. Unlike translation, conversation responding is inherently creative. The generation of long, informative, coherent, and diverse responses remains a hard task. In this work, we focus on the single turn setting. We add self-attention to the decoder to maintain coherence in longer responses, and we propose a practical approach, called the glimpse-model, for scaling to large datasets. We introduce a stochastic beam-search algorithm with segment-by-segment reranking which lets us inject diversity earlier in the generation process. We trained on a combined data set of over 2.3B conversation messages mined from the web. In human evaluation studies, our method produces longer responses overall, with a higher proportion rated as acceptable and excellent as length increases, compared to baseline sequence-to-sequence models with explicit length-promotion. A back-off strategy produces better responses overall, in the full spectrum of lengths.
△ Less
Submitted 31 July, 2017; v1 submitted 11 January, 2017;
originally announced January 2017.
-
Crystal field states of Kondo lattice heavy fermions CeRuSn3 and CeRhSn3
Authors:
V. K. Anand,
D. T. Adroja,
D. Britz,
A. M. Strydom,
J. W. Taylor,
W. Kockelmann
Abstract:
Inelastic neutron scattering experiments have been carried out to determine the crystal field states of the Kondo lattice heavy fermions CeRuSn3 and CeRhSn3. Both the compounds crystallize in LaRuSn3-type cubic structure (space group Pm-3n) in which the Ce atoms occupy two distinct crystallographic sites with cubic (m-3) and tetragonal (-4m.2) point symmetries. The INS data of CeRuSn3 reveal the p…
▽ More
Inelastic neutron scattering experiments have been carried out to determine the crystal field states of the Kondo lattice heavy fermions CeRuSn3 and CeRhSn3. Both the compounds crystallize in LaRuSn3-type cubic structure (space group Pm-3n) in which the Ce atoms occupy two distinct crystallographic sites with cubic (m-3) and tetragonal (-4m.2) point symmetries. The INS data of CeRuSn3 reveal the presence of a broad excitation centered around 6-8 meV which is accounted by a model based on crystal electric field (CEF) excitations. On the other hand, the INS data of isostructural CeRhSn3 reveal three CEF excitations around 7.0, 12.2 and 37.2 meV. The neutron intensity sum rule indicates that the Ce ions at both cubic and tetragonal Ce sites are in Ce3+ state in both CeRuSn3 and CeRhSn3. The CEF level schemes for both the compounds are deduced. We estimate the Kondo temperature T_K = 3.1(2) K for CeRuSn3 from neutron quasielastic linewidth in excellent agreement with that determined from the scaling of magnetoresistance which gives T_K = 3.2(1) K. For CeRhSn3 the neutron quasielastic linewidth gives T_K = 4.6 K. For both CeRuSn3 and CeRhSn3, the ground state of Ce3+ turns out to be a quartet for the cubic site and a doublet for the tetragonal site.
△ Less
Submitted 18 July, 2016;
originally announced July 2016.
-
Field-insensitive heavy fermion features and phase transition in the caged-structure quasi-skutterudite Sm$_3$Ru$_4$Ge$_{13}$
Authors:
Harikrishnan S. Nair,
Kumar K. Ramesh,
Douglas Britz,
Sarit K. Ghosh,
Christian Reinke,
A. M. Strydom
Abstract:
The robust field-insensitive heavy fermion features in Sm$_3$Ru$_4$Ge$_{13}$ and the magnetic phase transition at $T_N \approx$ 5~K are studied using magnetization $M(T)$, specific heat $C_p(T)$, resistivity $ρ(T)$ and thermal conductivity $κ_T(T)$. The average crystal structure of Sm$_3$Ru$_4$Ge$_{13}$ conforms to the cubic space group $Pm\bar{3}n$ however, signatures of subtle structural distort…
▽ More
The robust field-insensitive heavy fermion features in Sm$_3$Ru$_4$Ge$_{13}$ and the magnetic phase transition at $T_N \approx$ 5~K are studied using magnetization $M(T)$, specific heat $C_p(T)$, resistivity $ρ(T)$ and thermal conductivity $κ_T(T)$. The average crystal structure of Sm$_3$Ru$_4$Ge$_{13}$ conforms to the cubic space group $Pm\bar{3}n$ however, signatures of subtle structural distortions are obtained from the x ray data. The magnetic susceptibility, $χ(T)$, follows a modified Curie-Weiss law indicating the presence of crystal fields of Sm$^{3+}$ and the significance of van Vleck terms. No sign of ferromagnetism is observed in $M(H)$ of Sm$_3$Ru$_4$Ge$_{13}$ which yields only 0.025~$μ_\mathrm{B}$/f.u.-Sm at 2~K, 7~T. The Sommerfeld coefficient, $γ\approx$ 220~mJ/mol-Sm K$^2$, estimated from the analysis of low temperature specific heat suggests the formation of heavy quasi particles at low temperature. Though a ln$T$ dependence of $ρ(T)$ is observed till 60~K, the resistivity behavior is accounted for by assuming a two-band model for activated behavior of charge carriers. The field scans of resistivity, $ρ(H)$, below $T_N$ display significant nonlinearity while those above the $T_N$ are more metal-like. Low values of thermal conductivity, $κ_T(T)$, are observed in Sm$_3$Ru$_4$Ge$_{13}$ however, displaying an anomaly at $T_N$ which signifies magnetoelastic coupling. A fairly high value of Seebeck coefficient, $S \approx$ 40~$μ$V/K is observed at 300~K. We identify Sm$_3$Ru$_4$Ge$_{13}$ as a low charge carrier density system with unusual field-insensitive heavy fermion features very similar to the filled skutterudites.
△ Less
Submitted 2 February, 2016;
originally announced February 2016.
-
Physical properties of noncentrosymmetric superconductor LaIrSi3: A μSR study
Authors:
V. K. Anand,
D. Britz,
A. Bhattacharyya,
D. T. Adroja,
A. D. Hillier,
A. M. Strydom,
W. Kockelmann,
B. D. Rainford,
K. A. McEwen
Abstract:
The results of heat capacity C_p(T, H) and electrical resistivity ρ(T,H) measurements down to 0.35 K as well as muon spin relaxation and rotation (\muSR) measurements on a noncentrosymmetric superconductor LaIrSi3 are presented. Powder neutron diffraction confirmed the reported noncentrosymmetric body-centered tetragonal BaNiSn3-type structure (space group I4\,mm) of LaIrSi3. The bulk superconduct…
▽ More
The results of heat capacity C_p(T, H) and electrical resistivity ρ(T,H) measurements down to 0.35 K as well as muon spin relaxation and rotation (\muSR) measurements on a noncentrosymmetric superconductor LaIrSi3 are presented. Powder neutron diffraction confirmed the reported noncentrosymmetric body-centered tetragonal BaNiSn3-type structure (space group I4\,mm) of LaIrSi3. The bulk superconductivity is observed below T_c = 0.72(1) K. The intrinsic ΔC_e/γ_n T_c = 1.09(3) is significantly smaller than the BCS value of 1.43, and this reduction is accounted by the α-model of BCS superconductivity. The analysis of the superconducting state C_e(T) data by the single-band α-model indicates a moderately anisotropic order parameter with the s-wave gap Δ(0)/k_B T_c = 1.54(2) which is lower than the BCS value of 1.764. Our estimates of various normal and superconducting state parameters indicate a weakly coupled electron-phonon driven type-I s-wave superconductivity in LaIrSi3. The \muSR results also confirm the conventional type-I superconductivity in LaIrSi3 with a preserved time reversal symmetry and hence a singlet pairing superconducting ground state.
△ Less
Submitted 24 July, 2014;
originally announced July 2014.
-
Towards a fullerene-based quantum computer
Authors:
Simon C Benjamin,
Arzhang Ardavan,
G Andrew D Briggs,
David A Britz,
Daniel Gunlycke,
John Jefferson,
Mark A G Jones,
David F Leigh,
Brendon W Lovett,
Andrei N Khlobystov,
S A Lyon,
John J L Morton,
Kyriakos Porfyrakis,
Mark R Sambrook,
Alexei M Tyryshkin
Abstract:
Molecular structures appear to be natural candidates for a quantum technology: individual atoms can support quantum superpositions for long periods, and such atoms can in principle be embedded in a permanent molecular scaffolding to form an array. This would be true nanotechnology, with dimensions of order of a nanometre. However, the challenges of realising such a vision are immense. One must i…
▽ More
Molecular structures appear to be natural candidates for a quantum technology: individual atoms can support quantum superpositions for long periods, and such atoms can in principle be embedded in a permanent molecular scaffolding to form an array. This would be true nanotechnology, with dimensions of order of a nanometre. However, the challenges of realising such a vision are immense. One must identify a suitable elementary unit and demonstrate its merits for qubit storage and manipulation, including input / output. These units must then be formed into large arrays corresponding to an functional quantum architecture, including a mechanism for gate operations. Here we report our efforts, both experimental and theoretical, to create such a technology based on endohedral fullerenes or 'buckyballs'. We describe our successes with respect to these criteria, along with the obstacles we are currently facing and the questions that remain to be addressed.
△ Less
Submitted 21 November, 2005;
originally announced November 2005.