-
Harnessing metastability for grain size control in multiprincipal element alloys during additive manufacturing
Authors:
Akane Wakai,
Jenniffer Bustillos,
Noah Sargent,
Jamesa Stokes,
Wei Xiong,
Timothy M. Smith,
Atieh Moridi
Abstract:
Controlling microstructure in fusion-based metal additive manufacturing (AM) remains a challenge due to numerous parameters directly impacting solidification conditions. Multiprincipal element alloys (MPEAs) offer a vast compositional design space for microstructural engineering due to their chemical complexity and exceptional properties. Here, we establish a novel alloy design paradigm in MPEAs f…
▽ More
Controlling microstructure in fusion-based metal additive manufacturing (AM) remains a challenge due to numerous parameters directly impacting solidification conditions. Multiprincipal element alloys (MPEAs) offer a vast compositional design space for microstructural engineering due to their chemical complexity and exceptional properties. Here, we establish a novel alloy design paradigm in MPEAs for AM using the FeMnCoCr system. By exploiting the decreasing phase stability with increasing Mn content, we achieve notable grain refinement and breakdown of columnar grain growth. We combine thermodynamic modeling, operando synchrotron X-ray diffraction, multiscale microstructural characterization, and mechanical testing to gain insight into the solidification physics and its ramifications on the resulting microstructure. This work paves way for tailoring grain sizes through targeted manipulation of phase stability, thereby advancing microstructure control in AM.
△ Less
Submitted 6 May, 2024;
originally announced May 2024.
-
Limits of dispersoid size and number density in oxide dispersion strengthened alloys fabricated with powder bed fusion-laser beam
Authors:
Nathan A. Wassermann,
Yongchang Li,
Alexander J. Myers,
Christopher A. Kantzos,
Timothy M. Smith,
Jack L. Beuth,
Jonathan A. Malen,
Lin Shao,
Alan J. H. McGaughey,
Sneha P. Narra
Abstract:
Previous work on additively-manufactured oxide dispersion strengthened alloys focused on experimental approaches, resulting in larger dispersoid sizes and lower number densities than can be achieved with conventional powder metallurgy. To improve the as-fabricated microstructure, this work integrates experiments with a thermodynamic and kinetic modeling framework to probe the limits of the dispers…
▽ More
Previous work on additively-manufactured oxide dispersion strengthened alloys focused on experimental approaches, resulting in larger dispersoid sizes and lower number densities than can be achieved with conventional powder metallurgy. To improve the as-fabricated microstructure, this work integrates experiments with a thermodynamic and kinetic modeling framework to probe the limits of the dispersoid sizes and number densities that can be achieved with powder bed fusion-laser beam. Bulk samples of a Ni-20Cr $+$ 1 wt.% Y$_2$O$_3$ alloy are fabricated using a range of laser power and scanning velocity combinations. Scanning transmission electron microscopy characterization is performed to quantify the dispersoid size distributions across the processing space. The smallest mean dispersoid diameter (29 nm) is observed at 300 W and 1200 mm/s, with a number density of 1.0$\times$10$^{20}$ m$^{-3}$. The largest mean diameter (72 nm) is observed at 200 W and 200 mm/s, with a number density of 1.5$\times$10$^{19}$ m$^{-3}$. Scanning electron microscopy suggests that a considerable fraction of the oxide added to the feedstock is lost during processing, due to oxide agglomeration and the ejection of oxide-rich spatter from the melt pool. After accounting for these losses, the model predictions for the dispersoid diameter and number density align with the experimental trends. The results suggest that the mechanism that limits the final number density is collision coarsening of dispersoids in the melt pool. The modeling framework is leveraged to propose processing strategies to limit dispersoid size and increase number density.
△ Less
Submitted 16 January, 2024; v1 submitted 18 October, 2023;
originally announced October 2023.
-
Explaining Reinforcement Learning with Shapley Values
Authors:
Daniel Beechey,
Thomas M. S. Smith,
Özgür Şimşek
Abstract:
For reinforcement learning systems to be widely adopted, their users must understand and trust them. We present a theoretical analysis of explaining reinforcement learning using Shapley values, following a principled approach from game theory for identifying the contribution of individual players to the outcome of a cooperative game. We call this general framework Shapley Values for Explaining Rei…
▽ More
For reinforcement learning systems to be widely adopted, their users must understand and trust them. We present a theoretical analysis of explaining reinforcement learning using Shapley values, following a principled approach from game theory for identifying the contribution of individual players to the outcome of a cooperative game. We call this general framework Shapley Values for Explaining Reinforcement Learning (SVERL). Our analysis exposes the limitations of earlier uses of Shapley values in reinforcement learning. We then develop an approach that uses Shapley values to explain agent performance. In a variety of domains, SVERL produces meaningful explanations that match and supplement human intuition.
△ Less
Submitted 9 June, 2023;
originally announced June 2023.
-
arXiv:2304.06869
[pdf]
cond-mat.mtrl-sci
cond-mat.other
physics.app-ph
physics.chem-ph
physics.comp-ph
Energy-composition relations in Ni$_3$(Al$_{1-x}$X$_x$) phases
Authors:
Nikolai A. Zarkevich,
Timothy M. Smith,
John W. Lawson
Abstract:
The secondary phase, such as Ni$_3$Al-based $L1_2$ $γ^\prime$, is crucially important for precipitation strengthening of superalloys. Composition-structure-property relations provide useful insights for guided alloy design. Here we use density functional theory combined with the multiple scattering theory to compute dependencies of the structural energies and equilibrium volumes versus composition…
▽ More
The secondary phase, such as Ni$_3$Al-based $L1_2$ $γ^\prime$, is crucially important for precipitation strengthening of superalloys. Composition-structure-property relations provide useful insights for guided alloy design. Here we use density functional theory combined with the multiple scattering theory to compute dependencies of the structural energies and equilibrium volumes versus composition for ternary Ni$_3$(Al$_{1-x}$X$_x$) alloys with X=(Ti, Zr, Hf; V, Nb, Ta; Cr, Mo, W) in $L1_2$, $D0_{24}$, and $D0_{19}$ phases with a homogeneous chemical disorder on the (Al$_{1-x}$X$_x$) sublattice. Our results provide a better understanding of the physics in Ni$_3$Al-based precipitates and facilitate design of next-generation nickel superalloys with precipitation strengthening.
△ Less
Submitted 2 June, 2023; v1 submitted 13 April, 2023;
originally announced April 2023.
-
Energy landscape in NiCoCr-based middle-entropy alloys
Authors:
Nikolai A. Zarkevich,
Timothy M. Smith,
John W. Lawson
Abstract:
NiCoCr middle-entropy alloy is known for its exceptional strength at both low and elevated operating temperatures. Mechanical properties of NiCoCr-based alloys are affected by certain features of the energy landscape, such as the energy difference between the hcp and fcc phases (which is known to correlate with the stacking fault energy in the fcc phase) and curvature of the energy surface. We com…
▽ More
NiCoCr middle-entropy alloy is known for its exceptional strength at both low and elevated operating temperatures. Mechanical properties of NiCoCr-based alloys are affected by certain features of the energy landscape, such as the energy difference between the hcp and fcc phases (which is known to correlate with the stacking fault energy in the fcc phase) and curvature of the energy surface. We compute formation energies in the Ni-Co-Cr ternary and related quaternary systems and investigate dependences of the relative energies on composition. Such computed composition-structure-property relations can be useful for tuning composition and designing next-generation alloys with improved strength.
△ Less
Submitted 13 April, 2023;
originally announced April 2023.
-
High Rayleigh number variational multiscale large eddy simulations of Rayleigh-Bénard Convection
Authors:
David Sondak,
Thomas M. Smith,
Roger P. Pawlowski,
Sidafa Conde,
John N. Shadid
Abstract:
The variational multiscale (VMS) formulation is used to develop residual-based VMS large eddy simulation (LES) models for Rayleigh-Bénard convection. The resulting model is a mixed model that incorporates the VMS model and an eddy viscosity model. The Wall-Adapting Local Eddy-viscosity (WALE) model is used as the eddy viscosity model in this work. The new LES models were implemented in the finite…
▽ More
The variational multiscale (VMS) formulation is used to develop residual-based VMS large eddy simulation (LES) models for Rayleigh-Bénard convection. The resulting model is a mixed model that incorporates the VMS model and an eddy viscosity model. The Wall-Adapting Local Eddy-viscosity (WALE) model is used as the eddy viscosity model in this work. The new LES models were implemented in the finite element code Drekar. Simulations are performed using continuous, piecewise linear finite elements. The simulations ranged from $Ra = 10^6$ to $Ra = 10^{14}$ and were conducted at $Pr = 1$ and $Pr = 7$. Two domains were considered: a two-dimensional domain of aspect ratio 2 with a fluid confined between two parallel plates and a three-dimensional cylinder of aspect ratio $1/4$. The Nusselt number from the VMS results is compared against three dimensional direct numerical simulations and experiments. In all cases, the VMS results are in good agreement with existing literature.
△ Less
Submitted 20 May, 2020;
originally announced May 2020.
-
The MOMMS Family of Matrix Multiplication Algorithms
Authors:
Tyler M. Smith,
Robert A. van de Geijn
Abstract:
As the ratio between the rate of computation and rate with which data can be retrieved from various layers of memory continues to deteriorate, a question arises: Will the current best algorithms for computing matrix-matrix multiplication on future CPUs continue to be (near) optimal? This paper provides compelling analytical and empirical evidence that the answer is "no". The analytical results gui…
▽ More
As the ratio between the rate of computation and rate with which data can be retrieved from various layers of memory continues to deteriorate, a question arises: Will the current best algorithms for computing matrix-matrix multiplication on future CPUs continue to be (near) optimal? This paper provides compelling analytical and empirical evidence that the answer is "no". The analytical results guide us to a new family of algorithms of which the current state-of-the-art "Goto's algorithm" is but one member. The empirical results, on architectures that were custom built to reduce the amount of bandwidth to main memory, show that under different circumstances, different and particular members of the family become more superior. Thus, this family will likely start playing a prominent role going forward.
△ Less
Submitted 11 April, 2019;
originally announced April 2019.
-
A Tight I/O Lower Bound for Matrix Multiplication
Authors:
Tyler Michael Smith,
Bradley Lowery,
Julien Langou,
Robert A. van de Geijn
Abstract:
A tight lower bound for required I/O when computing an ordinary matrix-matrix multiplication on a processor with two layers of memory is established. Prior work obtained weaker lower bounds by reasoning about the number of segments needed to perform $C:=AB$, for distinct matrices $A$, $B$, and $C$, where each segment is a series of operations involving $M$ reads and writes to and from fast memory,…
▽ More
A tight lower bound for required I/O when computing an ordinary matrix-matrix multiplication on a processor with two layers of memory is established. Prior work obtained weaker lower bounds by reasoning about the number of segments needed to perform $C:=AB$, for distinct matrices $A$, $B$, and $C$, where each segment is a series of operations involving $M$ reads and writes to and from fast memory, and $M$ is the size of fast memory. A lower bound on the number of segments was then determined by obtaining an upper bound on the number of elementary multiplications performed per segment. This paper follows the same high level approach, but improves the lower bound by (1) transforming algorithms for MMM so that they perform all computation via fused multiply-add instructions (FMAs) and using this to reason about only the cost associated with reading the matrices, and (2) decoupling the per-segment I/O cost from the size of fast memory. For $n \times n$ matrices, the lower bound's leading-order term is $2n^3/\sqrt{M}$. A theoretical algorithm whose leading terms attains this is introduced. To what extent the state-of-the-art Goto's Algorithm attains the lower bound is discussed.
△ Less
Submitted 6 February, 2019; v1 submitted 3 February, 2017;
originally announced February 2017.
-
Automating the Last-Mile for High Performance Dense Linear Algebra
Authors:
Richard Michael Veras,
Tze Meng Low,
Tyler Michael Smith,
Robert van de Geijn,
Franz Franchetti
Abstract:
High performance dense linear algebra (DLA) libraries often rely on a general matrix multiply (Gemm) kernel that is implemented using assembly or with vector intrinsics. In particular, the real-valued Gemm kernels provide the overwhelming fraction of performance for the complex-valued Gemm kernels, along with the entire level-3 BLAS and many of the real and complex LAPACK routines. Thus,achieving…
▽ More
High performance dense linear algebra (DLA) libraries often rely on a general matrix multiply (Gemm) kernel that is implemented using assembly or with vector intrinsics. In particular, the real-valued Gemm kernels provide the overwhelming fraction of performance for the complex-valued Gemm kernels, along with the entire level-3 BLAS and many of the real and complex LAPACK routines. Thus,achieving high performance for the Gemm kernel translates into a high performance linear algebra stack above this kernel. However, it is a monumental task for a domain expert to manually implement the kernel for every library-supported architecture. This leads to the belief that the craft of a Gemm kernel is more dark art than science. It is this premise that drives the popularity of autotuning with code generation in the domain of DLA.
This paper, instead, focuses on an analytical approach to code generation of the Gemm kernel for different architecture, in order to shed light on the details or voo-doo required for implementing a high performance Gemm kernel. We distill the implementation of the kernel into an even smaller kernel, an outer-product, and analytically determine how available SIMD instructions can be used to compute the outer-product efficiently. We codify this approach into a system to automatically generate a high performance SIMD implementation of the Gemm kernel. Experimental results demonstrate that our approach yields generated kernels with performance that is competitive with kernels implemented manually or using empirical search.
△ Less
Submitted 28 April, 2017; v1 submitted 23 November, 2016;
originally announced November 2016.
-
Implementing Strassen's Algorithm with BLIS
Authors:
Jianyu Huang,
Tyler M. Smith,
Greg M. Henry,
Robert A. van de Geijn
Abstract:
We dispel with "street wisdom" regarding the practical implementation of Strassen's algorithm for matrix-matrix multiplication (DGEMM). Conventional wisdom: it is only practical for very large matrices. Our implementation is practical for small matrices. Conventional wisdom: the matrices being multiplied should be relatively square. Our implementation is practical for rank-k updates, where k is re…
▽ More
We dispel with "street wisdom" regarding the practical implementation of Strassen's algorithm for matrix-matrix multiplication (DGEMM). Conventional wisdom: it is only practical for very large matrices. Our implementation is practical for small matrices. Conventional wisdom: the matrices being multiplied should be relatively square. Our implementation is practical for rank-k updates, where k is relatively small (a shape of importance for libraries like LAPACK). Conventional wisdom: it inherently requires substantial workspace. Our implementation requires no workspace beyond buffers already incorporated into conventional high-performance DGEMM implementations. Conventional wisdom: a Strassen DGEMM interface must pass in workspace. Our implementation requires no such workspace and can be plug-compatible with the standard DGEMM interface. Conventional wisdom: it is hard to demonstrate speedup on multi-core architectures. Our implementation demonstrates speedup over conventional DGEMM even on an Intel(R) Xeon Phi(TM) coprocessor utilizing 240 threads. We show how a distributed memory matrix-matrix multiplication also benefits from these advances.
△ Less
Submitted 3 May, 2016;
originally announced May 2016.
-
A new class of finite element variational multiscale turbulence models for incompressible magnetohydrodynamics
Authors:
David Sondak,
John N. Shadid,
Assad A. Oberai,
Roger P. Pawlowski,
Eric C. Cyr,
Tom M. Smith
Abstract:
New large eddy simulation (LES) turbulence models for incompressible magnetohydrodynamics (MHD) derived from the variational multiscale (VMS) formulation for finite element simulations are introduced. The new models include the variational multiscale formulation, a residual-based eddy viscosity model, and a mixed model that combines both of these component models. Each model contains terms that ar…
▽ More
New large eddy simulation (LES) turbulence models for incompressible magnetohydrodynamics (MHD) derived from the variational multiscale (VMS) formulation for finite element simulations are introduced. The new models include the variational multiscale formulation, a residual-based eddy viscosity model, and a mixed model that combines both of these component models. Each model contains terms that are proportional to the residual of the incompressible MHD equations and is therefore numerically consistent. Moreover, each model is also dynamic, in that its effect vanishes when this residual is small. The new models are tested on the decaying MHD Taylor Green vortex at low and high Reynolds numbers. The evaluation of the models is based on comparisons with available data from direct numerical simulations (DNS) of the time evolution of energies as well as energy spectra at various discrete times. A numerical study, on a sequence of meshes, is presented that demonstrates that the large eddy simulation approaches the DNS solution for these quantities with spatial mesh refinement.
△ Less
Submitted 2 December, 2014;
originally announced December 2014.