-
The infrastructure powering IBM's Gen AI model development
Authors:
Talia Gershon,
Seetharami Seelam,
Brian Belgodere,
Milton Bonilla,
Lan Hoang,
Danny Barnett,
I-Hsin Chung,
Apoorve Mohan,
Ming-Hung Chen,
Lixiang Luo,
Robert Walkup,
Constantinos Evangelinos,
Shweta Salaria,
Marc Dombrowa,
Yoonho Park,
Apo Kayi,
Liran Schour,
Alim Alim,
Ali Sydney,
Pavlos Maniotis,
Laurent Schares,
Bernard Metzler,
Bengi Karacali-Akyamac,
Sophia Wen,
Tatsuhiro Chiba
, et al. (121 additional authors not shown)
Abstract:
AI Infrastructure plays a key role in the speed and cost-competitiveness of develo** and deploying advanced AI models. The current demand for powerful AI infrastructure for model training is driven by the emergence of generative AI and foundational models, where on occasion thousands of GPUs must cooperate on a single training job for the model to be trained in a reasonable time. Delivering effi…
▽ More
AI Infrastructure plays a key role in the speed and cost-competitiveness of develo** and deploying advanced AI models. The current demand for powerful AI infrastructure for model training is driven by the emergence of generative AI and foundational models, where on occasion thousands of GPUs must cooperate on a single training job for the model to be trained in a reasonable time. Delivering efficient and high-performing AI training requires an end-to-end solution that combines hardware, software and holistic telemetry to cater for multiple types of AI workloads. In this report, we describe IBM's hybrid cloud infrastructure that powers our generative AI model development. This infrastructure includes (1) Vela: an AI-optimized supercomputing capability directly integrated into the IBM Cloud, delivering scalable, dynamic, multi-tenant and geographically distributed infrastructure for large-scale model training and other AI workflow steps and (2) Blue Vela: a large-scale, purpose-built, on-premises hosting environment that is optimized to support our largest and most ambitious AI model training tasks. Vela provides IBM with the dual benefit of high performance for internal use along with the flexibility to adapt to an evolving commercial landscape. Blue Vela provides us with the benefits of rapid development of our largest and most ambitious models, as well as future-proofing against the evolving model landscape in the industry. Taken together, they provide IBM with the ability to rapidly innovate in the development of both AI models and commercial offerings.
△ Less
Submitted 7 July, 2024;
originally announced July 2024.
-
Understanding first order Raman spectra of boron carbides across the homogeneity range
Authors:
Guido Roma,
Kevin Gillet,
Antoine Jay,
Nathalie Vast,
Gaƫlle Gutierrez
Abstract:
Boron carbide, a lightweight, high temperature material, has various applications as a structural material and as a neutron absorber. The large solubility range of carbon in boron, between $\approx$ 9% and 20%, stems from the thermodynamical stability of three icosahedral phases at low temperature, with respective carbon atomic concentrations: 8.7% (B$_{10.5}$C, named OPO$_1$), 13.0 \% (B$_{6.7}$C…
▽ More
Boron carbide, a lightweight, high temperature material, has various applications as a structural material and as a neutron absorber. The large solubility range of carbon in boron, between $\approx$ 9% and 20%, stems from the thermodynamical stability of three icosahedral phases at low temperature, with respective carbon atomic concentrations: 8.7% (B$_{10.5}$C, named OPO$_1$), 13.0 \% (B$_{6.7}$C, named OPO$_2$), whose theoretical Raman spectra are still unknown, and 20% (B$_4$C), from which the nature of some of the Raman peaks are still debated. We report theoretical and experimental results of the first order, non-resonant, Raman spectrum of boron carbide. Density functional perturbation theory enables us to obtain the Raman spectra of the OPO$_1$ and OPO$_2$ phases, which are perfectly ordered structures with a complex crystalline motif of 414 atoms, due to charge compensation effects. Moreover, for the carbon-rich B$_4$C, with a simpler 15-atom unit cell, we study the influence of the low energy point defects and of their concentrations on the Raman spectrum, in connection with experiments, thus providing insights into the sensitivity of experime ntal spectra to sample preparation, experimental conditions and setup. In particular, this enables us to propose a new structure at 19.2% atomic carbon concentration, B$_{4.2}$C, that lies very close to the convex hull of boron carbide, on the carbon-rich side. This new phase, derived from what we name the "3+1" defect complex, helps in reconciling the experimentally observed Raman spectrum with the theory around 1000 cm$^{-1}$. Finally, we predict the intensity variations induced by the experimental geometry and quantitavely assess the localisation of bulk and defect vibrational modes and their character, with an analysis of "chain" and "icosahedral" modes.
△ Less
Submitted 5 May, 2021;
originally announced May 2021.
-
Atomic scale mechanisms controlling the oxidation of polyethylene: a first principles study
Authors:
Yunho Ahn,
Xavier Colin,
Guido Roma
Abstract:
Understanding the degradation mechanisms of aliphatic polymers by thermal oxidation and radio-oxidation is very important in order to assess their lifetime in a variety of industrial applications. We focus here on polyethylene as a prototypical aliphatic polymer. Kinetic models describing the time evolution of the concentration of chain defects and radicals species in the material identify a relev…
▽ More
Understanding the degradation mechanisms of aliphatic polymers by thermal oxidation and radio-oxidation is very important in order to assess their lifetime in a variety of industrial applications. We focus here on polyethylene as a prototypical aliphatic polymer. Kinetic models describing the time evolution of the concentration of chain defects and radicals species in the material identify a relevant step in the formation and subsequent decomposition of transient hydroperoxides species, finally leading to carbonyl defects, in particular ketones. In this paper we first summarize the most relevant mechanistic paths proposed in the literature for hydroperoxide formation and decomposition and, second, revisit them using first principles calculations based on Density Functional Theory (DFT). We investigate the reaction paths for several chemical reactions, for both isolated alkane molecules and a crystalline model of polyethylene, and confirm, in some cases, the accepted activation energies; in some other cases, we challenge the accepted view finding alternative, more favourable, reaction paths for which we estimate the activation energy. We highlight the infuence of the environment -- crystalline or not -- on the outcome of some of the studied chemical reactions. A remarkable results of our calculations is that hydroxyl radicals play an important role in the decomposition of hydroperoxides. Based on our findings, it should be possible to improve the set of equations and parameters used in current kinetic simulations of polyethylene radio-oxidation.
△ Less
Submitted 7 June, 2021; v1 submitted 12 February, 2021;
originally announced February 2021.
-
From latent ferroelectricity to hyperferroelectricity in alkali lead halide perovskites
Authors:
Guido Roma,
Arthur Marronnier,
Jacky Even
Abstract:
Using first principles calculations we show that several alkali lead halides potentially present collective ferroelectric polarization. This should occur at least at the nanoscale; it could be detected macroscopically provided it is not concealed by lattice vibrations in the temperature range of stability of the cubic perovskite phase. For potassium lead halides and for alkali lead fluorides, rema…
▽ More
Using first principles calculations we show that several alkali lead halides potentially present collective ferroelectric polarization. This should occur at least at the nanoscale; it could be detected macroscopically provided it is not concealed by lattice vibrations in the temperature range of stability of the cubic perovskite phase. For potassium lead halides and for alkali lead fluorides, remarkably, the ferroelectric behavior turns hyper-ferroelectric, suggesting a more robust ferroelectric polarization in spite of depolarization potentials induced by charge accumulation on surfaces or interfaces.
△ Less
Submitted 6 March, 2020; v1 submitted 24 January, 2020;
originally announced January 2020.
-
Influence of Disorder and Anharmonic Fluctuations on the Dynamical Rashba Effect in Purely Inorganic Lead-Halide Perovskites
Authors:
Arthur Marronnier,
Guido Roma,
Marcelo Carignano,
Yvan Bonnassieux,
Claudine Katan,
Jacky Even,
Edoardo Mosconi,
Filippo De Angelis
Abstract:
Do** organic metal-halide perovskites with cesium could be the best solution to stabilize highly-efficient perovskite solar cells. The understanding of the respective roles of the organic molecule, on one hand, and the inorganic lattice, on the other, is thus crucial in order to be able to optimize the physical properties of the mixed-cation structures. In particular, the study of the recombinat…
▽ More
Do** organic metal-halide perovskites with cesium could be the best solution to stabilize highly-efficient perovskite solar cells. The understanding of the respective roles of the organic molecule, on one hand, and the inorganic lattice, on the other, is thus crucial in order to be able to optimize the physical properties of the mixed-cation structures. In particular, the study of the recombination mechanisms is thought to be one of the key challenges towards full comprehension of their working principles. Using molecular dynamics and frozen phonons, we evidence sub-picosecond anharmonic fluctuations in the fully inorganic $CsPbI_3$ perovskite. We reveal the effect of these fluctuations, combined with spin-orbit coupling, on the electronic band structure, evidencing a dynamical Rashba effect. Our study show that under certain conditions space disorder can quench the Rashba effect. As for time disorder, we evidence a dynamical Rashba effect which is similar to what was found for $MAPbI_3$ and which is still sizable despite temperature disorder, the large investigated supercell, and the absence of the organic cations' motion. We show that the spin texture associated to the Rashba splitting cannot be deemed responsible for a consistent reduction of recombination rates, although the spin mismatch between valence and conduction band increases with the ferroelectric distortion causing the Rashba splitting.
△ Less
Submitted 23 November, 2018; v1 submitted 26 October, 2018;
originally announced October 2018.
-
Discriminative Enhancement for Single Channel Audio Source Separation using Deep Neural Networks
Authors:
Emad M. Grais,
Gerard Roma,
Andrew J. R. Simpson,
Mark D. Plumbley
Abstract:
The sources separated by most single channel audio source separation techniques are usually distorted and each separated source contains residual signals from the other sources. To tackle this problem, we propose to enhance the separated sources to decrease the distortion and interference between the separated sources using deep neural networks (DNNs). Two different DNNs are used in this work. The…
▽ More
The sources separated by most single channel audio source separation techniques are usually distorted and each separated source contains residual signals from the other sources. To tackle this problem, we propose to enhance the separated sources to decrease the distortion and interference between the separated sources using deep neural networks (DNNs). Two different DNNs are used in this work. The first DNN is used to separate the sources from the mixed signal. The second DNN is used to enhance the separated signals. To consider the interactions between the separated sources, we propose to use a single DNN to enhance all the separated sources together. To reduce the residual signals of one source from the other separated sources (interference), we train the DNN for enhancement discriminatively to maximize the dissimilarity between the predicted sources. The experimental results show that using discriminative enhancement decreases the distortion and interference between the separated sources.
△ Less
Submitted 20 December, 2016; v1 submitted 6 September, 2016;
originally announced September 2016.
-
Deep Remix: Remixing Musical Mixtures Using a Convolutional Deep Neural Network
Authors:
Andrew J. R Simpson,
Gerard Roma,
Mark D. Plumbley
Abstract:
Audio source separation is a difficult machine learning problem and performance is measured by comparing extracted signals with the component source signals. However, if separation is motivated by the ultimate goal of re-mixing then complete separation is not necessary and hence separation difficulty and separation quality are dependent on the nature of the re-mix. Here, we use a convolutional dee…
▽ More
Audio source separation is a difficult machine learning problem and performance is measured by comparing extracted signals with the component source signals. However, if separation is motivated by the ultimate goal of re-mixing then complete separation is not necessary and hence separation difficulty and separation quality are dependent on the nature of the re-mix. Here, we use a convolutional deep neural network (DNN), trained to estimate 'ideal' binary masks for separating voice from music, to perform re-mixing of the vocal balance by operating directly on the individual magnitude components of the musical mixture spectrogram. Our results demonstrate that small changes in vocal gain may be applied with very little distortion to the ultimate re-mix. Our method may be useful for re-mixing existing mixes.
△ Less
Submitted 1 May, 2015;
originally announced May 2015.
-
Deep Karaoke: Extracting Vocals from Musical Mixtures Using a Convolutional Deep Neural Network
Authors:
Andrew J. R. Simpson,
Gerard Roma,
Mark D. Plumbley
Abstract:
Identification and extraction of singing voice from within musical mixtures is a key challenge in source separation and machine audition. Recently, deep neural networks (DNN) have been used to estimate 'ideal' binary masks for carefully controlled cocktail party speech separation problems. However, it is not yet known whether these methods are capable of generalizing to the discrimination of voice…
▽ More
Identification and extraction of singing voice from within musical mixtures is a key challenge in source separation and machine audition. Recently, deep neural networks (DNN) have been used to estimate 'ideal' binary masks for carefully controlled cocktail party speech separation problems. However, it is not yet known whether these methods are capable of generalizing to the discrimination of voice and non-voice in the context of musical mixtures. Here, we trained a convolutional DNN (of around a billion parameters) to provide probabilistic estimates of the ideal binary mask for separation of vocal sounds from real-world musical mixtures. We contrast our DNN results with more traditional linear methods. Our approach may be useful for automatic removal of vocal sounds from musical mixtures for 'karaoke' type applications.
△ Less
Submitted 17 April, 2015;
originally announced April 2015.
-
Energetics and metastability of the silicon vacancy in cubic SiC
Authors:
Fabien Bruneval,
Guido Roma
Abstract:
The silicon vacancy is a prominent intrinsic defect of cubic SiC (3C-SiC) to which much effort has been devoted so far, experimentally and theoretically. We calculate its properties using the GW approximation that does not suffer from the band gap problem. The obtained formation and transition energies deviate significantly from the usual density functional theory evaluations and now compare favor…
▽ More
The silicon vacancy is a prominent intrinsic defect of cubic SiC (3C-SiC) to which much effort has been devoted so far, experimentally and theoretically. We calculate its properties using the GW approximation that does not suffer from the band gap problem. The obtained formation and transition energies deviate significantly from the usual density functional theory evaluations and now compare favorably with experiment. A new assignment for the main line of photoluminescence is then proposed. We further perform GW calculations for the saddle point of reaction paths. The resulting barrier energies explain the thermal annealing experiments thanks to an original mechanism mediated by a minority charge configuration.
△ Less
Submitted 14 March, 2011; v1 submitted 4 January, 2011;
originally announced January 2011.
-
On the Replica Approach to Spin Glass
Authors:
Giorgio Parisi Dipartimento di fisica Roma I
Abstract:
In this talk I will review the approach to spin glasses based on the spontaneously broken replica symmetry. I will concentrate my attention mostly on more general ideas, skip** technical details and stressing the characteristic predictions of this approach. After the introduction of the replica method, the predicted structure of states is investigated in details, paying a particular attention…
▽ More
In this talk I will review the approach to spin glasses based on the spontaneously broken replica symmetry. I will concentrate my attention mostly on more general ideas, skip** technical details and stressing the characteristic predictions of this approach. After the introduction of the replica method, the predicted structure of states is investigated in details, paying a particular attention to the local overlaps and to the structure of the clusters. I will finally study the behaviour of the system near the lower critical dimension and I will show that the technique of coupling real replicas is able to give relevant information.
△ Less
Submitted 1 December, 1994;
originally announced December 1994.