Scaling Studies for Efficient Parameter Search and Parallelism for Large Language Model Pre-training
Authors:
Michael Benington,
Leo Phan,
Chris Pierre Paul,
Evan Shoemaker,
Priyanka Ranade,
Torstein Collett,
Grant Hodgson Perez,
Christopher Krieger
Abstract:
AI accelerator processing capabilities and memory constraints largely dictate the scale in which machine learning workloads (e.g., training and inference) can be executed within a desirable time frame. Training a state of the art, transformer-based model today requires use of GPU-accelerated high performance computers with high-speed interconnects. As datasets and models continue to increase in si…
▽ More
AI accelerator processing capabilities and memory constraints largely dictate the scale in which machine learning workloads (e.g., training and inference) can be executed within a desirable time frame. Training a state of the art, transformer-based model today requires use of GPU-accelerated high performance computers with high-speed interconnects. As datasets and models continue to increase in size, computational requirements and memory demands for AI also continue to grow. These challenges have inspired the development of distributed algorithm and circuit-based optimization techniques that enable the ability to progressively scale models in multi-node environments, efficiently minimize neural network cost functions for faster convergence, and store more parameters into a set number of available resources. In our research project, we focus on parallel and distributed machine learning algorithm development, specifically for optimizing the data processing and pre-training of a set of 5 encoder-decoder LLMs, ranging from 580 million parameters to 13 billion parameters. We performed a fine-grained study to quantify the relationships between three ML parallelism methods, specifically exploring Microsoft DeepSpeed Zero Redundancy Optimizer (ZeRO) stages.
△ Less
Submitted 10 October, 2023; v1 submitted 8 October, 2023;
originally announced October 2023.
Identifying Circumgalactic Medium Absorption in QSO Spectra: A Bayesian Approach
Authors:
Jennifer E. Scott,
Emileigh S. Shoemaker,
Colin D. Hamill
Abstract:
We present a study of candidate galaxy-absorber pairs for 43 low redshift QSO sightlines ($0.06 < z < 0.85$) observed with the {\it Hubble Space Telescope}/Cosmic Origins Spectrograph that lie within the footprint of the Sloan Digital Sky Survey with a statistical approach to match absorbers with galaxies near the QSO lines of sight using only the SDSS Data Release 12 photometric data for the gala…
▽ More
We present a study of candidate galaxy-absorber pairs for 43 low redshift QSO sightlines ($0.06 < z < 0.85$) observed with the {\it Hubble Space Telescope}/Cosmic Origins Spectrograph that lie within the footprint of the Sloan Digital Sky Survey with a statistical approach to match absorbers with galaxies near the QSO lines of sight using only the SDSS Data Release 12 photometric data for the galaxies, including estimates of their redshifts. Our Bayesian methods combine the SDSS photometric information with measured properties of the circumgalactic medium to find the most probable galaxy match, if any, for each absorber in the line of sight QSO spectrum. We find $\sim$630 candidate galaxy-absorber pairs using two different statistics. The methods are able to reproduce pairs reported in the targeted spectroscopic studies upon which we base the statistics at a rate of 72\%. The properties of the galaxies comprising the candidate pairs have median redshift, luminosity, and stellar mass, all estimated from the photometric data, $z=0.13$, $L=0.1L^*$, and $\log(M_*/M_{Sun}) = 9.7$. The median impact parameter of the candidate pairs is $\sim$430~kpc, or $\sim 3.5$ times the galaxy virial radius. The results are broadly consistent with the high \lya\ covering fraction out to this radius found in previous studies. This method of matching absorbers and galaxies can be used to prioritize targets for spectroscopic studies, and we present specific examples of promising systems for such follow-up.
△ Less
Submitted 7 November, 2021;
originally announced November 2021.