Skip to main content

Showing 1–1 of 1 results for author: Paul, C P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2310.05350  [pdf

    cs.DC cs.LG

    Scaling Studies for Efficient Parameter Search and Parallelism for Large Language Model Pre-training

    Authors: Michael Benington, Leo Phan, Chris Pierre Paul, Evan Shoemaker, Priyanka Ranade, Torstein Collett, Grant Hodgson Perez, Christopher Krieger

    Abstract: AI accelerator processing capabilities and memory constraints largely dictate the scale in which machine learning workloads (e.g., training and inference) can be executed within a desirable time frame. Training a state of the art, transformer-based model today requires use of GPU-accelerated high performance computers with high-speed interconnects. As datasets and models continue to increase in si… ▽ More

    Submitted 10 October, 2023; v1 submitted 8 October, 2023; originally announced October 2023.

    Journal ref: Supercomputing 2023 (SC23) Student Research Poster Track