Skip to main content

Showing 1–5 of 5 results for author: Meisburger, N

Searching in archive cs. Search in all archives.
.
  1. BOLT: An Automated Deep Learning Framework for Training and Deploying Large-Scale Search and Recommendation Models on Commodity CPU Hardware

    Authors: Nicholas Meisburger, Vihan Lakshman, Benito Geordie, Joshua Engels, David Torres Ramos, Pratik Pranav, Benjamin Coleman, Benjamin Meisburger, Shubh Gupta, Yashwanth Adunukota, Tharun Medini, Anshumali Shrivastava

    Abstract: Efficient large-scale neural network training and inference on commodity CPU hardware is of immense practical significance in democratizing deep learning (DL) capabilities. Presently, the process of training massive models consisting of hundreds of millions to billions of parameters requires the extensive use of specialized hardware accelerators, such as GPUs, which are only accessible to a limite… ▽ More

    Submitted 12 September, 2023; v1 submitted 30 March, 2023; originally announced March 2023.

    Comments: 6 pages, 5 tables, 3 figures. CIKM 2023 (Applied Research Track)

  2. arXiv:2201.12667  [pdf, other

    cs.DC cs.LG

    Distributed SLIDE: Enabling Training Large Neural Networks on Low Bandwidth and Simple CPU-Clusters via Model Parallelism and Sparsity

    Authors: Minghao Yan, Nicholas Meisburger, Tharun Medini, Anshumali Shrivastava

    Abstract: More than 70% of cloud computing is paid for but sits idle. A large fraction of these idle compute are cheap CPUs with few cores that are not utilized during the less busy hours. This paper aims to enable those CPU cycles to train heavyweight AI models. Our goal is against mainstream frameworks, which focus on leveraging expensive specialized ultra-high bandwidth interconnect to address the commun… ▽ More

    Submitted 29 January, 2022; originally announced January 2022.

  3. arXiv:2103.10891  [pdf, other

    cs.LG cs.DC cs.PF

    Accelerating SLIDE Deep Learning on Modern CPUs: Vectorization, Quantizations, Memory Optimizations, and More

    Authors: Shabnam Daghaghi, Nicholas Meisburger, Mengnan Zhao, Yong Wu, Sameh Gobriel, Charlie Tai, Anshumali Shrivastava

    Abstract: Deep learning implementations on CPUs (Central Processing Units) are gaining more traction. Enhanced AI capabilities on commodity x86 architectures are commercially appealing due to the reuse of existing hardware and virtualization ease. A notable work in this direction is the SLIDE system. SLIDE is a C++ implementation of a sparse hash table based back-propagation, which was shown to be significa… ▽ More

    Submitted 5 March, 2021; originally announced March 2021.

  4. arXiv:2012.15843  [pdf, other

    cs.LG cs.AI cs.DS cs.IR

    A Tale of Two Efficient and Informative Negative Sampling Distributions

    Authors: Shabnam Daghaghi, Tharun Medini, Nicholas Meisburger, Beidi Chen, Mengnan Zhao, Anshumali Shrivastava

    Abstract: Softmax classifiers with a very large number of classes naturally occur in many applications such as natural language processing and information retrieval. The calculation of full softmax is costly from the computational and energy perspective. There have been various sampling approaches to overcome this challenge, popularly known as negative sampling (NS). Ideally, NS should sample negative class… ▽ More

    Submitted 28 July, 2021; v1 submitted 31 December, 2020; originally announced December 2020.

    Comments: Published at ICML 2021

  5. arXiv:2008.03260  [pdf, other

    cs.DB cs.DC cs.IR cs.LG

    Distributed Tera-Scale Similarity Search with MPI: Provably Efficient Similarity Search over billions without a Single Distance Computation

    Authors: Nicholas Meisburger, Anshumali Shrivastava

    Abstract: We present SLASH (Sketched LocAlity Sensitive Hashing), an MPI (Message Passing Interface) based distributed system for approximate similarity search over terabyte scale datasets. SLASH provides a multi-node implementation of the popular LSH (locality sensitive hashing) algorithm, which is generally implemented on a single machine. We show how we can append the LSH algorithm with heavy hitters ske… ▽ More

    Submitted 17 August, 2020; v1 submitted 5 August, 2020; originally announced August 2020.