Skip to main content

Showing 1–11 of 11 results for author: Medini, T

Searching in archive cs. Search in all archives.
.
  1. BOLT: An Automated Deep Learning Framework for Training and Deploying Large-Scale Search and Recommendation Models on Commodity CPU Hardware

    Authors: Nicholas Meisburger, Vihan Lakshman, Benito Geordie, Joshua Engels, David Torres Ramos, Pratik Pranav, Benjamin Coleman, Benjamin Meisburger, Shubh Gupta, Yashwanth Adunukota, Tharun Medini, Anshumali Shrivastava

    Abstract: Efficient large-scale neural network training and inference on commodity CPU hardware is of immense practical significance in democratizing deep learning (DL) capabilities. Presently, the process of training massive models consisting of hundreds of millions to billions of parameters requires the extensive use of specialized hardware accelerators, such as GPUs, which are only accessible to a limite… ▽ More

    Submitted 12 September, 2023; v1 submitted 30 March, 2023; originally announced March 2023.

    Comments: 6 pages, 5 tables, 3 figures. CIKM 2023 (Applied Research Track)

  2. arXiv:2201.12667  [pdf, other

    cs.DC cs.LG

    Distributed SLIDE: Enabling Training Large Neural Networks on Low Bandwidth and Simple CPU-Clusters via Model Parallelism and Sparsity

    Authors: Minghao Yan, Nicholas Meisburger, Tharun Medini, Anshumali Shrivastava

    Abstract: More than 70% of cloud computing is paid for but sits idle. A large fraction of these idle compute are cheap CPUs with few cores that are not utilized during the less busy hours. This paper aims to enable those CPU cycles to train heavyweight AI models. Our goal is against mainstream frameworks, which focus on leveraging expensive specialized ultra-high bandwidth interconnect to address the commun… ▽ More

    Submitted 29 January, 2022; originally announced January 2022.

  3. arXiv:2103.09944  [pdf, other

    cs.IR cs.LG

    IRLI: Iterative Re-partitioning for Learning to Index

    Authors: Gaurav Gupta, Tharun Medini, Anshumali Shrivastava, Alexander J Smola

    Abstract: Neural models have transformed the fundamental information retrieval problem of map** a query to a giant set of items. However, the need for efficient and low latency inference forces the community to reconsider efficient approximate near-neighbor search in the item space. To this end, learning to index is gaining much interest in recent times. Methods have to trade between obtaining high accura… ▽ More

    Submitted 17 March, 2021; originally announced March 2021.

    Comments: 12 pages

  4. arXiv:2012.15843  [pdf, other

    cs.LG cs.AI cs.DS cs.IR

    A Tale of Two Efficient and Informative Negative Sampling Distributions

    Authors: Shabnam Daghaghi, Tharun Medini, Nicholas Meisburger, Beidi Chen, Mengnan Zhao, Anshumali Shrivastava

    Abstract: Softmax classifiers with a very large number of classes naturally occur in many applications such as natural language processing and information retrieval. The calculation of full softmax is costly from the computational and energy perspective. There have been various sampling approaches to overcome this challenge, popularly known as negative sampling (NS). Ideally, NS should sample negative class… ▽ More

    Submitted 28 July, 2021; v1 submitted 31 December, 2020; originally announced December 2020.

    Comments: Published at ICML 2021

  5. arXiv:2008.13225  [pdf, other

    cs.LG cs.AI cs.DC cs.IR stat.ML

    SOLAR: Sparse Orthogonal Learned and Random Embeddings

    Authors: Tharun Medini, Beidi Chen, Anshumali Shrivastava

    Abstract: Dense embedding models are commonly deployed in commercial search engines, wherein all the document vectors are pre-computed, and near-neighbor search (NNS) is performed with the query vector to find relevant documents. However, the bottleneck of indexing a large number of dense vectors and performing an NNS hurts the query time and accuracy of these models. In this paper, we argue that high-dimen… ▽ More

    Submitted 30 August, 2020; originally announced August 2020.

    Comments: Under review at NeurIPS 2020

  6. arXiv:1910.13830  [pdf, other

    cs.LG cs.DS cs.IR stat.ML

    Extreme Classification in Log Memory using Count-Min Sketch: A Case Study of Amazon Search with 50M Products

    Authors: Tharun Medini, Qixuan Huang, Yiqiu Wang, Vijai Mohan, Anshumali Shrivastava

    Abstract: In the last decade, it has been shown that many hard AI tasks, especially in NLP, can be naturally modeled as extreme classification problems leading to improved precision. However, such models are prohibitively expensive to train due to the memory blow-up in the last layer. For example, a reasonable softmax layer for the dataset of interest in this paper can easily reach well beyond 100 billion p… ▽ More

    Submitted 28 October, 2019; originally announced October 2019.

    Comments: Published at NeurIPS 2019. arXiv admin note: text overlap with arXiv:1810.04254

  7. arXiv:1910.04358  [pdf, other

    q-bio.GN cs.IR

    Fast Processing and Querying of 170TB of Genomics Data via a Repeated And Merged BloOm Filter (RAMBO)

    Authors: Gaurav Gupta, Minghao Yan, Benjamin Coleman, Bryce Kille, R. A. Leo Elworth, Tharun Medini, Todd Treangen, Anshumali Shrivastava

    Abstract: DNA sequencing, especially of microbial genomes and metagenomes, has been at the core of recent research advances in large-scale comparative genomics. The data deluge has resulted in exponential growth in genomic datasets over the past years and has shown no sign of slowing down. Several recent attempts have been made to tame the computational burden of sequence search on these terabyte and petaby… ▽ More

    Submitted 30 April, 2022; v1 submitted 10 October, 2019; originally announced October 2019.

    Comments: 9 pages

  8. arXiv:1910.02611  [pdf, other

    cs.DS cs.IR

    RAMBO: Repeated And Merged BloOm Filter for Ultra-fast Multiple Set Membership Testing (MSMT) on Large-Scale Data

    Authors: Gaurav Gupta, Minghao Yan, Benjamin Coleman, R. A. Leo Elworth, Tharun Medini, Todd Treangen, Anshumali Shrivastava

    Abstract: Multiple Set Membership Testing (MSMT) is a well-known problem in a variety of search and query applications. Given a dataset of K different sets and a query q, it aims to find all of the sets containing the query. Trivially, an MSMT instance can be reduced to K membership testing instances, each with the same q, leading to O(K) query time with a simple array of Bloom Filters. We propose a data-st… ▽ More

    Submitted 17 July, 2020; v1 submitted 7 October, 2019; originally announced October 2019.

    Comments: 14 pages, 5 figures

  9. arXiv:1909.04790  [pdf, other

    cs.CV cs.LG

    SDM-Net: A Simple and Effective Model for Generalized Zero-Shot Learning

    Authors: Shabnam Daghaghi, Tharun Medini, Anshumali Shrivastava

    Abstract: Zero-Shot Learning (ZSL) is a classification task where we do not have even a single training labeled example from a set of unseen classes. Instead, we only have prior information (or description) about seen and unseen classes, often in the form of physically realizable or descriptive attributes. Lack of any single training example from a set of classes prohibits use of standard classification tec… ▽ More

    Submitted 31 December, 2020; v1 submitted 10 September, 2019; originally announced September 2019.

  10. arXiv:1903.03129  [pdf, other

    cs.DC cs.LG

    SLIDE : In Defense of Smart Algorithms over Hardware Acceleration for Large-Scale Deep Learning Systems

    Authors: Beidi Chen, Tharun Medini, James Farwell, Sameh Gobriel, Charlie Tai, Anshumali Shrivastava

    Abstract: Deep Learning (DL) algorithms are the central focus of modern machine learning systems. As data volumes keep growing, it has become customary to train large neural networks with hundreds of millions of parameters to maintain enough capacity to memorize these volumes and obtain state-of-the-art accuracy. To get around the costly computations associated with large models and data, the community is i… ▽ More

    Submitted 29 February, 2020; v1 submitted 7 March, 2019; originally announced March 2019.

    Comments: Published at MLSys 2020

  11. arXiv:1810.04254  [pdf, other

    cs.DC cs.CC cs.LG

    Extreme Classification in Log Memory

    Authors: Qixuan Huang, Yiqiu Wang, Tharun Medini, Anshumali Shrivastava

    Abstract: We present Merged-Averaged Classifiers via Hashing (MACH) for K-classification with ultra-large values of K. Compared to traditional one-vs-all classifiers that require O(Kd) memory and inference cost, MACH only need O(d log K) (d is dimensionality )memory while only requiring O(K log K + d log K) operation for inference. MACH is a generic K-classification algorithm, with provably theoretical guar… ▽ More

    Submitted 9 October, 2018; originally announced October 2018.