Skip to main content

Showing 1–7 of 7 results for author: Borzunov, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2312.08361  [pdf, other

    cs.LG cs.DC

    Distributed Inference and Fine-tuning of Large Language Models Over The Internet

    Authors: Alexander Borzunov, Max Ryabinin, Artem Chumachenko, Dmitry Baranchuk, Tim Dettmers, Younes Belkada, Pavel Samygin, Colin Raffel

    Abstract: Large language models (LLMs) are useful in many NLP tasks and become more capable with size, with the best open-source models having over 50 billion parameters. However, using these 50B+ models requires high-end hardware, making them inaccessible to most researchers. In this work, we investigate methods for cost-efficient inference and fine-tuning of LLMs, comparing local and distributed strategie… ▽ More

    Submitted 13 December, 2023; originally announced December 2023.

    Comments: Accepted to Conference on Neural Information Processing Systems (NeurIPS) 2023. 20 pages, 3 figures

  2. arXiv:2306.03078  [pdf, other

    cs.CL cs.LG

    SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression

    Authors: Tim Dettmers, Ruslan Svirschevski, Vage Egiazarian, Denis Kuznedelev, Elias Frantar, Saleh Ashkboos, Alexander Borzunov, Torsten Hoefler, Dan Alistarh

    Abstract: Recent advances in large language model (LLM) pretraining have led to high-quality LLMs with impressive abilities. By compressing such LLMs via quantization to 3-4 bits per parameter, they can fit into memory-limited devices such as laptops and mobile phones, enabling personalized use. However, quantization down to 3-4 bits per parameter usually leads to moderate-to-high accuracy losses, especiall… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

    Comments: Extended preprint

  3. arXiv:2301.11913  [pdf, other

    cs.DC cs.LG

    SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient

    Authors: Max Ryabinin, Tim Dettmers, Michael Diskin, Alexander Borzunov

    Abstract: Many deep learning applications benefit from using large models with billions of parameters. Training these models is notoriously expensive due to the need for specialized HPC clusters. In this work, we consider alternative setups for training large models: using cheap "preemptible" instances or pooling existing resources from multiple regions. We analyze the performance of existing model-parallel… ▽ More

    Submitted 29 June, 2023; v1 submitted 27 January, 2023; originally announced January 2023.

    Comments: Accepted to International Conference on Machine Learning (ICML) 2023. 25 pages, 8 figures

  4. arXiv:2209.01188  [pdf, other

    cs.LG cs.DC

    Petals: Collaborative Inference and Fine-tuning of Large Models

    Authors: Alexander Borzunov, Dmitry Baranchuk, Tim Dettmers, Max Ryabinin, Younes Belkada, Artem Chumachenko, Pavel Samygin, Colin Raffel

    Abstract: Many NLP tasks benefit from using large language models (LLMs) that often have more than 100 billion parameters. With the release of BLOOM-176B and OPT-175B, everyone can download pretrained models of this scale. Still, using these models requires high-end hardware unavailable to many researchers. In some cases, LLMs can be used more affordably via RAM offloading or hosted APIs. However, these tec… ▽ More

    Submitted 2 March, 2023; v1 submitted 2 September, 2022; originally announced September 2022.

    Comments: 10 pages, 4 figures. The version 2 updates the benchmarks and the description of the chat application. Source code and docs: https://petals.ml

  5. arXiv:2207.03481  [pdf, other

    cs.LG cs.DC

    Training Transformers Together

    Authors: Alexander Borzunov, Max Ryabinin, Tim Dettmers, Quentin Lhoest, Lucile Saulnier, Michael Diskin, Yacine Jernite, Thomas Wolf

    Abstract: The infrastructure necessary for training state-of-the-art models is becoming overly expensive, which makes training such models affordable only to large corporations and institutions. Recent work proposes several methods for training such models collaboratively, i.e., by pooling together hardware from many independent parties and training a shared model over the Internet. In this demonstration, w… ▽ More

    Submitted 7 July, 2022; originally announced July 2022.

    Comments: Accepted to NeurIPS 2021 Demonstration Track. 10 pages, 2 figures. Link: https://training-transformers-together.github.io

  6. arXiv:2106.11257  [pdf, other

    cs.LG cs.DC math.OC

    Secure Distributed Training at Scale

    Authors: Eduard Gorbunov, Alexander Borzunov, Michael Diskin, Max Ryabinin

    Abstract: Many areas of deep learning benefit from using increasingly larger neural networks trained on public data, as is the case for pre-trained models for NLP and computer vision. Training such models requires a lot of computational resources (e.g., HPC clusters) that are not available to small research groups and independent researchers. One way to address it is for several smaller groups to pool their… ▽ More

    Submitted 1 January, 2023; v1 submitted 21 June, 2021; originally announced June 2021.

    Comments: Accepted to International Conference on Machine Learning (ICML 2022). 61 pages, 10 figures. The version 4 fixes inaccuracies in the proofs of Lemmas E.2 and E.4. Code: https://github.com/yandex-research/btard

  7. arXiv:2106.10207  [pdf, other

    cs.LG cs.DC

    Distributed Deep Learning in Open Collaborations

    Authors: Michael Diskin, Alexey Bukhtiyarov, Max Ryabinin, Lucile Saulnier, Quentin Lhoest, Anton Sinitsin, Dmitry Popov, Dmitry Pyrkin, Maxim Kashirin, Alexander Borzunov, Albert Villanova del Moral, Denis Mazur, Ilia Kobelev, Yacine Jernite, Thomas Wolf, Gennady Pekhimenko

    Abstract: Modern deep learning applications require increasingly more compute to train state-of-the-art models. To address this demand, large corporations and institutions use dedicated High-Performance Computing clusters, whose construction and maintenance are both environmentally costly and well beyond the budget of most organizations. As a result, some research directions become the exclusive domain of a… ▽ More

    Submitted 8 November, 2021; v1 submitted 18 June, 2021; originally announced June 2021.

    Comments: Accepted to Conference on Neural Information Processing Systems (NeurIPS) 2021. 32 pages, 10 figures. Code: https://github.com/yandex-research/DeDLOC