Skip to main content

Showing 1–2 of 2 results for author: Ruwase, O

Searching in archive stat. Search in all archives.
.
  1. arXiv:2312.08583  [pdf, other

    cs.CL stat.ML

    ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks

    Authors: Xiaoxia Wu, Haojun Xia, Stephen Youn, Zhen Zheng, Shiyang Chen, Arash Bakhtiari, Michael Wyatt, Reza Yazdani Aminabadi, Yuxiong He, Olatunji Ruwase, Leon Song, Zhewei Yao

    Abstract: This study examines 4-bit quantization methods like GPTQ in large language models (LLMs), highlighting GPTQ's overfitting and limited enhancement in Zero-Shot tasks. While prior works merely focusing on zero-shot measurement, we extend task scope to more generative categories such as code generation and abstractive summarization, in which we found that INT4 quantization can significantly underperf… ▽ More

    Submitted 18 December, 2023; v1 submitted 13 December, 2023; originally announced December 2023.

  2. arXiv:1910.02054  [pdf, other

    cs.LG cs.DC stat.ML

    ZeRO: Memory Optimizations Toward Training Trillion Parameter Models

    Authors: Samyam Rajbhandari, Jeff Rasley, Olatunji Ruwase, Yuxiong He

    Abstract: Large deep learning models offer significant accuracy gains, but training billions to trillions of parameters is challenging. Existing solutions such as data and model parallelisms exhibit fundamental limitations to fit these models into limited device memory, while obtaining computation, communication and development efficiency. We develop a novel solution, Zero Redundancy Optimizer (ZeRO), to op… ▽ More

    Submitted 13 May, 2020; v1 submitted 4 October, 2019; originally announced October 2019.