Skip to main content

Showing 1–1 of 1 results for author: Dixon, M G

Searching in archive cs. Search in all archives.
.
  1. arXiv:2308.03290  [pdf, other

    cs.CV cs.LG

    FLIQS: One-Shot Mixed-Precision Floating-Point and Integer Quantization Search

    Authors: Jordan Dotzel, Gang Wu, Andrew Li, Muhammad Umar, Yun Ni, Mohamed S. Abdelfattah, Zhiru Zhang, Liqun Cheng, Martin G. Dixon, Norman P. Jouppi, Quoc V. Le, Sheng Li

    Abstract: Quantization has become a mainstream compression technique for reducing model size, computational requirements, and energy consumption for modern deep neural networks (DNNs). With improved numerical support in recent hardware, including multiple variants of integer and floating point, mixed-precision quantization has become necessary to achieve high-quality results with low model cost. Prior mixed… ▽ More

    Submitted 1 May, 2024; v1 submitted 7 August, 2023; originally announced August 2023.

    Comments: Accepted to AutoML 2024