Showing 1–1 of 1 results for author: Kligys, S
-
Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference
Authors:
Benoit Jacob,
Skirmantas Kligys,
Bo Chen,
Menglong Zhu,
Matthew Tang,
Andrew Howard,
Hartwig Adam,
Dmitry Kalenichenko
Abstract:
The rising popularity of intelligent mobile devices and the daunting computational cost of deep learning-based models call for efficient and accurate on-device inference schemes. We propose a quantization scheme that allows inference to be carried out using integer-only arithmetic, which can be implemented more efficiently than floating point inference on commonly available integer-only hardware.…
▽ More
The rising popularity of intelligent mobile devices and the daunting computational cost of deep learning-based models call for efficient and accurate on-device inference schemes. We propose a quantization scheme that allows inference to be carried out using integer-only arithmetic, which can be implemented more efficiently than floating point inference on commonly available integer-only hardware. We also co-design a training procedure to preserve end-to-end model accuracy post quantization. As a result, the proposed quantization scheme improves the tradeoff between accuracy and on-device latency. The improvements are significant even on MobileNets, a model family known for run-time efficiency, and are demonstrated in ImageNet classification and COCO detection on popular CPUs.
△ Less
Submitted 15 December, 2017;
originally announced December 2017.