DNA-TEQ: An Adaptive Exponential Quantization of Tensors for DNN Inference

Khabbazan, Bahareh; Riera, Marc; González, Antonio

Computer Science > Machine Learning

arXiv:2306.16430 (cs)

[Submitted on 28 Jun 2023 (v1), last revised 22 Nov 2023 (this version, v2)]

Title:DNA-TEQ: An Adaptive Exponential Quantization of Tensors for DNN Inference

Authors:Bahareh Khabbazan, Marc Riera, Antonio González

View PDF

Abstract:Quantization is commonly used in Deep Neural Networks (DNNs) to reduce the storage and computational complexity by decreasing the arithmetical precision of activations and weights, a.k.a. tensors. Efficient hardware architectures employ linear quantization to enable the deployment of recent DNNs onto embedded systems and mobile devices. However, linear uniform quantization cannot usually reduce the numerical precision to less than 8 bits without sacrificing high performance in terms of model accuracy. The performance loss is due to the fact that tensors do not follow uniform distributions. In this paper, we show that a significant amount of tensors fit into an exponential distribution. Then, we propose DNA-TEQ to exponentially quantize DNN tensors with an adaptive scheme that achieves the best trade-off between numerical precision and accuracy loss. The experimental results show that DNA-TEQ provides a much lower quantization bit-width compared to previous proposals, resulting in an average compression ratio of 40% over the linear INT8 baseline, with negligible accuracy loss and without retraining the DNNs. Besides, DNA-TEQ leads the way in performing dot-product operations in the exponential domain, which saves 66% of energy consumption on average for a set of widely used DNNs.

Comments:	10 pages, 8 figures, 5 tables
Subjects:	Machine Learning (cs.LG); Hardware Architecture (cs.AR)
Cite as:	arXiv:2306.16430 [cs.LG]
	(or arXiv:2306.16430v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2306.16430

Submission history

From: Bahareh Khabbazan [view email]
[v1] Wed, 28 Jun 2023 15:21:27 UTC (869 KB)
[v2] Wed, 22 Nov 2023 15:39:14 UTC (1,245 KB)

Computer Science > Machine Learning

Title:DNA-TEQ: An Adaptive Exponential Quantization of Tensors for DNN Inference

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:DNA-TEQ: An Adaptive Exponential Quantization of Tensors for DNN Inference

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators