Binary Early-Exit Network for Adaptive Inference on Low-Resource Devices

Saeed, Aaqib

Abstract:Deep neural networks have significantly improved performance on a range of tasks with the increasing demand for computational resources, leaving deployment on low-resource devices (with limited memory and battery power) infeasible. Binary neural networks (BNNs) tackle the issue to an extent with extreme compression and speed-up gains compared to real-valued models. We propose a simple but effective method to accelerate inference through unifying BNNs with an early-exiting strategy. Our approach allows simple instances to exit early based on a decision threshold and utilizes output layers added to different intermediate layers to avoid executing the entire binary model. We extensively evaluate our method on three audio classification tasks and across four BNNs architectures. Our method demonstrates favorable quality-efficiency trade-offs while being controllable with an entropy-based threshold specified by the system user. It also results in better speed-ups (latency less than 6ms) with a single model based on existing BNN architectures without retraining for different efficiency levels. It also provides a straightforward way to estimate sample difficulty and better understanding of uncertainty around certain classes within the dataset.

Comments:	Interspeech 2022
Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2206.09029 [cs.LG]
	(or arXiv:2206.09029v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2206.09029

Computer Science > Machine Learning

Title:Binary Early-Exit Network for Adaptive Inference on Low-Resource Devices

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators