BitTrain: Sparse Bitmap Compression for Memory-Efficient Training on the Edge

Hosny, Abdelrahman; Neseem, Marina; Reda, Sherief

doi:10.1145/3453142.3491290

Computer Science > Machine Learning

arXiv:2110.15362 (cs)

[Submitted on 29 Oct 2021]

Title:BitTrain: Sparse Bitmap Compression for Memory-Efficient Training on the Edge

Authors:Abdelrahman Hosny, Marina Neseem, Sherief Reda

View PDF

Abstract:Training on the Edge enables neural networks to learn continuously from new data after deployment on memory-constrained edge devices. Previous work is mostly concerned with reducing the number of model parameters which is only beneficial for inference. However, memory footprint from activations is the main bottleneck for training on the edge. Existing incremental training methods fine-tune the last few layers sacrificing accuracy gains from re-training the whole model. In this work, we investigate the memory footprint of training deep learning models, and use our observations to propose BitTrain. In BitTrain, we exploit activation sparsity and propose a novel bitmap compression technique that reduces the memory footprint during training. We save the activations in our proposed bitmap compression format during the forward pass of the training, and restore them during the backward pass for the optimizer computations. The proposed method can be integrated seamlessly in the computation graph of modern deep learning frameworks. Our implementation is safe by construction, and has no negative impact on the accuracy of model training. Experimental results show up to 34% reduction in the memory footprint at a sparsity level of 50%. Further pruning during training results in more than 70% sparsity, which can lead to up to 56% reduction in memory footprint. BitTrain advances the efforts towards bringing more machine learning capabilities to edge devices. Our source code is available at this https URL.

Comments:	12 pages, 13 figures, to appear in the proceedings of The Sixth ACM/IEEE Symposium on Edge Computing (SEC 2021)
Subjects:	Machine Learning (cs.LG); Distributed, Parallel, and Cluster Computing (cs.DC)
Cite as:	arXiv:2110.15362 [cs.LG]
	(or arXiv:2110.15362v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2110.15362
Related DOI:	https://doi.org/10.1145/3453142.3491290

Submission history

From: Marina Neseem [view email]
[v1] Fri, 29 Oct 2021 16:30:57 UTC (719 KB)

Computer Science > Machine Learning

Title:BitTrain: Sparse Bitmap Compression for Memory-Efficient Training on the Edge

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:BitTrain: Sparse Bitmap Compression for Memory-Efficient Training on the Edge

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators