Skip to main content

Showing 1–4 of 4 results for author: Jung, V J B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.09804  [pdf, other

    cs.AR

    Optimizing Layer-Fused Scheduling of Transformer Networks on Multi-accelerator Platforms

    Authors: Steven Colleman, Arne Symons, Victor J. B. Jung, Marian Verhelst

    Abstract: The impact of transformer networks is booming, yet, they come with significant computational complexity. It is therefore essential to understand how to optimally map and execute these networks on modern neural processor hardware. So far, literature on transformer scheduling optimization has been focusing on deployment on GPU and specific ASICs. This work enables extensive hardware/map** explorat… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: Accepted to ISQED2024

  2. arXiv:2404.02945  [pdf, other

    cs.LG cs.AI cs.DC cs.PF

    Optimizing the Deployment of Tiny Transformers on Low-Power MCUs

    Authors: Victor J. B. Jung, Alessio Burrello, Moritz Scherer, Francesco Conti, Luca Benini

    Abstract: Transformer networks are rapidly becoming SotA in many fields, such as NLP and CV. Similarly to CNN, there is a strong push for deploying Transformer models at the extreme edge, ultimately fitting the tiny power budget and memory footprint of MCUs. However, the early approaches in this direction are mostly ad-hoc, platform, and model-specific. This work aims to enable and optimize the flexible, mu… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: Pre-print manuscript submitted for review to the IEEE Transactions on Computers

  3. arXiv:2307.03493  [pdf, other

    cs.AR cs.LG

    ITA: An Energy-Efficient Attention and Softmax Accelerator for Quantized Transformers

    Authors: Gamze İslamoğlu, Moritz Scherer, Gianna Paulin, Tim Fischer, Victor J. B. Jung, Angelo Garofalo, Luca Benini

    Abstract: Transformer networks have emerged as the state-of-the-art approach for natural language processing tasks and are gaining popularity in other domains such as computer vision and audio processing. However, the efficient hardware acceleration of transformer models poses new challenges due to their high arithmetic intensities, large memory requirements, and complex dataflow dependencies. In this work,… ▽ More

    Submitted 10 July, 2023; v1 submitted 7 July, 2023; originally announced July 2023.

    Comments: Accepted for publication at the 2023 ACM/IEEE International Symposium on Low Power Electronics and Design (ISLPED)

  4. arXiv:2304.12931  [pdf, other

    cs.AR cs.AI

    SALSA: Simulated Annealing based Loop-Ordering Scheduler for DNN Accelerators

    Authors: Victor J. B. Jung, Arne Symons, Linyan Mei, Marian Verhelst, Luca Benini

    Abstract: To meet the growing need for computational power for DNNs, multiple specialized hardware architectures have been proposed. Each DNN layer should be mapped onto the hardware with the most efficient schedule, however, SotA schedulers struggle to consistently provide optimum schedules in a reasonable time across all DNN-HW combinations. This paper proposes SALSA, a fast dual-engine scheduler to gen… ▽ More

    Submitted 14 June, 2024; v1 submitted 20 April, 2023; originally announced April 2023.

    Comments: 5 pages, 6 figures, open-source at https://github.com/ZigZag-Project/zigzag