Skip to main content

Showing 1–12 of 12 results for author: Emer, J S

.
  1. arXiv:2407.01742  [pdf, other

    cs.PL

    The Continuous Tensor Abstraction: Where Indices are Real

    Authors: Jaeyeon Won, Willow Ahrens, Joel S. Emer, Saman Amarasinghe

    Abstract: This paper introduces the continuous tensor abstraction, allowing indices to take real-number values (e.g., A[3.14]), and provides a continuous loop construct that iterates over the infinitely large set of real numbers. This paper expands the existing tensor abstraction to include continuous tensors that exhibit a piecewise-constant property, enabling the transformation of an infinite amount of co… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  2. arXiv:2406.10491  [pdf, other

    cs.AR

    FuseMax: Leveraging Extended Einsums to Optimize Attention Accelerator Design

    Authors: Nandeeka Nayak, Xinrui Wu, Toluwanimi O. Odemuyiwa, Michael Pellauer, Joel S. Emer, Christopher W. Fletcher

    Abstract: Attention for transformers is a critical workload that has recently received significant "attention" as a target for custom acceleration. Yet, while prior work succeeds in reducing attention's memory-bandwidth requirements, it creates load imbalance between attention operators (resulting in severe compute under-utilization) and requires on-chip memory that scales with sequence length (which is exp… ▽ More

    Submitted 25 June, 2024; v1 submitted 15 June, 2024; originally announced June 2024.

    Comments: 15 pages, 10 figures

  3. arXiv:2405.07266  [pdf, other

    cs.ET cs.AR

    Architecture-Level Modeling of Photonic Deep Neural Network Accelerators

    Authors: Tanner Andrulis, Gohar Irfan Chaudhry, Vinith M. Suriyakumar, Joel S. Emer, Vivienne Sze

    Abstract: Photonics is a promising technology to accelerate Deep Neural Networks as it can use optical interconnects to reduce data movement energy and it enables low-energy, high-throughput optical-analog computations. To realize these benefits in a full system (accelerator + DRAM), designers must ensure that the benefits of using the electrical, optical, analog, and digital domains exceed the costs of c… ▽ More

    Submitted 14 May, 2024; v1 submitted 12 May, 2024; originally announced May 2024.

    Comments: Published at ISPASS 2024

  4. arXiv:2405.07259  [pdf, other

    cs.AR

    CiMLoop: A Flexible, Accurate, and Fast Compute-In-Memory Modeling Tool

    Authors: Tanner Andrulis, Joel S. Emer, Vivienne Sze

    Abstract: Compute-In-Memory (CiM) is a promising solution to accelerate Deep Neural Networks (DNNs) as it can avoid energy-intensive DNN weight movement and use memory arrays to perform low-energy, high-density computations. These benefits have inspired research across the CiM stack, but CiM research often focuses on only one level of the stack (i.e., devices, circuits, architecture, workload, or map**) o… ▽ More

    Submitted 29 May, 2024; v1 submitted 12 May, 2024; originally announced May 2024.

    Comments: Available at https://github.com/mit-emze/cimloop. Published in ISPASS 2024

  5. arXiv:2404.11591  [pdf, other

    cs.DS

    The EDGE Language: Extended General Einsums for Graph Algorithms

    Authors: Toluwanimi O. Odemuyiwa, Joel S. Emer, John D. Owens

    Abstract: In this work, we propose a unified abstraction for graph algorithms: the Extended General Einsums language, or EDGE. The EDGE language expresses graph algorithms in the language of tensor algebra, providing a rigorous, succinct, and expressive mathematical framework. EDGE leverages two ideas: (1) the well-known foundations provided by the graph-matrix duality, where a graph is simply a 2D tensor,… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: 79 pages, 14 figures

  6. arXiv:2404.06553  [pdf, other

    cs.AR

    Modeling Analog-Digital-Converter Energy and Area for Compute-In-Memory Accelerator Design

    Authors: Tanner Andrulis, Ruicong Chen, Hae-Seung Lee, Joel S. Emer, Vivienne Sze

    Abstract: Analog Compute-in-Memory (CiM) accelerators use analog-digital converters (ADCs) to read the analog values that they compute. ADCs can consume significant energy and area, so architecture-level ADC decisions such as ADC resolution or number of ADCs can significantly impact overall CiM accelerator energy and area. Therefore, modeling how architecture-level decisions affect ADC energy and area is cr… ▽ More

    Submitted 14 May, 2024; v1 submitted 9 April, 2024; originally announced April 2024.

  7. Tailors: Accelerating Sparse Tensor Algebra by Overbooking Buffer Capacity

    Authors: Zi Yu Xue, Yannan Nellie Wu, Joel S. Emer, Vivienne Sze

    Abstract: Sparse tensor algebra is a challenging class of workloads to accelerate due to low arithmetic intensity and varying sparsity patterns. Prior sparse tensor algebra accelerators have explored tiling sparse data to increase exploitable data reuse and improve throughput, but typically allocate tile size in a given buffer for the worst-case data occupancy. This severely limits the utilization of availa… ▽ More

    Submitted 26 June, 2024; v1 submitted 29 September, 2023; originally announced October 2023.

    Comments: 17 pages, 13 figures, in MICRO 2023

    Journal ref: 56th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO '23), 2023

  8. arXiv:2309.04119  [pdf, other

    cs.CR cs.AR

    Penetrating Shields: A Systematic Analysis of Memory Corruption Mitigations in the Spectre Era

    Authors: Weon Taek Na, Joel S. Emer, Mengjia Yan

    Abstract: This paper provides the first systematic analysis of a synergistic threat model encompassing memory corruption vulnerabilities and microarchitectural side-channel vulnerabilities. We study speculative shield bypass attacks that leverage speculative execution attacks to leak secrets that are critical to the security of memory corruption mitigations (i.e., the shields), and then use the leaked secre… ▽ More

    Submitted 8 September, 2023; originally announced September 2023.

    Comments: 14 pages

    ACM Class: K.6.5; D.4.6; C.2.0

  9. HighLight: Efficient and Flexible DNN Acceleration with Hierarchical Structured Sparsity

    Authors: Yannan Nellie Wu, Po-An Tsai, Saurav Muralidharan, Angshuman Parashar, Vivienne Sze, Joel S. Emer

    Abstract: Due to complex interactions among various deep neural network (DNN) optimization techniques, modern DNNs can have weights and activations that are dense or sparse with diverse sparsity degrees. To offer a good trade-off between accuracy and hardware performance, an ideal DNN accelerator should have high flexibility to efficiently translate DNN sparsity into reductions in energy and/or latency with… ▽ More

    Submitted 1 October, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

    Comments: Accepted to MICRO23

  10. RAELLA: Reforming the Arithmetic for Efficient, Low-Resolution, and Low-Loss Analog PIM: No Retraining Required!

    Authors: Tanner Andrulis, Joel S. Emer, Vivienne Sze

    Abstract: Processing-In-Memory (PIM) accelerators have the potential to efficiently run Deep Neural Network (DNN) inference by reducing costly data movement and by using resistive RAM (ReRAM) for efficient analog compute. Unfortunately, overall PIM accelerator efficiency is limited by energy-intensive analog-to-digital converters (ADCs). Furthermore, existing accelerators that reduce ADC cost do so by chang… ▽ More

    Submitted 16 April, 2023; originally announced April 2023.

    Comments: 16 pages; 15 figures; Accepted at ISCA 2023 (the International Symposium on Computer Architecture)

    ACM Class: C.1.3

  11. TeAAL: A Declarative Framework for Modeling Sparse Tensor Accelerators

    Authors: Nandeeka Nayak, Toluwanimi O. Odemuyiwa, Shubham Ugare, Christopher W. Fletcher, Michael Pellauer, Joel S. Emer

    Abstract: Over the past few years, the explosion in sparse tensor algebra workloads has led to a corresponding rise in domain-specific accelerators to service them. Due to the irregularity present in sparse tensors, these accelerators employ a wide variety of novel solutions to achieve good performance. At the same time, prior work on design-flexible sparse accelerator modeling does not express this full ra… ▽ More

    Submitted 11 June, 2024; v1 submitted 16 April, 2023; originally announced April 2023.

    Comments: 17 pages, 13 figures

  12. arXiv:2205.05826  [pdf, other

    cs.AR cs.CV cs.DC

    Sparseloop: An Analytical Approach To Sparse Tensor Accelerator Modeling

    Authors: Yannan Nellie Wu, Po-An Tsai, Angshuman Parashar, Vivienne Sze, Joel S. Emer

    Abstract: In recent years, many accelerators have been proposed to efficiently process sparse tensor algebra applications (e.g., sparse neural networks). However, these proposals are single points in a large and diverse design space. The lack of systematic description and modeling support for these sparse tensor accelerators impedes hardware designers from efficient and effective design space exploration. T… ▽ More

    Submitted 9 January, 2023; v1 submitted 11 May, 2022; originally announced May 2022.

    Comments: Update website link, update UOP format description