Skip to main content

Showing 1–10 of 10 results for author: Abellan, J L

.
  1. arXiv:2404.15510  [pdf, other

    cs.AR cs.DC cs.LG cs.NE

    NeuraChip: Accelerating GNN Computations with a Hash-based Decoupled Spatial Accelerator

    Authors: Kaustubh Shivdikar, Nicolas Bohm Agostini, Malith Jayaweera, Gilbert Jonatan, Jose L. Abellan, Ajay Joshi, John Kim, David Kaeli

    Abstract: Graph Neural Networks (GNNs) are emerging as a formidable tool for processing non-euclidean data across various domains, ranging from social network analysis to bioinformatics. Despite their effectiveness, their adoption has not been pervasive because of scalability challenges associated with large-scale graph datasets, particularly when leveraging message passing. To tackle these challenges, we… ▽ More

    Submitted 26 April, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

    Comments: Visit https://neurachip.us for WebGUI based simulations

  2. AXI4MLIR: User-Driven Automatic Host Code Generation for Custom AXI-Based Accelerators

    Authors: Nicolas Bohm Agostini, Jude Haris, Perry Gibson, Malith Jayaweera, Norm Rubin, Antonino Tumeo, José L. Abellán, José Cano, David Kaeli

    Abstract: This paper addresses the need for automatic and efficient generation of host driver code for arbitrary custom AXI-based accelerators targeting linear algebra algorithms, an important workload in various applications, including machine learning and scientific computing. While existing tools have focused on automating accelerator prototy**, little attention has been paid to the host-accelerator in… ▽ More

    Submitted 22 December, 2023; originally announced December 2023.

    Comments: 13 pages, 17 figures, to appear in CGO2024

    ACM Class: D.3.3

  3. GME: GPU-based Microarchitectural Extensions to Accelerate Homomorphic Encryption

    Authors: Kaustubh Shivdikar, Yuhui Bao, Rashmi Agrawal, Michael Shen, Gilbert Jonatan, Evelio Mora, Alexander Ingare, Neal Livesay, José L. Abellán, John Kim, Ajay Joshi, David Kaeli

    Abstract: Fully Homomorphic Encryption (FHE) enables the processing of encrypted data without decrypting it. FHE has garnered significant attention over the past decade as it supports secure outsourcing of data processing to remote cloud services. Despite its promise of strong data privacy and security guarantees, FHE introduces a slowdown of up to five orders of magnitude as compared to the same computatio… ▽ More

    Submitted 19 September, 2023; originally announced September 2023.

  4. arXiv:2301.10852  [pdf, other

    cs.AR

    Flexagon: A Multi-Dataflow Sparse-Sparse Matrix Multiplication Accelerator for Efficient DNN Processing

    Authors: Francisco Muñoz-Martínez, Raveesh Garg, José L. Abellán, Michael Pellauer, Manuel E. Acacio, Tushar Krishna

    Abstract: Sparsity is a growing trend in modern DNN models. Existing Sparse-Sparse Matrix Multiplication (SpMSpM) accelerators are tailored to a particular SpMSpM dataflow (i.e., Inner Product, Outer Product or Gustavsons), that determines their overall efficiency. We demonstrate that this static decision inherently results in a suboptimal dynamic solution. This is because different SpMSpM kernels show vary… ▽ More

    Submitted 25 January, 2023; originally announced January 2023.

    Comments: To appear on ASPLOS 2023

  5. arXiv:2201.12027  [pdf, other

    cs.AR cs.LG cs.PF

    Puppeteer: A Random Forest-based Manager for Hardware Prefetchers across the Memory Hierarchy

    Authors: Furkan Eris, Marcia S. Louis, Kubra Eris, Jose L. Abellan, Ajay Joshi

    Abstract: Over the years, processor throughput has steadily increased. However, the memory throughput has not increased at the same rate, which has led to the memory wall problem in turn increasing the gap between effective and theoretical peak processor performance. To cope with this, there has been an abundance of work in the area of data/instruction prefetcher designs. Broadly, prefetchers predict future… ▽ More

    Submitted 28 January, 2022; originally announced January 2022.

  6. arXiv:2103.07977  [pdf, other

    cs.DC cs.AR

    Understanding the Design-Space of Sparse/Dense Multiphase GNN dataflows on Spatial Accelerators

    Authors: Raveesh Garg, Eric Qin, Francisco Muñoz-Martínez, Robert Guirado, Akshay Jain, Sergi Abadal, José L. Abellán, Manuel E. Acacio, Eduard Alarcón, Sivasankaran Rajamanickam, Tushar Krishna

    Abstract: Graph Neural Networks (GNNs) have garnered a lot of recent interest because of their success in learning representations from graph-structured data across several critical applications in cloud and HPC. Owing to their unique compute and memory characteristics that come from an interplay between dense and sparse phases of computations, the emergence of reconfigurable dataflow (aka spatial) accelera… ▽ More

    Submitted 6 March, 2022; v1 submitted 14 March, 2021; originally announced March 2021.

    Comments: Accepted for publication at the 36th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2022)

  7. arXiv:2008.02300  [pdf, other

    cs.AR

    MGPU-TSM: A Multi-GPU System with Truly Shared Memory

    Authors: Saiful A. Mojumder, Yifan Sun, Leila Delshadtehrani, Yenai Ma, Trinayan Baruah, José L. Abellán, John Kim, David Kaeli, Ajay Joshi

    Abstract: The sizes of GPU applications are rapidly growing. They are exhausting the compute and memory resources of a single GPU, and are demanding the move to multiple GPUs. However, the performance of these applications scales sub-linearly with GPU count because of the overhead of data movement across multiple GPUs. Moreover, a lack of hardware support for coherency exacerbates the problem because a prog… ▽ More

    Submitted 8 August, 2020; v1 submitted 5 August, 2020; originally announced August 2020.

    Comments: 4 pages, 3 figures

  8. arXiv:2007.04292  [pdf, other

    cs.AR

    HALCONE : A Hardware-Level Timestamp-based Cache Coherence Scheme for Multi-GPU systems

    Authors: Saiful A. Mojumder, Yifan Sun, Leila Delshadtehrani, Yenai Ma, Trinayan Baruah, José L. Abellán, John Kim, David Kaeli, Ajay Joshi

    Abstract: While multi-GPU (MGPU) systems are extremely popular for compute-intensive workloads, several inefficiencies in the memory hierarchy and data movement result in a waste of GPU resources and difficulties in programming MGPU systems. First, due to the lack of hardware-level coherence, the MGPU programming model requires the programmer to replicate and repeatedly transfer data between the GPUs' memor… ▽ More

    Submitted 8 July, 2020; originally announced July 2020.

    Comments: 13 pages, 9 figures

  9. arXiv:2006.07137  [pdf, other

    eess.SP cs.AR cs.LG

    STONNE: A Detailed Architectural Simulator for Flexible Neural Network Accelerators

    Authors: Francisco Muñoz-Martínez, José L. Abellán, Manuel E. Acacio, Tushar Krishna

    Abstract: The design of specialized architectures for accelerating the inference procedure of Deep Neural Networks (DNNs) is a booming area of research nowadays. First-generation rigid proposals have been rapidly replaced by more advanced flexible accelerator architectures able to efficiently support a variety of layer types and dimensions. As the complexity of the designs grows, it is more and more appeali… ▽ More

    Submitted 10 June, 2020; originally announced June 2020.

  10. arXiv:1811.02884  [pdf, other

    cs.DC cs.AR

    MGSim + MGMark: A Framework for Multi-GPU System Research

    Authors: Yifan Sun, Trinayan Baruah, Saiful A. Mojumder, Shi Dong, Rafael Ubal, Xiang Gong, Shane Treadway, Yuhui Bao, Vincent Zhao, José L. Abellán, John Kim, Ajay Joshi, David Kaeli

    Abstract: The rapidly growing popularity and scale of data-parallel workloads demand a corresponding increase in raw computational power of GPUs (Graphics Processing Units). As single-GPU systems struggle to satisfy the performance demands, multi-GPU systems have begun to dominate the high-performance computing world. The advent of such systems raises a number of design challenges, including the GPU microar… ▽ More

    Submitted 13 November, 2018; v1 submitted 15 October, 2018; originally announced November 2018.

    Comments: Updated typo