Skip to main content

Showing 1–7 of 7 results for author: Zadeh, A H

.
  1. arXiv:2204.13666  [pdf, other

    cs.LG cs.AR

    Schrödinger's FP: Dynamic Adaptation of Floating-Point Containers for Deep Learning Training

    Authors: Miloš Nikolić, Enrique Torres Sanchez, Jiahui Wang, Ali Hadi Zadeh, Mostafa Mahmoud, Ameer Abdelhadi, Kareem Ibrahim, Andreas Moshovos

    Abstract: The transfer of tensors from/to memory during neural network training dominates time and energy. To improve energy efficiency and performance, research has been exploring ways to use narrower data representations. So far, these attempts relied on user-directed trial-and-error to achieve convergence. We present methods that relieve users from this responsibility. Our methods dynamically adjust the… ▽ More

    Submitted 16 May, 2024; v1 submitted 28 April, 2022; originally announced April 2022.

  2. Mokey: Enabling Narrow Fixed-Point Inference for Out-of-the-Box Floating-Point Transformer Models

    Authors: Ali Hadi Zadeh, Mostafa Mahmoud, Ameer Abdelhadi, Andreas Moshovos

    Abstract: Increasingly larger and better Transformer models keep advancing state-of-the-art accuracy and capability for Natural Language Processing applications. These models demand more computational power, storage, and energy. Mokey reduces the footprint of state-of-the-art 32-bit or 16-bit floating-point transformer models by quantizing all values to 4-bit indexes into dictionaries of representative 16-b… ▽ More

    Submitted 23 March, 2022; originally announced March 2022.

    Comments: Accepted at the 49th IEEE/ACM International Symposium on Computer Architecture (ISCA '22)

  3. arXiv:2201.10093  [pdf, other

    stat.ME math.PR

    Some applications of phase-type distributions in recurrent events

    Authors: Roufeh Asghari, Amin Hassan Zadeh

    Abstract: In this paper, the recurrent events that can occur more than one over the follow-up time have been modeled by phase-type distributions. We use the finite-state continuous-time Markov process with multi states for patients with recurrent events. The number of recurrences until time $t$, the time stay for every state and the time till death are of importances. The time till death is assumed to have… ▽ More

    Submitted 24 January, 2022; originally announced January 2022.

  4. arXiv:2010.08065  [pdf, other

    cs.AR cs.AI

    FPRaker: A Processing Element For Accelerating Neural Network Training

    Authors: Omar Mohamed Awad, Mostafa Mahmoud, Isak Edo, Ali Hadi Zadeh, Ciaran Bannon, Anand Jayarajan, Gennady Pekhimenko, Andreas Moshovos

    Abstract: We present FPRaker, a processing element for composing training accelerators. FPRaker processes several floating-point multiply-accumulation operations concurrently and accumulates their result into a higher precision accumulator. FPRaker boosts performance and energy efficiency during training by taking advantage of the values that naturally appear during training. Specifically, it processes the… ▽ More

    Submitted 15 October, 2020; originally announced October 2020.

  5. TensorDash: Exploiting Sparsity to Accelerate Deep Neural Network Training and Inference

    Authors: Mostafa Mahmoud, Isak Edo, Ali Hadi Zadeh, Omar Mohamed Awad, Gennady Pekhimenko, Jorge Albericio, Andreas Moshovos

    Abstract: TensorDash is a hardware level technique for enabling data-parallel MAC units to take advantage of sparsity in their input operand streams. When used to compose a hardware accelerator for deep learning, TensorDash can speedup the training process while also increasing energy efficiency. TensorDash combines a low-cost, sparse input operand interconnect comprising an 8-input multiplexer per multipli… ▽ More

    Submitted 1 September, 2020; originally announced September 2020.

  6. GOBO: Quantizing Attention-Based NLP Models for Low Latency and Energy Efficient Inference

    Authors: Ali Hadi Zadeh, Isak Edo, Omar Mohamed Awad, Andreas Moshovos

    Abstract: Attention-based models have demonstrated remarkable success in various natural language understanding tasks. However, efficient execution remains a challenge for these models which are memory-bound due to their massive number of parameters. We present GOBO, a model quantization technique that compresses the vast majority (typically 99.9%) of the 32-bit floating-point parameters of state-of-the-art… ▽ More

    Submitted 26 September, 2020; v1 submitted 7 May, 2020; originally announced May 2020.

    Comments: Accepted at the 53rd IEEE/ACM International Symposium on Microarchitecture - MICRO 2020

  7. arXiv:1806.10111  [pdf, ps, other

    stat.AP

    Modelling Joint Lifetimes of Couples by Using Bivariate Phase-type Distributions

    Authors: Amin Hassan Zadeh, Soroush Amirhashchi

    Abstract: Many insurance products and pension plans provide benefits which are related to couples, and thus under influence of the survival status of two lives. Some studies show the future lifetime of couples is correlated. Three reasons are available to confirm this fact: (1) catastrophe events that affect both lives, (2) the impact of spousal death and (3) the long-term association due to common life sty… ▽ More

    Submitted 26 June, 2018; originally announced June 2018.

    Comments: 20 pages