Skip to main content

Showing 1–7 of 7 results for author: Haj-Ali, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2006.06762  [pdf, other

    cs.LG cs.NE cs.PF cs.PL stat.ML

    Ansor: Generating High-Performance Tensor Programs for Deep Learning

    Authors: Lianmin Zheng, Chengfan Jia, Minmin Sun, Zhao Wu, Cody Hao Yu, Ameer Haj-Ali, Yida Wang, Jun Yang, Danyang Zhuo, Koushik Sen, Joseph E. Gonzalez, Ion Stoica

    Abstract: High-performance tensor programs are crucial to guarantee efficient execution of deep neural networks. However, obtaining performant tensor programs for different operators on various hardware platforms is notoriously challenging. Currently, deep learning systems rely on vendor-provided kernel libraries or various search strategies to get performant tensor programs. These approaches either require… ▽ More

    Submitted 15 October, 2023; v1 submitted 11 June, 2020; originally announced June 2020.

    Comments: OSDI 2020

  2. arXiv:2005.13685  [pdf, other

    cs.DC cs.AI cs.LG cs.PF cs.PL

    ProTuner: Tuning Programs with Monte Carlo Tree Search

    Authors: Ameer Haj-Ali, Hasan Genc, Qi**g Huang, William Moses, John Wawrzynek, Krste Asanović, Ion Stoica

    Abstract: We explore applying the Monte Carlo Tree Search (MCTS) algorithm in a notoriously difficult task: tuning programs for high-performance deep learning and image processing. We build our framework on top of Halide and show that MCTS can outperform the state-of-the-art beam-search algorithm. Unlike beam search, which is guided by greedy intermediate performance comparisons between partial and less mea… ▽ More

    Submitted 27 May, 2020; originally announced May 2020.

  3. arXiv:2003.00671  [pdf, other

    cs.DC cs.LG cs.PL

    AutoPhase: Juggling HLS Phase Orderings in Random Forests with Deep Reinforcement Learning

    Authors: Qi**g Huang, Ameer Haj-Ali, William Moses, John Xiang, Ion Stoica, Krste Asanovic, John Wawrzynek

    Abstract: The performance of the code a compiler generates depends on the order in which it applies the optimization passes. Choosing a good order--often referred to as the phase-ordering problem, is an NP-hard problem. As a result, existing solutions rely on a variety of heuristics. In this paper, we evaluate a new technique to address the phase-ordering problem: deep reinforcement learning. To this end, w… ▽ More

    Submitted 4 March, 2020; v1 submitted 2 March, 2020; originally announced March 2020.

    Comments: arXiv admin note: text overlap with arXiv:1901.04615

  4. arXiv:1911.09925  [pdf, other

    cs.DC cs.AR cs.LG cs.PF

    Gemmini: Enabling Systematic Deep-Learning Architecture Evaluation via Full-Stack Integration

    Authors: Hasan Genc, Seah Kim, Alon Amid, Ameer Haj-Ali, Vighnesh Iyer, Pranav Prakash, Jerry Zhao, Daniel Grubb, Harrison Liew, Howard Mao, Albert Ou, Colin Schmidt, Samuel Steffl, John Wright, Ion Stoica, Jonathan Ragan-Kelley, Krste Asanovic, Borivoje Nikolic, Yakun Sophia Shao

    Abstract: DNN accelerators are often developed and evaluated in isolation without considering the cross-stack, system-level effects in real-world environments. This makes it difficult to appreciate the impact of System-on-Chip (SoC) resource contention, OS overheads, and programming-stack inefficiencies on overall performance/energy-efficiency. To address this challenge, we present Gemmini, an open-source*,… ▽ More

    Submitted 9 July, 2021; v1 submitted 22 November, 2019; originally announced November 2019.

    Comments: To appear at the 58th IEEE/ACM Design Automation Conference (DAC), December 2021, San Francisco, CA, USA

  5. arXiv:1909.13639  [pdf, other

    cs.DC cs.PF cs.PL

    NeuroVectorizer: End-to-End Vectorization with Deep Reinforcement Learning

    Authors: Ameer Haj-Ali, Nesreen K. Ahmed, Ted Willke, Sophia Shao, Krste Asanovic, Ion Stoica

    Abstract: One of the key challenges arising when compilers vectorize loops for today's SIMD-compatible architectures is to decide if vectorization or interleaving is beneficial. Then, the compiler has to determine how many instructions to pack together and how many loop iterations to interleave. Compilers are designed today to use fixed-cost models that are based on heuristics to make vectorization decision… ▽ More

    Submitted 4 January, 2020; v1 submitted 20 September, 2019; originally announced September 2019.

  6. arXiv:1908.01275  [pdf, other

    cs.LG cs.AI eess.SY

    A View on Deep Reinforcement Learning in System Optimization

    Authors: Ameer Haj-Ali, Nesreen K. Ahmed, Ted Willke, Joseph Gonzalez, Krste Asanovic, Ion Stoica

    Abstract: Many real-world systems problems require reasoning about the long term consequences of actions taken to configure and manage the system. These problems with delayed and often sequentially aggregated reward, are often inherently reinforcement learning problems and present the opportunity to leverage the recent substantial advances in deep reinforcement learning. However, in some cases, it is not cl… ▽ More

    Submitted 4 September, 2019; v1 submitted 4 August, 2019; originally announced August 2019.

  7. arXiv:1901.04615  [pdf, other

    cs.PL cs.LG

    AutoPhase: Compiler Phase-Ordering for High Level Synthesis with Deep Reinforcement Learning

    Authors: Ameer Haj-Ali, Qi**g Huang, William Moses, John Xiang, Ion Stoica, Krste Asanovic, John Wawrzynek

    Abstract: The performance of the code generated by a compiler depends on the order in which the optimization passes are applied. In high-level synthesis, the quality of the generated circuit relates directly to the code generated by the front-end compiler. Choosing a good order--often referred to as the phase-ordering problem--is an NP-hard problem. In this paper, we evaluate a new technique to address the… ▽ More

    Submitted 3 April, 2019; v1 submitted 14 January, 2019; originally announced January 2019.