Skip to main content

Showing 1–7 of 7 results for author: Lyubomirsky, S

.
  1. FPGA Technology Map** Using Sketch-Guided Program Synthesis

    Authors: Gus Henry Smith, Ben Kushigian, Vishal Canumalla, Andrew Cheung, Steven Lyubomirsky, Sorawee Porncharoenwase, René Just, Gilbert Louis Bernstein, Zachary Tatlock

    Abstract: FPGA technology map** is the process of implementing a hardware design expressed in high-level HDL (hardware design language) code using the low-level, architecture-specific primitives of the target FPGA. As FPGAs become increasingly heterogeneous, achieving high performance requires hardware synthesis tools that better support map** to complex, highly configurable primitives like digital sign… ▽ More

    Submitted 29 January, 2024; originally announced January 2024.

  2. arXiv:2311.02103  [pdf, other

    cs.LG cs.AI cs.PL

    Relax: Composable Abstractions for End-to-End Dynamic Machine Learning

    Authors: Ruihang Lai, Junru Shao, Siyuan Feng, Steven S. Lyubomirsky, Bohan Hou, Wuwei Lin, Zihao Ye, Hongyi **, Yuchen **, Jiawei Liu, Lesheng **, Yaxing Cai, Ziheng Jiang, Yong Wu, Sunghyun Park, Prakalp Srivastava, Jared G. Roesch, Todd C. Mowry, Tianqi Chen

    Abstract: Dynamic shape computations have become critical in modern machine learning workloads, especially in emerging large language models. The success of these models has driven demand for deploying them to a diverse set of backend environments. In this paper, we present Relax, a compiler abstraction for optimizing end-to-end dynamic machine learning workloads. Relax introduces first-class symbolic shape… ▽ More

    Submitted 1 November, 2023; originally announced November 2023.

  3. arXiv:2203.00218  [pdf, other

    cs.AR cs.PL

    Application-Level Validation of Accelerator Designs Using a Formal Software/Hardware Interface

    Authors: Bo-Yuan Huang, Steven Lyubomirsky, Yi Li, Mike He, Gus Henry Smith, Thierry Tambe, Akash Gaonkar, Vishal Canumalla, Andrew Cheung, Gu-Yeon Wei, Aarti Gupta, Zachary Tatlock, Sharad Malik

    Abstract: Ideally, accelerator development should be as easy as software development. Several recent design languages/tools are working toward this goal, but actually testing early designs on real applications end-to-end remains prohibitively difficult due to the costs of building specialized compiler and simulator support. We propose a new first-in-class, mostly automated methodology termed "3LA" to enable… ▽ More

    Submitted 22 August, 2023; v1 submitted 28 February, 2022; originally announced March 2022.

  4. Pure Tensor Program Rewriting via Access Patterns (Representation Pearl)

    Authors: Gus Henry Smith, Andrew Liu, Steven Lyubomirsky, Scott Davidson, Joseph McMahan, Michael Taylor, Luis Ceze, Zachary Tatlock

    Abstract: Tensor kernels in machine learning (ML) often correspond to pure mathematical expressions, making term rewriting an attractive strategy for optimization and map** to specialized hardware accelerators. However, existing ML intermediate representations (IRs) tend to either be \textit{pure but high-level}, making low-level rewrites to hardware targets inexpressible, or \textit{low-level but impure}… ▽ More

    Submitted 19 May, 2021; originally announced May 2021.

    Comments: To be published at MAPS 2021

  5. arXiv:2006.09616  [pdf, other

    cs.LG cs.PL stat.ML

    Dynamic Tensor Rematerialization

    Authors: Marisa Kirisame, Steven Lyubomirsky, Altan Haan, Jennifer Brennan, Mike He, Jared Roesch, Tianqi Chen, Zachary Tatlock

    Abstract: Checkpointing enables the training of deep learning models under restricted memory budgets by freeing intermediate activations from memory and recomputing them on demand. Current checkpointing techniques statically plan these recomputations offline and assume static computation graphs. We demonstrate that a simple online algorithm can achieve comparable performance by introducing Dynamic Tensor Re… ▽ More

    Submitted 18 March, 2021; v1 submitted 16 June, 2020; originally announced June 2020.

    Comments: 31 pages, 12 figures, implementation available here: https://github.com/uwsampl/dtr-prototype, OpenReview: https://openreview.net/forum?id=Vfs_2RnOD0H

    ACM Class: C.3

  6. arXiv:1904.08368  [pdf, other

    cs.LG cs.PL stat.ML

    Relay: A High-Level Compiler for Deep Learning

    Authors: Jared Roesch, Steven Lyubomirsky, Marisa Kirisame, Logan Weber, Josh Pollock, Luis Vega, Ziheng Jiang, Tianqi Chen, Thierry Moreau, Zachary Tatlock

    Abstract: Frameworks for writing, compiling, and optimizing deep learning (DL) models have recently enabled progress in areas like computer vision and natural language processing. Extending these frameworks to accommodate the rapidly diversifying landscape of DL models and hardware platforms presents challenging tradeoffs between expressivity, composability, and portability. We present Relay, a new compiler… ▽ More

    Submitted 24 August, 2019; v1 submitted 17 April, 2019; originally announced April 2019.

  7. Relay: A New IR for Machine Learning Frameworks

    Authors: Jared Roesch, Steven Lyubomirsky, Logan Weber, Josh Pollock, Marisa Kirisame, Tianqi Chen, Zachary Tatlock

    Abstract: Machine learning powers diverse services in industry including search, translation, recommendation systems, and security. The scale and importance of these models require that they be efficient, expressive, and portable across an array of heterogeneous hardware devices. These constraints are often at odds; in order to better accommodate them we propose a new high-level intermediate representation… ▽ More

    Submitted 25 September, 2018; originally announced October 2018.