Skip to main content

Showing 1–4 of 4 results for author: Clemons, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2212.02687  [pdf, other

    cs.CV cs.AR

    Vision Transformer Computation and Resilience for Dynamic Inference

    Authors: Kavya Sreedhar, Jason Clemons, Rangharajan Venkatesan, Stephen W. Keckler, Mark Horowitz

    Abstract: State-of-the-art deep learning models for computer vision tasks are based on the transformer architecture and often deployed in real-time applications. In this scenario, the resources available for every inference can vary, so it is useful to be able to dynamically adapt execution to trade accuracy for efficiency. To create dynamic models, we leverage the resilience of vision transformers to pruni… ▽ More

    Submitted 15 April, 2024; v1 submitted 5 December, 2022; originally announced December 2022.

    Journal ref: 2024 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)

  2. arXiv:1806.00512  [pdf, other

    cs.LG cs.CL stat.ML

    Structurally Sparsified Backward Propagation for Faster Long Short-Term Memory Training

    Authors: Maohua Zhu, Jason Clemons, Jeff Pool, Minsoo Rhu, Stephen W. Keckler, Yuan Xie

    Abstract: Exploiting sparsity enables hardware systems to run neural networks faster and more energy-efficiently. However, most prior sparsity-centric optimization techniques only accelerate the forward pass of neural networks and usually require an even longer training process with iterative pruning and retraining. We observe that artificially inducing sparsity in the gradients of the gates in an LSTM cell… ▽ More

    Submitted 1 June, 2018; originally announced June 2018.

  3. arXiv:1611.06256  [pdf, other

    cs.LG

    Reinforcement Learning through Asynchronous Advantage Actor-Critic on a GPU

    Authors: Mohammad Babaeizadeh, Iuri Frosio, Stephen Tyree, Jason Clemons, Jan Kautz

    Abstract: We introduce a hybrid CPU/GPU version of the Asynchronous Advantage Actor-Critic (A3C) algorithm, currently the state-of-the-art method in reinforcement learning for various gaming tasks. We analyze its computational traits and concentrate on aspects critical to leveraging the GPU's computational power. We introduce a system of queues and a dynamic scheduling strategy, potentially helpful for othe… ▽ More

    Submitted 2 March, 2017; v1 submitted 18 November, 2016; originally announced November 2016.

  4. arXiv:1602.08124  [pdf, other

    cs.DC cs.LG cs.NE

    vDNN: Virtualized Deep Neural Networks for Scalable, Memory-Efficient Neural Network Design

    Authors: Minsoo Rhu, Natalia Gimelshein, Jason Clemons, Arslan Zulfiqar, Stephen W. Keckler

    Abstract: The most widely used machine learning frameworks require users to carefully tune their memory usage so that the deep neural network (DNN) fits into the DRAM capacity of a GPU. This restriction hampers a researcher's flexibility to study different machine learning algorithms, forcing them to either use a less desirable network architecture or parallelize the processing across multiple GPUs. We prop… ▽ More

    Submitted 28 July, 2016; v1 submitted 25 February, 2016; originally announced February 2016.

    Comments: Published as a conference paper at the 49th IEEE/ACM International Symposium on Microarchitecture (MICRO-49), 2016