Skip to main content

Showing 1–2 of 2 results for author: Tarjan, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.09636  [pdf, other

    cs.CL

    Dynamic Memory Compression: Retrofitting LLMs for Accelerated Inference

    Authors: Piotr Nawrot, Adrian Łańcucki, Marcin Chochowski, David Tarjan, Edoardo M. Ponti

    Abstract: Transformers have emerged as the backbone of large language models (LLMs). However, generation remains inefficient due to the need to store in memory a cache of key-value representations for past tokens, whose size scales linearly with the input sequence length and batch size. As a solution, we propose Dynamic Memory Compression (DMC), a method for on-line key-value cache compression at inference… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

  2. arXiv:1811.00684  [pdf, other

    cs.CV

    SDCNet: Video Prediction Using Spatially-Displaced Convolution

    Authors: Fitsum A. Reda, Guilin Liu, Kevin J. Shih, Robert Kirby, Jon Barker, David Tarjan, Andrew Tao, Bryan Catanzaro

    Abstract: We present an approach for high-resolution video frame prediction by conditioning on both past frames and past optical flows. Previous approaches rely on resampling past frames, guided by a learned future optical flow, or on direct generation of pixels. Resampling based on flow is insufficient because it cannot deal with disocclusions. Generative models currently lead to blurry results. Recent app… ▽ More

    Submitted 27 March, 2021; v1 submitted 1 November, 2018; originally announced November 2018.

    Comments: Published in ECCV 2018. Codes available at https://github.com/NVIDIA/semantic-segmentation/tree/sdcnet/sdcnet. Project page available at https://nv-adlr.github.io/publication/2018-SDCNet