Skip to main content

Showing 1–2 of 2 results for author: Hendon, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2312.14125  [pdf, other

    cs.CV cs.AI

    VideoPoet: A Large Language Model for Zero-Shot Video Generation

    Authors: Dan Kondratyuk, Lijun Yu, Xiuye Gu, José Lezama, Jonathan Huang, Grant Schindler, Rachel Hornung, Vighnesh Birodkar, Jimmy Yan, Ming-Chang Chiu, Krishna Somandepalli, Hassan Akbari, Yair Alon, Yong Cheng, Josh Dillon, Agrim Gupta, Meera Hahn, Anja Hauth, David Hendon, Alonso Martinez, David Minnen, Mikhail Sirotenko, Kihyuk Sohn, Xuan Yang, Hartwig Adam , et al. (6 additional authors not shown)

    Abstract: We present VideoPoet, a language model capable of synthesizing high-quality video, with matching audio, from a large variety of conditioning signals. VideoPoet employs a decoder-only transformer architecture that processes multimodal inputs -- including images, videos, text, and audio. The training protocol follows that of Large Language Models (LLMs), consisting of two stages: pretraining and tas… ▽ More

    Submitted 4 June, 2024; v1 submitted 21 December, 2023; originally announced December 2023.

    Comments: To appear at ICML 2024; Project page: http://sites.research.google/videopoet/

  2. arXiv:2304.04687  [pdf, other

    cs.CV cs.HC

    Learning to Detect Touches on Cluttered Tables

    Authors: Norberto Adrian Goussies, Kenji Hata, Shruthi Prabhakara, Abhishek Amit, Tony Aube, Carl Cepress, Diana Chang, Li-Te Cheng, Horia Stefan Ciurdar, Mike Cleron, Chelsey Fleming, Ashwin Ganti, Divyansh Garg, Niloofar Gheissari, Petra Luna Grutzik, David Hendon, Daniel Iglesia, ** Kim, Stuart Kyle, Chris LaRosa, Roman Lewkow, Peter F McDermott, Chris Melancon, Paru Nackeeran, Neal Norwitz , et al. (6 additional authors not shown)

    Abstract: We present a novel self-contained camera-projector tabletop system with a lamp form-factor that brings digital intelligence to our tables. We propose a real-time, on-device, learning-based touch detection algorithm that makes any tabletop interactive. The top-down configuration and learning-based algorithm makes our method robust to the presence of clutter, a main limitation of existing camera-pro… ▽ More

    Submitted 10 April, 2023; originally announced April 2023.