Skip to main content

Showing 1–10 of 10 results for author: Schindler, G

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.13195  [pdf, other

    cs.CV cs.AI

    CamViG: Camera Aware Image-to-Video Generation with Multimodal Transformers

    Authors: Andrew Marmon, Grant Schindler, José Lezama, Dan Kondratyuk, Bryan Seybold, Irfan Essa

    Abstract: We extend multimodal transformers to include 3D camera motion as a conditioning signal for the task of video generation. Generative video models are becoming increasingly powerful, thus focusing research efforts on methods of controlling the output of such models. We propose to add virtual 3D camera controls to generative video methods by conditioning generated video on an encoding of three-dimens… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

  2. arXiv:2404.11419  [pdf, other

    cs.CV

    SLAIM: Robust Dense Neural SLAM for Online Tracking and Map**

    Authors: Vincent Cartillier, Grant Schindler, Irfan Essa

    Abstract: We present SLAIM - Simultaneous Localization and Implicit Map**. We propose a novel coarse-to-fine tracking model tailored for Neural Radiance Field SLAM (NeRF-SLAM) to achieve state-of-the-art tracking performance. Notably, existing NeRF-SLAM systems consistently exhibit inferior tracking performance compared to traditional SLAM algorithms. NeRF-SLAM methods solve camera tracking via image alig… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  3. arXiv:2312.14125  [pdf, other

    cs.CV cs.AI

    VideoPoet: A Large Language Model for Zero-Shot Video Generation

    Authors: Dan Kondratyuk, Lijun Yu, Xiuye Gu, José Lezama, Jonathan Huang, Grant Schindler, Rachel Hornung, Vighnesh Birodkar, Jimmy Yan, Ming-Chang Chiu, Krishna Somandepalli, Hassan Akbari, Yair Alon, Yong Cheng, Josh Dillon, Agrim Gupta, Meera Hahn, Anja Hauth, David Hendon, Alonso Martinez, David Minnen, Mikhail Sirotenko, Kihyuk Sohn, Xuan Yang, Hartwig Adam , et al. (6 additional authors not shown)

    Abstract: We present VideoPoet, a language model capable of synthesizing high-quality video, with matching audio, from a large variety of conditioning signals. VideoPoet employs a decoder-only transformer architecture that processes multimodal inputs -- including images, videos, text, and audio. The training protocol follows that of Large Language Models (LLMs), consisting of two stages: pretraining and tas… ▽ More

    Submitted 4 June, 2024; v1 submitted 21 December, 2023; originally announced December 2023.

    Comments: To appear at ICML 2024; Project page: http://sites.research.google/videopoet/

  4. arXiv:2010.11773  [pdf, other

    cs.LG cs.AI stat.ML

    On Resource-Efficient Bayesian Network Classifiers and Deep Neural Networks

    Authors: Wolfgang Roth, Günther Schindler, Holger Fröning, Franz Pernkopf

    Abstract: We present two methods to reduce the complexity of Bayesian network (BN) classifiers. First, we introduce quantization-aware training using the straight-through gradient estimator to quantize the parameters of BNs to few bits. Second, we extend a recently proposed differentiable tree-augmented naive Bayes (TAN) structure learning approach by also considering the model size. Both methods are motiva… ▽ More

    Submitted 22 September, 2021; v1 submitted 22 October, 2020; originally announced October 2020.

    Comments: Accepted at ICPR 2020, fixed Figure 5

  5. arXiv:2007.11477  [pdf, other

    eess.AS cs.LG cs.SD

    Resource-Efficient Speech Mask Estimation for Multi-Channel Speech Enhancement

    Authors: Lukas Pfeifenberger, Matthias Zöhrer, Günther Schindler, Wolfgang Roth, Holger Fröning, Franz Pernkopf

    Abstract: While machine learning techniques are traditionally resource intensive, we are currently witnessing an increased interest in hardware and energy efficient approaches. This need for resource-efficient machine learning is primarily driven by the demand for embedded systems and their usage in ubiquitous computing and IoT applications. In this article, we provide a resource-efficient approach for mult… ▽ More

    Submitted 22 July, 2020; originally announced July 2020.

  6. arXiv:2006.14008  [pdf, other

    cs.AR cs.LG

    On the Difficulty of Designing Processor Arrays for Deep Neural Networks

    Authors: Kevin Stehle, Günther Schindler, Holger Fröning

    Abstract: Systolic arrays are a promising computing concept which is in particular inline with CMOS technology trends and linear algebra operations found in the processing of artificial neural networks. The recent success of such deep learning methods in a wide set of applications has led to a variety of models, which albeit conceptual similar as based on convolutions and fully-connected layers, in detail s… ▽ More

    Submitted 24 June, 2020; originally announced June 2020.

    Comments: 12 pages, 6 figures

  7. arXiv:2001.03048  [pdf, other

    stat.ML cs.LG

    Resource-Efficient Neural Networks for Embedded Systems

    Authors: Wolfgang Roth, Günther Schindler, Bernhard Klein, Robert Peharz, Sebastian Tschiatschek, Holger Fröning, Franz Pernkopf, Zoubin Ghahramani

    Abstract: While machine learning is traditionally a resource intensive task, embedded systems, autonomous navigation, and the vision of the Internet of Things fuel the interest in resource-efficient approaches. These approaches aim for a carefully chosen trade-off between performance and resource consumption in terms of computation and energy. The development of such approaches is among the major challenges… ▽ More

    Submitted 7 April, 2024; v1 submitted 7 January, 2020; originally announced January 2020.

    Comments: arXiv admin note: text overlap with arXiv:1812.02240; accepted at JMLR

  8. arXiv:1906.05180  [pdf, other

    cs.LG stat.ML

    Parameterized Structured Pruning for Deep Neural Networks

    Authors: Guenther Schindler, Wolfgang Roth, Franz Pernkopf, Holger Froening

    Abstract: As a result of the growing size of Deep Neural Networks (DNNs), the gap to hardware capabilities in terms of memory and compute increases. To effectively compress DNNs, quantization and connection pruning are usually considered. However, unconstrained pruning usually leads to unstructured parallelism, which maps poorly to massively parallel processors, and substantially reduces the efficiency of g… ▽ More

    Submitted 12 June, 2019; originally announced June 2019.

  9. arXiv:1812.02240  [pdf, other

    cs.LG stat.ML

    Efficient and Robust Machine Learning for Real-World Systems

    Authors: Franz Pernkopf, Wolfgang Roth, Matthias Zoehrer, Lukas Pfeifenberger, Guenther Schindler, Holger Froening, Sebastian Tschiatschek, Robert Peharz, Matthew Mattina, Zoubin Ghahramani

    Abstract: While machine learning is traditionally a resource intensive task, embedded systems, autonomous navigation and the vision of the Internet-of-Things fuel the interest in resource efficient approaches. These approaches require a carefully chosen trade-off between performance and resource consumption in terms of computation and energy. On top of this, it is crucial to treat uncertainty in a consisten… ▽ More

    Submitted 5 December, 2018; originally announced December 2018.

  10. Augmenting Bag-of-Words: Data-Driven Discovery of Temporal and Structural Information for Activity Recognition

    Authors: Vinay Bettadapura, Grant Schindler, Thomaz Plotz, Irfan Essa

    Abstract: We present data-driven techniques to augment Bag of Words (BoW) models, which allow for more robust modeling and recognition of complex long-term activities, especially when the structure and topology of the activities are not known a priori. Our approach specifically addresses the limitations of standard BoW approaches, which fail to represent the underlying temporal and causal information that i… ▽ More

    Submitted 7 October, 2015; originally announced October 2015.

    Comments: 8 pages

    Journal ref: Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2013) -- Pages 2619 - 2626