Skip to main content

Showing 1–7 of 7 results for author: Hornung, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.13217  [pdf, other

    cs.CV cs.AI

    VideoPrism: A Foundational Visual Encoder for Video Understanding

    Authors: Long Zhao, Nitesh B. Gundavarapu, Liangzhe Yuan, Hao Zhou, Shen Yan, Jennifer J. Sun, Luke Friedman, Rui Qian, Tobias Weyand, Yue Zhao, Rachel Hornung, Florian Schroff, Ming-Hsuan Yang, David A. Ross, Huisheng Wang, Hartwig Adam, Mikhail Sirotenko, Ting Liu, Boqing Gong

    Abstract: We introduce VideoPrism, a general-purpose video encoder that tackles diverse video understanding tasks with a single frozen model. We pretrain VideoPrism on a heterogeneous corpus containing 36M high-quality video-caption pairs and 582M video clips with noisy parallel text (e.g., ASR transcripts). The pretraining approach improves upon masked autoencoding by global-local distillation of semantic… ▽ More

    Submitted 15 June, 2024; v1 submitted 20 February, 2024; originally announced February 2024.

    Comments: Accepted to ICML 2024. v2: added retrieval results on MSRVTT (1K-A), more data analyses, and ablation studies

  2. arXiv:2312.14125  [pdf, other

    cs.CV cs.AI

    VideoPoet: A Large Language Model for Zero-Shot Video Generation

    Authors: Dan Kondratyuk, Lijun Yu, Xiuye Gu, José Lezama, Jonathan Huang, Grant Schindler, Rachel Hornung, Vighnesh Birodkar, Jimmy Yan, Ming-Chang Chiu, Krishna Somandepalli, Hassan Akbari, Yair Alon, Yong Cheng, Josh Dillon, Agrim Gupta, Meera Hahn, Anja Hauth, David Hendon, Alonso Martinez, David Minnen, Mikhail Sirotenko, Kihyuk Sohn, Xuan Yang, Hartwig Adam , et al. (6 additional authors not shown)

    Abstract: We present VideoPoet, a language model capable of synthesizing high-quality video, with matching audio, from a large variety of conditioning signals. VideoPoet employs a decoder-only transformer architecture that processes multimodal inputs -- including images, videos, text, and audio. The training protocol follows that of Large Language Models (LLMs), consisting of two stages: pretraining and tas… ▽ More

    Submitted 4 June, 2024; v1 submitted 21 December, 2023; originally announced December 2023.

    Comments: To appear at ICML 2024; Project page: http://sites.research.google/videopoet/

  3. arXiv:2310.15108  [pdf, other

    stat.ML cs.LG stat.AP stat.CO stat.ME

    Evaluating machine learning models in non-standard settings: An overview and new findings

    Authors: Roman Hornung, Malte Nalenz, Lennart Schneider, Andreas Bender, Ludwig Bothmann, Bernd Bischl, Thomas Augustin, Anne-Laure Boulesteix

    Abstract: Estimating the generalization error (GE) of machine learning models is fundamental, with resampling methods being the most common approach. However, in non-standard settings, particularly those where observations are not independently and identically distributed, resampling using simple random data divisions may lead to biased GE estimates. This paper strives to present well-grounded guidelines fo… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

  4. arXiv:2305.06324  [pdf, other

    cs.CV cs.AI cs.LG cs.MM eess.IV

    Alternating Gradient Descent and Mixture-of-Experts for Integrated Multimodal Perception

    Authors: Hassan Akbari, Dan Kondratyuk, Yin Cui, Rachel Hornung, Huisheng Wang, Hartwig Adam

    Abstract: We present Integrated Multimodal Perception (IMP), a simple and scalable multimodal multi-task training and modeling approach. IMP integrates multimodal inputs including image, video, text, and audio into a single Transformer encoder with minimal modality-specific components. IMP makes use of a novel design that combines Alternating Gradient Descent (AGD) and Mixture-of-Experts (MoE) for efficient… ▽ More

    Submitted 11 December, 2023; v1 submitted 10 May, 2023; originally announced May 2023.

  5. arXiv:2302.03991  [pdf, other

    q-bio.GN cs.AI cs.LG stat.AP stat.CO

    Prediction approaches for partly missing multi-omics covariate data: A literature review and an empirical comparison study

    Authors: Roman Hornung, Frederik Ludwigs, Jonas Hagenberg, Anne-Laure Boulesteix

    Abstract: As the availability of omics data has increased in the last few years, more multi-omics data have been generated, that is, high-dimensional molecular data consisting of several types such as genomic, transcriptomic, or proteomic data, all obtained from the same patients. Such data lend themselves to being used as covariates in automatic outcome prediction because each omics type may contribute uni… ▽ More

    Submitted 8 February, 2023; originally announced February 2023.

  6. arXiv:2206.01284  [pdf, other

    stat.ME cs.LG stat.AP stat.CO stat.ML

    Sequential Permutation Testing of Random Forest Variable Importance Measures

    Authors: Alexander Hapfelmeier, Roman Hornung, Bernhard Haller

    Abstract: Hypothesis testing of random forest (RF) variable importance measures (VIMP) remains the subject of ongoing research. Among recent developments, heuristic approaches to parametric testing have been proposed whose distributional assumptions are based on empirical evidence. Other formal tests under regularity conditions were derived analytically. However, these approaches can be computationally expe… ▽ More

    Submitted 2 June, 2022; originally announced June 2022.

    Journal ref: Computational Statistics & Data Analysis 181 (2023): 107689

  7. arXiv:2003.03621  [pdf, ps, other

    stat.ML cs.LG stat.AP stat.ME

    Large-scale benchmark study of survival prediction methods using multi-omics data

    Authors: Moritz Herrmann, Philipp Probst, Roman Hornung, Vindi Jurinovic, Anne-Laure Boulesteix

    Abstract: Multi-omics data, that is, datasets containing different types of high-dimensional molecular variables (often in addition to classical clinical variables), are increasingly generated for the investigation of various diseases. Nevertheless, questions remain regarding the usefulness of multi-omics data for the prediction of disease outcomes such as survival time. It is also unclear which methods are… ▽ More

    Submitted 7 March, 2020; originally announced March 2020.

    Comments: 23 pages, 6 tables, 3 figures

    Journal ref: Briefings in Bioinformatics (2020) bbaa167