Skip to main content

Showing 1–6 of 6 results for author: Hoffmann, D T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.07983  [pdf, other

    cs.CV cs.LG

    Two Effects, One Trigger: On the Modality Gap, Object Bias, and Information Imbalance in Contrastive Vision-Language Representation Learning

    Authors: Simon Schrodi, David T. Hoffmann, Max Argus, Volker Fischer, Thomas Brox

    Abstract: Contrastive vision-language models like CLIP have gained popularity for their versatile applicable learned representations in various downstream tasks. Despite their successes in some tasks, like zero-shot image recognition, they also perform surprisingly poor on other tasks, like attribute detection. Previous work has attributed these challenges to the modality gap, a separation of image and text… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  2. arXiv:2310.12956  [pdf, other

    cs.LG cs.AI cs.CV

    Eureka-Moments in Transformers: Multi-Step Tasks Reveal Softmax Induced Optimization Problems

    Authors: David T. Hoffmann, Simon Schrodi, Jelena Bratulić, Nadine Behrmann, Volker Fischer, Thomas Brox

    Abstract: In this work, we study rapid improvements of the training loss in transformers when being confronted with multi-step decision tasks. We found that transformers struggle to learn the intermediate task and both training and validation loss saturate for hundreds of epochs. When transformers finally learn the intermediate task, they do this rapidly and unexpectedly. We call these abrupt improvements E… ▽ More

    Submitted 6 June, 2024; v1 submitted 19 October, 2023; originally announced October 2023.

    Comments: Accepted at ICML 2024

  3. arXiv:2201.11736  [pdf, other

    cs.CV

    Ranking Info Noise Contrastive Estimation: Boosting Contrastive Learning via Ranked Positives

    Authors: David T. Hoffmann, Nadine Behrmann, Juergen Gall, Thomas Brox, Mehdi Noroozi

    Abstract: This paper introduces Ranking Info Noise Contrastive Estimation (RINCE), a new member in the family of InfoNCE losses that preserves a ranked ordering of positive samples. In contrast to the standard InfoNCE loss, which requires a strict binary separation of the training pairs into similar and dissimilar samples, RINCE can exploit information about a similarity ranking for learning a corresponding… ▽ More

    Submitted 27 January, 2022; originally announced January 2022.

    Comments: AAAI 2022 (Main Track)

  4. arXiv:2104.14643  [pdf, other

    cs.CV

    AGORA: Avatars in Geography Optimized for Regression Analysis

    Authors: Priyanka Patel, Chun-Hao P. Huang, Joachim Tesch, David T. Hoffmann, Shashank Tripathi, Michael J. Black

    Abstract: While the accuracy of 3D human pose estimation from images has steadily improved on benchmark datasets, the best methods still fail in many real-world scenarios. This suggests that there is a domain gap between current datasets and common scenes containing people. To obtain ground-truth 3D pose, current datasets limit the complexity of clothing, environmental conditions, number of subjects, and oc… ▽ More

    Submitted 29 April, 2021; originally announced April 2021.

    Journal ref: CVPR 2021

  5. Learning Multi-Human Optical Flow

    Authors: Anurag Ranjan, David T. Hoffmann, Dimitrios Tzionas, Siyu Tang, Javier Romero, Michael J. Black

    Abstract: The optical flow of humans is well known to be useful for the analysis of human action. Recent optical flow methods focus on training deep networks to approach the problem. However, the training data used by them does not cover the domain of human motion. Therefore, we develop a dataset of multi-human optical flow and train optical flow networks on this dataset. We use a 3D model of the human body… ▽ More

    Submitted 4 December, 2019; v1 submitted 24 October, 2019; originally announced October 2019.

    Comments: arXiv admin note: text overlap with arXiv:1806.05666

    Report number: 2019

    Journal ref: International Journal of Computer Vision (IJCV) 2019

  6. arXiv:1908.00967  [pdf, other

    cs.CV

    Learning to Train with Synthetic Humans

    Authors: David T. Hoffmann, Dimitrios Tzionas, Micheal J. Black, Siyu Tang

    Abstract: Neural networks need big annotated datasets for training. However, manual annotation can be too expensive or even unfeasible for certain tasks, like multi-person 2D pose estimation with severe occlusions. A remedy for this is synthetic data with perfect ground truth. Here we explore two variations of synthetic data for this challenging problem; a dataset with purely synthetic humans and a real dat… ▽ More

    Submitted 2 August, 2019; originally announced August 2019.

    Comments: In German Conference on Pattern Recognition (GCPR)