Skip to main content

Showing 1–8 of 8 results for author: Schrodi, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.07983  [pdf, other

    cs.CV cs.LG

    Two Effects, One Trigger: On the Modality Gap, Object Bias, and Information Imbalance in Contrastive Vision-Language Representation Learning

    Authors: Simon Schrodi, David T. Hoffmann, Max Argus, Volker Fischer, Thomas Brox

    Abstract: Contrastive vision-language models like CLIP have gained popularity for their versatile applicable learned representations in various downstream tasks. Despite their successes in some tasks, like zero-shot image recognition, they also perform surprisingly poor on other tasks, like attribute detection. Previous work has attributed these challenges to the modality gap, a separation of image and text… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  2. arXiv:2402.03170  [pdf, other

    cs.LG

    Is Mamba Capable of In-Context Learning?

    Authors: Riccardo Grazzi, Julien Siems, Simon Schrodi, Thomas Brox, Frank Hutter

    Abstract: State of the art foundation models such as GPT-4 perform surprisingly well at in-context learning (ICL), a variant of meta-learning concerning the learned ability to solve tasks during a neural network forward pass, exploiting contextual information provided as input to the model. This useful ability emerges as a side product of the foundation model's massive pretraining. While transformer models… ▽ More

    Submitted 24 April, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

  3. arXiv:2310.12956  [pdf, other

    cs.LG cs.AI cs.CV

    Eureka-Moments in Transformers: Multi-Step Tasks Reveal Softmax Induced Optimization Problems

    Authors: David T. Hoffmann, Simon Schrodi, Jelena Bratulić, Nadine Behrmann, Volker Fischer, Thomas Brox

    Abstract: In this work, we study rapid improvements of the training loss in transformers when being confronted with multi-step decision tasks. We found that transformers struggle to learn the intermediate task and both training and validation loss saturate for hundreds of epochs. When transformers finally learn the intermediate task, they do this rapidly and unexpectedly. We call these abrupt improvements E… ▽ More

    Submitted 6 June, 2024; v1 submitted 19 October, 2023; originally announced October 2023.

    Comments: Accepted at ICML 2024

  4. arXiv:2310.06668  [pdf, other

    cs.LG cs.CV

    Latent Diffusion Counterfactual Explanations

    Authors: Karim Farid, Simon Schrodi, Max Argus, Thomas Brox

    Abstract: Counterfactual explanations have emerged as a promising method for elucidating the behavior of opaque black-box models. Recently, several works leveraged pixel-space diffusion models for counterfactual generation. To handle noisy, adversarial gradients during counterfactual generation -- causing unrealistic artifacts or mere adversarial perturbations -- they required either auxiliary adversarially… ▽ More

    Submitted 10 October, 2023; originally announced October 2023.

  5. arXiv:2310.05691  [pdf, other

    cs.CV physics.ao-ph

    Climate-sensitive Urban Planning through Optimization of Tree Placements

    Authors: Simon Schrodi, Ferdinand Briegel, Max Argus, Andreas Christen, Thomas Brox

    Abstract: Climate change is increasing the intensity and frequency of many extreme weather events, including heatwaves, which results in increased thermal discomfort and mortality rates. While global mitigation action is undoubtedly necessary, so is climate adaptation, e.g., through climate-sensitive urban planning. Among the most promising strategies is harnessing the benefits of urban trees in shading and… ▽ More

    Submitted 9 October, 2023; originally announced October 2023.

  6. arXiv:2211.01842  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Construction of Hierarchical Neural Architecture Search Spaces based on Context-free Grammars

    Authors: Simon Schrodi, Danny Stoll, Binxin Ru, Rhea Sukthanker, Thomas Brox, Frank Hutter

    Abstract: The discovery of neural architectures from simple building blocks is a long-standing goal of Neural Architecture Search (NAS). Hierarchical search spaces are a promising step towards this goal but lack a unifying search space design framework and typically only search over some limited aspect of architectures. In this work, we introduce a unifying search space design framework based on context-fre… ▽ More

    Submitted 8 December, 2023; v1 submitted 3 November, 2022; originally announced November 2022.

    Comments: NeurIPS 2023

  7. arXiv:2105.01015  [pdf, other

    cs.LG cs.AI stat.ML

    Bag of Baselines for Multi-objective Joint Neural Architecture Search and Hyperparameter Optimization

    Authors: Julia Guerrero-Viu, Sven Hauns, Sergio Izquierdo, Guilherme Miotto, Simon Schrodi, Andre Biedenkapp, Thomas Elsken, Difan Deng, Marius Lindauer, Frank Hutter

    Abstract: Neural architecture search (NAS) and hyperparameter optimization (HPO) make deep learning accessible to non-experts by automatically finding the architecture of the deep neural network to use and tuning the hyperparameters of the used training pipeline. While both NAS and HPO have been studied extensively in recent years, NAS methods typically assume fixed hyperparameters and vice versa - there ex… ▽ More

    Submitted 3 May, 2021; originally announced May 2021.

  8. arXiv:2103.16255  [pdf, other

    cs.CV

    Towards Understanding Adversarial Robustness of Optical Flow Networks

    Authors: Simon Schrodi, Tonmoy Saikia, Thomas Brox

    Abstract: Recent work demonstrated the lack of robustness of optical flow networks to physical patch-based adversarial attacks. The possibility to physically attack a basic component of automotive systems is a reason for serious concerns. In this paper, we analyze the cause of the problem and show that the lack of robustness is rooted in the classical aperture problem of optical flow estimation in combinati… ▽ More

    Submitted 15 June, 2022; v1 submitted 30 March, 2021; originally announced March 2021.

    Comments: CVPR 2022