-
Most discriminative stimuli for functional cell type clustering
Authors:
Max F. Burg,
Thomas Zenkel,
Michaela Vystrčilová,
Jonathan Oesterle,
Larissa Höfling,
Konstantin F. Willeke,
Jan Lause,
Sarah Müller,
Paul G. Fahey,
Zhiwei Ding,
Kelli Restivo,
Shashwat Sridhar,
Tim Gollisch,
Philipp Berens,
Andreas S. Tolias,
Thomas Euler,
Matthias Bethge,
Alexander S. Ecker
Abstract:
Identifying cell types and understanding their functional properties is crucial for unraveling the mechanisms underlying perception and cognition. In the retina, functional types can be identified by carefully selected stimuli, but this requires expert domain knowledge and biases the procedure towards previously known cell types. In the visual cortex, it is still unknown what functional types exis…
▽ More
Identifying cell types and understanding their functional properties is crucial for unraveling the mechanisms underlying perception and cognition. In the retina, functional types can be identified by carefully selected stimuli, but this requires expert domain knowledge and biases the procedure towards previously known cell types. In the visual cortex, it is still unknown what functional types exist and how to identify them. Thus, for unbiased identification of the functional cell types in retina and visual cortex, new approaches are needed. Here we propose an optimization-based clustering approach using deep predictive models to obtain functional clusters of neurons using Most Discriminative Stimuli (MDS). Our approach alternates between stimulus optimization with cluster reassignment akin to an expectation-maximization algorithm. The algorithm recovers functional clusters in mouse retina, marmoset retina and macaque visual area V4. This demonstrates that our approach can successfully find discriminative stimuli across species, stages of the visual system and recording techniques. The resulting most discriminative stimuli can be used to assign functional cell types fast and on the fly, without the need to train complex predictive models or show a large natural scene dataset, paving the way for experiments that were previously limited by experimental time. Crucially, MDS are interpretable: they visualize the distinctive stimulus patterns that most unambiguously identify a specific type of neuron.
△ Less
Submitted 14 March, 2024; v1 submitted 29 November, 2023;
originally announced January 2024.
-
Visual Data-Type Understanding does not emerge from Scaling Vision-Language Models
Authors:
Vishaal Udandarao,
Max F. Burg,
Samuel Albanie,
Matthias Bethge
Abstract:
Recent advances in the development of vision-language models (VLMs) are yielding remarkable success in recognizing visual semantic content, including impressive instances of compositional image understanding. Here, we introduce the novel task of Visual Data-Type Identification, a basic perceptual skill with implications for data curation (e.g., noisy data-removal from large datasets, domain-specif…
▽ More
Recent advances in the development of vision-language models (VLMs) are yielding remarkable success in recognizing visual semantic content, including impressive instances of compositional image understanding. Here, we introduce the novel task of Visual Data-Type Identification, a basic perceptual skill with implications for data curation (e.g., noisy data-removal from large datasets, domain-specific retrieval) and autonomous vision (e.g., distinguishing changing weather conditions from camera lens staining). We develop two datasets consisting of animal images altered across a diverse set of 27 visual data-types, spanning four broad categories. An extensive zero-shot evaluation of 39 VLMs, ranging from 100M to 80B parameters, shows a nuanced performance landscape. While VLMs are reasonably good at identifying certain stylistic \textit{data-types}, such as cartoons and sketches, they struggle with simpler data-types arising from basic manipulations like image rotations or additive noise. Our findings reveal that (i) model scaling alone yields marginal gains for contrastively-trained models like CLIP, and (ii) there is a pronounced drop in performance for the largest auto-regressively trained VLMs like OpenFlamingo. This finding points to a blind spot in current frontier VLMs: they excel in recognizing semantic content but fail to acquire an understanding of visual data-types through scaling. By analyzing the pre-training distributions of these models and incorporating data-type information into the captions during fine-tuning, we achieve a significant enhancement in performance. By exploring this previously uncharted task, we aim to set the stage for further advancing VLMs to equip them with visual data-type understanding. Code and datasets are released at https://github.com/bethgelab/DataTypeIdentification.
△ Less
Submitted 6 December, 2023; v1 submitted 12 October, 2023;
originally announced October 2023.
-
HARD: Hard Augmentations for Robust Distillation
Authors:
Arne F. Nix,
Max F. Burg,
Fabian H. Sinz
Abstract:
Knowledge distillation (KD) is a simple and successful method to transfer knowledge from a teacher to a student model solely based on functional activity. However, current KD has a few shortcomings: it has recently been shown that this method is unsuitable to transfer simple inductive biases like shift equivariance, struggles to transfer out of domain generalization, and optimization time is magni…
▽ More
Knowledge distillation (KD) is a simple and successful method to transfer knowledge from a teacher to a student model solely based on functional activity. However, current KD has a few shortcomings: it has recently been shown that this method is unsuitable to transfer simple inductive biases like shift equivariance, struggles to transfer out of domain generalization, and optimization time is magnitudes longer compared to default non-KD model training. To improve these aspects of KD, we propose Hard Augmentations for Robust Distillation (HARD), a generally applicable data augmentation framework, that generates synthetic data points for which the teacher and the student disagree. We show in a simple toy example that our augmentation framework solves the problem of transferring simple equivariances with KD. We then apply our framework in real-world tasks for a variety of augmentation models, ranging from simple spatial transformations to unconstrained image manipulations with a pretrained variational autoencoder. We find that our learned augmentations significantly improve KD performance on in-domain and out-of-domain evaluation. Moreover, our method outperforms even state-of-the-art data augmentations and since the augmented training inputs can be visualized, they offer a qualitative insight into the properties that are transferred from the teacher to the student. Thus HARD represents a generally applicable, dynamically optimized data augmentation technique tailored to improve the generalization and convergence speed of models trained with KD.
△ Less
Submitted 25 May, 2023; v1 submitted 24 May, 2023;
originally announced May 2023.
-
Image retrieval outperforms diffusion models on data augmentation
Authors:
Max F. Burg,
Florian Wenzel,
Dominik Zietlow,
Max Horn,
Osama Makansi,
Francesco Locatello,
Chris Russell
Abstract:
Many approaches have been proposed to use diffusion models to augment training datasets for downstream tasks, such as classification. However, diffusion models are themselves trained on large datasets, often with noisy annotations, and it remains an open question to which extent these models contribute to downstream classification performance. In particular, it remains unclear if they generalize e…
▽ More
Many approaches have been proposed to use diffusion models to augment training datasets for downstream tasks, such as classification. However, diffusion models are themselves trained on large datasets, often with noisy annotations, and it remains an open question to which extent these models contribute to downstream classification performance. In particular, it remains unclear if they generalize enough to improve over directly using the additional data of their pre-training process for augmentation. We systematically evaluate a range of existing methods to generate images from diffusion models and study new extensions to assess their benefit for data augmentation. Personalizing diffusion models towards the target data outperforms simpler prompting strategies. However, using the pre-training data of the diffusion model alone, via a simple nearest-neighbor retrieval procedure, leads to even stronger downstream performance. Our study explores the potential of diffusion models in generating new training data, and surprisingly finds that these sophisticated models are not yet able to beat a simple and strong image retrieval baseline on simple downstream vision tasks.
△ Less
Submitted 30 November, 2023; v1 submitted 20 April, 2023;
originally announced April 2023.
-
The Sensorium competition on predicting large-scale mouse primary visual cortex activity
Authors:
Konstantin F. Willeke,
Paul G. Fahey,
Mohammad Bashiri,
Laura Pede,
Max F. Burg,
Christoph Blessing,
Santiago A. Cadena,
Zhiwei Ding,
Konstantin-Klemens Lurz,
Kayla Ponder,
Taliah Muhammad,
Saumil S. Patel,
Alexander S. Ecker,
Andreas S. Tolias,
Fabian H. Sinz
Abstract:
The neural underpinning of the biological visual system is challenging to study experimentally, in particular as the neuronal activity becomes increasingly nonlinear with respect to visual input. Artificial neural networks (ANNs) can serve a variety of goals for improving our understanding of this complex system, not only serving as predictive digital twins of sensory cortex for novel hypothesis g…
▽ More
The neural underpinning of the biological visual system is challenging to study experimentally, in particular as the neuronal activity becomes increasingly nonlinear with respect to visual input. Artificial neural networks (ANNs) can serve a variety of goals for improving our understanding of this complex system, not only serving as predictive digital twins of sensory cortex for novel hypothesis generation in silico, but also incorporating bio-inspired architectural motifs to progressively bridge the gap between biological and machine vision. The mouse has recently emerged as a popular model system to study visual information processing, but no standardized large-scale benchmark to identify state-of-the-art models of the mouse visual system has been established. To fill this gap, we propose the Sensorium benchmark competition. We collected a large-scale dataset from mouse primary visual cortex containing the responses of more than 28,000 neurons across seven mice stimulated with thousands of natural images, together with simultaneous behavioral measurements that include running speed, pupil dilation, and eye movements. The benchmark challenge will rank models based on predictive performance for neuronal responses on a held-out test set, and includes two tracks for model input limited to either stimulus only (Sensorium) or stimulus plus behavior (Sensorium+). We provide a starting kit to lower the barrier for entry, including tutorials, pre-trained baseline models, and APIs with one line commands for data loading and submission. We would like to see this as a starting point for regular challenges and data releases, and as a standard tool for measuring progress in large-scale neural system identification models of the mouse visual system and beyond.
△ Less
Submitted 17 June, 2022;
originally announced June 2022.
-
Number fluctuations induce persistent congestion
Authors:
Verena Krall,
Max F. Burg,
Malte Schröder,
Marc Timme
Abstract:
The capacity of a street segment quantifies the maximal density of vehicles before congestion arises. Here we show in a simple mathematical model that fluctuations in the instantaneous number of vehicles entering a street segment are sufficient to induce persistent congestion. Congestion emerges even if the average flow is below the segment's capacity where congestion is absent without fluctuation…
▽ More
The capacity of a street segment quantifies the maximal density of vehicles before congestion arises. Here we show in a simple mathematical model that fluctuations in the instantaneous number of vehicles entering a street segment are sufficient to induce persistent congestion. Congestion emerges even if the average flow is below the segment's capacity where congestion is absent without fluctuations. We explain how this fluctuation-induced congestion emerges due to a self-amplifying reduction of the average vehicle velocities.
△ Less
Submitted 28 May, 2021;
originally announced May 2021.
-
Obscuring digital route choice information prevents delay-induced congestion
Authors:
Verena Krall,
Max F. Burg,
Friedrich Pagenkopf,
Henrik Wolf,
Marc Timme,
Malte Schröder
Abstract:
Although routing applications increasingly affect individual mobility choices, their impact on collective traffic dynamics remains largely unknown. Smart communication technologies provide accurate traffic data for choosing one route over other alternatives, yet inherent delays undermine the potential usefulness of such information. Here we introduce and analyze a simple model of collective traffi…
▽ More
Although routing applications increasingly affect individual mobility choices, their impact on collective traffic dynamics remains largely unknown. Smart communication technologies provide accurate traffic data for choosing one route over other alternatives, yet inherent delays undermine the potential usefulness of such information. Here we introduce and analyze a simple model of collective traffic dynamics which result from route choice relying on outdated traffic information. We find for sufficiently small information delays that traffic flows are stable against perturbations. However, delays beyond a bifurcation point induce self-organized flow oscillations of increasing amplitude -- congestion arises. Providing delayed information averaged over sufficiently long periods of time or, more intriguingly, reducing the number of vehicles adhering to the route recommendations may prevent such delay-induced congestion. We reveal the fundamental mechanisms underlying these phenomena in a minimal two-road model and demonstrate their generality in microscopic, agent-based simulations of a road network system. Our findings provide a way to conceptually understand system-wide traffic dynamics caused by broadly used non-instantaneous routing information and suggest how resulting unintended collective traffic states could be avoided.
△ Less
Submitted 13 October, 2021; v1 submitted 25 May, 2021;
originally announced May 2021.