-
Loci-Segmented: Improving Scene Segmentation Learning
Authors:
Manuel Traub,
Frederic Becker,
Adrian Sauter,
Sebastian Otte,
Martin V. Butz
Abstract:
Current slot-oriented approaches for compositional scene segmentation from images and videos rely on provided background information or slot assignments. We present a segmented location and identity tracking system, Loci-Segmented (Loci-s), which does not require either of this information. It learns to dynamically segment scenes into interpretable background and slot-based object encodings, separ…
▽ More
Current slot-oriented approaches for compositional scene segmentation from images and videos rely on provided background information or slot assignments. We present a segmented location and identity tracking system, Loci-Segmented (Loci-s), which does not require either of this information. It learns to dynamically segment scenes into interpretable background and slot-based object encodings, separating rgb, mask, location, and depth information for each. The results reveal largely superior video decomposition performance in the MOVi datasets and in another established dataset collection targeting scene segmentation. The system's well-interpretable, compositional latent encodings may serve as a foundation model for downstream tasks.
△ Less
Submitted 6 February, 2024; v1 submitted 16 October, 2023;
originally announced October 2023.
-
Learning Object Permanence from Videos via Latent Imaginations
Authors:
Manuel Traub,
Frederic Becker,
Sebastian Otte,
Martin V. Butz
Abstract:
While human infants exhibit knowledge about object permanence from two months of age onwards, deep-learning approaches still largely fail to recognize objects' continued existence. We introduce a slot-based autoregressive deep learning system, the looped location and identity tracking model Loci-Looped, which learns to adaptively fuse latent imaginations with pixel-space observations into consiste…
▽ More
While human infants exhibit knowledge about object permanence from two months of age onwards, deep-learning approaches still largely fail to recognize objects' continued existence. We introduce a slot-based autoregressive deep learning system, the looped location and identity tracking model Loci-Looped, which learns to adaptively fuse latent imaginations with pixel-space observations into consistent latent object-specific what and where encodings over time. The novel loop empowers Loci-Looped to learn the physical concepts of object permanence, directional inertia, and object solidity through observation alone. As a result, Loci-Looped tracks objects through occlusions, anticipates their reappearance, and shows signs of surprise and internal revisions when observing implausible object behavior. Notably, Loci-Looped outperforms state-of-the-art baseline models in handling object occlusions and temporary sensory interruptions while exhibiting more compositional, interpretable internal activity patterns. Our work thus introduces the first self-supervised interpretable learning model that learns about object permanence directly from video data without supervision.
△ Less
Submitted 11 April, 2024; v1 submitted 16 October, 2023;
originally announced October 2023.
-
Learning What and Where: Disentangling Location and Identity Tracking Without Supervision
Authors:
Manuel Traub,
Sebastian Otte,
Tobias Menge,
Matthias Karlbauer,
Jannik Thümmel,
Martin V. Butz
Abstract:
Our brain can almost effortlessly decompose visual data streams into background and salient objects. Moreover, it can anticipate object motion and interactions, which are crucial abilities for conceptual planning and reasoning. Recent object reasoning datasets, such as CATER, have revealed fundamental shortcomings of current vision-based AI systems, particularly when targeting explicit object repr…
▽ More
Our brain can almost effortlessly decompose visual data streams into background and salient objects. Moreover, it can anticipate object motion and interactions, which are crucial abilities for conceptual planning and reasoning. Recent object reasoning datasets, such as CATER, have revealed fundamental shortcomings of current vision-based AI systems, particularly when targeting explicit object representations, object permanence, and object reasoning. Here we introduce a self-supervised LOCation and Identity tracking system (Loci), which excels on the CATER tracking challenge. Inspired by the dorsal and ventral pathways in the brain, Loci tackles the binding problem by processing separate, slot-wise encodings of `what' and `where'. Loci's predictive coding-like processing encourages active error minimization, such that individual slots tend to encode individual objects. Interactions between objects and object dynamics are processed in the disentangled latent space. Truncated backpropagation through time combined with forward eligibility accumulation significantly speeds up learning and improves memory efficiency. Besides exhibiting superior performance in current benchmarks, Loci effectively extracts objects from video streams and separates them into location and Gestalt components. We believe that this separation offers a representation that will facilitate effective planning and reasoning on conceptual levels.
△ Less
Submitted 7 February, 2023; v1 submitted 26 May, 2022;
originally announced May 2022.
-
Many-Joint Robot Arm Control with Recurrent Spiking Neural Networks
Authors:
Manuel Traub,
Robert Legenstein,
Sebastian Otte
Abstract:
In the paper, we show how scalable, low-cost trunk-like robotic arms can be constructed using only basic 3D-printing equipment and simple electronics. The design is based on uniform, stackable joint modules with three degrees of freedom each. Moreover, we present an approach for controlling these robots with recurrent spiking neural networks. At first, a spiking forward model learns motor-pose cor…
▽ More
In the paper, we show how scalable, low-cost trunk-like robotic arms can be constructed using only basic 3D-printing equipment and simple electronics. The design is based on uniform, stackable joint modules with three degrees of freedom each. Moreover, we present an approach for controlling these robots with recurrent spiking neural networks. At first, a spiking forward model learns motor-pose correlations from movement observations. After training, intentions can be projected back through unrolled spike trains of the forward model essentially routing the intention-driven motor gradients towards the respective joints, which unfolds goal-direction navigation. We demonstrate that spiking neural networks can thus effectively control trunk-like robotic arms with up to 75 articulated degrees of freedom with near millimeter accuracy.
△ Less
Submitted 8 April, 2021;
originally announced April 2021.
-
Learning Precise Spike Timings with Eligibility Traces
Authors:
Manuel Traub,
Martin V. Butz,
R. Harald Baayen,
Sebastian Otte
Abstract:
Recent research in the field of spiking neural networks (SNNs) has shown that recurrent variants of SNNs, namely long short-term SNNs (LSNNs), can be trained via error gradients just as effective as LSTMs. The underlying learning method (e-prop) is based on a formalization of eligibility traces applied to leaky integrate and fire (LIF) neurons. Here, we show that the proposed approach cannot fully…
▽ More
Recent research in the field of spiking neural networks (SNNs) has shown that recurrent variants of SNNs, namely long short-term SNNs (LSNNs), can be trained via error gradients just as effective as LSTMs. The underlying learning method (e-prop) is based on a formalization of eligibility traces applied to leaky integrate and fire (LIF) neurons. Here, we show that the proposed approach cannot fully unfold spike timing dependent plasticity (STDP). As a consequence, this limits in principle the inherent advantage of SNNs, that is, the potential to develop codes that rely on precise relative spike timings. We show that STDP-aware synaptic gradients naturally emerge within the eligibility equations of e-prop when derived for a slightly more complex spiking neuron model, here at the example of the Izhikevich model. We also present a simple extension of the LIF model that provides similar gradients. In a simple experiment we demonstrate that the STDP-aware LIF neurons can learn precise spike timings from an e-prop-based gradient signal.
△ Less
Submitted 8 May, 2020;
originally announced June 2020.
-
Towards Automatic Embryo Staging in 3D+T Microscopy Images using Convolutional Neural Networks and PointNets
Authors:
Manuel Traub,
Johannes Stegmaier
Abstract:
Automatic analyses and comparisons of different stages of embryonic development largely depend on a highly accurate spatiotemporal alignment of the investigated data sets. In this contribution, we assess multiple approaches for automatic staging of develo** embryos that were imaged with time-resolved 3D light-sheet microscopy. The methods comprise image-based convolutional neural networks as wel…
▽ More
Automatic analyses and comparisons of different stages of embryonic development largely depend on a highly accurate spatiotemporal alignment of the investigated data sets. In this contribution, we assess multiple approaches for automatic staging of develo** embryos that were imaged with time-resolved 3D light-sheet microscopy. The methods comprise image-based convolutional neural networks as well as an approach based on the PointNet architecture that directly operates on 3D point clouds of detected cell nuclei centroids. The experiments with four wild-type zebrafish embryos render both approaches suitable for automatic staging with average deviations of 21 - 34 minutes. Moreover, a proof-of-concept evaluation based on simulated 3D+t point cloud data sets shows that average deviations of less than 7 minutes are possible.
△ Less
Submitted 29 July, 2020; v1 submitted 1 October, 2019;
originally announced October 2019.
-
Evaluating Tag Recommendations for E-Book Annotation Using a Semantic Similarity Metric
Authors:
Emanuel Lacic,
Dominik Kowald,
Dieter Theiler,
Matthias Traub,
Lucky Kuffer,
Stefanie Lindstaedt,
Elisabeth Lex
Abstract:
In this paper, we present our work to support publishers and editors in finding descriptive tags for e-books through tag recommendations. We propose a hybrid tag recommendation system for e-books, which leverages search query terms from Amazon users and e-book metadata, which is assigned by publishers and editors. Our idea is to mimic the vocabulary of users in Amazon, who search for and review e-…
▽ More
In this paper, we present our work to support publishers and editors in finding descriptive tags for e-books through tag recommendations. We propose a hybrid tag recommendation system for e-books, which leverages search query terms from Amazon users and e-book metadata, which is assigned by publishers and editors. Our idea is to mimic the vocabulary of users in Amazon, who search for and review e-books, and to combine these search terms with editor tags in a hybrid tag recommendation approach. In total, we evaluate 19 tag recommendation algorithms on the review content of Amazon users, which reflects the readers' vocabulary. Our results show that we can improve the performance of tag recommender systems for e-books both concerning tag recommendation accuracy, diversity as well as a novel semantic similarity metric, which we also propose in this paper.
△ Less
Submitted 12 August, 2019;
originally announced August 2019.
-
Using the Open Meta Kaggle Dataset to Evaluate Tripartite Recommendations in Data Markets
Authors:
Dominik Kowald,
Matthias Traub,
Dieter Theiler,
Heimo Gursch,
Emanuel Lacic,
Stefanie Lindstaedt,
Roman Kern,
Elisabeth Lex
Abstract:
This work addresses the problem of providing and evaluating recommendations in data markets. Since most of the research in recommender systems is focused on the bipartite relationship between users and items (e.g., movies), we extend this view to the tripartite relationship between users, datasets and services, which is present in data markets. Between these entities, we identify four use cases fo…
▽ More
This work addresses the problem of providing and evaluating recommendations in data markets. Since most of the research in recommender systems is focused on the bipartite relationship between users and items (e.g., movies), we extend this view to the tripartite relationship between users, datasets and services, which is present in data markets. Between these entities, we identify four use cases for recommendations: (i) recommendation of datasets for users, (ii) recommendation of services for users, (iii) recommendation of services for datasets, and (iv) recommendation of datasets for services. Using the open Meta Kaggle dataset, we evaluate the recommendation accuracy of a popularity-based as well as a collaborative filtering-based algorithm for these four use cases and find that the recommendation accuracy strongly depends on the given use case. The presented work contributes to the tripartite recommendation problem in general and to the under-researched portfolio of evaluating recommender systems for data markets in particular.
△ Less
Submitted 27 August, 2019; v1 submitted 12 August, 2019;
originally announced August 2019.
-
SModelS v1.1 user manual
Authors:
Federico Ambrogi,
Sabine Kraml,
Suchita Kulkarni,
Ursula Laa,
Andre Lessa,
Veronika Magerl,
Jory Sonneveld,
Michael Traub,
Wolfgang Waltenberger
Abstract:
SModelS is an automatised tool for the interpretation of simplified model results from the LHC. It allows to decompose models of new physics obeying a Z2 symmetry into simplified model components, and to compare these against a large database of experimental results. The first release of SModelS, v1.0, used only cross section upper limit maps provided by the experimental collaborations. In this ne…
▽ More
SModelS is an automatised tool for the interpretation of simplified model results from the LHC. It allows to decompose models of new physics obeying a Z2 symmetry into simplified model components, and to compare these against a large database of experimental results. The first release of SModelS, v1.0, used only cross section upper limit maps provided by the experimental collaborations. In this new release, v1.1, we extend the functionality of SModelS to efficiency maps. This increases the constraining power of the software, as efficiency maps allow to combine contributions to the same signal region from different simplified models. Other new features of version 1.1 include likelihood and chi-square calculations, extended information on the topology coverage, an extended database of experimental results as well as major speed upgrades for both the code and the database. We describe in detail the concepts and procedures used in SModelS, explaining in particular how upper limits and efficiency map results are dealt with in parallel. Detailed instructions for code usage are also provided.
△ Less
Submitted 7 February, 2018; v1 submitted 23 January, 2017;
originally announced January 2017.
-
SModelS v1.0: a short user guide
Authors:
Sabine Kraml,
Suchita Kulkarni,
Ursula Laa,
Andre Lessa,
Veronika Magerl,
Wolfgang Magerl,
Doris Proschofsky-Spindler,
Michael Traub,
Wolfgang Waltenberger
Abstract:
SModelS is a tool for the automatic interpretation of simplified-model results from the LHC. Version 1.0 of the code is now publicly available. This document provides a quick user guide for installing and running SModelS v1.0.
SModelS is a tool for the automatic interpretation of simplified-model results from the LHC. Version 1.0 of the code is now publicly available. This document provides a quick user guide for installing and running SModelS v1.0.
△ Less
Submitted 4 December, 2014;
originally announced December 2014.