Search | arXiv e-print repository

TuneTables: Context Optimization for Scalable Prior-Data Fitted Networks

Authors: Benjamin Feuer, Robin Tibor Schirrmeister, Valeriia Cherepanova, Chinmay Hegde, Frank Hutter, Micah Goldblum, Niv Cohen, Colin White

Abstract: While tabular classification has traditionally relied on from-scratch training, a recent breakthrough called prior-data fitted networks (PFNs) challenges this approach. Similar to large language models, PFNs make use of pretraining and in-context learning to achieve strong performance on new tasks in a single forward pass. However, current PFNs have limitations that prohibit their widespread adopt… ▽ More While tabular classification has traditionally relied on from-scratch training, a recent breakthrough called prior-data fitted networks (PFNs) challenges this approach. Similar to large language models, PFNs make use of pretraining and in-context learning to achieve strong performance on new tasks in a single forward pass. However, current PFNs have limitations that prohibit their widespread adoption. Notably, TabPFN achieves very strong performance on small tabular datasets but is not designed to make predictions for datasets of size larger than 1000. In this work, we overcome these limitations and substantially improve the performance of PFNs by develo** context optimization techniques for PFNs. Specifically, we propose TuneTables, a novel prompt-tuning strategy that compresses large datasets into a smaller learned context. TuneTables scales TabPFN to be competitive with state-of-the-art tabular classification methods on larger datasets, while having a substantially lower inference time than TabPFN. Furthermore, we show that TuneTables can be used as an interpretability tool and can even be used to mitigate biases by optimizing a fairness objective. △ Less

Submitted 18 March, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

arXiv:2310.07029 [pdf, other]

Brain Age Revisited: Investigating the State vs. Trait Hypotheses of EEG-derived Brain-Age Dynamics with Deep Learning

Authors: Lukas AW Gemein, Robin T Schirrmeister, Joschka Boedecker, Tonio Ball

Abstract: The brain's biological age has been considered as a promising candidate for a neurologically significant biomarker. However, recent results based on longitudinal magnetic resonance imaging data have raised questions on its interpretation. A central question is whether an increased biological age of the brain is indicative of brain pathology and if changes in brain age correlate with diagnosed path… ▽ More The brain's biological age has been considered as a promising candidate for a neurologically significant biomarker. However, recent results based on longitudinal magnetic resonance imaging data have raised questions on its interpretation. A central question is whether an increased biological age of the brain is indicative of brain pathology and if changes in brain age correlate with diagnosed pathology (state hypothesis). Alternatively, could the discrepancy in brain age be a stable characteristic unique to each individual (trait hypothesis)? To address this question, we present a comprehensive study on brain aging based on clinical EEG, which is complementary to previous MRI-based investigations. We apply a state-of-the-art Temporal Convolutional Network (TCN) to the task of age regression. We train on recordings of the Temple University Hospital EEG Corpus (TUEG) explicitly labeled as non-pathological and evaluate on recordings of subjects with non-pathological as well as pathological recordings, both with examinations at a single point in time and repeated examinations over time. Therefore, we created four novel subsets of TUEG that include subjects with multiple recordings: I) all labeled non-pathological; II) all labeled pathological; III) at least one recording labeled non-pathological followed by at least one recording labeled pathological; IV) similar to III) but with opposing transition (first pathological then non-pathological). The results show that our TCN reaches state-of-the-art performance in age decoding with a mean absolute error of 6.6 years. Our extensive analyses demonstrate that the model significantly underestimates the age of non-pathological and pathological subjects (-1 and -5 years, paired t-test, p <= 0.18 and p <= 0.0066). Furthermore, the brain age gap biomarker is not indicative of pathological EEG. △ Less

Submitted 22 September, 2023; originally announced October 2023.

arXiv:2212.10426 [pdf, other]

Deep Riemannian Networks for EEG Decoding

Authors: Daniel Wilson, Robin Tibor Schirrmeister, Lukas Alexander Wilhelm Gemein, Tonio Ball

Abstract: State-of-the-art performance in electroencephalography (EEG) decoding tasks is currently often achieved with either Deep-Learning (DL) or Riemannian-Geometry-based decoders (RBDs). Recently, there is growing interest in Deep Riemannian Networks (DRNs) possibly combining the advantages of both previous classes of methods. However, there are still a range of topics where additional insight is needed… ▽ More State-of-the-art performance in electroencephalography (EEG) decoding tasks is currently often achieved with either Deep-Learning (DL) or Riemannian-Geometry-based decoders (RBDs). Recently, there is growing interest in Deep Riemannian Networks (DRNs) possibly combining the advantages of both previous classes of methods. However, there are still a range of topics where additional insight is needed to pave the way for a more widespread application of DRNs in EEG. These include architecture design questions such as network size and end-to-end ability.How these factors affect model performance has not been explored. Additionally, it is not clear how the data within these networks is transformed, and whether this would correlate with traditional EEG decoding. Our study aims to lay the groundwork in the area of these topics through the analysis of DRNs for EEG with a wide range of hyperparameters. Networks were tested on two public EEG datasets and compared with state-of-the-art ConvNets. Here we propose end-to-end EEG SPDNet (EE(G)-SPDNet), and we show that this wide, end-to-end DRN can outperform the ConvNets, and in doing so use physiologically plausible frequency regions. We also show that the end-to-end approach learns more complex filters than traditional band-pass filters targeting the classical alpha, beta, and gamma frequency bands of the EEG, and that performance can benefit from channel specific filtering approaches. Additionally, architectural analysis revealed areas for further improvement due to the possible loss of Riemannian specific information throughout the network. Our study thus shows how to design and train DRNs to infer task-related information from the raw EEG without the need of handcrafted filterbanks and highlights the potential of end-to-end DRNs such as EE(G)-SPDNet for high-performance EEG decoding. △ Less

Submitted 1 August, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

Comments: 27 pages, 13 Figures

arXiv:2207.07875 [pdf, other]

On the Importance of Hyperparameters and Data Augmentation for Self-Supervised Learning

Authors: Diane Wagner, Fabio Ferreira, Danny Stoll, Robin Tibor Schirrmeister, Samuel Müller, Frank Hutter

Abstract: Self-Supervised Learning (SSL) has become a very active area of Deep Learning research where it is heavily used as a pre-training method for classification and other tasks. However, the rapid pace of advancements in this area comes at a price: training pipelines vary significantly across papers, which presents a potentially crucial confounding factor. Here, we show that, indeed, the choice of hype… ▽ More Self-Supervised Learning (SSL) has become a very active area of Deep Learning research where it is heavily used as a pre-training method for classification and other tasks. However, the rapid pace of advancements in this area comes at a price: training pipelines vary significantly across papers, which presents a potentially crucial confounding factor. Here, we show that, indeed, the choice of hyperparameters and data augmentation strategies can have a dramatic impact on performance. To shed light on these neglected factors and help maximize the power of SSL, we hyperparameterize these components and optimize them with Bayesian optimization, showing improvements across multiple datasets for the SimSiam SSL approach. Realizing the importance of data augmentations for SSL, we also introduce a new automated data augmentation algorithm, GroupAugment, which considers groups of augmentations and optimizes the sampling across groups. In contrast to algorithms designed for supervised learning, GroupAugment achieved consistently high linear evaluation accuracy across all datasets we considered. Overall, our results indicate the importance and likely underestimated role of data augmentation for SSL. △ Less

Submitted 16 July, 2022; originally announced July 2022.

Comments: Accepted at the ICML 2022 Pre-training Workshop

arXiv:2201.05610 [pdf, other]

When less is more: Simplifying inputs aids neural network understanding

Authors: Robin Tibor Schirrmeister, Rosanne Liu, Sara Hooker, Tonio Ball

Abstract: How do neural network image classifiers respond to simpler and simpler inputs? And what do such responses reveal about the learning process? To answer these questions, we need a clear measure of input simplicity (or inversely, complexity), an optimization objective that correlates with simplification, and a framework to incorporate such objective into training and inference. Lastly we need a varie… ▽ More How do neural network image classifiers respond to simpler and simpler inputs? And what do such responses reveal about the learning process? To answer these questions, we need a clear measure of input simplicity (or inversely, complexity), an optimization objective that correlates with simplification, and a framework to incorporate such objective into training and inference. Lastly we need a variety of testbeds to experiment and evaluate the impact of such simplification on learning. In this work, we measure simplicity with the encoding bit size given by a pretrained generative model, and minimize the bit size to simplify inputs in training and inference. We investigate the effect of such simplification in several scenarios: conventional training, dataset condensation and post-hoc explanations. In all settings, inputs are simplified along with the original classification task, and we investigate the trade-off between input simplicity and task performance. For images with injected distractors, such simplification naturally removes superfluous information. For dataset condensation, we find that inputs can be simplified with almost no accuracy degradation. When used in post-hoc explanation, our learning-based simplification approach offers a valuable new tool to explore the basis of network decisions. △ Less

Submitted 1 February, 2022; v1 submitted 14 January, 2022; originally announced January 2022.

ACM Class: I.2.6

arXiv:2006.10848 [pdf, other]

Understanding Anomaly Detection with Deep Invertible Networks through Hierarchies of Distributions and Features

Authors: Robin Tibor Schirrmeister, Yuxuan Zhou, Tonio Ball, Dan Zhang

Abstract: Deep generative networks trained via maximum likelihood on a natural image dataset like CIFAR10 often assign high likelihoods to images from datasets with different objects (e.g., SVHN). We refine previous investigations of this failure at anomaly detection for invertible generative networks and provide a clear explanation of it as a combination of model bias and domain prior: Convolutional networ… ▽ More Deep generative networks trained via maximum likelihood on a natural image dataset like CIFAR10 often assign high likelihoods to images from datasets with different objects (e.g., SVHN). We refine previous investigations of this failure at anomaly detection for invertible generative networks and provide a clear explanation of it as a combination of model bias and domain prior: Convolutional networks learn similar low-level feature distributions when trained on any natural image dataset and these low-level features dominate the likelihood. Hence, when the discriminative features between inliers and outliers are on a high-level, e.g., object shapes, anomaly detection becomes particularly challenging. To remove the negative impact of model bias and domain prior on detecting high-level differences, we propose two methods, first, using the log likelihood ratios of two identical models, one trained on the in-distribution data (e.g., CIFAR10) and the other one on a more general distribution of images (e.g., 80 Million Tiny Images). We also derive a novel outlier loss for the in-distribution network on samples from the more general distribution to further improve the performance. Secondly, using a multi-scale model like Glow, we show that low-level features are mainly captured at early scales. Therefore, using only the likelihood contribution of the final scale performs remarkably well for detecting high-level feature differences of the out-of-distribution and the in-distribution. This method is especially useful if one does not have access to a suitable general distribution. Overall, our methods achieve strong anomaly detection performance in the unsupervised setting, and only slightly underperform state-of-the-art classifier-based methods in the supervised setting. Code can be found at https://github.com/boschresearch/hierarchical_anomaly_detection. △ Less

Submitted 2 November, 2020; v1 submitted 18 June, 2020; originally announced June 2020.

Comments: Published at NeurIPS 2020. Code can be found at https://github.com/boschresearch/hierarchical_anomaly_detection

ACM Class: I.2.6

arXiv:2002.05115 [pdf, other]

doi 10.1016/j.neuroimage.2020.117021

Machine-Learning-Based Diagnostics of EEG Pathology

Authors: Lukas Alexander Wilhelm Gemein, Robin Tibor Schirrmeister, Patryk Chrabąszcz, Daniel Wilson, Joschka Boedecker, Andreas Schulze-Bonhage, Frank Hutter, Tonio Ball

Abstract: Machine learning (ML) methods have the potential to automate clinical EEG analysis. They can be categorized into feature-based (with handcrafted features), and end-to-end approaches (with learned features). Previous studies on EEG pathology decoding have typically analyzed a limited number of features, decoders, or both. For a I) more elaborate feature-based EEG analysis, and II) in-depth comparis… ▽ More Machine learning (ML) methods have the potential to automate clinical EEG analysis. They can be categorized into feature-based (with handcrafted features), and end-to-end approaches (with learned features). Previous studies on EEG pathology decoding have typically analyzed a limited number of features, decoders, or both. For a I) more elaborate feature-based EEG analysis, and II) in-depth comparisons of both approaches, here we first develop a comprehensive feature-based framework, and then compare this framework to state-of-the-art end-to-end methods. To this aim, we apply the proposed feature-based framework and deep neural networks including an EEG-optimized temporal convolutional network (TCN) to the task of pathological versus non-pathological EEG classification. For a robust comparison, we chose the Temple University Hospital (TUH) Abnormal EEG Corpus (v2.0.0), which contains approximately 3000 EEG recordings. The results demonstrate that the proposed feature-based decoding framework can achieve accuracies on the same level as state-of-the-art deep neural networks. We find accuracies across both approaches in an astonishingly narrow range from 81--86\%. Moreover, visualizations and analyses indicated that both approaches used similar aspects of the data, e.g., delta and theta band power at temporal electrode locations. We argue that the accuracies of current binary EEG pathology decoders could saturate near 90\% due to the imperfect inter-rater agreement of the clinical labels, and that such decoders are already clinically useful, such as in areas where clinical EEG experts are rare. We make the proposed feature-based framework available open source and thus offer a new tool for EEG machine learning research. △ Less

Submitted 11 February, 2020; originally announced February 2020.

Journal ref: NeuroImage, Volume 220, 15 October 2020, 117021

arXiv:1907.07746 [pdf]

Deep Invertible Networks for EEG-based brain-signal decoding

Authors: Robin Tibor Schirrmeister, Tonio Ball

Abstract: In this manuscript, we investigate deep invertible networks for EEG-based brain signal decoding and find them to generate realistic EEG signals as well as classify novel signals above chance. Further ideas for their regularization towards better decoding accuracies are discussed. In this manuscript, we investigate deep invertible networks for EEG-based brain signal decoding and find them to generate realistic EEG signals as well as classify novel signals above chance. Further ideas for their regularization towards better decoding accuracies are discussed. △ Less

Submitted 17 July, 2019; originally announced July 2019.

arXiv:1810.02584 [pdf]

Deep Learning for micro-Electrocorticographic (μECoG) Data

Authors: Xi Wang, C. Alexis Gkogkidis, Robin T. Schirrmeister, Felix A. Heilmeyer, Mortimer Gierthmuehlen, Fabian Kohler, Martin Schuettler, Thomas Stieglitz, Tonio Ball

Abstract: Machine learning can extract information from neural recordings, e.g., surface EEG, ECoG and μECoG, and therefore plays an important role in many research and clinical applications. Deep learning with artificial neural networks has recently seen increasing attention as a new approach in brain signal decoding. Here, we apply a deep learning approach using convolutional neural networks to μECoG data… ▽ More Machine learning can extract information from neural recordings, e.g., surface EEG, ECoG and μECoG, and therefore plays an important role in many research and clinical applications. Deep learning with artificial neural networks has recently seen increasing attention as a new approach in brain signal decoding. Here, we apply a deep learning approach using convolutional neural networks to μECoG data obtained with a wireless, chronically implanted system in an ovine animal model. Regularized linear discriminant analysis (rLDA), a filter bank component spatial pattern (FBCSP) algorithm and convolutional neural networks (ConvNets) were applied to auditory evoked responses captured by μECoG. We show that compared with rLDA and FBCSP, significantly higher decoding accuracy can be obtained by ConvNets trained in an end-to-end manner, i.e., without any predefined signal features. Deep learning thus proves a promising technique for μECoG-based brain-machine interfacing applications. △ Less

Submitted 5 October, 2018; originally announced October 2018.

Comments: 6 pages, 7 figures, 2018 IEEE EMBS conference

arXiv:1807.01597 [pdf, other]

doi 10.5220/0006934900610066

The role of robot design in decoding error-related information from EEG signals of a human observer

Authors: Joos Behncke, Robin Tibor Schirrmeister, Wolfram Burgard, Tonio Ball

Abstract: For utilization of robotic assistive devices in everyday life, means for detection and processing of erroneous robot actions are a focal aspect in the development of collaborative systems, especially when controlled via brain signals. Though, the variety of possible scenarios and the diversity of used robotic systems pose a challenge for error decoding from recordings of brain signals such as via… ▽ More For utilization of robotic assistive devices in everyday life, means for detection and processing of erroneous robot actions are a focal aspect in the development of collaborative systems, especially when controlled via brain signals. Though, the variety of possible scenarios and the diversity of used robotic systems pose a challenge for error decoding from recordings of brain signals such as via EEG. For example, it is unclear whether humanoid appearances of robotic assistants have an influence on the performance. In this paper, we designed a study in which two different robots executed the same task both in an erroneous and a correct manner. We find error-related EEG signals of human observers indicating that the performance of the error decoding was independent of robot design. However, we can show that it was possible to identify which robot performed the instructed task by means of the EEG signals. In this case, deep convolutional neural networks (deep ConvNets) could reach significantly higher accuracies than both regularized Linear Discriminanat Analysis (rLDA) and filter bank common spatial patterns (FB-CSP) combined with rLDA. Our findings indicate that decoding information about robot action success from the EEG, particularly when using deep neural networks, may be an applicable approach for a broad range of robot designs. △ Less

Submitted 18 July, 2018; v1 submitted 4 July, 2018; originally announced July 2018.

arXiv:1806.09532 [pdf, other]

Cross-paradigm pretraining of convolutional networks improves intracranial EEG decoding

Authors: Joos Behncke, Robin Tibor Schirrmeister, Martin Völker, Jiří Hammer, Petr Marusič, Andreas Schulze-Bonhage, Wolfram Burgard, Tonio Ball

Abstract: When it comes to the classification of brain signals in real-life applications, the training and the prediction data are often described by different distributions. Furthermore, diverse data sets, e.g., recorded from various subjects or tasks, can even exhibit distinct feature spaces. The fact that data that have to be classified are often only available in small amounts reinforces the need for te… ▽ More When it comes to the classification of brain signals in real-life applications, the training and the prediction data are often described by different distributions. Furthermore, diverse data sets, e.g., recorded from various subjects or tasks, can even exhibit distinct feature spaces. The fact that data that have to be classified are often only available in small amounts reinforces the need for techniques to generalize learned information, as performances of brain-computer interfaces (BCIs) are enhanced by increasing quantity of available data. In this paper, we apply transfer learning to a framework based on deep convolutional neural networks (deep ConvNets) to prove the transferability of learned patterns in error-related brain signals across different tasks. The experiments described in this paper demonstrate the usefulness of transfer learning, especially improving performances when only little data can be used to distinguish between erroneous and correct realization of a task. This effect could be delimited from a transfer of merely general brain signal characteristics, underlining the transfer of error-specific information. Furthermore, we could extract similar patterns in time-frequency analyses in identical channels, leading to selective high signal correlations between the two different paradigms. Classification on the intracranial data yields in median accuracies up to $(81.50 \pm 9.49)\,\%$. Decoding on only $10\%$ of the data without pre-training reaches performances of $(54.76 \pm 3.56)\,\%$, compared to $(64.95 \pm 0.79)\,\%$ with pre-training. △ Less

Submitted 20 July, 2018; v1 submitted 20 June, 2018; originally announced June 2018.

arXiv:1806.07741 [pdf, other]

doi 10.1109/SMC.2018.00185

A large-scale evaluation framework for EEG deep learning architectures

Authors: Felix A. Heilmeyer, Robin T. Schirrmeister, Lukas D. J. Fiederer, Martin Völker, Joos Behncke, Tonio Ball

Abstract: EEG is the most common signal source for noninvasive BCI applications. For such applications, the EEG signal needs to be decoded and translated into appropriate actions. A recently emerging EEG decoding approach is deep learning with Convolutional or Recurrent Neural Networks (CNNs, RNNs) with many different architectures already published. Here we present a novel framework for the large-scale eva… ▽ More EEG is the most common signal source for noninvasive BCI applications. For such applications, the EEG signal needs to be decoded and translated into appropriate actions. A recently emerging EEG decoding approach is deep learning with Convolutional or Recurrent Neural Networks (CNNs, RNNs) with many different architectures already published. Here we present a novel framework for the large-scale evaluation of different deep-learning architectures on different EEG datasets. This framework comprises (i) a collection of EEG datasets currently including 100 examples (recording sessions) from six different classification problems, (ii) a collection of different EEG decoding algorithms, and (iii) a wrapper linking the decoders to the data as well as handling structured documentation of all settings and (hyper-) parameters and statistics, designed to ensure transparency and reproducibility. As an applications example we used our framework by comparing three publicly available CNN architectures: the Braindecode Deep4 ConvNet, Braindecode Shallow ConvNet, and two versions of EEGNet. We also show how our framework can be used to study similarities and differences in the performance of different decoding methods across tasks. We argue that the deep learning EEG framework as described here could help to tap the full potential of deep learning for BCI applications. △ Less

Submitted 25 July, 2018; v1 submitted 18 June, 2018; originally announced June 2018.

Comments: 7 pages, 3 figures, final version accepted for presentation at IEEE SMC 2018 conference

arXiv:1806.01875 [pdf, other]

EEG-GAN: Generative adversarial networks for electroencephalograhic (EEG) brain signals

Authors: Kay Gregor Hartmann, Robin Tibor Schirrmeister, Tonio Ball

Abstract: Generative adversarial networks (GANs) are recently highly successful in generative applications involving images and start being applied to time series data. Here we describe EEG-GAN as a framework to generate electroencephalographic (EEG) brain signals. We introduce a modification to the improved training of Wasserstein GANs to stabilize training and investigate a range of architectural choices… ▽ More Generative adversarial networks (GANs) are recently highly successful in generative applications involving images and start being applied to time series data. Here we describe EEG-GAN as a framework to generate electroencephalographic (EEG) brain signals. We introduce a modification to the improved training of Wasserstein GANs to stabilize training and investigate a range of architectural choices critical for time series generation (most notably up- and down-sampling). For evaluation we consider and compare different metrics such as Inception score, Frechet inception distance and sliced Wasserstein distance, together showing that our EEG-GAN framework generated naturalistic EEG examples. It thus opens up a range of new generative application scenarios in the neuroscientific and neurological context, such as data augmentation in brain-computer interfacing tasks, EEG super-sampling, or restoration of corrupted data segments. The possibility to generate signals of a certain class and/or with specific properties may also open a new avenue for research into the underlying structure of brain signals. △ Less

Submitted 5 June, 2018; originally announced June 2018.

Comments: 6 pages, 6 figures

arXiv:1806.01610 [pdf, other]

Training Generative Reversible Networks

Authors: Robin Tibor Schirrmeister, Patryk Chrabąszcz, Frank Hutter, Tonio Ball

Abstract: Generative models with an encoding component such as autoencoders currently receive great interest. However, training of autoencoders is typically complicated by the need to train a separate encoder and decoder model that have to be enforced to be reciprocal to each other. To overcome this problem, by-design reversible neural networks (RevNets) had been previously used as generative models either… ▽ More Generative models with an encoding component such as autoencoders currently receive great interest. However, training of autoencoders is typically complicated by the need to train a separate encoder and decoder model that have to be enforced to be reciprocal to each other. To overcome this problem, by-design reversible neural networks (RevNets) had been previously used as generative models either directly optimizing the likelihood of the data under the model or using an adversarial approach on the generated data. Here, we instead investigate their performance using an adversary on the latent space in the adversarial autoencoder framework. We investigate the generative performance of RevNets on the CelebA dataset, showing that generative RevNets can generate coherent faces with similar quality as Variational Autoencoders. This first attempt to use RevNets inside the adversarial autoencoder framework slightly underperformed relative to recent advanced generative models using an autoencoder component on CelebA, but this gap may diminish with further optimization of the training setup of generative RevNets. In addition to the experiments on CelebA, we show a proof-of-principle experiment on the MNIST dataset suggesting that adversary-free trained RevNets can discover meaningful latent dimensions without pre-specifying the number of dimensions of the latent sampling distribution. In summary, this study shows that RevNets can be employed in different generative training settings. Source code for this study is at https://github.com/robintibor/generative-reversible △ Less

Submitted 23 August, 2018; v1 submitted 5 June, 2018; originally announced June 2018.

Comments: Source code for this study is at https://github.com/robintibor/generative-reversible

arXiv:1805.01667 [pdf, other]

Intracranial Error Detection via Deep Learning

Authors: Martin Völker, Jiří Hammer, Robin T. Schirrmeister, Joos Behncke, Lukas D. J. Fiederer, Andreas Schulze-Bonhage, Petr Marusič, Wolfram Burgard, Tonio Ball

Abstract: Deep learning techniques have revolutionized the field of machine learning and were recently successfully applied to various classification problems in noninvasive electroencephalography (EEG). However, these methods were so far only rarely evaluated for use in intracranial EEG. We employed convolutional neural networks (CNNs) to classify and characterize the error-related brain response as measur… ▽ More Deep learning techniques have revolutionized the field of machine learning and were recently successfully applied to various classification problems in noninvasive electroencephalography (EEG). However, these methods were so far only rarely evaluated for use in intracranial EEG. We employed convolutional neural networks (CNNs) to classify and characterize the error-related brain response as measured in 24 intracranial EEG recordings. Decoding accuracies of CNNs were significantly higher than those of a regularized linear discriminant analysis. Using time-resolved deep decoding, it was possible to classify errors in various regions in the human brain, and further to decode errors over 200 ms before the actual erroneous button press, e.g., in the precentral gyrus. Moreover, deeper networks performed better than shallower networks in distinguishing correct from error trials in all-channel decoding. In single recordings, up to 100 % decoding accuracy was achieved. Visualization of the networks' learned features indicated that multivariate decoding on an ensemble of channels yields related, albeit non-redundant information compared to single-channel decoding. In summary, here we show the usefulness of deep learning for both intracranial error decoding and map** of the spatio-temporal structure of the human error processing network. △ Less

Submitted 2 November, 2018; v1 submitted 4 May, 2018; originally announced May 2018.

Comments: 8 pages, 6 figures. Accepted at the 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC2018)

ACM Class: I.2.6; I.2.8; I.5.0; J.2; J.3

arXiv:1711.07792 [pdf, other]

doi 10.1109/IWW-BCI.2018.8311493

Hierarchical internal representation of spectral features in deep convolutional networks trained for EEG decoding

Authors: Kay Gregor Hartmann, Robin Tibor Schirrmeister, Tonio Ball

Abstract: Recently, there is increasing interest and research on the interpretability of machine learning models, for example how they transform and internally represent EEG signals in Brain-Computer Interface (BCI) applications. This can help to understand the limits of the model and how it may be improved, in addition to possibly provide insight about the data itself. Schirrmeister et al. (2017) have rece… ▽ More Recently, there is increasing interest and research on the interpretability of machine learning models, for example how they transform and internally represent EEG signals in Brain-Computer Interface (BCI) applications. This can help to understand the limits of the model and how it may be improved, in addition to possibly provide insight about the data itself. Schirrmeister et al. (2017) have recently reported promising results for EEG decoding with deep convolutional neural networks (ConvNets) trained in an end-to-end manner and, with a causal visualization approach, showed that they learn to use spectral amplitude changes in the input. In this study, we investigate how ConvNets represent spectral features through the sequence of intermediate stages of the network. We show higher sensitivity to EEG phase features at earlier stages and higher sensitivity to EEG amplitude features at later stages. Intriguingly, we observed a specialization of individual stages of the network to the classical EEG frequency bands alpha, beta, and high gamma. Furthermore, we find first evidence that particularly in the last convolutional layer, the network learns to detect more complex oscillatory patterns beyond spectral phase and amplitude, reminiscent of the representation of complex visual features in later layers of ConvNets in computer vision tasks. Our findings thus provide insights into how ConvNets hierarchically represent spectral EEG features in their intermediate layers and suggest that ConvNets can exploit and might help to better understand the compositional structure of EEG time series. △ Less

Submitted 15 December, 2017; v1 submitted 21 November, 2017; originally announced November 2017.

Comments: 6 pages, 7 figures, The 6th International Winter Conference on Brain-Computer Interface

arXiv:1711.06068 [pdf]

doi 10.1109/IWW-BCI.2018.8311531

The signature of robot action success in EEG signals of a human observer: Decoding and visualization using deep convolutional neural networks

Authors: Joos Behncke, Robin Tibor Schirrmeister, Wolfram Burgard, Tonio Ball

Abstract: The importance of robotic assistive devices grows in our work and everyday life. Cooperative scenarios involving both robots and humans require safe human-robot interaction. One important aspect here is the management of robot errors, including fast and accurate online robot-error detection and correction. Analysis of brain signals from a human interacting with a robot may help identifying robot e… ▽ More The importance of robotic assistive devices grows in our work and everyday life. Cooperative scenarios involving both robots and humans require safe human-robot interaction. One important aspect here is the management of robot errors, including fast and accurate online robot-error detection and correction. Analysis of brain signals from a human interacting with a robot may help identifying robot errors, but accuracies of such analyses have still substantial space for improvement. In this paper we evaluate whether a novel framework based on deep convolutional neural networks (deep ConvNets) could improve the accuracy of decoding robot errors from the EEG of a human observer, both during an object gras** and a pouring task. We show that deep ConvNets reached significantly higher accuracies than both regularized Linear Discriminant Analysis (rLDA) and filter bank common spatial patterns (FB-CSP) combined with rLDA, both widely used EEG classifiers. Deep ConvNets reached mean accuracies of 75% +/- 9 %, rLDA 65% +/- 10% and FB-CSP + rLDA 63% +/- 6% for decoding of erroneous vs. correct trials. Visualization of the time-domain EEG features learned by the ConvNets to decode errors revealed spatiotemporal patterns that reflected differences between the two experimental paradigms. Across subjects, ConvNet decoding accuracies were significantly correlated with those obtained with rLDA, but not CSP, indicating that in the present context ConvNets behaved more 'rLDA-like' (but consistently better), while in a previous decoding study with another task but the same ConvNet architecture, it was found to behave more 'CSP-like'. Our findings thus provide further support for the assumption that deep ConvNets are a versatile addition to the existing toolbox of EEG decoding techniques, and we discuss steps how ConvNet EEG decoding performance could be further optimized. △ Less

Submitted 16 November, 2017; originally announced November 2017.

arXiv:1710.09139 [pdf]

Deep Transfer Learning for Error Decoding from Non-Invasive EEG

Authors: Martin Völker, Robin T. Schirrmeister, Lukas D. J. Fiederer, Wolfram Burgard, Tonio Ball

Abstract: We recorded high-density EEG in a flanker task experiment (31 subjects) and an online BCI control paradigm (4 subjects). On these datasets, we evaluated the use of transfer learning for error decoding with deep convolutional neural networks (deep ConvNets). In comparison with a regularized linear discriminant analysis (rLDA) classifier, ConvNets were significantly better in both intra- and inter-s… ▽ More We recorded high-density EEG in a flanker task experiment (31 subjects) and an online BCI control paradigm (4 subjects). On these datasets, we evaluated the use of transfer learning for error decoding with deep convolutional neural networks (deep ConvNets). In comparison with a regularized linear discriminant analysis (rLDA) classifier, ConvNets were significantly better in both intra- and inter-subject decoding, achieving an average accuracy of 84.1 % within subject and 81.7 % on unknown subjects (flanker task). Neither method was, however, able to generalize reliably between paradigms. Visualization of features the ConvNets learned from the data showed plausible patterns of brain activity, revealing both similarities and differences between the different kinds of errors. Our findings indicate that deep learning techniques are useful to infer information about the correctness of action in BCI applications, particularly for the transfer of pre-trained classifiers to new recording sessions or subjects. △ Less

Submitted 10 January, 2018; v1 submitted 25 October, 2017; originally announced October 2017.

Comments: 6 pages, 9 figures, The 6th International Winter Conference on Brain-Computer Interface 2018

ACM Class: I.2.6; I.2.8; I.5.0; J.2; J.3

arXiv:1708.08012 [pdf, other]

Deep learning with convolutional neural networks for decoding and visualization of EEG pathology

Authors: Robin Tibor Schirrmeister, Lukas Gemein, Katharina Eggensperger, Frank Hutter, Tonio Ball

Abstract: We apply convolutional neural networks (ConvNets) to the task of distinguishing pathological from normal EEG recordings in the Temple University Hospital EEG Abnormal Corpus. We use two basic, shallow and deep ConvNet architectures recently shown to decode task-related information from EEG at least as well as established algorithms designed for this purpose. In decoding EEG pathology, both ConvNet… ▽ More We apply convolutional neural networks (ConvNets) to the task of distinguishing pathological from normal EEG recordings in the Temple University Hospital EEG Abnormal Corpus. We use two basic, shallow and deep ConvNet architectures recently shown to decode task-related information from EEG at least as well as established algorithms designed for this purpose. In decoding EEG pathology, both ConvNets reached substantially better accuracies (about 6% better, ~85% vs. ~79%) than the only published result for this dataset, and were still better when using only 1 minute of each recording for training and only six seconds of each recording for testing. We used automated methods to optimize architectural hyperparameters and found intriguingly different ConvNet architectures, e.g., with max pooling as the only nonlinearity. Visualizations of the ConvNet decoding behavior showed that they used spectral power changes in the delta (0-4 Hz) and theta (4-8 Hz) frequency range, possibly alongside other features, consistent with expectations derived from spectral analysis of the EEG data and from the textual medical reports. Analysis of the textual medical reports also highlighted the potential for accuracy increases by integrating contextual information, such as the age of subjects. In summary, the ConvNets and visualization techniques used in this study constitute a next step towards clinically useful automated EEG diagnosis and establish a new baseline for future work on this topic. △ Less

Submitted 11 January, 2018; v1 submitted 26 August, 2017; originally announced August 2017.

Comments: Published at IEEE SPMB 2017 https://www.ieeespmb.org/2017/

ACM Class: I.2.6

arXiv:1708.01465 [pdf]

doi 10.17185/duepublico/44533

Brain Responses During Robot-Error Observation

Authors: Dominik Welke, Joos Behncke, Marina Hader, Robin Tibor Schirrmeister, Andreas Schönau, Boris Eßmann, Oliver Müller, Wolfram Burgard, Tonio Ball

Abstract: Brain-controlled robots are a promising new type of assistive device for severely impaired persons. Little is however known about how to optimize the interaction of humans and brain-controlled robots. Information about the human's perceived correctness of robot performance might provide a useful teaching signal for adaptive control algorithms and thus help enhancing robot control. Here, we studied… ▽ More Brain-controlled robots are a promising new type of assistive device for severely impaired persons. Little is however known about how to optimize the interaction of humans and brain-controlled robots. Information about the human's perceived correctness of robot performance might provide a useful teaching signal for adaptive control algorithms and thus help enhancing robot control. Here, we studied whether watching robots perform erroneous vs. correct action elicits differential brain responses that can be decoded from single trials of electroencephalographic (EEG) recordings, and whether brain activity during human-robot interaction is modulated by the robot's visual similarity to a human. To address these topics, we designed two experiments. In experiment I, participants watched a robot arm pour liquid into a cup. The robot performed the action either erroneously or correctly, i.e. it either spilled some liquid or not. In experiment II, participants observed two different types of robots, humanoid and non-humanoid, grabbing a ball. The robots either managed to grab the ball or not. We recorded high-resolution EEG during the observation tasks in both experiments to train a Filter Bank Common Spatial Pattern (FBCSP) pipeline on the multivariate EEG signal and decode for the correctness of the observed action, and for the type of the observed robot. Our findings show that it was possible to decode both correctness and robot type for the majority of participants significantly, although often just slightly, above chance level. Our findings suggest that non-invasive recordings of brain responses elicited when observing robots indeed contain decodable information about the correctness of the robot's action and the type of observed robot. △ Less

Submitted 16 August, 2017; v1 submitted 4 August, 2017; originally announced August 2017.

arXiv:1707.06633 [pdf, other]

doi 10.1109/ECMR.2017.8098658

Acting Thoughts: Towards a Mobile Robotic Service Assistant for Users with Limited Communication Skills

Authors: Felix Burget, Lukas Dominique Josef Fiederer, Daniel Kuhner, Martin Völker, Johannes Aldinger, Robin Tibor Schirrmeister, Chau Do, Joschka Boedecker, Bernhard Nebel, Tonio Ball, Wolfram Burgard

Abstract: As autonomous service robots become more affordable and thus available also for the general public, there is a growing need for user friendly interfaces to control the robotic system. Currently available control modalities typically expect users to be able to express their desire through either touch, speech or gesture commands. While this requirement is fulfilled for the majority of users, paraly… ▽ More As autonomous service robots become more affordable and thus available also for the general public, there is a growing need for user friendly interfaces to control the robotic system. Currently available control modalities typically expect users to be able to express their desire through either touch, speech or gesture commands. While this requirement is fulfilled for the majority of users, paralyzed users may not be able to use such systems. In this paper, we present a novel framework, that allows these users to interact with a robotic service assistant in a closed-loop fashion, using only thoughts. The brain-computer interface (BCI) system is composed of several interacting components, i.e., non-invasive neuronal signal recording and decoding, high-level task planning, motion and manipulation planning as well as environment perception. In various experiments, we demonstrate its applicability and robustness in real world scenarios, considering fetch-and-carry tasks and tasks involving human-robot interaction. As our results demonstrate, our system is capable of adapting to frequent changes in the environment and reliably completing given tasks within a reasonable amount of time. Combined with high-level planning and autonomous robotic systems, interesting new perspectives open up for non-invasive BCI-based human-robot interactions. △ Less

Submitted 12 June, 2018; v1 submitted 20 July, 2017; originally announced July 2017.

Comments: * FB, LDJF, DK, MV and JA contributed equally to the work. Accepted as a conference paper at the European Conference on Mobile Robotics 2017 (ECMR 2017), 6 pages, 3 figures

ACM Class: I.2.4; I.2.6; I.2.8; I.2.9; I.2.10; I.4.8; I.5.1

Journal ref: 2017 European Conference on Mobile Robots (ECMR)

arXiv:1703.05051 [pdf, other]

doi 10.1002/hbm.23730

Deep learning with convolutional neural networks for EEG decoding and visualization

Authors: Robin Tibor Schirrmeister, Jost Tobias Springenberg, Lukas Dominique Josef Fiederer, Martin Glasstetter, Katharina Eggensperger, Michael Tangermann, Frank Hutter, Wolfram Burgard, Tonio Ball

Abstract: PLEASE READ AND CITE THE REVISED VERSION at Human Brain Map**: http://onlinelibrary.wiley.com/doi/10.1002/hbm.23730/full Code available here: https://github.com/robintibor/braindecode PLEASE READ AND CITE THE REVISED VERSION at Human Brain Map**: http://onlinelibrary.wiley.com/doi/10.1002/hbm.23730/full Code available here: https://github.com/robintibor/braindecode △ Less

Submitted 8 June, 2018; v1 submitted 15 March, 2017; originally announced March 2017.

Comments: A revised manuscript (with the new title) has been accepted at Human Brain Map**, see http://onlinelibrary.wiley.com/doi/10.1002/hbm.23730/full

ACM Class: I.2.6

Showing 1–22 of 22 results for author: Schirrmeister, R T