Search | arXiv e-print repository

Energy-Efficient Seizure Detection Suitable for low-power Applications

Authors: Julia Werner, Bhavya Kohli, Paul Palomero Bernardo, Christoph Gerum, Oliver Bringmann

Abstract: Epilepsy is the most common, chronic, neurological disease worldwide and is typically accompanied by reoccurring seizures. Neuro implants can be used for effective treatment by suppressing an upcoming seizure upon detection. Due to the restricted size and limited battery lifetime of those medical devices, the employed approach also needs to be limited in size and have low energy requirements. We p… ▽ More Epilepsy is the most common, chronic, neurological disease worldwide and is typically accompanied by reoccurring seizures. Neuro implants can be used for effective treatment by suppressing an upcoming seizure upon detection. Due to the restricted size and limited battery lifetime of those medical devices, the employed approach also needs to be limited in size and have low energy requirements. We present an energy-efficient seizure detection approach involving a TC-ResNet and time-series analysis which is suitable for low-power edge devices. The presented approach allows for accurate seizure detection without preceding feature extraction while considering the stringent hardware requirements of neural implants. The approach is validated using the CHB-MIT Scalp EEG Database with a 32-bit floating point model and a hardware suitable 4-bit fixed point model. The presented method achieves an accuracy of 95.28%, a sensitivity of 92.34% and an AUC score of 0.9384 on this dataset with 4-bit fixed point representation. Furthermore, the power consumption of the model is measured with the low-power AI accelerator UltraTrail, which only requires 495 nW on average. Due to this low-power consumption this classification approach is suitable for real-time seizure detection on low-power wearable devices such as neural implants. △ Less

Submitted 19 June, 2024; originally announced June 2024.

Comments: Accepted at IJCNN 2024 (Preprint)

arXiv:2310.07895 [pdf, other]

Precise localization within the GI tract by combining classification of CNNs and time-series analysis of HMMs

Authors: Julia Werner, Christoph Gerum, Moritz Reiber, Jörg Nick, Oliver Bringmann

Abstract: This paper presents a method to efficiently classify the gastroenterologic section of images derived from Video Capsule Endoscopy (VCE) studies by exploring the combination of a Convolutional Neural Network (CNN) for classification with the time-series analysis properties of a Hidden Markov Model (HMM). It is demonstrated that successive time-series analysis identifies and corrects errors in the C… ▽ More This paper presents a method to efficiently classify the gastroenterologic section of images derived from Video Capsule Endoscopy (VCE) studies by exploring the combination of a Convolutional Neural Network (CNN) for classification with the time-series analysis properties of a Hidden Markov Model (HMM). It is demonstrated that successive time-series analysis identifies and corrects errors in the CNN output. Our approach achieves an accuracy of $98.04\%$ on the Rhode Island (RI) Gastroenterology dataset. This allows for precise localization within the gastrointestinal (GI) tract while requiring only approximately 1M parameters and thus, provides a method suitable for low power devices △ Less

Submitted 11 October, 2023; originally announced October 2023.

Comments: Accepted at MLMI 2023

arXiv:2209.03807 [pdf, other]

Hardware Accelerator and Neural Network Co-Optimization for Ultra-Low-Power Audio Processing Devices

Authors: Christoph Gerum, Adrian Frischknecht, Tobias Hald, Paul Palomero Bernardo, Konstantin Lübeck, Oliver Bringmann

Abstract: The increasing spread of artificial neural networks does not stop at ultralow-power edge devices. However, these very often have high computational demand and require specialized hardware accelerators to ensure the design meets power and performance constraints. The manual optimization of neural networks along with the corresponding hardware accelerators can be very challenging. This paper present… ▽ More The increasing spread of artificial neural networks does not stop at ultralow-power edge devices. However, these very often have high computational demand and require specialized hardware accelerators to ensure the design meets power and performance constraints. The manual optimization of neural networks along with the corresponding hardware accelerators can be very challenging. This paper presents HANNAH (Hardware Accelerator and Neural Network seArcH), a framework for automated and combined hardware/software co-design of deep neural networks and hardware accelerators for resource and power-constrained edge devices. The optimization approach uses an evolution-based search algorithm, a neural network template technique, and analytical KPI models for the configurable UltraTrail hardware accelerator template to find an optimized neural network and accelerator configuration. We demonstrate that HANNAH can find suitable neural networks with minimized power consumption and high accuracy for different audio classification tasks such as single-class wake word detection, multi-class keyword detection, and voice activity detection, which are superior to the related work. △ Less

Submitted 29 September, 2022; v1 submitted 8 September, 2022; originally announced September 2022.

Comments: Accepted Version for: EUROMICRO DSD 2022

arXiv:2209.02582 [pdf, other]

Improving the Accuracy and Robustness of CNNs Using a Deep CCA Neural Data Regularizer

Authors: Cassidy Pirlot, Richard C. Gerum, Cory Efird, Joel Zylberberg, Alona Fyshe

Abstract: As convolutional neural networks (CNNs) become more accurate at object recognition, their representations become more similar to the primate visual system. This finding has inspired us and other researchers to ask if the implication also runs the other way: If CNN representations become more brain-like, does the network become more accurate? Previous attempts to address this question showed very m… ▽ More As convolutional neural networks (CNNs) become more accurate at object recognition, their representations become more similar to the primate visual system. This finding has inspired us and other researchers to ask if the implication also runs the other way: If CNN representations become more brain-like, does the network become more accurate? Previous attempts to address this question showed very modest gains in accuracy, owing in part to limitations of the regularization method. To overcome these limitations, we developed a new neural data regularizer for CNNs that uses Deep Canonical Correlation Analysis (DCCA) to optimize the resemblance of the CNN's image representations to that of the monkey visual cortex. Using this new neural data regularizer, we see much larger performance gains in both classification accuracy and within-super-class accuracy, as compared to the previous state-of-the-art neural data regularizers. These networks are also more robust to adversarial attacks than their unregularized counterparts. Together, these results confirm that neural data regularization can push CNN performance higher, and introduces a new method that obtains a larger performance boost. △ Less

Submitted 6 September, 2022; originally announced September 2022.

arXiv:2208.10576 [pdf, other]

Different Spectral Representations in Optimized Artificial Neural Networks and Brains

Authors: Richard C. Gerum, Cassidy Pirlot, Alona Fyshe, Joel Zylberberg

Abstract: Recent studies suggest that artificial neural networks (ANNs) that match the spectral properties of the mammalian visual cortex -- namely, the $\sim 1/n$ eigenspectrum of the covariance matrix of neural activities -- achieve higher object recognition performance and robustness to adversarial attacks than those that do not. To our knowledge, however, no previous work systematically explored how mod… ▽ More Recent studies suggest that artificial neural networks (ANNs) that match the spectral properties of the mammalian visual cortex -- namely, the $\sim 1/n$ eigenspectrum of the covariance matrix of neural activities -- achieve higher object recognition performance and robustness to adversarial attacks than those that do not. To our knowledge, however, no previous work systematically explored how modifying the ANN's spectral properties affects performance. To fill this gap, we performed a systematic search over spectral regularizers, forcing the ANN's eigenspectrum to follow $1/n^α$ power laws with different exponents $α$. We found that larger powers (around 2--3) lead to better validation accuracy and more robustness to adversarial attacks on dense networks. This surprising finding applied to both shallow and deep networks and it overturns the notion that the brain-like spectrum (corresponding to $α\sim 1$) always optimizes ANN performance and/or robustness. For convolutional networks, the best $α$ values depend on the task complexity and evaluation metric: lower $α$ values optimized validation accuracy and robustness to adversarial attack for networks performing a simple object recognition task (categorizing MNIST images of handwritten digits); for a more complex task (categorizing CIFAR-10 natural images), we found that lower $α$ values optimized validation accuracy whereas higher $α$ values optimized adversarial robustness. These results have two main implications. First, they cast doubt on the notion that brain-like spectral properties ($α\sim 1$) \emph{always} optimize ANN performance. Second, they demonstrate the potential for fine-tuned spectral regularizers to optimize a chosen design metric, i.e., accuracy and/or robustness. △ Less

Submitted 22 August, 2022; originally announced August 2022.

arXiv:2111.10009 [pdf, other]

doi 10.3847/1538-4357/ac4399

ExoMiner: A Highly Accurate and Explainable Deep Learning Classifier that Validates 301 New Exoplanets

Authors: Hamed Valizadegan, Miguel Martinho, Laurent S. Wilkens, Jon M. Jenkins, Jeffrey Smith, Douglas A. Caldwell, Joseph D. Twicken, Pedro C. Gerum, Nikash Walia, Kaylie Hausknecht, Noa Y. Lubin, Stephen T. Bryson, Nikunj C. Oza

Abstract: The kepler and TESS missions have generated over 100,000 potential transit signals that must be processed in order to create a catalog of planet candidates. During the last few years, there has been a growing interest in using machine learning to analyze these data in search of new exoplanets. Different from the existing machine learning works, ExoMiner, the proposed deep learning classifier in th… ▽ More The kepler and TESS missions have generated over 100,000 potential transit signals that must be processed in order to create a catalog of planet candidates. During the last few years, there has been a growing interest in using machine learning to analyze these data in search of new exoplanets. Different from the existing machine learning works, ExoMiner, the proposed deep learning classifier in this work, mimics how domain experts examine diagnostic tests to vet a transit signal. ExoMiner is a highly accurate, explainable, and robust classifier that 1) allows us to validate 301 new exoplanets from the MAST Kepler Archive and 2) is general enough to be applied across missions such as the on-going TESS mission. We perform an extensive experimental study to verify that ExoMiner is more reliable and accurate than the existing transit signal classifiers in terms of different classification and ranking metrics. For example, for a fixed precision value of 99%, ExoMiner retrieves 93.6% of all exoplanets in the test set (i.e., recall=0.936) while this rate is 76.3% for the best existing classifier. Furthermore, the modular design of ExoMiner favors its explainability. We introduce a simple explainability framework that provides experts with feedback on why ExoMiner classifies a transit signal into a specific class label (e.g., planet candidate or not planet candidate). △ Less

Submitted 8 December, 2021; v1 submitted 18 November, 2021; originally announced November 2021.

Comments: Accepted for Publication in Astrophysical Journals, November 12, 2021

MSC Class: J.2; I.2.6

arXiv:2109.07930 [pdf, other]

doi 10.1007/978-3-030-86362-3_30

Behavior of Keyword Spotting Networks Under Noisy Conditions

Authors: Anwesh Mohanty, Adrian Frischknecht, Christoph Gerum, Oliver Bringmann

Abstract: Keyword spotting (KWS) is becoming a ubiquitous need with the advancement in artificial intelligence and smart devices. Recent work in this field have focused on several different architectures to achieve good results on datasets with low to moderate noise. However, the performance of these models deteriorates under high noise conditions as shown by our experiments. In our paper, we present an ext… ▽ More Keyword spotting (KWS) is becoming a ubiquitous need with the advancement in artificial intelligence and smart devices. Recent work in this field have focused on several different architectures to achieve good results on datasets with low to moderate noise. However, the performance of these models deteriorates under high noise conditions as shown by our experiments. In our paper, we present an extensive comparison between state-of-the-art KWS networks under various noisy conditions. We also suggest adaptive batch normalization as a technique to improve the performance of the networks when the noise files are unknown during the training phase. The results of such high noise characterization enable future work in develo** models that perform better in the aforementioned conditions. △ Less

Submitted 15 September, 2021; originally announced September 2021.

Comments: 11 pages, 5 figures, Published in Lecture Notes in Computer Science book series (LNCS, volume 12891)

Journal ref: ICANN 2021. Lecture Notes in Computer Science, vol 12891, pp 369-378. Springer

arXiv:2004.13532 [pdf, other]

doi 10.1162/neco_a_01424

Integration of Leaky-Integrate-and-Fire-Neurons in Deep Learning Architectures

Authors: Richard C. Gerum, Achim Schilling

Abstract: Up to now, modern Machine Learning is mainly based on fitting high dimensional functions to enormous data sets, taking advantage of huge hardware resources. We show that biologically inspired neuron models such as the Leaky-Integrate-and-Fire (LIF) neurons provide novel and efficient ways of information encoding. They can be integrated in Machine Learning models, and are a potential target to impr… ▽ More Up to now, modern Machine Learning is mainly based on fitting high dimensional functions to enormous data sets, taking advantage of huge hardware resources. We show that biologically inspired neuron models such as the Leaky-Integrate-and-Fire (LIF) neurons provide novel and efficient ways of information encoding. They can be integrated in Machine Learning models, and are a potential target to improve Machine Learning performance. Thus, we derived simple update-rules for the LIF units from the differential equations, which are easy to numerically integrate. We apply a novel approach to train the LIF units supervisedly via backpropagation, by assigning a constant value to the derivative of the neuron activation function exclusively for the backpropagation step. This simple mathematical trick helps to distribute the error between the neurons of the pre-connected layer. We apply our method to the IRIS blossoms image data set and show that the training technique can be used to train LIF neurons on image classification tasks. Furthermore, we show how to integrate our method in the KERAS (tensorflow) framework and efficiently run it on GPUs. To generate a deeper understanding of the mechanisms during training we developed interactive illustrations, which we provide online. With this study we want to contribute to the current efforts to enhance Machine Intelligence by integrating principles from biology. △ Less

Submitted 11 September, 2020; v1 submitted 28 April, 2020; originally announced April 2020.

Journal ref: Neural Computation (2021) 33 (10)

arXiv:1911.10988 [pdf, other]

doi 10.1016/j.neunet.2020.05.007

Sparsity through evolutionary pruning prevents neuronal networks from overfitting

Authors: Richard C. Gerum, André Erpenbeck, Patrick Krauss, Achim Schilling

Abstract: Modern Machine learning techniques take advantage of the exponentially rising calculation power in new generation processor units. Thus, the number of parameters which are trained to resolve complex tasks was highly increased over the last decades. However, still the networks fail - in contrast to our brain - to develop general intelligence in the sense of being able to solve several complex tasks… ▽ More Modern Machine learning techniques take advantage of the exponentially rising calculation power in new generation processor units. Thus, the number of parameters which are trained to resolve complex tasks was highly increased over the last decades. However, still the networks fail - in contrast to our brain - to develop general intelligence in the sense of being able to solve several complex tasks with only one network architecture. This could be the case because the brain is not a randomly initialized neural network, which has to be trained by simply investing a lot of calculation power, but has from birth some fixed hierarchical structure. To make progress in decoding the structural basis of biological neural networks we here chose a bottom-up approach, where we evolutionarily trained small neural networks in performing a maze task. This simple maze task requires dynamical decision making with delayed rewards. We were able to show that during the evolutionary optimization random severance of connections lead to better generalization performance of the networks compared to fully connected networks. We conclude that sparsity is a central property of neural networks and should be considered for modern Machine learning approaches. △ Less

Submitted 31 January, 2020; v1 submitted 7 November, 2019; originally announced November 2019.

arXiv:1908.10293 [pdf]

Modelagem de um Problema de Dimensionamento de Lotes com Demanda Variavel e Deterministica e Efeitos de Learning e Forgetting

Authors: Pedro Cesar Lopes Gerum

Abstract: The main goal of this paper was to analyze the importance that the effects of learning and forgetting might have in a lot-sizing problem. It assumes that the learning curve and the economies of scale are present in several industries yet are, in most cases, not considered when dealing with a lot-sizing problem. The importance of the effects was demonstrated and quantified, showing that there is st… ▽ More The main goal of this paper was to analyze the importance that the effects of learning and forgetting might have in a lot-sizing problem. It assumes that the learning curve and the economies of scale are present in several industries yet are, in most cases, not considered when dealing with a lot-sizing problem. The importance of the effects was demonstrated and quantified, showing that there is still space for developments in this field. However, as the problem becomes quadratic, there is a possibility that the current algorithms are not able to solve the problem to optimality. Thus, future improvements in the algorithms may further improve the results. However, the overall results found with current algorithms show that the contribution of a discount from a learning curve can be very considerable, even if it is a minimal amount. △ Less

Submitted 20 August, 2019; originally announced August 2019.

Comments: in Portuguese

Showing 1–10 of 10 results for author: Gerum, C