Search | arXiv e-print repository

Open-Source Drift Detection Tools in Action: Insights from Two Use Cases

Authors: Rieke Müller, Mohamed Abdelaal, Davor Stjelja

Abstract: Data drifts pose a critical challenge in the lifecycle of machine learning (ML) models, affecting their performance and reliability. In response to this challenge, we present a microbenchmark study, called D3Bench, which evaluates the efficacy of open-source drift detection tools. D3Bench examines the capabilities of Evidently AI, NannyML, and Alibi-Detect, leveraging real-world data from two smar… ▽ More Data drifts pose a critical challenge in the lifecycle of machine learning (ML) models, affecting their performance and reliability. In response to this challenge, we present a microbenchmark study, called D3Bench, which evaluates the efficacy of open-source drift detection tools. D3Bench examines the capabilities of Evidently AI, NannyML, and Alibi-Detect, leveraging real-world data from two smart building use cases.We prioritize assessing the functional suitability of these tools to identify and analyze data drifts. Furthermore, we consider a comprehensive set of non-functional criteria, such as the integrability with ML pipelines, the adaptability to diverse data types, user-friendliness, computational efficiency, and resource demands. Our findings reveal that Evidently AI stands out for its general data drift detection, whereas NannyML excels at pinpointing the precise timing of shifts and evaluating their consequent effects on predictive accuracy. △ Less

Submitted 10 May, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

arXiv:2404.16042 [pdf]

How explainable AI affects human performance: A systematic review of the behavioural consequences of saliency maps

Authors: Romy Müller

Abstract: Saliency maps can explain how deep neural networks classify images. But are they actually useful for humans? The present systematic review of 68 user studies found that while saliency maps can enhance human performance, null effects or even costs are quite common. To investigate what modulates these effects, the empirical outcomes were organised along several factors related to the human tasks, AI… ▽ More Saliency maps can explain how deep neural networks classify images. But are they actually useful for humans? The present systematic review of 68 user studies found that while saliency maps can enhance human performance, null effects or even costs are quite common. To investigate what modulates these effects, the empirical outcomes were organised along several factors related to the human tasks, AI performance, XAI methods, images to be classified, human participants and comparison conditions. In image-focused tasks, benefits were less common than in AI-focused tasks, but the effects depended on the specific cognitive requirements. Moreover, benefits were usually restricted to incorrect AI predictions in AI-focused tasks but to correct ones in image-focused tasks. XAI-related factors had surprisingly little impact. The evidence was limited for image- and human-related factors and the effects were highly dependent on the comparison conditions. These findings may support the design of future user studies. △ Less

Submitted 26 April, 2024; v1 submitted 3 April, 2024; originally announced April 2024.

arXiv:2402.10318 [pdf, other]

Multi-Antenna Towards Inband Shift Keying

Authors: Ralf R. Müller

Abstract: We propose a new continuous phase frequency shift keying that is particularly suited for multi-antenna communications when the link budget is critical and beam alignment is problematic. It combines the constant envelope of frequency modulation with low-rate repetition coding in order to compensate for the absence of transmit beamforming. Although it is a frequency modulation, its transmit signal s… ▽ More We propose a new continuous phase frequency shift keying that is particularly suited for multi-antenna communications when the link budget is critical and beam alignment is problematic. It combines the constant envelope of frequency modulation with low-rate repetition coding in order to compensate for the absence of transmit beamforming. Although it is a frequency modulation, its transmit signal shows close to rectangular spectral shape. Similar to GSM's Gaussian minimum shift keying, it can be well approximated by linear modulation, when combined with differential precoding. This allows for easy coherent demodulation by means of a windowed fast Fourier transform. △ Less

Submitted 8 April, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

Comments: The initial version of this paper contains an error. It calculates the beamforming gain of transmit beamforming for N antennas as N, but it should be N^2. This error has been corrected in the latest version

arXiv:2402.10307

A New Radio to Overcome Critical Link Budgets

Authors: Ralf R. Müller

Abstract: We propose Multi-Antenna (MA) Towards Inband Shift Keying (TISK): a new multi-carrier radio concept to cope with critical link budgets. In contrast to common proposals that rely on analog beamforming at both transmitter and receiver, MA-TISK does not require beam alignment. The transmitted signals have all constant envelope in continuous time, which allows for efficient, low-cost power amplificati… ▽ More We propose Multi-Antenna (MA) Towards Inband Shift Keying (TISK): a new multi-carrier radio concept to cope with critical link budgets. In contrast to common proposals that rely on analog beamforming at both transmitter and receiver, MA-TISK does not require beam alignment. The transmitted signals have all constant envelope in continuous time, which allows for efficient, low-cost power amplification and up-conversion. The concept is compatible with any linear PSK-modulation as well as pulse position modulation. Each sub-carrier is sent over a separate antenna that is equipped with a voltage-controlled oscillator. The phases of these oscillators are controlled by digital baseband. Temporal signal combining makes up for the lack of beamforming gain at the transmitter. A common message may be broadcast to many receivers, simultaneously. Demodulation can be efficiently implemented by means of fast Fourier transform. MA-TISK does not suffer from spectral re-growth issues plaguing other constant envelope modulations like GMSK. Almost rectangular signal spectra similar to those for linear modulation with root-raised-cosine pulse sha** are possible. For the 100 MHz-wide spectral mask of 5G downlink, QPSK-modulation allows for 160 MBit/s with 5.74 MHz subcarrier spacing when using 16 transmit antennas. The wide carrier spacing makes the signals insensitive to Doppler effects. There is no loss in link budget gain compared to spatial beamforming at the transmitter. △ Less

Submitted 20 February, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

Comments: The paper is not correct. The paper calculates the beamforming gain of transmit beamforming for N antennas as N, but it should be N^2

arXiv:2401.08249 [pdf, other]

Graph-based Algorithms for Linear Computation Coding

Authors: Hans Rosenberger, Ali Bereyhi, Ralf R. Müller

Abstract: We revisit existing linear computation coding (LCC) algorithms, and introduce a new framework that measures the computational cost of computing multidimensional linear functions, not only in terms of the number of additions, but also with respect to their suitability for parallel processing. Utilizing directed acyclic graphs, which correspond to signal flow graphs in hardware, we propose a novel L… ▽ More We revisit existing linear computation coding (LCC) algorithms, and introduce a new framework that measures the computational cost of computing multidimensional linear functions, not only in terms of the number of additions, but also with respect to their suitability for parallel processing. Utilizing directed acyclic graphs, which correspond to signal flow graphs in hardware, we propose a novel LCC algorithm that controls the trade-off between the total number of operations and their parallel executability. Numerical evaluations show that the proposed algorithm, constrained to a fully parallel structure, outperforms existing schemes. △ Less

Submitted 16 January, 2024; originally announced January 2024.

Comments: Accepted at the 2024 International Zurich Seminar on Information and Communication

arXiv:2401.08241 [pdf]

Adapt/Exchange decisions or generic choices: Does framing influence how people integrate qualitatively different risks?

Authors: Romy Müller, Alexander Blunk

Abstract: In complex systems, decision makers often have to consider qualitatively different risks when choosing between options. Do their strategies of integrating these risks depend on the framing of problem contents? In the present study, participants were either instructed that they were choosing between two ways of solving a complex problem, or between two generic options. The former was framed as a mo… ▽ More In complex systems, decision makers often have to consider qualitatively different risks when choosing between options. Do their strategies of integrating these risks depend on the framing of problem contents? In the present study, participants were either instructed that they were choosing between two ways of solving a complex problem, or between two generic options. The former was framed as a modular plant scenario that required choices between modifying parameter settings in a current module (Adapt) and replacing the module by another one (Exchange). The risk was higher for Adapt to harm the product and for Exchange to harm the plant. These risks were presented as probabilities, and participants were either told that the consequences of both risks were equally severe (content-same group), or that harming the plant was much worse (content-different group). A third group made decisions based on the same probabilities, but received a generic task framing (no-content group). We expected framing to affect risk integration, leading the content-same group to make different choices than the no-content group. Contrary to this hypothesis, these two groups were strikingly similar in their decision outcomes and strategies, but clearly differed from the content-different group. These findings question whether ecological validity can be enhanced merely by framing a task in terms of real-world problem contents. △ Less

Submitted 16 January, 2024; originally announced January 2024.

arXiv:2401.03504 [pdf, other]

ClusterComm: Discrete Communication in Decentralized MARL using Internal Representation Clustering

Authors: Robert Müller, Hasan Turalic, Thomy Phan, Michael Kölle, Jonas Nüßlein, Claudia Linnhoff-Popien

Abstract: In the realm of Multi-Agent Reinforcement Learning (MARL), prevailing approaches exhibit shortcomings in aligning with human learning, robustness, and scalability. Addressing this, we introduce ClusterComm, a fully decentralized MARL framework where agents communicate discretely without a central control unit. ClusterComm utilizes Mini-Batch-K-Means clustering on the last hidden layer's activation… ▽ More In the realm of Multi-Agent Reinforcement Learning (MARL), prevailing approaches exhibit shortcomings in aligning with human learning, robustness, and scalability. Addressing this, we introduce ClusterComm, a fully decentralized MARL framework where agents communicate discretely without a central control unit. ClusterComm utilizes Mini-Batch-K-Means clustering on the last hidden layer's activations of an agent's policy network, translating them into discrete messages. This approach outperforms no communication and competes favorably with unbounded, continuous communication and hence poses a simple yet effective strategy for enhancing collaborative task-solving in MARL. △ Less

Submitted 7 January, 2024; originally announced January 2024.

Comments: Accepted at ICAART 2024

arXiv:2401.02505 [pdf]

Adapt/Exchange decisions depend on structural and surface features: Effects of solution costs and presentation format

Authors: Romy Müller

Abstract: Problem solvers often need to choose between adapting a current solution and exchanging it for a new one. How do such decisions depend on structural and surface features of the task? The present study investigated the interplay between the costs of the two solutions (a structural feature) and the format in which this information was presented (a surface feature). In a computer-based modular plant… ▽ More Problem solvers often need to choose between adapting a current solution and exchanging it for a new one. How do such decisions depend on structural and surface features of the task? The present study investigated the interplay between the costs of the two solutions (a structural feature) and the format in which this information was presented (a surface feature). In a computer-based modular plant scenario, participants chose between process parameter modifications (Adapt) and reconfigurations of the module setup (Exchange). Solution costs were presented either as graphs depicting parameter relations, separate numbers for each parameter, or integrated numbers for each solution. It was hypothesised that graphs induce satisficing (i.e., basing decisions only on Adapt), whereas the numeric formats foster a comparison of the solutions (i.e., basing decisions on the Adapt/Exchange ratio). The hypothesised effects were restricted to situations with medium Adapt costs. A second experiment replicated these findings while adjusting the scale of numeric formats. We conclude that Adapt/Exchange decisions are shaped by an interaction of structural and surface features. △ Less

Submitted 4 January, 2024; originally announced January 2024.

arXiv:2401.00195 [pdf]

Information acquisition in Adapt/Exchange decisions: When do people check alternative solution principles?

Authors: Romy Müller, Maria Pohl

Abstract: Many problems can be solved in two ways: either by adapting an existing solution, or by exchanging it for a new one. To investigate under what conditions people consider new solutions, we traced their information acquisition processes in a simulated mechanical engineering task. Within a multi-step optimisation procedure, participants could either adapt the properties of a currently used machine co… ▽ More Many problems can be solved in two ways: either by adapting an existing solution, or by exchanging it for a new one. To investigate under what conditions people consider new solutions, we traced their information acquisition processes in a simulated mechanical engineering task. Within a multi-step optimisation procedure, participants could either adapt the properties of a currently used machine component, or exchange this component for a new one. They had the opportunity to check whether the solutions met a set of requirements, which was varied systematically. We investigated whether participants would consistently check both solutions, or whether they would satisfice, ignoring the new solution as long as the current one was good enough. The results clearly refuted consistent checking, but only partly confirmed satisficing. On the one hand, participants indeed checked the new solution least often when the current one was applicable without problems. On the other hand, in this case the new solution still was not fully ignored. However, the latter finding could be traced back to a few participants who diverged from our anticipated strategy of first checking the current solution, and directly went for the new one. The results suggest that in Adapt/Exchange decisions, people do not usually check both solutions in an unbiased manner, but rely on existing solutions as long as they are good enough. △ Less

Submitted 30 December, 2023; originally announced January 2024.

arXiv:2312.14919 [pdf, other]

Lift-Attend-Splat: Bird's-eye-view camera-lidar fusion using transformers

Authors: James Gunn, Zygmunt Lenyk, Anuj Sharma, Andrea Donati, Alexandru Buburuzan, John Redford, Romain Mueller

Abstract: Combining complementary sensor modalities is crucial to providing robust perception for safety-critical robotics applications such as autonomous driving (AD). Recent state-of-the-art camera-lidar fusion methods for AD rely on monocular depth estimation which is a notoriously difficult task compared to using depth information from the lidar directly. Here, we find that this approach does not levera… ▽ More Combining complementary sensor modalities is crucial to providing robust perception for safety-critical robotics applications such as autonomous driving (AD). Recent state-of-the-art camera-lidar fusion methods for AD rely on monocular depth estimation which is a notoriously difficult task compared to using depth information from the lidar directly. Here, we find that this approach does not leverage depth as expected and show that naively improving depth estimation does not lead to improvements in object detection performance. Strikingly, we also find that removing depth estimation altogether does not degrade object detection performance substantially, suggesting that relying on monocular depth could be an unnecessary architectural bottleneck during camera-lidar fusion. In this work, we introduce a novel fusion method that bypasses monocular depth estimation altogether and instead selects and fuses camera and lidar features in a bird's-eye-view grid using a simple attention mechanism. We show that our model can modulate its use of camera features based on the availability of lidar features and that it yields better 3D object detection on the nuScenes dataset than baselines relying on monocular depth estimation. △ Less

Submitted 21 May, 2024; v1 submitted 22 December, 2023; originally announced December 2023.

Comments: Updated method figure; camera ready

arXiv:2312.09174 [pdf, other]

Towards Efficient Quantum Anomaly Detection: One-Class SVMs using Variable Subsampling and Randomized Measurements

Authors: Michael Kölle, Afrae Ahouzi, Pascal Debus, Robert Müller, Danielle Schuman, Claudia Linnhoff-Popien

Abstract: Quantum computing, with its potential to enhance various machine learning tasks, allows significant advancements in kernel calculation and model precision. Utilizing the one-class Support Vector Machine alongside a quantum kernel, known for its classically challenging representational capacity, notable improvements in average precision compared to classical counterparts were observed in previous s… ▽ More Quantum computing, with its potential to enhance various machine learning tasks, allows significant advancements in kernel calculation and model precision. Utilizing the one-class Support Vector Machine alongside a quantum kernel, known for its classically challenging representational capacity, notable improvements in average precision compared to classical counterparts were observed in previous studies. Conventional calculations of these kernels, however, present a quadratic time complexity concerning data size, posing challenges in practical applications. To mitigate this, we explore two distinct approaches: utilizing randomized measurements to evaluate the quantum kernel and implementing the variable subsampling ensemble method, both targeting linear time complexity. Experimental results demonstrate a substantial reduction in training and inference times by up to 95\% and 25\% respectively, employing these methods. Although unstable, the average precision of randomized measurements discernibly surpasses that of the classical Radial Basis Function kernel, suggesting a promising direction for further research in scalable, efficient quantum computing applications in machine learning. △ Less

Submitted 14 December, 2023; originally announced December 2023.

Comments: Accepted at ICAART 2024

arXiv:2311.12722 [pdf, other]

Attacking Motion Planners Using Adversarial Perception Errors

Authors: Jonathan Sadeghi, Nicholas A. Lord, John Redford, Romain Mueller

Abstract: Autonomous driving (AD) systems are often built and tested in a modular fashion, where the performance of different modules is measured using task-specific metrics. These metrics should be chosen so as to capture the downstream impact of each module and the performance of the system as a whole. For example, high perception quality should enable prediction and planning to be performed safely. Even… ▽ More Autonomous driving (AD) systems are often built and tested in a modular fashion, where the performance of different modules is measured using task-specific metrics. These metrics should be chosen so as to capture the downstream impact of each module and the performance of the system as a whole. For example, high perception quality should enable prediction and planning to be performed safely. Even though this is true in general, we show here that it is possible to construct planner inputs that score very highly on various perception quality metrics but still lead to planning failures. In an analogy to adversarial attacks on image classifiers, we call such inputs \textbf{adversarial perception errors} and show they can be systematically constructed using a simple boundary-attack algorithm. We demonstrate the effectiveness of this algorithm by finding attacks for two different black-box planners in several urban and highway driving scenarios using the CARLA simulator. Finally, we analyse the properties of these attacks and show that they are isolated in the input space of the planner, and discuss their implications for AD system deployment and testing. △ Less

Submitted 21 November, 2023; originally announced November 2023.

arXiv:2311.12481 [pdf]

Interpretability is in the eye of the beholder: Human versus artificial classification of image segments generated by humans versus XAI

Authors: Romy Müller, Marius Thoß, Julian Ullrich, Steffen Seitz, Carsten Knoll

Abstract: The evaluation of explainable artificial intelligence is challenging, because automated and human-centred metrics of explanation quality may diverge. To clarify their relationship, we investigated whether human and artificial image classification will benefit from the same visual explanations. In three experiments, we analysed human reaction times, errors, and subjective ratings while participants… ▽ More The evaluation of explainable artificial intelligence is challenging, because automated and human-centred metrics of explanation quality may diverge. To clarify their relationship, we investigated whether human and artificial image classification will benefit from the same visual explanations. In three experiments, we analysed human reaction times, errors, and subjective ratings while participants classified image segments. These segments either reflected human attention (eye movements, manual selections) or the outputs of two attribution methods explaining a ResNet (Grad-CAM, XRAI). We also had this model classify the same segments. Humans and the model largely agreed on the interpretability of attribution methods: Grad-CAM was easily interpretable for indoor scenes and landscapes, but not for objects, while the reverse pattern was observed for XRAI. Conversely, human and model performance diverged for human-generated segments. Our results caution against general statements about interpretability, as it varies with the explanation method, the explained images, and the agent interpreting them. △ Less

Submitted 12 February, 2024; v1 submitted 21 November, 2023; originally announced November 2023.

arXiv:2310.01220 [pdf]

The benefits and costs of explainable artificial intelligence in visual quality control: Evidence from fault detection performance and eye movements

Authors: Romy Müller, David F. Reindel, Yannick D. Stadtfeld

Abstract: Visual inspection tasks often require humans to cooperate with AI-based image classifiers. To enhance this cooperation, explainable artificial intelligence (XAI) can highlight those image areas that have contributed to an AI decision. However, the literature on visual cueing suggests that such XAI support might come with costs of its own. To better understand how the benefits and cost of XAI depen… ▽ More Visual inspection tasks often require humans to cooperate with AI-based image classifiers. To enhance this cooperation, explainable artificial intelligence (XAI) can highlight those image areas that have contributed to an AI decision. However, the literature on visual cueing suggests that such XAI support might come with costs of its own. To better understand how the benefits and cost of XAI depend on the accuracy of AI classifications and XAI highlights, we conducted two experiments that simulated visual quality control in a chocolate factory. Participants had to decide whether chocolate moulds contained faulty bars or not, and were always informed whether the AI had classified the mould as faulty or not. In half of the experiment, they saw additional XAI highlights that justified this classification. While XAI speeded up performance, its effects on error rates were highly dependent on (X)AI accuracy. XAI benefits were observed when the system correctly detected and highlighted the fault, but XAI costs were evident for misplaced highlights that marked an intact area while the actual fault was located elsewhere. Eye movement analyses indicated that participants spent less time searching the rest of the mould and thus looked at the fault less often. However, we also observed large interindividual differences. Taken together, the results suggest that despite its potentials, XAI can discourage people from investing effort into their own information analysis. △ Less

Submitted 13 November, 2023; v1 submitted 2 October, 2023; originally announced October 2023.

arXiv:2309.14198 [pdf, other]

(Predictable) Performance Bias in Unsupervised Anomaly Detection

Authors: Felix Meissen, Svenja Breuer, Moritz Knolle, Alena Buyx, Ruth Müller, Georgios Kaissis, Benedikt Wiestler, Daniel Rückert

Abstract: Background: With the ever-increasing amount of medical imaging data, the demand for algorithms to assist clinicians has amplified. Unsupervised anomaly detection (UAD) models promise to aid in the crucial first step of disease detection. While previous studies have thoroughly explored fairness in supervised models in healthcare, for UAD, this has so far been unexplored. Methods: In this study, w… ▽ More Background: With the ever-increasing amount of medical imaging data, the demand for algorithms to assist clinicians has amplified. Unsupervised anomaly detection (UAD) models promise to aid in the crucial first step of disease detection. While previous studies have thoroughly explored fairness in supervised models in healthcare, for UAD, this has so far been unexplored. Methods: In this study, we evaluated how dataset composition regarding subgroups manifests in disparate performance of UAD models along multiple protected variables on three large-scale publicly available chest X-ray datasets. Our experiments were validated using two state-of-the-art UAD models for medical images. Finally, we introduced a novel subgroup-AUROC (sAUROC) metric, which aids in quantifying fairness in machine learning. Findings: Our experiments revealed empirical "fairness laws" (similar to "scaling laws" for Transformers) for training-dataset composition: Linear relationships between anomaly detection performance within a subpopulation and its representation in the training data. Our study further revealed performance disparities, even in the case of balanced training data, and compound effects that exacerbate the drop in performance for subjects associated with multiple adversely affected groups. Interpretation: Our study quantified the disparate performance of UAD models against certain demographic subgroups. Importantly, we showed that this unfairness cannot be mitigated by balanced representation alone. Instead, the representation of some subgroups seems harder to learn by UAD models than that of others. The empirical fairness laws discovered in our study make disparate performance in UAD models easier to estimate and aid in determining the most desirable dataset composition. △ Less

Submitted 25 September, 2023; originally announced September 2023.

Comments: 11 pages, 5 Figures, 1 panel

arXiv:2308.15079 [pdf, other]

Area Efficient Modular Reduction in Hardware for Arbitrary Static Moduli

Authors: Robin Müller, Willi Meier, Christoph F. Wildfeuer

Abstract: Modular reduction is a crucial operation in many post-quantum cryptographic schemes, including the Kyber key exchange method or Dilithium signature scheme. However, it can be computationally expensive and pose a performance bottleneck in hardware implementations. To address this issue, we propose a novel approach for computing modular reduction efficiently in hardware for arbitrary static moduli.… ▽ More Modular reduction is a crucial operation in many post-quantum cryptographic schemes, including the Kyber key exchange method or Dilithium signature scheme. However, it can be computationally expensive and pose a performance bottleneck in hardware implementations. To address this issue, we propose a novel approach for computing modular reduction efficiently in hardware for arbitrary static moduli. Unlike other commonly used methods such as Barrett or Montgomery reduction, the method does not require any multiplications. It is not dependent on properties of any particular choice of modulus for good performance and low area consumption. Its major strength lies in its low area consumption, which was reduced by 60% for optimized and up to 90% for generic Barrett implementations for Kyber and Dilithium. Additionally, it is well suited for parallelization and pipelining and scales linearly in hardware resource consumption with increasing operation width. All operations can be performed in the bit-width of the modulus, rather than the size of the number being reduced. This shortens carry chains and allows for faster clocking. Moreover, our method can be executed in constant time, which is essential for cryptography applications where timing attacks can be used to obtain information about the secret key. △ Less

Submitted 29 August, 2023; originally announced August 2023.

Comments: 7 pages, 2 figures

arXiv:2307.13345 [pdf]

Do humans and Convolutional Neural Networks attend to similar areas during scene classification: Effects of task and image type

Authors: Romy Müller, Marcel Dürschmidt, Julian Ullrich, Carsten Knoll, Sascha Weber, Steffen Seitz

Abstract: Deep Learning models like Convolutional Neural Networks (CNN) are powerful image classifiers, but what factors determine whether they attend to similar image areas as humans do? While previous studies have focused on technological factors, little is known about the role of factors that affect human attention. In the present study, we investigated how the tasks used to elicit human attention maps i… ▽ More Deep Learning models like Convolutional Neural Networks (CNN) are powerful image classifiers, but what factors determine whether they attend to similar image areas as humans do? While previous studies have focused on technological factors, little is known about the role of factors that affect human attention. In the present study, we investigated how the tasks used to elicit human attention maps interact with image characteristics in modulating the similarity between humans and CNN. We varied the intentionality of human tasks, ranging from spontaneous gaze during categorization over intentional gaze-pointing up to manual area selection. Moreover, we varied the type of image to be categorized, using either singular, salient objects, indoor scenes consisting of object arrangements, or landscapes without distinct objects defining the category. The human attention maps generated in this way were compared to the CNN attention maps revealed by explainable artificial intelligence (Grad-CAM). The influence of human tasks strongly depended on image type: For objects, human manual selection produced maps that were most similar to CNN, while the specific eye movement task has little impact. For indoor scenes, spontaneous gaze produced the least similarity, while for landscapes, similarity was equally low across all human tasks. To better understand these results, we also compared the different human attention maps to each other. Our results highlight the importance of taking human factors into account when comparing the attention of humans and CNN. △ Less

Submitted 15 October, 2023; v1 submitted 25 July, 2023; originally announced July 2023.

arXiv:2307.11788 [pdf, other]

doi 10.1109/QCE57702.2023.10178

Applying QNLP to sentiment analysis in finance

Authors: Jonas Stein, Ivo Christ, Nicolas Kraus, Maximilian Balthasar Mansky, Robert Müller, Claudia Linnhoff-Popien

Abstract: As an application domain where the slightest qualitative improvements can yield immense value, finance is a promising candidate for early quantum advantage. Focusing on the rapidly advancing field of Quantum Natural Language Processing (QNLP), we explore the practical applicability of the two central approaches DisCoCat and Quantum-Enhanced Long Short-Term Memory (QLSTM) to the problem of sentimen… ▽ More As an application domain where the slightest qualitative improvements can yield immense value, finance is a promising candidate for early quantum advantage. Focusing on the rapidly advancing field of Quantum Natural Language Processing (QNLP), we explore the practical applicability of the two central approaches DisCoCat and Quantum-Enhanced Long Short-Term Memory (QLSTM) to the problem of sentiment analysis in finance. Utilizing a novel ChatGPT-based data generation approach, we conduct a case study with more than 1000 realistic sentences and find that QLSTMs can be trained substantially faster than DisCoCat while also achieving close to classical results for their available software implementations. △ Less

Submitted 11 September, 2023; v1 submitted 20 July, 2023; originally announced July 2023.

Journal ref: QCE'23 Companion: Proceedings of the Companion IEEE International Conference on Quantum Computing and Engineering, 2023, 20-25

arXiv:2306.15682 [pdf]

doi 10.1364/OE.498302

Fast non-iterative algorithm for 3D point-cloud holography

Authors: Nathan Tessema Ersaro, Cem Yalcin, Liz Murray, Leyla Kabuli, Laura Waller, Rikky Muller

Abstract: Recently developed iterative and deep learning-based approaches to computer-generated holography (CGH) have been shown to achieve high-quality photorealistic 3D images with spatial light modulators. However, such approaches remain overly cumbersome for patterning sparse collections of target points across a photoresponsive volume in applications including biological microscopy and material process… ▽ More Recently developed iterative and deep learning-based approaches to computer-generated holography (CGH) have been shown to achieve high-quality photorealistic 3D images with spatial light modulators. However, such approaches remain overly cumbersome for patterning sparse collections of target points across a photoresponsive volume in applications including biological microscopy and material processing. Specifically, in addition to requiring heavy computation that cannot accommodate real-time operation in mobile or hardware-light settings, existing sampling-dependent 3D CGH methods preclude the ability to place target points with arbitrary precision, limiting accessible depths to a handful of planes. Accordingly, we present a non-iterative point cloud holography algorithm that employs fast deterministic calculations in order to efficiently allocate patches of SLM pixels to different target points in the 3D volume and spread the patterning of all points across multiple time frames. Compared to a matched-performance implementation of the iterative Gerchberg-Saxton algorithm, our algorithm's relative computation speed advantage was found to increase with SLM pixel count, exceeding 100,000x at 512x512 array format. △ Less

Submitted 7 September, 2023; v1 submitted 21 June, 2023; originally announced June 2023.

Comments: 22 pages, 11 figures, manuscript and supplement

arXiv:2306.05776 [pdf, other]

Weight Re-Map** for Variational Quantum Algorithms

Authors: Michael Kölle, Alessandro Giovagnoli, Jonas Stein, Maximilian Balthasar Mansky, Julian Hager, Tobias Rohe, Robert Müller, Claudia Linnhoff-Popien

Abstract: Inspired by the remarkable success of artificial neural networks across a broad spectrum of AI tasks, variational quantum circuits (VQCs) have recently seen an upsurge in quantum machine learning applications. The promising outcomes shown by VQCs, such as improved generalization and reduced parameter training requirements, are attributed to the robust algorithmic capabilities of quantum computing.… ▽ More Inspired by the remarkable success of artificial neural networks across a broad spectrum of AI tasks, variational quantum circuits (VQCs) have recently seen an upsurge in quantum machine learning applications. The promising outcomes shown by VQCs, such as improved generalization and reduced parameter training requirements, are attributed to the robust algorithmic capabilities of quantum computing. However, the current gradient-based training approaches for VQCs do not adequately accommodate the fact that trainable parameters (or weights) are typically used as angles in rotational gates. To address this, we extend the concept of weight re-map** for VQCs, as introduced by Kölle et al. (2023). This approach unambiguously maps the weights to an interval of length $2π$, mirroring data rescaling techniques in conventional machine learning that have proven to be highly beneficial in numerous scenarios. In our study, we employ seven distinct weight re-map** functions to assess their impact on eight classification datasets, using variational classifiers as a representative example. Our results indicate that weight re-map** can enhance the convergence speed of the VQC. We assess the efficacy of various re-map** functions across all datasets and measure their influence on the VQC's average performance. Our findings indicate that weight re-map** not only consistently accelerates the convergence of VQCs, regardless of the specific re-map** function employed, but also significantly increases accuracy in certain cases. △ Less

Submitted 9 June, 2023; originally announced June 2023.

arXiv:2301.05615 [pdf, other]

Linear Computation Coding: Exponential Search and Reduced-State Algorithms

Authors: Hans Rosenberger, Johanna S. Fröhlich, Ali Bereyhi, Ralf R. Müller

Abstract: Linear computation coding is concerned with the compression of multidimensional linear functions, i.e. with reducing the computational effort of multiplying an arbitrary vector to an arbitrary, but known, constant matrix. This paper advances over the state-of-the art, that is based on a discrete matching pursuit (DMP) algorithm, by a step-wise optimal search. Offering significant performance gains… ▽ More Linear computation coding is concerned with the compression of multidimensional linear functions, i.e. with reducing the computational effort of multiplying an arbitrary vector to an arbitrary, but known, constant matrix. This paper advances over the state-of-the art, that is based on a discrete matching pursuit (DMP) algorithm, by a step-wise optimal search. Offering significant performance gains over DMP, it is however computationally infeasible for large matrices and high accuracy. Therefore, a reduced-state algorithm is introduced that offers performance superior to DMP, while still being computationally feasible even for large matrices. Depending on the matrix size, the performance gain over DMP is on the order of at least 10%. △ Less

Submitted 13 January, 2023; originally announced January 2023.

Comments: Accepted as paper for presentation at Data Compression Conference (DCC) 2023, Snowbird, UT. 10 pages, 4 figures

arXiv:2212.13680 [pdf, ps, other]

Statistical-CSI-Based Antenna Selection and Precoding in Uplink MIMO

Authors: Chongjun Ouyang, Ali Bereyhi, Saba Asaad, Ralf R. Müller, Hongwen Yang

Abstract: Classical antenna selection schemes require instantaneous channel state information (CSI). This leads to high signaling overhead in the system. This work proposes a novel joint receive antenna selection and precoding scheme for multiuser multiple-input multiple-output uplink transmission that relies only on the long-term statistics of the CSI. The proposed scheme designs the switching network and… ▽ More Classical antenna selection schemes require instantaneous channel state information (CSI). This leads to high signaling overhead in the system. This work proposes a novel joint receive antenna selection and precoding scheme for multiuser multiple-input multiple-output uplink transmission that relies only on the long-term statistics of the CSI. The proposed scheme designs the switching network and the uplink precoders, such that the expected throughput of the system in the long term is maximized. Invoking results from the random matrix theory, we derive a closed-form expression for the expected throughput of the system. We then develop a tractable iterative algorithm to tackle the throughput maximization problem, capitalizing on the alternating optimization and majorization-maximization (MM) techniques. Numerical results substantiate the efficiency of the proposed approach and its superior performance as compared with the baseline. △ Less

Submitted 27 December, 2022; originally announced December 2022.

Comments: 6 pages

arXiv:2212.11842 [pdf, other]

A High-Level Comparison of Recent Technologies for Massive MIMO Architectures

Authors: Hans Rosenberger, Bernhard Gäde, Ali Bereyhi, Doaa Ahmed, Vahid Jamali, Ralf R. Müller, Georg Fischer, Gaoning He, Mérouane Debbah

Abstract: Since the introduction of massive MIMO (mMIMO), the design of a transceiver with feasible complexity has been a challenging problem. Initially, it was believed that the main issue in this respect is the overall RF-cost. However, as mMIMO is becoming more and more a key technology for future wireless networks, it is realized, that the RF-cost is only one of many implementational challenges and desi… ▽ More Since the introduction of massive MIMO (mMIMO), the design of a transceiver with feasible complexity has been a challenging problem. Initially, it was believed that the main issue in this respect is the overall RF-cost. However, as mMIMO is becoming more and more a key technology for future wireless networks, it is realized, that the RF-cost is only one of many implementational challenges and design trade-offs. In this paper, we present, analyze and compare various novel mMIMO architectures, considering recent emerging technologies such as intelligent surface-assisted and Rotman lens based architectures. These are compared to the conventional fully digital (FD) and hybrid analog-digital beamforming (HADB) approaches. To enable a fair comparison, we account for various hardware imperfections and losses and utilize a novel, universal algorithm for signal precoding. Based on our thorough investigations, we draw a generic efficiency to quality trade-off for various mMIMO architectures. We find that in a typical cellular communication setting the reflect/transmit array based architectures sketch the best overall trade-off. Further, we show that in a qualitative ranking the power efficiency of the considered architectures is independent of the frequency range. △ Less

Submitted 22 December, 2022; originally announced December 2022.

Comments: 11 pages, 3 tables, 3 figures

arXiv:2212.11085 [pdf, other]

doi 10.5220/0010818500003116

Empirical Analysis of Limits for Memory Distance in Recurrent Neural Networks

Authors: Steffen Illium, Thore Schillman, Robert Müller, Thomas Gabor, Claudia Linnhoff-Popien

Abstract: Common to all different kinds of recurrent neural networks (RNNs) is the intention to model relations between data points through time. When there is no immediate relationship between subsequent data points (like when the data points are generated at random, e.g.), we show that RNNs are still able to remember a few data points back into the sequence by memorizing them by heart using standard backp… ▽ More Common to all different kinds of recurrent neural networks (RNNs) is the intention to model relations between data points through time. When there is no immediate relationship between subsequent data points (like when the data points are generated at random, e.g.), we show that RNNs are still able to remember a few data points back into the sequence by memorizing them by heart using standard backpropagation. However, we also show that for classical RNNs, LSTM and GRU networks the distance of data points between recurrent calls that can be reproduced this way is highly limited (compared to even a loose connection between data points) and subject to various constraints imposed by the type and size of the RNN in question. This implies the existence of a hard limit (way below the information-theoretic one) for the distance between related data points within which RNNs are still able to recognize said relation. △ Less

Submitted 20 December, 2022; originally announced December 2022.

arXiv:2212.10093 [pdf, other]

doi 10.21437/Interspeech.2021-273

Visual Transformers for Primates Classification and Covid Detection

Authors: Steffen Illium, Robert Müller, Andreas Sedlmeier, Claudia-Linnhoff Popien

Abstract: We apply the vision transformer, a deep machine learning model build around the attention mechanism, on mel-spectrogram representations of raw audio recordings. When adding mel-based data augmentation techniques and sample-weighting, we achieve comparable performance on both (PRS and CCS challenge) tasks of ComParE21, outperforming most single model baselines. We further introduce overlap** vert… ▽ More We apply the vision transformer, a deep machine learning model build around the attention mechanism, on mel-spectrogram representations of raw audio recordings. When adding mel-based data augmentation techniques and sample-weighting, we achieve comparable performance on both (PRS and CCS challenge) tasks of ComParE21, outperforming most single model baselines. We further introduce overlap** vertical patching and evaluate the influence of parameter configurations. Index Terms: audio classification, attention, mel-spectrogram, unbalanced data-sets, computational paralinguistics △ Less

Submitted 20 December, 2022; originally announced December 2022.

arXiv:2212.10074 [pdf]

doi 10.1016/j.jbiomech.2023.111605

'Virtual pivot point' in human walking: always experimentally observed but simulations suggest it may not be necessary for stability

Authors: L. Schreff, D. F. B. Haeufle, A. Badri-Spröwitz, J. Vielemeyer, R. Müller

Abstract: The intersection of ground reaction forces near a point above the center of mass has been observed in computer simulation models and human walking experiments. Observed so ubiquitously, the intersection point (IP) is commonly assumed to provide postural stability for bipedal walking. In this study, we challenge this assumption by questioning if walking without an IP is possible. Deriving gaits wit… ▽ More The intersection of ground reaction forces near a point above the center of mass has been observed in computer simulation models and human walking experiments. Observed so ubiquitously, the intersection point (IP) is commonly assumed to provide postural stability for bipedal walking. In this study, we challenge this assumption by questioning if walking without an IP is possible. Deriving gaits with a neuromuscular reflex model through multi-stage optimization, we found stable walking patterns that show no signs of the IP-typical intersection of ground reaction forces. The non-IP gaits found are stable and successfully rejected step-down perturbations, which indicates that an IP is not necessary for locomotion robustness or postural stability. A collision-based analysis shows that non-IP gaits feature center of mass (CoM) dynamics with vectors of the CoM velocity and ground reaction force increasingly opposing each other, indicating an increased mechanical cost of transport. Although our computer simulation results have yet to be confirmed through experimental studies, they already indicate that the role of the IP in postural stability should be further investigated. Moreover, our observations on the CoM dynamics and gait efficiency suggest that the IP may have an alternative or additional function that should be considered. △ Less

Submitted 8 May, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

arXiv:2212.09814 [pdf, ps, other]

Analysis of Sparse Recovery Algorithms via the Replica Method

Authors: Ali Bereyhi, Ralf R. Müller, Hermann Schulz-Baldes

Abstract: This manuscript goes through the fundamental connections between statistical mechanics and estimation theory by focusing on the particular problem of compressive sensing. We first show that the asymptotic analysis of a sparse recovery algorithm is mathematically equivalent to the problem of calculating the free energy of a spin glass in the thermodynamic limit. We then use the replica method from… ▽ More This manuscript goes through the fundamental connections between statistical mechanics and estimation theory by focusing on the particular problem of compressive sensing. We first show that the asymptotic analysis of a sparse recovery algorithm is mathematically equivalent to the problem of calculating the free energy of a spin glass in the thermodynamic limit. We then use the replica method from statistical mechanics to evaluate the performance in the asymptotic regime. The asymptotic results have several applications in communications and signal processing. We briefly go through two instances of these applications: Characterization of joint sparse recovery algorithms used in distributed compressive sensing, and tuning of receivers employed for detection of spatially modulated signals. △ Less

Submitted 19 December, 2022; originally announced December 2022.

Comments: "A Comprehensive Introduction to the Applications of the Replica Method in Analysis of Large Inference Problems". Initial version of the contribution to the book "Compressed Sensing in Information Processing''; 32 pages, 2 figures

arXiv:2211.03449 [pdf, other]

How to Coordinate Edge Devices for Over-the-Air Federated Learning?

Authors: Mohammad Ali Sedaghat, Ali Bereyhi, Saba Asaad, Ralf R. Mueller

Abstract: This work studies the task of device coordination in wireless networks for over-the-air federated learning (OTA-FL). For conventional metrics of aggregation error, the task is shown to describe the zero-forcing (ZF) and minimum mean squared error (MMSE) schemes and reduces to the NP-hard problem of subset selection. We tackle this problem by studying properties of the optimal scheme. Our analytica… ▽ More This work studies the task of device coordination in wireless networks for over-the-air federated learning (OTA-FL). For conventional metrics of aggregation error, the task is shown to describe the zero-forcing (ZF) and minimum mean squared error (MMSE) schemes and reduces to the NP-hard problem of subset selection. We tackle this problem by studying properties of the optimal scheme. Our analytical results reveal that this scheme is found by searching among the leaves of a tree with favorable monotonic features. Invoking these features, we develop a low-complexity algorithm that approximates the optimal scheme by tracking a dominant path of the tree sequentially. Our numerical investigations show that the proposed algorithm closely tracks the optimal scheme. △ Less

Submitted 8 November, 2022; v1 submitted 7 November, 2022; originally announced November 2022.

arXiv:2210.13345 [pdf, ps, other]

A Novel Antenna Placement Algorithm for Compressive Sensing MIMO Radar

Authors: Bastian Eisele, Ali Bereyhi, Ingrid Ullmann, Ralf Müller

Abstract: In colocated compressive sensing MIMO radar, the measurement matrix is specified by antenna placement. To guarantee an acceptable recovery performance, this measurement matrix should satisfy certain properties, e.g., a small coherence. Prior work in the literature often employs randomized placement algorithms which optimize the prior distribution of antenna locations. The performance of these algo… ▽ More In colocated compressive sensing MIMO radar, the measurement matrix is specified by antenna placement. To guarantee an acceptable recovery performance, this measurement matrix should satisfy certain properties, e.g., a small coherence. Prior work in the literature often employs randomized placement algorithms which optimize the prior distribution of antenna locations. The performance of these algorithms is suboptimal, as they can be easily enhanced via expurgation. In this paper, we suggest an iterative antenna placement algorithm which determines the antenna locations deterministically. The proposed algorithm locates jointly the antenna elements on the transmit and receive arrays, such that the coherence of the resulting measurement matrix is minimized. Numerical simulations demonstrate that the proposed algorithm outperforms significantly the benchmark, even after expurgation. △ Less

Submitted 24 October, 2022; originally announced October 2022.

Comments: 6 pages, 3 figures

arXiv:2210.02168 [pdf, other]

An Active Learning Reliability Method for Systems with Partially Defined Performance Functions

Authors: Jonathan Sadeghi, Romain Mueller, John Redford

Abstract: In engineering design, one often wishes to calculate the probability that the performance of a system is satisfactory under uncertainty. State of the art algorithms exist to solve this problem using active learning with Gaussian process models. However, these algorithms cannot be applied to problems which often occur in the autonomous vehicle domain where the performance of a system may be undefin… ▽ More In engineering design, one often wishes to calculate the probability that the performance of a system is satisfactory under uncertainty. State of the art algorithms exist to solve this problem using active learning with Gaussian process models. However, these algorithms cannot be applied to problems which often occur in the autonomous vehicle domain where the performance of a system may be undefined under certain circumstances. To solve this problem, we introduce a hierarchical model for the system performance, where undefined performance is classified before the performance is regressed. This enables active learning Gaussian process methods to be applied to problems where the performance of the system is sometimes undefined, and we demonstrate the effectiveness of our approach by testing our methodology on synthetic numerical examples for the autonomous driving domain. △ Less

Submitted 2 November, 2022; v1 submitted 5 October, 2022; originally announced October 2022.

Comments: To appear in NeurIPS 2022 Workshop on Gaussian Processes, Spatiotemporal Modeling, and Decision-making Systems (GPSMDMS). The code to generate these experiments is available as an open source repository, see http://github.com/fiveai/hGP_experiments/

arXiv:2209.11559 [pdf, other]

doi 10.1609/aaai.v37i12.26717

Query-based Hard-Image Retrieval for Object Detection at Test Time

Authors: Edward Ayers, Jonathan Sadeghi, John Redford, Romain Mueller, Puneet K. Dokania

Abstract: There is a longstanding interest in capturing the error behaviour of object detectors by finding images where their performance is likely to be unsatisfactory. In real-world applications such as autonomous driving, it is also crucial to characterise potential failures beyond simple requirements of detection performance. For example, a missed detection of a pedestrian close to an ego vehicle will g… ▽ More There is a longstanding interest in capturing the error behaviour of object detectors by finding images where their performance is likely to be unsatisfactory. In real-world applications such as autonomous driving, it is also crucial to characterise potential failures beyond simple requirements of detection performance. For example, a missed detection of a pedestrian close to an ego vehicle will generally require closer inspection than a missed detection of a car in the distance. The problem of predicting such potential failures at test time has largely been overlooked in the literature and conventional approaches based on detection uncertainty fall short in that they are agnostic to such fine-grained characterisation of errors. In this work, we propose to reformulate the problem of finding "hard" images as a query-based hard image retrieval task, where queries are specific definitions of "hardness", and offer a simple and intuitive method that can solve this task for a large family of queries. Our method is entirely post-hoc, does not require ground-truth annotations, is independent of the choice of a detector, and relies on an efficient Monte Carlo estimation that uses a simple stochastic model in place of the ground-truth. We show experimentally that it can be applied successfully to a wide variety of queries for which it can reliably identify hard images for a given detector without any labelled data. We provide results on ranking and classification tasks using the widely used RetinaNet, Faster-RCNN, Mask-RCNN, and Cascade Mask-RCNN object detectors. The code for this project is available at https://github.com/fiveai/hardest. △ Less

Submitted 29 June, 2023; v1 submitted 23 September, 2022; originally announced September 2022.

Journal ref: Proceedings of the AAAI Conference on Artificial Intelligence, 37(12), 14692-14700 (2023)

arXiv:2209.09157 [pdf, other]

RESHAPE: Explaining Accounting Anomalies in Financial Statement Audits by enhancing SHapley Additive exPlanations

Authors: Ricardo Müller, Marco Schreyer, Timur Sattarov, Damian Borth

Abstract: Detecting accounting anomalies is a recurrent challenge in financial statement audits. Recently, novel methods derived from Deep-Learning (DL) have been proposed to audit the large volumes of a statement's underlying accounting records. However, due to their vast number of parameters, such models exhibit the drawback of being inherently opaque. At the same time, the concealing of a model's inner w… ▽ More Detecting accounting anomalies is a recurrent challenge in financial statement audits. Recently, novel methods derived from Deep-Learning (DL) have been proposed to audit the large volumes of a statement's underlying accounting records. However, due to their vast number of parameters, such models exhibit the drawback of being inherently opaque. At the same time, the concealing of a model's inner workings often hinders its real-world application. This observation holds particularly true in financial audits since auditors must reasonably explain and justify their audit decisions. Nowadays, various Explainable AI (XAI) techniques have been proposed to address this challenge, e.g., SHapley Additive exPlanations (SHAP). However, in unsupervised DL as often applied in financial audits, these methods explain the model output at the level of encoded variables. As a result, the explanations of Autoencoder Neural Networks (AENNs) are often hard to comprehend by human auditors. To mitigate this drawback, we propose (RESHAPE), which explains the model output on an aggregated attribute-level. In addition, we introduce an evaluation framework to compare the versatility of XAI methods in auditing. Our experimental results show empirical evidence that RESHAPE results in versatile explanations compared to state-of-the-art baselines. We envision such attribute-level explanations as a necessary next step in the adoption of unsupervised DL techniques in financial auditing. △ Less

Submitted 19 September, 2022; originally announced September 2022.

Comments: 9 pages, 4 figures, 5 tables, preprint version, currently under review

arXiv:2208.13769 [pdf, ps, other]

doi 10.1007/s00466-023-02369-w

A Lattice Boltzmann Method for nonlinear solid mechanics in the reference configuration

Authors: Erik Faust, Alexander Schlüter, Henning Müller, Ralf Müller

Abstract: With a sufficiently fine discretisation, the Lattice Boltzmann Method (LBM) mimics a second order Crank-Nicolson scheme for certain types of balance laws (Farag et al. [2021]). This allows the explicit, highly parallelisable LBM to efficiently solve the fundamental equations of solid mechanics: the conservation of mass, the balance of linear momentum, and constitutive relations. To date, all LBM… ▽ More With a sufficiently fine discretisation, the Lattice Boltzmann Method (LBM) mimics a second order Crank-Nicolson scheme for certain types of balance laws (Farag et al. [2021]). This allows the explicit, highly parallelisable LBM to efficiently solve the fundamental equations of solid mechanics: the conservation of mass, the balance of linear momentum, and constitutive relations. To date, all LBM algorithms for solid simulation - see e.g. Murthy et al. [2017], Escande et al. [2020], Schlüter et al. [2021] - have been limited to the small strain case. Furthermore, the typical interpretation of the LBM in the current (Eulerian) configuration is not easily extensible to large strains, as large topological changes complicate the treatment of boundary conditions. In this publication, we propose a large deformation Lattice Boltzmann Method for geometrically and constitutively nonlinear solid mechanics. To facilitate versatile boundary modelling, the algorithm is defined in the reference (Lagrangian) configuration. △ Less

Submitted 25 August, 2022; originally announced August 2022.

Comments: 13 pages, 9 figures. Submitted to Computer Methods in Applied Mechanics and Engineering

Journal ref: Comput Mech (2023)

arXiv:2208.04088 [pdf, ps, other]

Dirichlet and Neumann boundary conditions in a Lattice Boltzmann Method for Elastodynamics

Authors: Erik Faust, Alexander Schlüter, Henning Müller, Ralf Müller

Abstract: Recently, Murthy et al. [2017] and Escande et al. [2020] adopted the Lattice Boltzmann Method (LBM) to model the linear elastodynamic behaviour of isotropic solids. The LBM is attractive as an elastodynamic solver because it can be parallelised readily and lends itself to finely discretised dynamic continuum simulations, allowing transient phenomena such as wave propagation to be modelled efficien… ▽ More Recently, Murthy et al. [2017] and Escande et al. [2020] adopted the Lattice Boltzmann Method (LBM) to model the linear elastodynamic behaviour of isotropic solids. The LBM is attractive as an elastodynamic solver because it can be parallelised readily and lends itself to finely discretised dynamic continuum simulations, allowing transient phenomena such as wave propagation to be modelled efficiently. This work proposes simple local boundary rules which approximate the behaviour of Dirichlet and Neumann boundary conditions with an LBM for elastic solids. Both lattice-conforming and non-lattice-conforming, curved boundary geometries are considered. For validation, we compare results produced by the LBM for the sudden loading of a stationary crack with an analytical solution. Furthermore, we investigate the performance of the LBM for the transient tension loading of a plate with a circular hole, using Finite Element (FEM) simulations as a reference. △ Less

Submitted 8 August, 2022; originally announced August 2022.

Comments: Submitted to Springer Computational Mechanics. 15 pages, 10 figures

arXiv:2207.07388 [pdf, other]

doi 10.24963/ijcai.2021/54

Stochastic Market Games

Authors: Kyrill Schmid, Lenz Belzner, Robert Müller, Johannes Tochtermann, Claudia Linnhoff-Popien

Abstract: Some of the most relevant future applications of multi-agent systems like autonomous driving or factories as a service display mixed-motive scenarios, where agents might have conflicting goals. In these settings agents are likely to learn undesirable outcomes in terms of cooperation under independent learning, such as overly greedy behavior. Motivated from real world societies, in this work we pro… ▽ More Some of the most relevant future applications of multi-agent systems like autonomous driving or factories as a service display mixed-motive scenarios, where agents might have conflicting goals. In these settings agents are likely to learn undesirable outcomes in terms of cooperation under independent learning, such as overly greedy behavior. Motivated from real world societies, in this work we propose to utilize market forces to provide incentives for agents to become cooperative. As demonstrated in an iterated version of the Prisoner's Dilemma, the proposed market formulation can change the dynamics of the game to consistently learn cooperative policies. Further we evaluate our approach in spatially and temporally extended settings for varying numbers of agents. We empirically find that the presence of markets can improve both the overall result and agent individual returns via their trading activities. △ Less

Submitted 19 July, 2022; v1 submitted 15 July, 2022; originally announced July 2022.

Comments: IJCAI-21

arXiv:2206.12412 [pdf, other]

doi 10.1007/s00419-022-02306-y

Dynamic Propagation of Mode III Cracks in a Lattice Boltzmann Method for Solids

Authors: Henning Müller, Ali Touil, Alexander Schlüter, Ralf Müller

Abstract: This work presents concepts and algorithms for the simulation of dynamic fractures with a Lattice Boltzmann method (LBM) for linear elastic solids. This LBM has been presented previously and solves the wave equation, which is interpreted as the governing equation for antiplane shear deformation. Besides the steady growth of a crack at a prescribed crack velocity, a fracture criterion based on stre… ▽ More This work presents concepts and algorithms for the simulation of dynamic fractures with a Lattice Boltzmann method (LBM) for linear elastic solids. This LBM has been presented previously and solves the wave equation, which is interpreted as the governing equation for antiplane shear deformation. Besides the steady growth of a crack at a prescribed crack velocity, a fracture criterion based on stress intensity factors (SIF) has been implemented. This is the first time, that crack propagation with a mechanically relevant criterion is regarded in the context of LBMs. Numerical results are examined to validate the proposed method. The concepts of crack propagation introduced here are not limited to mode III cracks or the simplified deformation assumption of antiplane shear. By introducing a rather simple processing step into the existing LBM at the level of individual lattice sites, the overall performance of the LBM is maintained. Our findings underline the validity of the LBM as a numerical tool to simulate solids in general as well as dynamic fractures in particular. △ Less

Submitted 25 October, 2022; v1 submitted 23 June, 2022; originally announced June 2022.

Comments: accepted for publication in Archive of Applied Mechanics

arXiv:2206.06679 [pdf, ps, other]

Matching Pursuit Based Scheduling for Over-the-Air Federated Learning

Authors: Ali Bereyhi, Adela Vagollari, Saba Asaad, Ralf R. Müller, Wolfgang Gerstacker, H. Vincent Poor

Abstract: This paper develops a class of low-complexity device scheduling algorithms for over-the-air federated learning via the method of matching pursuit. The proposed scheme tracks closely the close-to-optimal performance achieved by difference-of-convex programming, and outperforms significantly the well-known benchmark algorithms based on convex relaxation. Compared to the state-of-the-art, the propose… ▽ More This paper develops a class of low-complexity device scheduling algorithms for over-the-air federated learning via the method of matching pursuit. The proposed scheme tracks closely the close-to-optimal performance achieved by difference-of-convex programming, and outperforms significantly the well-known benchmark algorithms based on convex relaxation. Compared to the state-of-the-art, the proposed scheme poses a drastically lower computational load on the system: For $K$ devices and $N$ antennas at the parameter server, the benchmark complexity scales with $\left(N^2+K\right)^3 + N^6$ while the complexity of the proposed scheme scales with $K^p N^q$ for some $0 < p,q \leq 2$. The efficiency of the proposed scheme is confirmed via numerical experiments on the CIFAR-10 dataset. △ Less

Submitted 12 October, 2022; v1 submitted 14 June, 2022; originally announced June 2022.

Comments: 47 Pages and 10 Figures

arXiv:2206.05827 [pdf, other]

Case-Based Inverse Reinforcement Learning Using Temporal Coherence

Authors: Jonas Nüßlein, Steffen Illium, Robert Müller, Thomas Gabor, Claudia Linnhoff-Popien

Abstract: Providing expert trajectories in the context of Imitation Learning is often expensive and time-consuming. The goal must therefore be to create algorithms which require as little expert data as possible. In this paper we present an algorithm that imitates the higher-level strategy of the expert rather than just imitating the expert on action level, which we hypothesize requires less expert data and… ▽ More Providing expert trajectories in the context of Imitation Learning is often expensive and time-consuming. The goal must therefore be to create algorithms which require as little expert data as possible. In this paper we present an algorithm that imitates the higher-level strategy of the expert rather than just imitating the expert on action level, which we hypothesize requires less expert data and makes training more stable. As a prior, we assume that the higher-level strategy is to reach an unknown target state area, which we hypothesize is a valid prior for many domains in Reinforcement Learning. The target state area is unknown, but since the expert has demonstrated how to reach it, the agent tries to reach states similar to the expert. Building on the idea of Temporal Coherence, our algorithm trains a neural network to predict whether two states are similar, in the sense that they may occur close in time. During inference, the agent compares its current state with expert states from a Case Base for similarity. The results show that our approach can still learn a near-optimal policy in settings with very little expert data, where algorithms that try to imitate the expert at the action level can no longer do so. △ Less

Submitted 12 June, 2022; originally announced June 2022.

Comments: accepted at ICCBR

arXiv:2205.15642 [pdf, ps, other]

How Should IRSs Scale to Harden Multi-Antenna Channels?

Authors: Ali Bereyhi, Saba Asaad, Chongjun Ouyang, Ralf R. Müller, Rafael F. Schaefer, H. Vincent Poor

Abstract: This work extends the concept of channel hardening to multi-antenna systems that are aided by intelligent reflecting surfaces (IRSs). For fading links between a multi-antenna transmitter and a single-antenna receiver, we derive an accurate approximation for the distribution of the input-output mutual information when the number of reflecting elements grows large. The asymptotic results demonstrate… ▽ More This work extends the concept of channel hardening to multi-antenna systems that are aided by intelligent reflecting surfaces (IRSs). For fading links between a multi-antenna transmitter and a single-antenna receiver, we derive an accurate approximation for the distribution of the input-output mutual information when the number of reflecting elements grows large. The asymptotic results demonstrate that by increasing the number of elements on the IRS, the end-to-end channel hardens as long as the physical dimensions of the IRS grow as well. The growth rate however need not to be of a specific order and can be significantly sub-linear. The validity of the analytical result is confirmed by numerical experiments. △ Less

Submitted 31 May, 2022; originally announced May 2022.

Comments: Accepted for presentation at 2022 IEEE Sensor Array and Multichannel Signal Processing Workshop (SAM ) in Trondheim, Norway; 5 pages and 2 figures. arXiv admin note: text overlap with arXiv:2203.11592

arXiv:2205.08782 [pdf, ps, other]

Secure Coding via Gaussian Random Fields

Authors: Ali Bereyhi, Bruno Loureiro, Florent Krzakala, Ralf R. Müller, Hermann Schulz-Baldes

Abstract: Inverse probability problems whose generative models are given by strictly nonlinear Gaussian random fields show the all-or-nothing behavior: There exists a critical rate at which Bayesian inference exhibits a phase transition. Below this rate, the optimal Bayesian estimator recovers the data perfectly, and above it the recovered data becomes uncorrelated. This study uses the replica method from t… ▽ More Inverse probability problems whose generative models are given by strictly nonlinear Gaussian random fields show the all-or-nothing behavior: There exists a critical rate at which Bayesian inference exhibits a phase transition. Below this rate, the optimal Bayesian estimator recovers the data perfectly, and above it the recovered data becomes uncorrelated. This study uses the replica method from the theory of spin glasses to show that this critical rate is the channel capacity. This interesting finding has a particular application to the problem of secure transmission: A strictly nonlinear Gaussian random field along with random binning can be used to securely encode a confidential message in a wiretap channel. Our large-system characterization demonstrates that this secure coding scheme asymptotically achieves the secrecy capacity of the Gaussian wiretap channel. △ Less

Submitted 18 May, 2022; originally announced May 2022.

Comments: Accepted for presentation in 2022 IEEE International Symposium on Information Theory (ISIT), 6 pages, 2 figures

arXiv:2203.11592 [pdf, ps, other]

Channel Hardening of IRS-Aided Multi-Antenna Systems: How Should IRSs Scale?

Authors: Ali Bereyhi, Saba Asaad, Chongjun Ouyang, Ralf R. Müller, Rafael F. Schaefer, H. Vincent Poor

Abstract: Unlike active array antennas, intelligent reflecting surfaces (IRSs) are efficiently implemented at large dimensions. This allows for traceable realizations of large-scale IRS-aided MIMO systems in which not necessarily the array antennas, but the passive IRSs are large. It is widely believed that large IRS-aided MIMO settings maintain the fundamental features of massive MIMO systems, and hence th… ▽ More Unlike active array antennas, intelligent reflecting surfaces (IRSs) are efficiently implemented at large dimensions. This allows for traceable realizations of large-scale IRS-aided MIMO systems in which not necessarily the array antennas, but the passive IRSs are large. It is widely believed that large IRS-aided MIMO settings maintain the fundamental features of massive MIMO systems, and hence they are the implementationally feasible technology for establishing the performance of large-scale MIMO settings. This work gives a rigorous proof to this belief. We show that using a large passive IRS, the end-to-end MIMO channel between the transmitter and the receiver always hardens, even if the IRS elements are strongly correlated. For the fading direct and reflection links between the transmitter and the receiver, our derivations demonstrate that as the number of IRS elements grows large, the capacity of end-to-end channel converges in distribution to a real-valued Gaussian random variable whose variance goes to zero. The order of this drop depends on how the physical dimensions of the IRS grow. We derive this order explicitly. Numerical experiments depict that the analytical asymptotic distribution almost perfectly matches the histogram of the capacity, even in practical scenarios. As a sample application of the results, we use the asymptotic characterization to study the dimensional trade-off between the transmitter and the IRS. The result is intuitive: For a given target performance, the larger the IRS is, the less transmit antennas are required to achieve the target. For an arbitrary ergodic and outage performance, we characterize this trade-off analytically. Our investigations demonstrate that using a practical IRS size, the target performance can be achieved with significantly small end-to-end MIMO dimensions. △ Less

Submitted 22 March, 2022; originally announced March 2022.

Comments: 17 pages, 11 figures

arXiv:2203.08725 [pdf, other]

Attacking deep networks with surrogate-based adversarial black-box methods is easy

Authors: Nicholas A. Lord, Romain Mueller, Luca Bertinetto

Abstract: A recent line of work on black-box adversarial attacks has revived the use of transfer from surrogate models by integrating it into query-based search. However, we find that existing approaches of this type underperform their potential, and can be overly complicated besides. Here, we provide a short and simple algorithm which achieves state-of-the-art results through a search which uses the surrog… ▽ More A recent line of work on black-box adversarial attacks has revived the use of transfer from surrogate models by integrating it into query-based search. However, we find that existing approaches of this type underperform their potential, and can be overly complicated besides. Here, we provide a short and simple algorithm which achieves state-of-the-art results through a search which uses the surrogate network's class-score gradients, with no need for other priors or heuristics. The guiding assumption of the algorithm is that the studied networks are in a fundamental sense learning similar functions, and that a transfer attack from one to the other should thus be fairly "easy". This assumption is validated by the extremely low query counts and failure rates achieved: e.g. an untargeted attack on a VGG-16 ImageNet network using a ResNet-152 as the surrogate yields a median query count of 6 at a success rate of 99.9%. Code is available at https://github.com/fiveai/GFCS. △ Less

Submitted 16 March, 2022; originally announced March 2022.

Comments: ICLR 2022

arXiv:2203.06732 [pdf, other]

BioSimulators: a central registry of simulation engines and services for recommending specific tools

Authors: Bilal Shaikh, Lucian P. Smith, Dan Vasilescu, Gnaneswara Marupilla, Michael Wilson, Eran Agmon, Henry Agnew, Steven S. Andrews, Azraf Anwar, Moritz E. Beber, Frank T. Bergmann, David Brooks, Lutz Brusch, Laurence Calzone, Kiri Choi, Joshua Cooper, John Detloff, Brian Drawert, Michel Dumontier, G. Bard Ermentrout, James R. Faeder, Andrew P. Freiburger, Fabian Fröhlich, Akira Funahashi, Alan Garny , et al. (46 additional authors not shown)

Abstract: Computational models have great potential to accelerate bioscience, bioengineering, and medicine. However, it remains challenging to reproduce and reuse simulations, in part, because the numerous formats and methods for simulating various subsystems and scales remain siloed by different software tools. For example, each tool must be executed through a distinct interface. To help investigators find… ▽ More Computational models have great potential to accelerate bioscience, bioengineering, and medicine. However, it remains challenging to reproduce and reuse simulations, in part, because the numerous formats and methods for simulating various subsystems and scales remain siloed by different software tools. For example, each tool must be executed through a distinct interface. To help investigators find and use simulation tools, we developed BioSimulators (https://biosimulators.org), a central registry of the capabilities of simulation tools and consistent Python, command-line, and containerized interfaces to each version of each tool. The foundation of BioSimulators is standards, such as CellML, SBML, SED-ML, and the COMBINE archive format, and validation tools for simulation projects and simulation tools that ensure these standards are used consistently. To help modelers find tools for particular projects, we have also used the registry to develop recommendation services. We anticipate that BioSimulators will help modelers exchange, reproduce, and combine simulations. △ Less

Submitted 13 March, 2022; originally announced March 2022.

Comments: 6 pages, 2 figures

arXiv:2201.09986 [pdf, ps, other]

Bayesian Inference with Nonlinear Generative Models: Comments on Secure Learning

Authors: Ali Bereyhi, Bruno Loureiro, Florent Krzakala, Ralf R. Müller, Hermann Schulz-Baldes

Abstract: Unlike the classical linear model, nonlinear generative models have been addressed sparsely in the literature of statistical learning. This work aims to bringing attention to these models and their secrecy potential. To this end, we invoke the replica method to derive the asymptotic normalized cross entropy in an inverse probability problem whose generative model is described by a Gaussian random… ▽ More Unlike the classical linear model, nonlinear generative models have been addressed sparsely in the literature of statistical learning. This work aims to bringing attention to these models and their secrecy potential. To this end, we invoke the replica method to derive the asymptotic normalized cross entropy in an inverse probability problem whose generative model is described by a Gaussian random field with a generic covariance function. Our derivations further demonstrate the asymptotic statistical decoupling of the Bayesian estimator and specify the decoupled setting for a given nonlinear model. The replica solution depicts that strictly nonlinear models establish an all-or-nothing phase transition: There exists a critical load at which the optimal Bayesian inference changes from perfect to an uncorrelated learning. Based on this finding, we design a new secure coding scheme which achieves the secrecy capacity of the wiretap channel. This interesting result implies that strictly nonlinear generative models are perfectly secured without any secure coding. We justify this latter statement through the analysis of an illustrative model for perfectly secure and reliable inference. △ Less

Submitted 13 July, 2022; v1 submitted 19 January, 2022; originally announced January 2022.

Comments: 72 pages, 14 figures

arXiv:2201.08732 [pdf, ps, other]

Meta Learning MDPs with Linear Transition Models

Authors: Robert Müller, Aldo Pacchiano

Abstract: We study meta-learning in Markov Decision Processes (MDP) with linear transition models in the undiscounted episodic setting. Under a task sharedness metric based on model proximity we study task families characterized by a distribution over models specified by a bias term and a variance component. We then propose BUC-MatrixRL, a version of the UC-Matrix RL algorithm, and show it can meaningfully… ▽ More We study meta-learning in Markov Decision Processes (MDP) with linear transition models in the undiscounted episodic setting. Under a task sharedness metric based on model proximity we study task families characterized by a distribution over models specified by a bias term and a variance component. We then propose BUC-MatrixRL, a version of the UC-Matrix RL algorithm, and show it can meaningfully leverage a set of sampled training tasks to quickly solve a test task sampled from the same task distribution by learning an estimator of the bias parameter of the task distribution. The analysis leverages and extends results in the learning to learn linear regression and linear bandit setting to the more general case of MDP's with linear transition models. We prove that compared to learning the tasks in isolation, BUC-Matrix RL provides significant improvements in the transfer regret for high bias low variance task distributions. △ Less

Submitted 21 January, 2022; originally announced January 2022.

arXiv:2201.05718 [pdf, other]

Parameter-free Online Test-time Adaptation

Authors: Malik Boudiaf, Romain Mueller, Ismail Ben Ayed, Luca Bertinetto

Abstract: Training state-of-the-art vision models has become prohibitively expensive for researchers and practitioners. For the sake of accessibility and resource reuse, it is important to focus on adapting these models to a variety of downstream scenarios. An interesting and practical paradigm is online test-time adaptation, according to which training data is inaccessible, no labelled data from the test d… ▽ More Training state-of-the-art vision models has become prohibitively expensive for researchers and practitioners. For the sake of accessibility and resource reuse, it is important to focus on adapting these models to a variety of downstream scenarios. An interesting and practical paradigm is online test-time adaptation, according to which training data is inaccessible, no labelled data from the test distribution is available, and adaptation can only happen at test time and on a handful of samples. In this paper, we investigate how test-time adaptation methods fare for a number of pre-trained models on a variety of real-world scenarios, significantly extending the way they have been originally evaluated. We show that they perform well only in narrowly-defined experimental setups and sometimes fail catastrophically when their hyperparameters are not selected for the same scenario in which they are being tested. Motivated by the inherent uncertainty around the conditions that will ultimately be encountered at test time, we propose a particularly "conservative" approach, which addresses the problem with a Laplacian Adjusted Maximum-likelihood Estimation (LAME) objective. By adapting the model's output (not its parameters), and solving our objective with an efficient concave-convex procedure, our approach exhibits a much higher average accuracy across scenarios than existing methods, while being notably faster and have a much lower memory footprint. The code is available at https://github.com/fiveai/LAME. △ Less

Submitted 4 April, 2022; v1 submitted 14 January, 2022; originally announced January 2022.

Comments: CVPR 2022 (oral). Code available at https://github.com/fiveai/LAME

arXiv:2112.07263 [pdf, other]

Quantifying Multimodality in World Models

Authors: Andreas Sedlmeier, Michael Kölle, Robert Müller, Leo Baudrexel, Claudia Linnhoff-Popien

Abstract: Model-based Deep Reinforcement Learning (RL) assumes the availability of a model of an environment's underlying transition dynamics. This model can be used to predict future effects of an agent's possible actions. When no such model is available, it is possible to learn an approximation of the real environment, e.g. by using generative neural networks, sometimes also called World Models. As most r… ▽ More Model-based Deep Reinforcement Learning (RL) assumes the availability of a model of an environment's underlying transition dynamics. This model can be used to predict future effects of an agent's possible actions. When no such model is available, it is possible to learn an approximation of the real environment, e.g. by using generative neural networks, sometimes also called World Models. As most real-world environments are stochastic in nature and the transition dynamics are oftentimes multimodal, it is important to use a modelling technique that is able to reflect this multimodal uncertainty. In order to safely deploy such learning systems in the real world, especially in an industrial context, it is paramount to consider these uncertainties. In this work, we analyze existing and propose new metrics for the detection and quantification of multimodal uncertainty in RL based World Models. The correct modelling & detection of uncertain future states lays the foundation for handling critical situations in a safe way, which is a prerequisite for deploying RL systems in real-world settings. △ Less

Submitted 14 December, 2021; originally announced December 2021.

arXiv:2112.04415 [pdf, ps, other]

On the Ergodic Mutual Information of Keyhole MIMO Channels With Finite-Alphabet Inputs

Authors: Chongjun Ouyang, Ali Bereyhi, Saba Asaad, Ralf R. Müller, Julian Cheng, Hongwen Yang

Abstract: This letter studies the ergodic mutual information (EMI) of keyhole multiple-input multiple-output channels having finite-alphabet input signals. The EMI is first investigated for single-stream transmission considering both cases with and without the channel state information at the transmitter. Then, the derived results are extended to the scenario of multi-stream transmission. Asymptotic analyse… ▽ More This letter studies the ergodic mutual information (EMI) of keyhole multiple-input multiple-output channels having finite-alphabet input signals. The EMI is first investigated for single-stream transmission considering both cases with and without the channel state information at the transmitter. Then, the derived results are extended to the scenario of multi-stream transmission. Asymptotic analyses are performed in the regime of high signal-to-noise ratio (SNR). The high-SNR EMI is shown to converge to a constant with its rate of convergence determined by the diversity order. On this basis, the influence of the keyhole effect on the EMI is discussed. The analytical results are validated by numerical simulations. △ Less

Submitted 8 September, 2022; v1 submitted 8 December, 2021; originally announced December 2021.

Comments: 5 figures

MSC Class: 94A05

arXiv:2111.13436 [pdf]

Towards a Secure and Reliable IT-Ecosystem in Seaports

Authors: Tobias Brandt, Dieter Hutter, Christian Maeder, Rainer Müller

Abstract: Digitalization in seaports dovetails the IT infrastructure of various actors (e.g., ship** companies, terminals, customs, port authorities) to process complex workflows for ship** containers. The security of these workflows relies not only on the security of each individual actor but actors must also provide additional guarantees to other actors like, for instance, respecting obligations relat… ▽ More Digitalization in seaports dovetails the IT infrastructure of various actors (e.g., ship** companies, terminals, customs, port authorities) to process complex workflows for ship** containers. The security of these workflows relies not only on the security of each individual actor but actors must also provide additional guarantees to other actors like, for instance, respecting obligations related to received data or checking the integrity of workflows observed so far. This paper analyses global security requirements (e.g., accountability, confidentiality) of the workflows and decomposes them - according to the way workflow data is stored and distributed - into requirements and obligations for the individual actors. Security mechanisms are presented to satisfy the resulting requirements, which together with the guarantees of all individual actors will guarantee the security of the overall workflow. △ Less

Submitted 9 December, 2021; v1 submitted 26 November, 2021; originally announced November 2021.

Comments: Presented at the 29th Conference of the International Association of Maritime Economists, Rotterdam, November 2021

ACM Class: C.2.2; J.7

arXiv:2110.02169 [pdf]

doi 10.1109/JSSC.2022.3172231

SOUL: An Energy-Efficient Unsupervised Online Learning Seizure Detection Classifier

Authors: Adelson Chua, Michael I. Jordan, Rikky Muller

Abstract: Implantable devices that record neural activity and detect seizures have been adopted to issue warnings or trigger neurostimulation to suppress epileptic seizures. Typical seizure detection systems rely on high-accuracy offline-trained machine learning classifiers that require manual retraining when seizure patterns change over long periods of time. For an implantable seizure detection system, a l… ▽ More Implantable devices that record neural activity and detect seizures have been adopted to issue warnings or trigger neurostimulation to suppress epileptic seizures. Typical seizure detection systems rely on high-accuracy offline-trained machine learning classifiers that require manual retraining when seizure patterns change over long periods of time. For an implantable seizure detection system, a low power, at-the-edge, online learning algorithm can be employed to dynamically adapt to the neural signal drifts, thereby maintaining high accuracy without external intervention. This work proposes SOUL: Stochastic-gradient-descent-based Online Unsupervised Logistic regression classifier. After an initial offline training phase, continuous online unsupervised classifier updates are applied in situ, which improves sensitivity in patients with drifting seizure features. SOUL was tested on two human electroencephalography (EEG) datasets: the CHB-MIT scalp EEG dataset, and a long (>100 hours) NeuroVista intracranial EEG dataset. It was able to achieve an average sensitivity of 97.5% and 97.9% for the two datasets respectively, at >95% specificity. Sensitivity improved by at most 8.2% on long-term data when compared to a typical seizure detection classifier. SOUL was fabricated in TSMC's 28 nm process occupying 0.1 mm2 and achieves 1.5 nJ/classification energy efficiency, which is at least 24x more efficient than state-of-the-art. △ Less

Submitted 17 May, 2022; v1 submitted 1 October, 2021; originally announced October 2021.

Comments: Copyright 2022 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

Journal ref: IEEE Journal of Solid-State Circuits, 2022

Showing 1–50 of 150 results for author: Müller, R