-
Beyond human subjectivity and error: a novel AI grading system
Authors:
Alexandra Gobrecht,
Felix Tuma,
Moritz Möller,
Thomas Zöller,
Mark Zakhvatkin,
Alexandra Wuttig,
Holger Sommerfeldt,
Sven Schütt
Abstract:
The grading of open-ended questions is a high-effort, high-impact task in education. Automating this task promises a significant reduction in workload for education professionals, as well as more consistent grading outcomes for students, by circumventing human subjectivity and error. While recent breakthroughs in AI technology might facilitate such automation, this has not been demonstrated at sca…
▽ More
The grading of open-ended questions is a high-effort, high-impact task in education. Automating this task promises a significant reduction in workload for education professionals, as well as more consistent grading outcomes for students, by circumventing human subjectivity and error. While recent breakthroughs in AI technology might facilitate such automation, this has not been demonstrated at scale. It this paper, we introduce a novel automatic short answer grading (ASAG) system. The system is based on a fine-tuned open-source transformer model which we trained on large set of exam data from university courses across a large range of disciplines. We evaluated the trained model's performance against held-out test data in a first experiment and found high accuracy levels across a broad spectrum of unseen questions, even in unseen courses. We further compared the performance of our model with that of certified human domain experts in a second experiment: we first assembled another test dataset from real historical exams - the historic grades contained in that data were awarded to students in a regulated, legally binding examination process; we therefore considered them as ground truth for our experiment. We then asked certified human domain experts and our model to grade the historic student answers again without disclosing the historic grades. Finally, we compared the hence obtained grades with the historic grades (our ground truth). We found that for the courses examined, the model deviated less from the official historic grades than the human re-graders - the model's median absolute error was 44 % smaller than the human re-graders', implying that the model is more consistent than humans in grading. These results suggest that leveraging AI enhanced grading can reduce human subjectivity, improve consistency and thus ultimately increase fairness.
△ Less
Submitted 7 May, 2024;
originally announced May 2024.
-
Predicting Open-Hole Laminates Failure Using Support Vector Machines With Classical and Quantum Kernels
Authors:
Giorgio Tosti Balducci,
Boyang Chen,
Matthias Möller,
Marc Gerritsma,
Roeland De Breuker
Abstract:
Modeling open hole failure of composites is a complex task, consisting in a highly nonlinear response with interacting failure modes. Numerical modeling of this phenomenon has traditionally been based on the finite element method, but requires to tradeoff between high fidelity and computational cost. To mitigate this shortcoming, recent work has leveraged machine learning to predict the strength o…
▽ More
Modeling open hole failure of composites is a complex task, consisting in a highly nonlinear response with interacting failure modes. Numerical modeling of this phenomenon has traditionally been based on the finite element method, but requires to tradeoff between high fidelity and computational cost. To mitigate this shortcoming, recent work has leveraged machine learning to predict the strength of open hole composite specimens. Here, we also propose using data-based models but to tackle open hole composite failure from a classification point of view. More specifically, we show how to train surrogate models to learn the ultimate failure envelope of an open hole composite plate under in-plane loading. To achieve this, we solve the classification problem via support vector machine (SVM) and test different classifiers by changing the SVM kernel function. The flexibility of kernel-based SVM also allows us to integrate the recently developed quantum kernels in our algorithm and compare them with the standard radial basis function (RBF) kernel. Finally, thanks to kernel-target alignment optimization, we tune the free parameters of all kernels to best separate safe and failure-inducing loading states. The results show classification accuracies higher than 90% for RBF, especially after alignment, followed closely by the quantum kernel classifiers.
△ Less
Submitted 9 June, 2024; v1 submitted 5 May, 2024;
originally announced May 2024.
-
VisAnywhere: Develo** Multi-platform Scientific Visualization Applications
Authors:
Thomas Marrinan,
Madeleine Moeller,
Alina Kanayinkal,
Victor A. Mateevitsi,
Michael E. Papka
Abstract:
Scientists often explore and analyze large-scale scientific simulation data by leveraging two- and three-dimensional visualizations. The data and tasks can be complex and therefore best supported using myriad display technologies, from mobile devices to large high-resolution display walls to virtual reality headsets. Using a simulation of neuron connections in the human brain, we present our work…
▽ More
Scientists often explore and analyze large-scale scientific simulation data by leveraging two- and three-dimensional visualizations. The data and tasks can be complex and therefore best supported using myriad display technologies, from mobile devices to large high-resolution display walls to virtual reality headsets. Using a simulation of neuron connections in the human brain, we present our work leveraging various web technologies to create a multi-platform scientific visualization application. Users can spread visualization and interaction across multiple devices to support flexible user interfaces and both co-located and remote collaboration. Drawing inspiration from responsive web design principles, this work demonstrates that a single codebase can be adapted to develop scientific visualization applications that operate everywhere.
△ Less
Submitted 26 April, 2024;
originally announced April 2024.
-
KATch: A Fast Symbolic Verifier for NetKAT
Authors:
Mark Moeller,
Jules Jacobs,
Olivier Savary Belanger,
David Darais,
Cole Schlesinger,
Steffen Smolka,
Nate Foster,
Alexandra Silva
Abstract:
We develop new data structures and algorithms for checking verification queries in NetKAT, a domain-specific language for specifying the behavior of network data planes. Our results extend the techniques obtained in prior work on symbolic automata and provide a framework for building efficient and scalable verification tools. We present KATch, an implementation of these ideas in Scala, featuring a…
▽ More
We develop new data structures and algorithms for checking verification queries in NetKAT, a domain-specific language for specifying the behavior of network data planes. Our results extend the techniques obtained in prior work on symbolic automata and provide a framework for building efficient and scalable verification tools. We present KATch, an implementation of these ideas in Scala, featuring an extended set of NetKAT operators that are useful for expressing network-wide specifications, and a verification engine that constructs a bisimulation or generates a counter-example showing that none exists. We evaluate the performance of our implementation on real-world and synthetic benchmarks, verifying properties such as reachability and slice isolation, typically returning a result in well under a second, which is orders of magnitude faster than previous approaches. Our advancements underscore NetKAT's potential as a practical, declarative language for network specification and verification.
△ Less
Submitted 21 June, 2024; v1 submitted 6 April, 2024;
originally announced April 2024.
-
Revolutionising Distance Learning: A Comparative Study of Learning Progress with AI-Driven Tutoring
Authors:
Moritz Möller,
Gargi Nirmal,
Dario Fabietti,
Quintus Stierstorfer,
Mark Zakhvatkin,
Holger Sommerfeld,
Sven Schütt
Abstract:
Generative AI is expected to have a vast, positive impact on education; however, at present, this potential has not yet been demonstrated at scale at university level. In this study, we present first evidence that generative AI can increase the speed of learning substantially in university students. We tested whether using the AI-powered teaching assistant Syntea affected the speed of learning of…
▽ More
Generative AI is expected to have a vast, positive impact on education; however, at present, this potential has not yet been demonstrated at scale at university level. In this study, we present first evidence that generative AI can increase the speed of learning substantially in university students. We tested whether using the AI-powered teaching assistant Syntea affected the speed of learning of hundreds of distance learning students across more than 40 courses at the IU International University of Applied Sciences. Our analysis suggests that using Syntea reduced their study time substantially--by about 27\% on average--in the third month after the release of Syntea. Taken together, the magnitude of the effect and the scalability of the approach implicate generative AI as a key lever to significantly improve and accelerate learning by personalisation.
△ Less
Submitted 21 February, 2024;
originally announced March 2024.
-
Robustness and Exploration of Variational and Machine Learning Approaches to Inverse Problems: An Overview
Authors:
Alexander Auras,
Kanchana Vaishnavi Gandikota,
Hannah Droege,
Michael Moeller
Abstract:
This paper attempts to provide an overview of current approaches for solving inverse problems in imaging using variational methods and machine learning. A special focus lies on point estimators and their robustness against adversarial perturbations. In this context results of numerical experiments for a one-dimensional toy problem are provided, showing the robustness of different approaches and em…
▽ More
This paper attempts to provide an overview of current approaches for solving inverse problems in imaging using variational methods and machine learning. A special focus lies on point estimators and their robustness against adversarial perturbations. In this context results of numerical experiments for a one-dimensional toy problem are provided, showing the robustness of different approaches and empirically verifying theoretical guarantees. Another focus of this review is the exploration of the subspace of data consistent solutions through explicit guidance to satisfy specific semantic or textural properties.
△ Less
Submitted 19 February, 2024;
originally announced February 2024.
-
Evaluating Adversarial Robustness of Low dose CT Recovery
Authors:
Kanchana Vaishnavi Gandikota,
Paramanand Chandramouli,
Hannah Droege,
Michael Moeller
Abstract:
Low dose computed tomography (CT) acquisition using reduced radiation or sparse angle measurements is recommended to decrease the harmful effects of X-ray radiation. Recent works successfully apply deep networks to the problem of low dose CT recovery on bench-mark datasets. However, their robustness needs a thorough evaluation before use in clinical settings. In this work, we evaluate the robustne…
▽ More
Low dose computed tomography (CT) acquisition using reduced radiation or sparse angle measurements is recommended to decrease the harmful effects of X-ray radiation. Recent works successfully apply deep networks to the problem of low dose CT recovery on bench-mark datasets. However, their robustness needs a thorough evaluation before use in clinical settings. In this work, we evaluate the robustness of different deep learning approaches and classical methods for CT recovery. We show that deep networks, including model-based networks encouraging data consistency, are more susceptible to untargeted attacks. Surprisingly, we observe that data consistency is not heavily affected even for these poor quality reconstructions, motivating the need for better regularization for the networks. We demonstrate the feasibility of universal attacks and study attack transferability across different methods. We analyze robustness to attacks causing localized changes in clinically relevant regions. Both classical approaches and deep networks are affected by such attacks leading to changes in the visual appearance of localized lesions, for extremely small perturbations. As the resulting reconstructions have high data consistency with the original measurements, these localized attacks can be used to explore the solution space of the CT recovery problem.
△ Less
Submitted 18 February, 2024;
originally announced February 2024.
-
Quantum Computing and Tensor Networks for Laminate Design: A Novel Approach to Stacking Sequence Retrieval
Authors:
Arne Wulff,
Boyang Chen,
Matthew Steinberg,
Yinglu Tang,
Matthias Möller,
Sebastian Feld
Abstract:
As with many tasks in engineering, structural design frequently involves navigating complex and computationally expensive problems. A prime example is the weight optimization of laminated composite materials, which to this day remains a formidable task, due to an exponentially large configuration space and non-linear constraints. The rapidly develo** field of quantum computation may offer novel…
▽ More
As with many tasks in engineering, structural design frequently involves navigating complex and computationally expensive problems. A prime example is the weight optimization of laminated composite materials, which to this day remains a formidable task, due to an exponentially large configuration space and non-linear constraints. The rapidly develo** field of quantum computation may offer novel approaches for addressing these intricate problems. However, before applying any quantum algorithm to a given problem, it must be translated into a form that is compatible with the underlying operations on a quantum computer. Our work specifically targets stacking sequence retrieval with lamination parameters. To adapt this problem for quantum computational methods, we map the possible stacking sequences onto a quantum state space. We further derive a linear operator, the Hamiltonian, within this state space that encapsulates the loss function inherent to the stacking sequence retrieval problem. Additionally, we demonstrate the incorporation of manufacturing constraints on stacking sequences as penalty terms in the Hamiltonian. This quantum representation is suitable for a variety of classical and quantum algorithms for finding the ground state of a quantum Hamiltonian. For a practical demonstration, we chose a classical tensor network algorithm, the DMRG algorithm, to numerically validate our approach. For this purpose, we derived a matrix product operator representation of the loss function Hamiltonian and the penalty terms. Numerical trials with this algorithm successfully yielded approximate solutions, while exhibiting a tradeoff between accuracy and runtime. Although this work primarily concentrates on quantum computation, the application of tensor network algorithms presents a novel quantum-inspired approach for stacking sequence retrieval.
△ Less
Submitted 9 February, 2024;
originally announced February 2024.
-
Temporal Action Localization for Inertial-based Human Activity Recognition
Authors:
Marius Bock,
Michael Moeller,
Kristof Van Laerhoven
Abstract:
A persistent trend in Deep Learning has been the applicability of machine learning concepts to other areas than originally introduced for. As of today, state-of-the-art activity recognition from wearable sensors relies on classifiers being trained on fixed windows of data. Contrarily, video-based Human Activity Recognition has followed a segment-based prediction approach, localizing activity occur…
▽ More
A persistent trend in Deep Learning has been the applicability of machine learning concepts to other areas than originally introduced for. As of today, state-of-the-art activity recognition from wearable sensors relies on classifiers being trained on fixed windows of data. Contrarily, video-based Human Activity Recognition has followed a segment-based prediction approach, localizing activity occurrences from start to end. This paper is the first to systematically demonstrate the applicability of state-of-the-art TAL models for wearable Human Activity Recongition (HAR) using raw inertial data as input. Our results show that state-of-the-art TAL models are able to outperform popular inertial models on 4 out of 6 wearable activity recognition benchmark datasets, with improvements ranging as much as 25% in F1-score. Introducing the TAL community's most popular metric to inertial-based HAR, namely mean Average Precision, our analysis shows that TAL models are able to produce more coherent segments along with an overall higher NULL-class accuracy across all datasets. Being the first to provide such an analysis, the TAL community offers an interesting new perspective to inertial-based HAR with yet to be explored design choices and training concepts, which could be of significant value for the inertial-based HAR community.
△ Less
Submitted 27 November, 2023;
originally announced November 2023.
-
Quantum Neural Networks for Power Flow Analysis
Authors:
Zeynab Kaseb,
Matthias Moller,
Giorgio Tosti Balducci,
Peter Palensky,
Pedro P. Vergara
Abstract:
This paper explores the potential application of quantum and hybrid quantum-classical neural networks in power flow analysis. Experiments are conducted using two datasets based on 4-bus and 33-bus test systems. A systematic performance comparison is also conducted among quantum, hybrid quantum-classical, and classical neural networks. The comparison is based on (i) generalization ability, (ii) rob…
▽ More
This paper explores the potential application of quantum and hybrid quantum-classical neural networks in power flow analysis. Experiments are conducted using two datasets based on 4-bus and 33-bus test systems. A systematic performance comparison is also conducted among quantum, hybrid quantum-classical, and classical neural networks. The comparison is based on (i) generalization ability, (ii) robustness, (iii) training dataset size needed, (iv) training error, and (v) training process stability. The results show that the developed hybrid quantum-classical neural network outperforms both quantum and classical neural networks, and hence can improve deep learning-based power flow analysis in the noisy-intermediate-scale quantum (NISQ) and fault-tolerant quantum (FTQ) era.
△ Less
Submitted 10 March, 2024; v1 submitted 4 November, 2023;
originally announced November 2023.
-
Bring the Noise: Introducing Noise Robustness to Pretrained Automatic Speech Recognition
Authors:
Patrick Eickhoff,
Matthias Möller,
Theresa Pekarek Rosin,
Johannes Twiefel,
Stefan Wermter
Abstract:
In recent research, in the domain of speech processing, large End-to-End (E2E) systems for Automatic Speech Recognition (ASR) have reported state-of-the-art performance on various benchmarks. These systems intrinsically learn how to handle and remove noise conditions from speech. Previous research has shown, that it is possible to extract the denoising capabilities of these models into a preproces…
▽ More
In recent research, in the domain of speech processing, large End-to-End (E2E) systems for Automatic Speech Recognition (ASR) have reported state-of-the-art performance on various benchmarks. These systems intrinsically learn how to handle and remove noise conditions from speech. Previous research has shown, that it is possible to extract the denoising capabilities of these models into a preprocessor network, which can be used as a frontend for downstream ASR models. However, the proposed methods were limited to specific fully convolutional architectures. In this work, we propose a novel method to extract the denoising capabilities, that can be applied to any encoder-decoder architecture. We propose the Cleancoder preprocessor architecture that extracts hidden activations from the Conformer ASR model and feeds them to a decoder to predict denoised spectrograms. We train our pre-processor on the Noisy Speech Database (NSD) to reconstruct denoised spectrograms from noisy inputs. Then, we evaluate our model as a frontend to a pretrained Conformer ASR model as well as a frontend to train smaller Conformer ASR models from scratch. We show that the Cleancoder is able to filter noise from speech and that it improves the total Word Error Rate (WER) of the downstream model in noisy conditions for both applications.
△ Less
Submitted 5 September, 2023;
originally announced September 2023.
-
Kissing to Find a Match: Efficient Low-Rank Permutation Representation
Authors:
Hannah Dröge,
Zorah Lähner,
Yuval Bahat,
Onofre Martorell,
Felix Heide,
Michael Möller
Abstract:
Permutation matrices play a key role in matching and assignment problems across the fields, especially in computer vision and robotics. However, memory for explicitly representing permutation matrices grows quadratically with the size of the problem, prohibiting large problem instances. In this work, we propose to tackle the curse of dimensionality of large permutation matrices by approximating th…
▽ More
Permutation matrices play a key role in matching and assignment problems across the fields, especially in computer vision and robotics. However, memory for explicitly representing permutation matrices grows quadratically with the size of the problem, prohibiting large problem instances. In this work, we propose to tackle the curse of dimensionality of large permutation matrices by approximating them using low-rank matrix factorization, followed by a nonlinearity. To this end, we rely on the Kissing number theory to infer the minimal rank required for representing a permutation matrix of a given size, which is significantly smaller than the problem size. This leads to a drastic reduction in computation and memory costs, e.g., up to $3$ orders of magnitude less memory for a problem of size $n=20000$, represented using $8.4\times10^5$ elements in two small matrices instead of using a single huge matrix with $4\times 10^8$ elements. The proposed representation allows for accurate representations of large permutation matrices, which in turn enables handling large problems that would have been infeasible otherwise. We demonstrate the applicability and merits of the proposed approach through a series of experiments on a range of problems that involve predicting permutation matrices, from linear and quadratic assignment to shape matching problems.
△ Less
Submitted 25 August, 2023;
originally announced August 2023.
-
SIGMA: Scale-Invariant Global Sparse Shape Matching
Authors:
Maolin Gao,
Paul Roetzer,
Marvin Eisenberger,
Zorah Lähner,
Michael Moeller,
Daniel Cremers,
Florian Bernard
Abstract:
We propose a novel mixed-integer programming (MIP) formulation for generating precise sparse correspondences for highly non-rigid shapes. To this end, we introduce a projected Laplace-Beltrami operator (PLBO) which combines intrinsic and extrinsic geometric information to measure the deformation quality induced by predicted correspondences. We integrate the PLBO, together with an orientation-aware…
▽ More
We propose a novel mixed-integer programming (MIP) formulation for generating precise sparse correspondences for highly non-rigid shapes. To this end, we introduce a projected Laplace-Beltrami operator (PLBO) which combines intrinsic and extrinsic geometric information to measure the deformation quality induced by predicted correspondences. We integrate the PLBO, together with an orientation-aware regulariser, into a novel MIP formulation that can be solved to global optimality for many practical problems. In contrast to previous methods, our approach is provably invariant to rigid transformations and global scaling, initialisation-free, has optimality guarantees, and scales to high resolution meshes with (empirically observed) linear time. We show state-of-the-art results for sparse non-rigid matching on several challenging 3D datasets, including data with inconsistent meshing, as well as applications in mesh-to-point-cloud matching.
△ Less
Submitted 3 April, 2024; v1 submitted 16 August, 2023;
originally announced August 2023.
-
An Evaluation of Zero-Cost Proxies -- from Neural Architecture Performance to Model Robustness
Authors:
Jovita Lukasik,
Michael Moeller,
Margret Keuper
Abstract:
Zero-cost proxies are nowadays frequently studied and used to search for neural architectures. They show an impressive ability to predict the performance of architectures by making use of their untrained weights. These techniques allow for immense search speed-ups. So far the joint search for well-performing and robust architectures has received much less attention in the field of NAS. Therefore,…
▽ More
Zero-cost proxies are nowadays frequently studied and used to search for neural architectures. They show an impressive ability to predict the performance of architectures by making use of their untrained weights. These techniques allow for immense search speed-ups. So far the joint search for well-performing and robust architectures has received much less attention in the field of NAS. Therefore, the main focus of zero-cost proxies is the clean accuracy of architectures, whereas the model robustness should play an evenly important part. In this paper, we analyze the ability of common zero-cost proxies to serve as performance predictors for robustness in the popular NAS-Bench-201 search space. We are interested in the single prediction task for robustness and the joint multi-objective of clean and robust accuracy. We further analyze the feature importance of the proxies and show that predicting the robustness makes the prediction task from existing zero-cost proxies more challenging. As a result, the joint consideration of several proxies becomes necessary to predict a model's robustness while the clean accuracy can be regressed from a single such feature.
△ Less
Submitted 18 July, 2023;
originally announced July 2023.
-
Differentiable Sensor Layouts for End-to-End Learning of Task-Specific Camera Parameters
Authors:
Hendrik Sommerhoff,
Shashank Agnihotri,
Mohamed Saleh,
Michael Moeller,
Margret Keuper,
Andreas Kolb
Abstract:
The success of deep learning is frequently described as the ability to train all parameters of a network on a specific application in an end-to-end fashion. Yet, several design choices on the camera level, including the pixel layout of the sensor, are considered as pre-defined and fixed, and high resolution, regular pixel layouts are considered to be the most generic ones in computer vision and gr…
▽ More
The success of deep learning is frequently described as the ability to train all parameters of a network on a specific application in an end-to-end fashion. Yet, several design choices on the camera level, including the pixel layout of the sensor, are considered as pre-defined and fixed, and high resolution, regular pixel layouts are considered to be the most generic ones in computer vision and graphics, treating all regions of an image as equally important. While several works have considered non-uniform, \eg, hexagonal or foveated, pixel layouts in hardware and image processing, the layout has not been integrated into the end-to-end learning paradigm so far. In this work, we present the first truly end-to-end trained imaging pipeline that optimizes the size and distribution of pixels on the imaging sensor jointly with the parameters of a given neural network on a specific task. We derive an analytic, differentiable approach for the sensor layout parameterization that allows for task-specific, local varying pixel resolutions. We present two pixel layout parameterization functions: rectangular and curvilinear grid shapes that retain a regular topology. We provide a drop-in module that approximates sensor simulation given existing high-resolution images to directly connect our method with existing deep learning models. We show that network predictions benefit from learnable pixel layouts for two different downstream tasks, classification and semantic segmentation.
△ Less
Submitted 28 April, 2023;
originally announced April 2023.
-
WEAR: An Outdoor Sports Dataset for Wearable and Egocentric Activity Recognition
Authors:
Marius Bock,
Hilde Kuehne,
Kristof Van Laerhoven,
Michael Moeller
Abstract:
Though research has shown the complementarity of camera- and inertial-based data, datasets which offer both egocentric video and inertial-based sensor data remain scarce. In this paper, we introduce WEAR, an outdoor sports dataset for both vision- and inertial-based human activity recognition (HAR). The dataset comprises data from 18 participants performing a total of 18 different workout activiti…
▽ More
Though research has shown the complementarity of camera- and inertial-based data, datasets which offer both egocentric video and inertial-based sensor data remain scarce. In this paper, we introduce WEAR, an outdoor sports dataset for both vision- and inertial-based human activity recognition (HAR). The dataset comprises data from 18 participants performing a total of 18 different workout activities with untrimmed inertial (acceleration) and camera (egocentric video) data recorded at 10 different outside locations. Unlike previous egocentric datasets, WEAR provides a challenging prediction scenario marked by purposely introduced activity variations as well as an overall small information overlap across modalities. Benchmark results obtained using each modality separately show that each modality interestingly offers complementary strengths and weaknesses in their prediction performance. Further, in light of the recent success of temporal action localization models following the architecture design of the ActionFormer, we demonstrate their versatility by applying them in a plain fashion using vision, inertial and combined (vision + inertial) features as input. Results demonstrate both the applicability of vision-based temporal action localization models for inertial data and fusing both modalities by means of simple concatenation, with the combined approach (vision + inertial features) being able to produce the highest mean average precision and close-to-best F1-score. The dataset and code to reproduce experiments is publicly available via: https://mariusbock.github.io/wear/
△ Less
Submitted 21 November, 2023; v1 submitted 11 April, 2023;
originally announced April 2023.
-
CCuantuMM: Cycle-Consistent Quantum-Hybrid Matching of Multiple Shapes
Authors:
Harshil Bhatia,
Edith Tretschk,
Zorah Lähner,
Marcel Seelbach Benkner,
Michael Moeller,
Christian Theobalt,
Vladislav Golyanik
Abstract:
Jointly matching multiple, non-rigidly deformed 3D shapes is a challenging, $\mathcal{NP}$-hard problem. A perfect matching is necessarily cycle-consistent: Following the pairwise point correspondences along several shapes must end up at the starting vertex of the original shape. Unfortunately, existing quantum shape-matching methods do not support multiple shapes and even less cycle consistency.…
▽ More
Jointly matching multiple, non-rigidly deformed 3D shapes is a challenging, $\mathcal{NP}$-hard problem. A perfect matching is necessarily cycle-consistent: Following the pairwise point correspondences along several shapes must end up at the starting vertex of the original shape. Unfortunately, existing quantum shape-matching methods do not support multiple shapes and even less cycle consistency. This paper addresses the open challenges and introduces the first quantum-hybrid approach for 3D shape multi-matching; in addition, it is also cycle-consistent. Its iterative formulation is admissible to modern adiabatic quantum hardware and scales linearly with the total number of input shapes. Both these characteristics are achieved by reducing the $N$-shape case to a sequence of three-shape matchings, the derivation of which is our main technical contribution. Thanks to quantum annealing, high-quality solutions with low energy are retrieved for the intermediate $\mathcal{NP}$-hard objectives. On benchmark datasets, the proposed approach significantly outperforms extensions to multi-shape matching of a previous quantum-hybrid two-shape matching method and is on-par with classical multi-matching methods.
△ Less
Submitted 28 March, 2023;
originally announced March 2023.
-
Convergent Data-driven Regularizations for CT Reconstruction
Authors:
Samira Kabri,
Alexander Auras,
Danilo Riccio,
Hartmut Bauermeister,
Martin Benning,
Michael Moeller,
Martin Burger
Abstract:
The reconstruction of images from their corresponding noisy Radon transform is a typical example of an ill-posed linear inverse problem as arising in the application of computerized tomography (CT). As the (naive) solution does not depend on the measured data continuously, regularization is needed to re-establish a continuous dependence. In this work, we investigate simple, but yet still provably…
▽ More
The reconstruction of images from their corresponding noisy Radon transform is a typical example of an ill-posed linear inverse problem as arising in the application of computerized tomography (CT). As the (naive) solution does not depend on the measured data continuously, regularization is needed to re-establish a continuous dependence. In this work, we investigate simple, but yet still provably convergent approaches to learning linear regularization methods from data. More specifically, we analyze two approaches: One generic linear regularization that learns how to manipulate the singular values of the linear operator in an extension of our previous work, and one tailored approach in the Fourier domain that is specific to CT-reconstruction. We prove that such approaches become convergent regularization methods as well as the fact that the reconstructions they provide are typically much smoother than the training data they were trained on. Finally, we compare the spectral as well as the Fourier-based approaches for CT-reconstruction numerically, discuss their advantages and disadvantages and investigate the effect of discretization errors at different resolutions.
△ Less
Submitted 15 December, 2023; v1 submitted 14 December, 2022;
originally announced December 2022.
-
QuAnt: Quantum Annealing with Learnt Couplings
Authors:
Marcel Seelbach Benkner,
Maximilian Krahn,
Edith Tretschk,
Zorah Lähner,
Michael Moeller,
Vladislav Golyanik
Abstract:
Modern quantum annealers can find high-quality solutions to combinatorial optimisation objectives given as quadratic unconstrained binary optimisation (QUBO) problems. Unfortunately, obtaining suitable QUBO forms in computer vision remains challenging and currently requires problem-specific analytical derivations. Moreover, such explicit formulations impose tangible constraints on solution encodin…
▽ More
Modern quantum annealers can find high-quality solutions to combinatorial optimisation objectives given as quadratic unconstrained binary optimisation (QUBO) problems. Unfortunately, obtaining suitable QUBO forms in computer vision remains challenging and currently requires problem-specific analytical derivations. Moreover, such explicit formulations impose tangible constraints on solution encodings. In stark contrast to prior work, this paper proposes to learn QUBO forms from data through gradient backpropagation instead of deriving them. As a result, the solution encodings can be chosen flexibly and compactly. Furthermore, our methodology is general and virtually independent of the specifics of the target problem type. We demonstrate the advantages of learnt QUBOs on the diverse problem types of graph matching, 2D point cloud alignment and 3D rotation estimation. Our results are competitive with the previous quantum state of the art while requiring much fewer logical and physical qubits, enabling our method to scale to larger problems. The code and the new dataset will be open-sourced.
△ Less
Submitted 13 October, 2022;
originally announced October 2022.
-
On Adversarial Robustness of Deep Image Deblurring
Authors:
Kanchana Vaishnavi Gandikota,
Paramanand Chandramouli,
Michael Moeller
Abstract:
Recent approaches employ deep learning-based solutions for the recovery of a sharp image from its blurry observation. This paper introduces adversarial attacks against deep learning-based image deblurring methods and evaluates the robustness of these neural networks to untargeted and targeted attacks. We demonstrate that imperceptible distortion can significantly degrade the performance of state-o…
▽ More
Recent approaches employ deep learning-based solutions for the recovery of a sharp image from its blurry observation. This paper introduces adversarial attacks against deep learning-based image deblurring methods and evaluates the robustness of these neural networks to untargeted and targeted attacks. We demonstrate that imperceptible distortion can significantly degrade the performance of state-of-the-art deblurring networks, even producing drastically different content in the output, indicating the strong need to include adversarially robust training not only in classification but also for image recovery.
△ Less
Submitted 5 October, 2022;
originally announced October 2022.
-
A Simple Strategy to Provable Invariance via Orbit Map**
Authors:
Kanchana Vaishnavi Gandikota,
Jonas Gei**,
Zorah Lähner,
Adam Czapliński,
Michael Moeller
Abstract:
Many applications require robustness, or ideally invariance, of neural networks to certain transformations of input data. Most commonly, this requirement is addressed by training data augmentation, using adversarial training, or defining network architectures that include the desired invariance by design. In this work, we propose a method to make network architectures provably invariant with respe…
▽ More
Many applications require robustness, or ideally invariance, of neural networks to certain transformations of input data. Most commonly, this requirement is addressed by training data augmentation, using adversarial training, or defining network architectures that include the desired invariance by design. In this work, we propose a method to make network architectures provably invariant with respect to group actions by choosing one element from a (possibly continuous) orbit based on a fixed criterion. In a nutshell, we intend to 'undo' any possible transformation before feeding the data into the actual network. Further, we empirically analyze the properties of different approaches which incorporate invariance via training or architecture, and demonstrate the advantages of our method in terms of robustness and computational efficiency. In particular, we investigate the robustness with respect to rotations of images (which can hold up to discretization artifacts) as well as the provable orientation and scaling invariance of 3D point cloud classification.
△ Less
Submitted 23 September, 2022;
originally announced September 2022.
-
QPack Scores: Quantitative performance metrics for application-oriented quantum computer benchmarking
Authors:
Huub Donkers,
Koen Mesman,
Zaid Al-Ars,
Matthias Möller
Abstract:
This paper presents the benchmark score definitions of QPack, an application-oriented cross-platform benchmarking suite for quantum computers and simulators, which makes use of scalable Quantum Approximate Optimization Algorithm and Variational Quantum Eigensolver applications. Using a varied set of benchmark applications, an insight of how well a quantum computer or its simulator performs on a ge…
▽ More
This paper presents the benchmark score definitions of QPack, an application-oriented cross-platform benchmarking suite for quantum computers and simulators, which makes use of scalable Quantum Approximate Optimization Algorithm and Variational Quantum Eigensolver applications. Using a varied set of benchmark applications, an insight of how well a quantum computer or its simulator performs on a general NISQ-era application can be quantitatively made. This paper presents what quantum execution data can be collected and transformed into benchmark scores for application-oriented quantum benchmarking. Definitions are given for an overall benchmark score, as well as sub-scores based on runtime, accuracy, scalability and capacity performance. Using these scores, a comparison is made between various quantum computer simulators, running both locally and on vendors' remote cloud services. We also use the QPack benchmark to collect a small set of quantum execution data of the IBMQ Nairobi quantum processor. The goal of the QPack benchmark scores is to give a holistic insight into quantum performance and the ability to make easy and quick comparisons between different quantum computers
△ Less
Submitted 24 May, 2022;
originally announced May 2022.
-
Intrinsic Neural Fields: Learning Functions on Manifolds
Authors:
Lukas Koestler,
Daniel Grittner,
Michael Moeller,
Daniel Cremers,
Zorah Lähner
Abstract:
Neural fields have gained significant attention in the computer vision community due to their excellent performance in novel view synthesis, geometry reconstruction, and generative modeling. Some of their advantages are a sound theoretic foundation and an easy implementation in current deep learning frameworks. While neural fields have been applied to signals on manifolds, e.g., for texture recons…
▽ More
Neural fields have gained significant attention in the computer vision community due to their excellent performance in novel view synthesis, geometry reconstruction, and generative modeling. Some of their advantages are a sound theoretic foundation and an easy implementation in current deep learning frameworks. While neural fields have been applied to signals on manifolds, e.g., for texture reconstruction, their representation has been limited to extrinsically embedding the shape into Euclidean space. The extrinsic embedding ignores known intrinsic manifold properties and is inflexible wrt. transfer of the learned function. To overcome these limitations, this work introduces intrinsic neural fields, a novel and versatile representation for neural fields on manifolds. Intrinsic neural fields combine the advantages of neural fields with the spectral properties of the Laplace-Beltrami operator. We show theoretically that intrinsic neural fields inherit many desirable properties of the extrinsic neural field framework but exhibit additional intrinsic qualities, like isometry invariance. In experiments, we show intrinsic neural fields can reconstruct high-fidelity textures from images with state-of-the-art quality and are robust to the discretization of the underlying manifold. We demonstrate the versatility of intrinsic neural fields by tackling various applications: texture transfer between deformed shapes & different shapes, texture reconstruction from real-world images with view dependence, and discretization-agnostic learning on meshes and point clouds.
△ Less
Submitted 23 March, 2022; v1 submitted 15 March, 2022;
originally announced March 2022.
-
Tutorial on Deep Learning for Human Activity Recognition
Authors:
Marius Bock,
Alexander Hoelzemann,
Michael Moeller,
Kristof Van Laerhoven
Abstract:
Activity recognition systems that are capable of estimating human activities from wearable inertial sensors have come a long way in the past decades. Not only have state-of-the-art methods moved away from feature engineering and have fully adopted end-to-end deep learning approaches, best practices for setting up experiments, preparing datasets, and validating activity recognition approaches have…
▽ More
Activity recognition systems that are capable of estimating human activities from wearable inertial sensors have come a long way in the past decades. Not only have state-of-the-art methods moved away from feature engineering and have fully adopted end-to-end deep learning approaches, best practices for setting up experiments, preparing datasets, and validating activity recognition approaches have similarly evolved. This tutorial was first held at the 2021 ACM International Symposium on Wearable Computers (ISWC'21) and International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp'21). The tutorial, after a short introduction in the research field of activity recognition, provides a hands-on and interactive walk-through of the most important steps in the data pipeline for the deep learning of human activities. All presentation slides shown during the tutorial, which also contain links to all code exercises, as well as the link of the GitHub page of the tutorial can be found on: https://mariusbock.github.io/dl-for-har
△ Less
Submitted 13 October, 2021;
originally announced October 2021.
-
Stochastic Training is Not Necessary for Generalization
Authors:
Jonas Gei**,
Micah Goldblum,
Phillip E. Pope,
Michael Moeller,
Tom Goldstein
Abstract:
It is widely believed that the implicit regularization of SGD is fundamental to the impressive generalization behavior we observe in neural networks. In this work, we demonstrate that non-stochastic full-batch training can achieve comparably strong performance to SGD on CIFAR-10 using modern architectures. To this end, we show that the implicit regularization of SGD can be completely replaced with…
▽ More
It is widely believed that the implicit regularization of SGD is fundamental to the impressive generalization behavior we observe in neural networks. In this work, we demonstrate that non-stochastic full-batch training can achieve comparably strong performance to SGD on CIFAR-10 using modern architectures. To this end, we show that the implicit regularization of SGD can be completely replaced with explicit regularization even when comparing against a strong and well-researched baseline. Our observations indicate that the perceived difficulty of full-batch training may be the result of its optimization properties and the disproportionate time and effort spent by the ML community tuning optimizers and hyperparameters for small-batch training.
△ Less
Submitted 19 April, 2022; v1 submitted 28 September, 2021;
originally announced September 2021.
-
Is Differentiable Architecture Search truly a One-Shot Method?
Authors:
Jonas Gei**,
Jovita Lukasik,
Margret Keuper,
Michael Moeller
Abstract:
Differentiable architecture search (DAS) is a widely researched tool for the discovery of novel architectures, due to its promising results for image classification. The main benefit of DAS is the effectiveness achieved through the weight-sharing one-shot paradigm, which allows efficient architecture search. In this work, we investigate DAS in a systematic case study of inverse problems, which all…
▽ More
Differentiable architecture search (DAS) is a widely researched tool for the discovery of novel architectures, due to its promising results for image classification. The main benefit of DAS is the effectiveness achieved through the weight-sharing one-shot paradigm, which allows efficient architecture search. In this work, we investigate DAS in a systematic case study of inverse problems, which allows us to analyze these potential benefits in a controlled manner. We demonstrate that the success of DAS can be extended from image classification to signal reconstruction, in principle. However, our experiments also expose three fundamental difficulties in the evaluation of DAS-based methods in inverse problems: First, the results show a large variance in all test cases. Second, the final performance is strongly dependent on the hyperparameters of the optimizer. And third, the performance of the weight-sharing architecture used during training does not reflect the final performance of the found architecture well. While the results on image reconstruction confirm the potential of the DAS paradigm, they challenge the common understanding of DAS as a one-shot method.
△ Less
Submitted 20 February, 2023; v1 submitted 12 August, 2021;
originally announced August 2021.
-
Improving Deep Learning for HAR with shallow LSTMs
Authors:
Marius Bock,
Alexander Hoelzemann,
Michael Moeller,
Kristof Van Laerhoven
Abstract:
Recent studies in Human Activity Recognition (HAR) have shown that Deep Learning methods are able to outperform classical Machine Learning algorithms. One popular Deep Learning architecture in HAR is the DeepConvLSTM. In this paper we propose to alter the DeepConvLSTM architecture to employ a 1-layered instead of a 2-layered LSTM. We validate our architecture change on 5 publicly available HAR dat…
▽ More
Recent studies in Human Activity Recognition (HAR) have shown that Deep Learning methods are able to outperform classical Machine Learning algorithms. One popular Deep Learning architecture in HAR is the DeepConvLSTM. In this paper we propose to alter the DeepConvLSTM architecture to employ a 1-layered instead of a 2-layered LSTM. We validate our architecture change on 5 publicly available HAR datasets by comparing the predictive performance with and without the change employing varying hidden units within the LSTM layer(s). Results show that across all datasets, our architecture consistently improves on the original one: Recognition performance increases up to 11.7% for the F1-score, and our architecture significantly decreases the amount of learnable parameters. This improvement over DeepConvLSTM decreases training time by as much as 48%. Our results stand in contrast to the belief that one needs at least a 2-layered LSTM when dealing with sequential data. Based on our results we argue that said claim might not be applicable to sensor-based HAR.
△ Less
Submitted 5 August, 2021; v1 submitted 2 August, 2021;
originally announced August 2021.
-
Lifting the Convex Conjugate in Lagrangian Relaxations: A Tractable Approach for Continuous Markov Random Fields
Authors:
Hartmut Bauermeister,
Emanuel Laude,
Thomas Möllenhoff,
Michael Moeller,
Daniel Cremers
Abstract:
Dual decomposition approaches in nonconvex optimization may suffer from a duality gap. This poses a challenge when applying them directly to nonconvex problems such as MAP-inference in a Markov random field (MRF) with continuous state spaces. To eliminate such gaps, this paper considers a reformulation of the original nonconvex task in the space of measures. This infinite-dimensional reformulation…
▽ More
Dual decomposition approaches in nonconvex optimization may suffer from a duality gap. This poses a challenge when applying them directly to nonconvex problems such as MAP-inference in a Markov random field (MRF) with continuous state spaces. To eliminate such gaps, this paper considers a reformulation of the original nonconvex task in the space of measures. This infinite-dimensional reformulation is then approximated by a semi-infinite one, which is obtained via a piecewise polynomial discretization in the dual. We provide a geometric intuition behind the primal problem induced by the dual discretization and draw connections to optimization over moment spaces. In contrast to existing discretizations which suffer from a grid bias, we show that a piecewise polynomial discretization better preserves the continuous nature of our problem. Invoking results from optimal transport theory and convex algebraic geometry we reduce the semi-infinite program to a finite one and provide a practical implementation based on semidefinite programming. We show, experimentally and in theory, that the approach successfully reduces the duality gap. To showcase the scalability of our approach, we apply it to the stereo matching problem between two images.
△ Less
Submitted 16 May, 2022; v1 submitted 13 July, 2021;
originally announced July 2021.
-
Adiabatic Quantum Graph Matching with Permutation Matrix Constraints
Authors:
Marcel Seelbach Benkner,
Vladislav Golyanik,
Christian Theobalt,
Michael Moeller
Abstract:
Matching problems on 3D shapes and images are challenging as they are frequently formulated as combinatorial quadratic assignment problems (QAPs) with permutation matrix constraints, which are NP-hard. In this work, we address such problems with emerging quantum computing technology and propose several reformulations of QAPs as unconstrained problems suitable for efficient execution on quantum har…
▽ More
Matching problems on 3D shapes and images are challenging as they are frequently formulated as combinatorial quadratic assignment problems (QAPs) with permutation matrix constraints, which are NP-hard. In this work, we address such problems with emerging quantum computing technology and propose several reformulations of QAPs as unconstrained problems suitable for efficient execution on quantum hardware. We investigate several ways to inject permutation matrix constraints in a quadratic unconstrained binary optimization problem which can be mapped to quantum hardware. We focus on obtaining a sufficient spectral gap, which further increases the probability to measure optimal solutions and valid permutation matrices in a single run. We perform our experiments on the quantum computer D-Wave 2000Q (2^11 qubits, adiabatic). Despite the observed discrepancy between simulated adiabatic quantum computing and execution on real quantum hardware, our reformulation of permutation matrix constraints increases the robustness of the numerical computations over other penalty approaches in our experiments. The proposed algorithm has the potential to scale to higher dimensions on future quantum computing architectures, which opens up multiple new directions for solving matching problems in 3D computer vision and graphics.
△ Less
Submitted 8 July, 2021;
originally announced July 2021.
-
Training or Architecture? How to Incorporate Invariance in Neural Networks
Authors:
Kanchana Vaishnavi Gandikota,
Jonas Gei**,
Zorah Lähner,
Adam Czapliński,
Michael Moeller
Abstract:
Many applications require the robustness, or ideally the invariance, of a neural network to certain transformations of input data. Most commonly, this requirement is addressed by either augmenting the training data, using adversarial training, or defining network architectures that include the desired invariance automatically. Unfortunately, the latter often relies on the ability to enlist all pos…
▽ More
Many applications require the robustness, or ideally the invariance, of a neural network to certain transformations of input data. Most commonly, this requirement is addressed by either augmenting the training data, using adversarial training, or defining network architectures that include the desired invariance automatically. Unfortunately, the latter often relies on the ability to enlist all possible transformations, which make such approaches largely infeasible for infinite sets of transformations, such as arbitrary rotations or scaling. In this work, we propose a method for provably invariant network architectures with respect to group actions by choosing one element from a (possibly continuous) orbit based on a fixed criterion. In a nutshell, we intend to 'undo' any possible transformation before feeding the data into the actual network. We analyze properties of such approaches, extend them to equivariant networks, and demonstrate their advantages in terms of robustness as well as computational efficiency in several numerical examples. In particular, we investigate the robustness with respect to rotations of images (which can possibly hold up to discretization artifacts only) as well as the provable rotational and scaling invariance of 3D point cloud classification.
△ Less
Submitted 18 June, 2021;
originally announced June 2021.
-
Q-Match: Iterative Shape Matching via Quantum Annealing
Authors:
Marcel Seelbach Benkner,
Zorah Lähner,
Vladislav Golyanik,
Christof Wunderlich,
Christian Theobalt,
Michael Moeller
Abstract:
Finding shape correspondences can be formulated as an NP-hard quadratic assignment problem (QAP) that becomes infeasible for shapes with high sampling density. A promising research direction is to tackle such quadratic optimization problems over binary variables with quantum annealing, which allows for some problems a more efficient search in the solution space. Unfortunately, enforcing the linear…
▽ More
Finding shape correspondences can be formulated as an NP-hard quadratic assignment problem (QAP) that becomes infeasible for shapes with high sampling density. A promising research direction is to tackle such quadratic optimization problems over binary variables with quantum annealing, which allows for some problems a more efficient search in the solution space. Unfortunately, enforcing the linear equality constraints in QAPs via a penalty significantly limits the success probability of such methods on currently available quantum hardware. To address this limitation, this paper proposes Q-Match, i.e., a new iterative quantum method for QAPs inspired by the alpha-expansion algorithm, which allows solving problems of an order of magnitude larger than current quantum methods. It implicitly enforces the QAP constraints by updating the current estimates in a cyclic fashion. Further, Q-Match can be applied iteratively, on a subset of well-chosen correspondences, allowing us to scale to real-world problems. Using the latest quantum annealer, the D-Wave Advantage, we evaluate the proposed method on a subset of QAPLIB as well as on isometric shape matching problems from the FAUST dataset.
△ Less
Submitted 19 August, 2021; v1 submitted 6 May, 2021;
originally announced May 2021.
-
QPack: Quantum Approximate Optimization Algorithms as universal benchmark for quantum computers
Authors:
Koen Mesman,
Zaid Al-Ars,
Matthias Möller
Abstract:
In this paper, we present QPack, a universal benchmark for Noisy Intermediate-Scale Quantum (NISQ) computers based on Quantum Approximate Optimization Algorithms (QAOA). Unlike other evaluation metrics in the field, this benchmark evaluates not only one, but multiple important aspects of quantum computing hardware: the maximum problem size a quantum computer can solve, the required runtime, as wel…
▽ More
In this paper, we present QPack, a universal benchmark for Noisy Intermediate-Scale Quantum (NISQ) computers based on Quantum Approximate Optimization Algorithms (QAOA). Unlike other evaluation metrics in the field, this benchmark evaluates not only one, but multiple important aspects of quantum computing hardware: the maximum problem size a quantum computer can solve, the required runtime, as well as the achieved accuracy. The applications MaxCut, dominating set and traveling salesman are included to provide variation in resource requirements. This will allow for a diverse benchmark that promotes optimal design considerations, avoiding hardware implementations for specific applications. We also discuss the design aspects that are taken in consideration for the QPack benchmark, with critical quantum benchmark requirements in mind. An implementation is presented, providing practical metrics. QPack is presented as a hardware agnostic benchmark by making use of the XACC library. We demonstrate the application of the benchmark on various IBM machines, as well as a range of simulators.
△ Less
Submitted 19 April, 2022; v1 submitted 31 March, 2021;
originally announced March 2021.
-
What Doesn't Kill You Makes You Robust(er): How to Adversarially Train against Data Poisoning
Authors:
Jonas Gei**,
Liam Fowl,
Gowthami Somepalli,
Micah Goldblum,
Michael Moeller,
Tom Goldstein
Abstract:
Data poisoning is a threat model in which a malicious actor tampers with training data to manipulate outcomes at inference time. A variety of defenses against this threat model have been proposed, but each suffers from at least one of the following flaws: they are easily overcome by adaptive attacks, they severely reduce testing performance, or they cannot generalize to diverse data poisoning thre…
▽ More
Data poisoning is a threat model in which a malicious actor tampers with training data to manipulate outcomes at inference time. A variety of defenses against this threat model have been proposed, but each suffers from at least one of the following flaws: they are easily overcome by adaptive attacks, they severely reduce testing performance, or they cannot generalize to diverse data poisoning threat models. Adversarial training, and its variants, are currently considered the only empirically strong defense against (inference-time) adversarial attacks. In this work, we extend the adversarial training framework to defend against (training-time) data poisoning, including targeted and backdoor attacks. Our method desensitizes networks to the effects of such attacks by creating poisons during training and injecting them into training batches. We show that this defense withstands adaptive attacks, generalizes to diverse threat models, and incurs a better performance trade-off than previous defenses such as DP-SGD or (evasion) adversarial training.
△ Less
Submitted 17 February, 2022; v1 submitted 26 February, 2021;
originally announced February 2021.
-
Depthwise Separable Convolutions Allow for Fast and Memory-Efficient Spectral Normalization
Authors:
Christina Runkel,
Christian Etmann,
Michael Möller,
Carola-Bibiane Schönlieb
Abstract:
An increasing number of models require the control of the spectral norm of convolutional layers of a neural network. While there is an abundance of methods for estimating and enforcing upper bounds on those during training, they are typically costly in either memory or time. In this work, we introduce a very simple method for spectral normalization of depthwise separable convolutions, which introd…
▽ More
An increasing number of models require the control of the spectral norm of convolutional layers of a neural network. While there is an abundance of methods for estimating and enforcing upper bounds on those during training, they are typically costly in either memory or time. In this work, we introduce a very simple method for spectral normalization of depthwise separable convolutions, which introduces negligible computational and memory overhead. We demonstrate the effectiveness of our method on image classification tasks using standard architectures like MobileNetV2.
△ Less
Submitted 12 February, 2021;
originally announced February 2021.
-
Deep Sound Field Reconstruction in Real Rooms: Introducing the ISOBEL Sound Field Dataset
Authors:
Miklas Strøm Kristoffersen,
Martin Bo Møller,
Pablo Martínez-Nuevo,
Jan Østergaard
Abstract:
Knowledge of loudspeaker responses are useful in a number of applications, where a sound system is located inside a room that alters the listening experience depending on position within the room. Acquisition of sound fields for sound sources located in reverberant rooms can be achieved through labor intensive measurements of impulse response functions covering the room, or alternatively by means…
▽ More
Knowledge of loudspeaker responses are useful in a number of applications, where a sound system is located inside a room that alters the listening experience depending on position within the room. Acquisition of sound fields for sound sources located in reverberant rooms can be achieved through labor intensive measurements of impulse response functions covering the room, or alternatively by means of reconstruction methods which can potentially require significantly fewer measurements. This paper extends evaluations of sound field reconstruction at low frequencies by introducing a dataset with measurements from four real rooms. The ISOBEL Sound Field dataset is publicly available, and aims to bridge the gap between synthetic and real-world sound fields in rectangular rooms. Moreover, the paper advances on a recent deep learning-based method for sound field reconstruction using a very low number of microphones, and proposes an approach for modeling both magnitude and phase response in a U-Net-like neural network architecture. The complex-valued sound field reconstruction demonstrates that the estimated room transfer functions are of high enough accuracy to allow for personalized sound zones with contrast ratios comparable to ideal room transfer functions using 15 microphones below 150 Hz.
△ Less
Submitted 12 February, 2021;
originally announced February 2021.
-
Towards Accuracy and Scalability: Combining Isogeometric Analysis with Deflation to Obtain Scalable Convergence for the Helmholtz Equation
Authors:
Vandana Dwarka,
Roel Tielen,
Matthias Möller,
Kees Vuik
Abstract:
Finding fast yet accurate numerical solutions to the Helmholtz equation remains a challenging task. The pollution error (i.e. the discrepancy between the numerical and analytical wave number k) requires the mesh resolution to be kept fine enough to obtain accurate solutions. A recent study showed that the use of Isogeometric Analysis (IgA) for the spatial discretization significantly reduces the p…
▽ More
Finding fast yet accurate numerical solutions to the Helmholtz equation remains a challenging task. The pollution error (i.e. the discrepancy between the numerical and analytical wave number k) requires the mesh resolution to be kept fine enough to obtain accurate solutions. A recent study showed that the use of Isogeometric Analysis (IgA) for the spatial discretization significantly reduces the pollution error.
However, solving the resulting linear systems by means of a direct solver remains computationally expensive when large wave numbers or multiple dimensions are considered. An alternative lies in the use of (preconditioned) Krylov subspace methods. Recently, the use of the exact Complex Shifted Laplacian Preconditioner (CSLP) with a small complex shift has shown to lead to wave number independent convergence while obtaining more accurate numerical solutions using IgA.
In this paper, we propose the use of deflation techniques combined with an approximated inverse of the CSLP using a geometric multigrid method. Numerical results obtained for both one- and two-dimensional model problems, including constant and non-constant wave numbers, show scalable convergence with respect to the wave number and approximation order p of the spatial discretization. Furthermore, when kh is kept constant, the proposed approach leads to a significant reduction of the computational time compared to the use of the exact inverse of the CSLP with a small shift.
△ Less
Submitted 20 October, 2020;
originally announced October 2020.
-
Learning to Identify Physical Parameters from Video Using Differentiable Physics
Authors:
Rama Krishna Kandukuri,
Jan Achterhold,
Michael Möller,
Jörg Stückler
Abstract:
Video representation learning has recently attracted attention in computer vision due to its applications for activity and scene forecasting or vision-based planning and control. Video prediction models often learn a latent representation of video which is encoded from input frames and decoded back into images. Even when conditioned on actions, purely deep learning based architectures typically la…
▽ More
Video representation learning has recently attracted attention in computer vision due to its applications for activity and scene forecasting or vision-based planning and control. Video prediction models often learn a latent representation of video which is encoded from input frames and decoded back into images. Even when conditioned on actions, purely deep learning based architectures typically lack a physically interpretable latent space. In this study, we use a differentiable physics engine within an action-conditional video representation network to learn a physical latent representation. We propose supervised and self-supervised learning methods to train our network and identify physical properties. The latter uses spatial transformers to decode physical states back into images. The simulation scenarios in our experiments comprise pushing, sliding and colliding objects, for which we also analyze the observability of the physical properties. In experiments we demonstrate that our network can learn to encode images and identify physical properties like mass and friction from videos and action sequences in the simulated scenarios. We evaluate the accuracy of our supervised and self-supervised methods and compare it with a system identification baseline which directly learns from state trajectories. We also demonstrate the ability of our method to predict future video frames from input images and actions.
△ Less
Submitted 17 September, 2020;
originally announced September 2020.
-
Nonlinear Spectral Geometry Processing via the TV Transform
Authors:
Marco Fumero,
Michael Moeller,
Emanuele Rodolà
Abstract:
We introduce a novel computational framework for digital geometry processing, based upon the derivation of a nonlinear operator associated to the total variation functional. Such operator admits a generalized notion of spectral decomposition, yielding a sparse multiscale representation akin to Laplacian-based methods, while at the same time avoiding undesirable over-smoothing effects typical of su…
▽ More
We introduce a novel computational framework for digital geometry processing, based upon the derivation of a nonlinear operator associated to the total variation functional. Such operator admits a generalized notion of spectral decomposition, yielding a sparse multiscale representation akin to Laplacian-based methods, while at the same time avoiding undesirable over-smoothing effects typical of such techniques. Our approach entails accurate, detail-preserving decomposition and manipulation of 3D shape geometry while taking an especially intuitive form: non-local semantic details are well separated into different bands, which can then be filtered and re-synthesized with a straightforward linear step. Our computational framework is flexible, can be applied to a variety of signals, and is easily adapted to different geometry representations, including triangle meshes and point clouds. We showcase our method throughout multiple applications in graphics, ranging from surface and signal denoising to detail transfer and cubic stylization.
△ Less
Submitted 7 September, 2020;
originally announced September 2020.
-
Witches' Brew: Industrial Scale Data Poisoning via Gradient Matching
Authors:
Jonas Gei**,
Liam Fowl,
W. Ronny Huang,
Wojciech Czaja,
Gavin Taylor,
Michael Moeller,
Tom Goldstein
Abstract:
Data Poisoning attacks modify training data to maliciously control a model trained on such data. In this work, we focus on targeted poisoning attacks which cause a reclassification of an unmodified test image and as such breach model integrity. We consider a particularly malicious poisoning attack that is both "from scratch" and "clean label", meaning we analyze an attack that successfully works a…
▽ More
Data Poisoning attacks modify training data to maliciously control a model trained on such data. In this work, we focus on targeted poisoning attacks which cause a reclassification of an unmodified test image and as such breach model integrity. We consider a particularly malicious poisoning attack that is both "from scratch" and "clean label", meaning we analyze an attack that successfully works against new, randomly initialized models, and is nearly imperceptible to humans, all while perturbing only a small fraction of the training data. Previous poisoning attacks against deep neural networks in this setting have been limited in scope and success, working only in simplified settings or being prohibitively expensive for large datasets. The central mechanism of the new attack is matching the gradient direction of malicious examples. We analyze why this works, supplement with practical considerations. and show its threat to real-world practitioners, finding that it is the first poisoning method to cause targeted misclassification in modern deep networks trained from scratch on a full-sized, poisoned ImageNet dataset. Finally we demonstrate the limitations of existing defensive strategies against such an attack, concluding that data poisoning is a credible threat, even for large-scale deep learning systems.
△ Less
Submitted 10 May, 2021; v1 submitted 4 September, 2020;
originally announced September 2020.
-
Exploiting the Logits: Joint Sign Language Recognition and Spell-Correction
Authors:
Christina Runkel,
Stefan Dorenkamp,
Hartmut Bauermeister,
Michael Moeller
Abstract:
Machine learning techniques have excelled in the automatic semantic analysis of images, reaching human-level performances on challenging benchmarks. Yet, the semantic analysis of videos remains challenging due to the significantly higher dimensionality of the input data, respectively, the significantly higher need for annotated training examples. By studying the automatic recognition of German sig…
▽ More
Machine learning techniques have excelled in the automatic semantic analysis of images, reaching human-level performances on challenging benchmarks. Yet, the semantic analysis of videos remains challenging due to the significantly higher dimensionality of the input data, respectively, the significantly higher need for annotated training examples. By studying the automatic recognition of German sign language videos, we demonstrate that on the relatively scarce training data of 2.800 videos, modern deep learning architectures for video analysis (such as ResNeXt) along with transfer learning on large gesture recognition tasks, can achieve about 75% character accuracy. Considering that this leaves us with a probability of under 25% that a 5 letter word is spelled correctly, spell-correction systems are crucial for producing readable outputs. The contribution of this paper is to propose a convolutional neural network for spell-correction that expects the softmax outputs of the character recognition network (instead of a misspelled word) as an input. We demonstrate that purely learning on softmax inputs in combination with scarce training data yields overfitting as the network learns the inputs by heart. In contrast, training the network on several variants of the logits of the classification output i.e. scaling by a constant factor, adding of random noise, mixing of softmax and hardmax inputs or purely training on hardmax inputs, leads to better generalization while benefitting from the significant information hidden in these outputs (that have 98% top-5 accuracy), yielding a readable text despite the comparably low character accuracy.
△ Less
Submitted 1 July, 2020;
originally announced July 2020.
-
A Simple Domain Shifting Networkfor Generating Low Quality Images
Authors:
Guruprasad Hegde,
Avinash Nittur Ramesh,
Kanchana Vaishnavi Gandikota,
Roman Obermaisser,
Michael Moeller
Abstract:
Deep Learning systems have proven to be extremely successful for image recognition tasks for which significant amounts of training data is available, e.g., on the famous ImageNet dataset. We demonstrate that for robotics applications with cheap camera equipment, the low image quality, however,influences the classification accuracy, and freely available databases cannot be exploited in a straight f…
▽ More
Deep Learning systems have proven to be extremely successful for image recognition tasks for which significant amounts of training data is available, e.g., on the famous ImageNet dataset. We demonstrate that for robotics applications with cheap camera equipment, the low image quality, however,influences the classification accuracy, and freely available databases cannot be exploited in a straight forward way to train classifiers to be used on a robot. As a solution we propose to train a network on degrading the quality images in order to mimic specific low quality imaging systems. Numerical experiments demonstrate that classification networks trained by using images produced by our quality degrading network along with the high quality images outperform classification networks trained only on high quality data when used on a real robot system, while being significantly easier to use than competing zero-shot domain adaptation techniques.
△ Less
Submitted 30 June, 2020;
originally announced June 2020.
-
A Generative Model for Generic Light Field Reconstruction
Authors:
Paramanand Chandramouli,
Kanchana Vaishnavi Gandikota,
Andreas Goerlitz,
Andreas Kolb,
Michael Moeller
Abstract:
Recently deep generative models have achieved impressive progress in modeling the distribution of training data. In this work, we present for the first time a generative model for 4D light field patches using variational autoencoders to capture the data distribution of light field patches. We develop a generative model conditioned on the central view of the light field and incorporate this as a pr…
▽ More
Recently deep generative models have achieved impressive progress in modeling the distribution of training data. In this work, we present for the first time a generative model for 4D light field patches using variational autoencoders to capture the data distribution of light field patches. We develop a generative model conditioned on the central view of the light field and incorporate this as a prior in an energy minimization framework to address diverse light field reconstruction tasks. While pure learning-based approaches do achieve excellent results on each instance of such a problem, their applicability is limited to the specific observation model they have been trained on. On the contrary, our trained light field generative model can be incorporated as a prior into any model-based optimization approach and therefore extend to diverse reconstruction tasks including light field view synthesis, spatial-angular super resolution and reconstruction from coded projections. Our proposed method demonstrates good reconstruction, with performance approaching end-to-end trained networks, while outperforming traditional model-based approaches on both synthetic and real scenes. Furthermore, we show that our approach enables reliable light field recovery despite distortions in the input.
△ Less
Submitted 17 June, 2020; v1 submitted 13 May, 2020;
originally announced May 2020.
-
Fast Convex Relaxations using Graph Discretizations
Authors:
Jonas Gei**,
Fjedor Gaede,
Hartmut Bauermeister,
Michael Moeller
Abstract:
Matching and partitioning problems are fundamentals of computer vision applications with examples in multilabel segmentation, stereo estimation and optical-flow computation. These tasks can be posed as non-convex energy minimization problems and solved near-globally optimal by recent convex lifting approaches. Yet, applying these techniques comes with a significant computational effort, reducing t…
▽ More
Matching and partitioning problems are fundamentals of computer vision applications with examples in multilabel segmentation, stereo estimation and optical-flow computation. These tasks can be posed as non-convex energy minimization problems and solved near-globally optimal by recent convex lifting approaches. Yet, applying these techniques comes with a significant computational effort, reducing their feasibility in practical applications. We discuss spatial discretization of continuous partitioning problems into a graph structure, generalizing discretization onto a Cartesian grid. This setup allows us to faithfully work on super-pixel graphs constructed by SLIC or Cut-Pursuit, massively decreasing the computational effort for lifted partitioning problems compared to a Cartesian grid, while optimal energy values remain similar: The global matching is still solved near-globally optimal. We discuss this methodology in detail and show examples in multi-label segmentation by minimal partitions and stereo estimation, where we demonstrate that the proposed graph discretization can reduce runtime as well as memory consumption of convex relaxations of matching problems by up to a factor of 10.
△ Less
Submitted 11 September, 2020; v1 submitted 23 April, 2020;
originally announced April 2020.
-
Inverting Gradients -- How easy is it to break privacy in federated learning?
Authors:
Jonas Gei**,
Hartmut Bauermeister,
Hannah Dröge,
Michael Moeller
Abstract:
The idea of federated learning is to collaboratively train a neural network on a server. Each user receives the current weights of the network and in turns sends parameter updates (gradients) based on local data. This protocol has been designed not only to train neural networks data-efficiently, but also to provide privacy benefits for users, as their input data remains on device and only paramete…
▽ More
The idea of federated learning is to collaboratively train a neural network on a server. Each user receives the current weights of the network and in turns sends parameter updates (gradients) based on local data. This protocol has been designed not only to train neural networks data-efficiently, but also to provide privacy benefits for users, as their input data remains on device and only parameter gradients are shared. But how secure is sharing parameter gradients? Previous attacks have provided a false sense of security, by succeeding only in contrived settings - even for a single image. However, by exploiting a magnitude-invariant loss along with optimization strategies based on adversarial attacks, we show that is is actually possible to faithfully reconstruct images at high resolution from the knowledge of their parameter gradients, and demonstrate that such a break of privacy is possible even for trained deep networks. We analyze the effects of architecture as well as parameters on the difficulty of reconstructing an input image and prove that any input to a fully connected layer can be reconstructed analytically independent of the remaining architecture. Finally we discuss settings encountered in practice and show that even averaging gradients over several iterations or several images does not protect the user's privacy in federated learning applications in computer vision.
△ Less
Submitted 11 September, 2020; v1 submitted 31 March, 2020;
originally announced March 2020.
-
Inferring the location of reflecting surfaces exploiting loudspeaker directivity
Authors:
Vincenzo Zaccà,
Pablo Martinez-Nuevo,
Martin Møller,
Jorge Martínez,
Richard Heusdens
Abstract:
Accurate sound field reproduction in rooms is often limited by the lack of knowledge of the room characteristics. Information about the room shape or nearby reflecting boundaries can, in principle, be used to improve the accuracy of the reproduction. In this paper, we propose a method to infer the location of nearby reflecting boundaries from measurements on a microphone array. As opposed to tradi…
▽ More
Accurate sound field reproduction in rooms is often limited by the lack of knowledge of the room characteristics. Information about the room shape or nearby reflecting boundaries can, in principle, be used to improve the accuracy of the reproduction. In this paper, we propose a method to infer the location of nearby reflecting boundaries from measurements on a microphone array. As opposed to traditional methods, we explicitly exploit the loudspeaker directivity model (beyond omnidirectional radiation) and the microphone array geometry. This approach does not require noiseless timing information of the echoes as input, nor a tailored loudspeaker-wall-microphone measurement step. Simulations show the proposed model outperforms current methods that disregard directivity in reverberant environments.
△ Less
Submitted 2 March, 2020;
originally announced March 2020.
-
Sound field reconstruction in rooms: inpainting meets super-resolution
Authors:
Francesc Lluís,
Pablo Martínez-Nuevo,
Martin Bo Møller,
Sven Ewan Shepstone
Abstract:
In this paper, a deep-learning-based method for sound field reconstruction is proposed. It is shown the possibility to reconstruct the magnitude of the sound pressure in the frequency band 30-300 Hz for an entire room by using a very low number of irregularly distributed microphones arbitrarily arranged. Moreover, the approach is agnostic to the location of the measurements in the Euclidean space.…
▽ More
In this paper, a deep-learning-based method for sound field reconstruction is proposed. It is shown the possibility to reconstruct the magnitude of the sound pressure in the frequency band 30-300 Hz for an entire room by using a very low number of irregularly distributed microphones arbitrarily arranged. Moreover, the approach is agnostic to the location of the measurements in the Euclidean space. In particular, the presented approach uses a limited number of arbitrary discrete measurements of the magnitude of the sound field pressure in order to extrapolate this field to a higher-resolution grid of discrete points in space with a low computational complexity. The method is based on a U-net-like neural network with partial convolutions trained solely on simulated data, which itself is constructed from numerical simulations of Green's function across thousands of common rectangular rooms. Although extensible to three dimensions and different room shapes, the method focuses on reconstructing a two-dimensional plane of a rectangular room from measurements of the three-dimensional sound field. Experiments using simulated data together with an experimental validation in a real listening room are shown. The results suggest a performance which may exceed conventional reconstruction techniques for a low number of microphones and computational requirements.
△ Less
Submitted 6 August, 2020; v1 submitted 30 January, 2020;
originally announced January 2020.
-
Truth or Backpropaganda? An Empirical Investigation of Deep Learning Theory
Authors:
Micah Goldblum,
Jonas Gei**,
Avi Schwarzschild,
Michael Moeller,
Tom Goldstein
Abstract:
We empirically evaluate common assumptions about neural networks that are widely held by practitioners and theorists alike. In this work, we: (1) prove the widespread existence of suboptimal local minima in the loss landscape of neural networks, and we use our theory to find examples; (2) show that small-norm parameters are not optimal for generalization; (3) demonstrate that ResNets do not confor…
▽ More
We empirically evaluate common assumptions about neural networks that are widely held by practitioners and theorists alike. In this work, we: (1) prove the widespread existence of suboptimal local minima in the loss landscape of neural networks, and we use our theory to find examples; (2) show that small-norm parameters are not optimal for generalization; (3) demonstrate that ResNets do not conform to wide-network theories, such as the neural tangent kernel, and that the interaction between skip connections and batch normalization plays a role; (4) find that rank does not correlate with generalization or robustness in a practical setting.
△ Less
Submitted 28 April, 2020; v1 submitted 1 October, 2019;
originally announced October 2019.
-
Parametric Majorization for Data-Driven Energy Minimization Methods
Authors:
Jonas Gei**,
Michael Moeller
Abstract:
Energy minimization methods are a classical tool in a multitude of computer vision applications. While they are interpretable and well-studied, their regularity assumptions are difficult to design by hand. Deep learning techniques on the other hand are purely data-driven, often provide excellent results, but are very difficult to constrain to predefined physical or safety-critical models. A possib…
▽ More
Energy minimization methods are a classical tool in a multitude of computer vision applications. While they are interpretable and well-studied, their regularity assumptions are difficult to design by hand. Deep learning techniques on the other hand are purely data-driven, often provide excellent results, but are very difficult to constrain to predefined physical or safety-critical models. A possible combination between the two approaches is to design a parametric energy and train the free parameters in such a way that minimizers of the energy correspond to desired solution on a set of training examples. Unfortunately, such formulations typically lead to bi-level optimization problems, on which common optimization algorithms are difficult to scale to modern requirements in data processing and efficiency. In this work, we present a new strategy to optimize these bi-level problems. We investigate surrogate single-level problems that majorize the target problems and can be implemented with existing tools, leading to efficient algorithms without collapse of the energy function. This framework of strategies enables new avenues to the training of parameterized energy minimization models from large data.
△ Less
Submitted 16 August, 2019;
originally announced August 2019.
-
Training Auto-encoder-based Optimizers for Terahertz Image Reconstruction
Authors:
Tak Ming Wong,
Matthias Kahl,
Peter Haring Bolívar,
Andreas Kolb,
Michael Möller
Abstract:
Terahertz (THz) sensing is a promising imaging technology for a wide variety of different applications. Extracting the interpretable and physically meaningful parameters for such applications, however, requires solving an inverse problem in which a model function determined by these parameters needs to be fitted to the measured data. Since the underlying optimization problem is nonconvex and very…
▽ More
Terahertz (THz) sensing is a promising imaging technology for a wide variety of different applications. Extracting the interpretable and physically meaningful parameters for such applications, however, requires solving an inverse problem in which a model function determined by these parameters needs to be fitted to the measured data. Since the underlying optimization problem is nonconvex and very costly to solve, we propose learning the prediction of suitable parameters from the measured data directly. More precisely, we develop a model-based autoencoder in which the encoder network predicts suitable parameters and the decoder is fixed to a physically meaningful model function, such that we can train the encoding network in an unsupervised way. We illustrate numerically that the resulting network is more than 140 times faster than classical optimization techniques while making predictions with only slightly higher objective values. Using such predictions as starting points of local optimization techniques allows us to converge to better local minima about twice as fast as optimization without the network-based initialization.
△ Less
Submitted 29 October, 2019; v1 submitted 2 July, 2019;
originally announced July 2019.
-
Controlling Neural Networks via Energy Dissipation
Authors:
Michael Moeller,
Thomas Möllenhoff,
Daniel Cremers
Abstract:
The last decade has shown a tremendous success in solving various computer vision problems with the help of deep learning techniques. Lately, many works have demonstrated that learning-based approaches with suitable network architectures even exhibit superior performance for the solution of (ill-posed) image reconstruction problems such as deblurring, super-resolution, or medical image reconstruct…
▽ More
The last decade has shown a tremendous success in solving various computer vision problems with the help of deep learning techniques. Lately, many works have demonstrated that learning-based approaches with suitable network architectures even exhibit superior performance for the solution of (ill-posed) image reconstruction problems such as deblurring, super-resolution, or medical image reconstruction. The drawback of purely learning-based methods, however, is that they cannot provide provable guarantees for the trained network to follow a given data formation process during inference. In this work we propose energy dissipating networks that iteratively compute a descent direction with respect to a given cost function or energy at the currently estimated reconstruction. Therefore, an adaptive step size rule such as a line-search, along with a suitable number of iterations can guarantee the reconstruction to follow a given data formation model encoded in the energy to arbitrary precision, and hence control the model's behavior even during test time. We prove that under standard assumptions, descent using the direction predicted by the network converges (linearly) to the global minimum of the energy. We illustrate the effectiveness of the proposed approach in experiments on single image super resolution and computed tomography (CT) reconstruction, and further illustrate extensions to convex feasibility problems.
△ Less
Submitted 20 August, 2019; v1 submitted 5 April, 2019;
originally announced April 2019.