Search | arXiv e-print repository

arXiv:2405.01967 [pdf, other]

Real-time multichannel deep speech enhancement in hearing aids: Comparing monaural and binaural processing in complex acoustic scenarios

Authors: Nils L. Westhausen, Hendrik Kayser, Theresa Jansen, Bernd T. Meyer

Abstract: Deep learning has the potential to enhance speech signals and increase their intelligibility for users of hearing aids. Deep models suited for real-world application should feature a low computational complexity and low processing delay of only a few milliseconds. In this paper, we explore deep speech enhancement that matches these requirements and contrast monaural and binaural processing algorit… ▽ More Deep learning has the potential to enhance speech signals and increase their intelligibility for users of hearing aids. Deep models suited for real-world application should feature a low computational complexity and low processing delay of only a few milliseconds. In this paper, we explore deep speech enhancement that matches these requirements and contrast monaural and binaural processing algorithms in two complex acoustic scenes. Both algorithms are evaluated with objective metrics and in experiments with hearing-impaired listeners performing a speech-in-noise test. Results are compared to two traditional enhancement strategies, i.e., adaptive differential microphone processing and binaural beamforming. While in diffuse noise, all algorithms perform similarly, the binaural deep learning approach performs best in the presence of spatial interferers. Through a post-analysis, this can be attributed to improvements at low SNRs and to precise spatial filtering. △ Less

Submitted 3 May, 2024; originally announced May 2024.

Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2405.01309 [pdf, other]

Execution-free Program Repair

Authors: Li Huang, Bertrand Meyer, Ilgiz Mustafin, Manuel Oriol

Abstract: Automatic program repair usually relies heavily on test cases for both bug identification and fix validation. The issue is that writing test cases is tedious, running them takes much time, and validating a fix through tests does not guarantee its correctness. The novel idea in the Proof2Fix methodology and tool presented here is to rely instead on a program prover, without the need to run tests or… ▽ More Automatic program repair usually relies heavily on test cases for both bug identification and fix validation. The issue is that writing test cases is tedious, running them takes much time, and validating a fix through tests does not guarantee its correctness. The novel idea in the Proof2Fix methodology and tool presented here is to rely instead on a program prover, without the need to run tests or to run the program at all. Results show that Proof2Fix finds and fixes significant historical bugs. △ Less

Submitted 9 May, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

arXiv:2403.08923 [pdf]

Loop unrolling (for test coverage): formal definition

Authors: Bertrand Meyer

Abstract: Techniques to achieve various forms of test coverage, such as branch coverage, typically do not iterate loops; in other words, they treat a loop as a conditional, executed zero or one time. Existing work by the author and collaborators produces test suites guaranteeing full branch coverage. More recent work has shown that by unrolling loops the approach can find significantly more bugs. The presen… ▽ More Techniques to achieve various forms of test coverage, such as branch coverage, typically do not iterate loops; in other words, they treat a loop as a conditional, executed zero or one time. Existing work by the author and collaborators produces test suites guaranteeing full branch coverage. More recent work has shown that by unrolling loops the approach can find significantly more bugs. The present discussion provides the theoretical basis and precise definition for this concept of unrolling. △ Less

Submitted 13 March, 2024; originally announced March 2024.

arXiv:2402.14471 [pdf, ps, other]

BUGFIX: towards a common language and framework for the AutomaticProgram Repair community

Authors: Bertrand Meyer, Viktoryia Kananchuk, Li Huang

Abstract: Techniques of Automatic Program Repair (APR) have the potential of thoroughly facilitating the task of producing quality software. After a promising start, however, progress in making APR practical has been hindered by the lack of a common framework to support the multiplicity of APR ideas and tools, and of target programming languages and environments. In this position paper we outline a genera… ▽ More Techniques of Automatic Program Repair (APR) have the potential of thoroughly facilitating the task of producing quality software. After a promising start, however, progress in making APR practical has been hindered by the lack of a common framework to support the multiplicity of APR ideas and tools, and of target programming languages and environments. In this position paper we outline a general framework to enable the APR community to benefit from each otherś advances, in particular through a standard language for describing bugs and their fixes. Such a common framework (which is also applicable to work on fault seeding) could be a tremendous benefit to researchers and developers of Interactive Development Environments (IDEs) who are working to make APR an effective part of the practical experience of software developers. △ Less

Submitted 22 February, 2024; originally announced February 2024.

arXiv:2402.01169 [pdf, ps, other]

Faster Inference of Integer SWIN Transformer by Removing the GELU Activation

Authors: Mohammadreza Tayaranian, Seyyed Hasan Mozafari, James J. Clark, Brett Meyer, Warren Gross

Abstract: SWIN transformer is a prominent vision transformer model that has state-of-the-art accuracy in image classification tasks. Despite this success, its unique architecture causes slower inference compared with similar deep neural networks. Integer quantization of the model is one of the methods used to improve its inference latency. However, state-of-the-art has not been able to fully quantize the mo… ▽ More SWIN transformer is a prominent vision transformer model that has state-of-the-art accuracy in image classification tasks. Despite this success, its unique architecture causes slower inference compared with similar deep neural networks. Integer quantization of the model is one of the methods used to improve its inference latency. However, state-of-the-art has not been able to fully quantize the model. In this work, we improve upon the inference latency of the state-of-the-art methods by removing the floating-point operations, which are associated with the GELU activation in Swin Transformer. While previous work proposed to replace the non-integer operations with linear approximation functions, we propose to replace GELU with ReLU activation. The advantage of ReLU over previous methods is its low memory and computation complexity. We use iterative knowledge distillation to compensate for the lost accuracy due to replacing GELU with ReLU. We quantize our GELU-less SWIN transformer and show that on an RTX 4090 NVIDIA GPU we can improve the inference latency of the quantized SWIN transformer by at least $11\%$ while maintaining an accuracy drop of under $0.5\%$ on the ImageNet evaluation dataset. △ Less

Submitted 2 February, 2024; originally announced February 2024.

Comments: 5 pages, 1 figure. Submitted to Edge Intelligence Workshop III, an AAAI 2024 workshop

arXiv:2401.13212 [pdf, other]

AdCorDA: Classifier Refinement via Adversarial Correction and Domain Adaptation

Authors: Lulan Shen, Ali Edalati, Brett Meyer, Warren Gross, James J. Clark

Abstract: This paper describes a simple yet effective technique for refining a pretrained classifier network. The proposed AdCorDA method is based on modification of the training set and making use of the duality between network weights and layer inputs. We call this input space training. The method consists of two stages - adversarial correction followed by domain adaptation. Adversarial correction uses ad… ▽ More This paper describes a simple yet effective technique for refining a pretrained classifier network. The proposed AdCorDA method is based on modification of the training set and making use of the duality between network weights and layer inputs. We call this input space training. The method consists of two stages - adversarial correction followed by domain adaptation. Adversarial correction uses adversarial attacks to correct incorrect training-set classifications. The incorrectly classified samples of the training set are removed and replaced with the adversarially corrected samples to form a new training set, and then, in the second stage, domain adaptation is performed back to the original training set. Extensive experimental validations show significant accuracy boosts of over 5% on the CIFAR-100 dataset. The technique can be straightforwardly applied to refinement of weight-quantized neural networks, where experiments show substantial enhancement in performance over the baseline. The adversarial correction technique also results in enhanced robustness to adversarial attacks. △ Less

Submitted 23 January, 2024; originally announced January 2024.

arXiv:2401.12014 [pdf, other]

Robustness to distribution shifts of compressed networks for edge devices

Authors: Lulan Shen, Ali Edalati, Brett Meyer, Warren Gross, James J. Clark

Abstract: It is necessary to develop efficient DNNs deployed on edge devices with limited computation resources. However, the compressed networks often execute new tasks in the target domain, which is different from the source domain where the original network is trained. It is important to investigate the robustness of compressed networks in two types of data distribution shifts: domain shifts and adversar… ▽ More It is necessary to develop efficient DNNs deployed on edge devices with limited computation resources. However, the compressed networks often execute new tasks in the target domain, which is different from the source domain where the original network is trained. It is important to investigate the robustness of compressed networks in two types of data distribution shifts: domain shifts and adversarial perturbations. In this study, we discover that compressed models are less robust to distribution shifts than their original networks. Interestingly, larger networks are more vulnerable to losing robustness than smaller ones, even when they are compressed to a similar size as the smaller networks. Furthermore, compact networks obtained by knowledge distillation are much more robust to distribution shifts than pruned networks. Finally, post-training quantization is a reliable method for achieving significant robustness to distribution shifts, and it outperforms both pruned and distilled models in terms of robustness. △ Less

Submitted 22 January, 2024; originally announced January 2024.

arXiv:2312.05173 [pdf, other]

Binaural multichannel blind speaker separation with a causal low-latency and low-complexity approach

Authors: Nils L. Westhausen, Bernd T. Meyer

Abstract: In this paper, we introduce a causal low-latency low-complexity approach for binaural multichannel blind speaker separation in noisy reverberant conditions. The model, referred to as Group Communication Binaural Filter and Sum Network (GCBFSnet) predicts complex filters for filter-and-sum beamforming in the time-frequency domain. We apply Group Communication (GC), i.e., latent model variable… ▽ More In this paper, we introduce a causal low-latency low-complexity approach for binaural multichannel blind speaker separation in noisy reverberant conditions. The model, referred to as Group Communication Binaural Filter and Sum Network (GCBFSnet) predicts complex filters for filter-and-sum beamforming in the time-frequency domain. We apply Group Communication (GC), i.e., latent model variables are split into groups and processed with a shared sequence model with the aim of reducing the complexity of a simple model only containing one convolutional and one recurrent module. With GC we are able to reduce the size of the model by up to 83 % and the complexity up to 73 % compared to the model without GC, while mostly retaining performance. Even for the smallest model configuration, GCBFSnet matches the performance of a low-complexity TasNet baseline in most metrics despite the larger size and higher number of required operations of the baseline. △ Less

Submitted 8 December, 2023; originally announced December 2023.

Comments: Accepted for publication at IEEE ICASSP 2024 OJSP track

arXiv:2311.15971 [pdf, other]

Supply Chain Due Diligence Risk Assessment for the EU: A Network Approach to estimate expected effectiveness of the planned EU directive

Authors: Jan Hurt, Katharina Ledebur, Birgit Meyer, Klaus Friesenbichler, Markus Gerschberger, Stefan Thurner, Peter Klimek

Abstract: Globalization has had undesirable effects on the labor standards embedded in the products we consume. This paper proposes an ex-ante evaluation of supply chain due diligence regulations, such as the EU Corporate Sustainable Due Diligence Directive (CSDDD). We construct a full-scale network model derived from structural business statistics of 30 million EU firms to quantify the likelihood of links… ▽ More Globalization has had undesirable effects on the labor standards embedded in the products we consume. This paper proposes an ex-ante evaluation of supply chain due diligence regulations, such as the EU Corporate Sustainable Due Diligence Directive (CSDDD). We construct a full-scale network model derived from structural business statistics of 30 million EU firms to quantify the likelihood of links to firms potentially involved in human rights abuses in the European supply chain. The 900 million supply links of these firms are modeled in a way that is consistent with multiregional input-output data, EU import data, and stylized facts of firm-level production networks. We find that this network exhibits a small world effect with three degrees of separation, meaning that most firms are no more than three steps away from each other in the network. Consequently we find that about 8.5% of EU companies are at risk of having child or forced labor in the first tier of their supply chains, about 82.4% are likely to have such offenders at the second tier and more than 99.1% have such offenders at the third tier. We also profile companies by country, sector, and size for the likelihood of having human rights violations or child and forced labor violations at a given tier in their supply chain, revealing considerable heterogeneity across EU companies. Our results show that supply chain due diligence regulations that focus on monitoring individual buyer-supplier links, as currently proposed in the CSDDD, are likely to be ineffective due to a high degree of redundancy and the fact that individual company value chains cannot be properly isolated from the global supply network. Rather, to maximize cost-effectiveness without compromising due diligence coverage, we suggest that regulations should focus on monitoring individual suppliers. △ Less

Submitted 12 December, 2023; v1 submitted 27 November, 2023; originally announced November 2023.

Comments: see also the visualization under https://vis.csh.ac.at/scdd-exposure-indicator/

arXiv:2309.05827 [pdf, ps, other]

Digraph Branchings and Matrix Determinants

Authors: Sayani Ghosh, Bradley S. Meyer

Abstract: We present a version of the matrix-tree theorem, which relates the determinant of a matrix to sums of weights of arborescences of its directed graph representation. Our treatment allows for non-zero column sums in the parent matrix by adding a root vertex to the usually considered matrix directed graph. We use our result to prove a version of the matrix-forest, or all-minors, theorem, which relate… ▽ More We present a version of the matrix-tree theorem, which relates the determinant of a matrix to sums of weights of arborescences of its directed graph representation. Our treatment allows for non-zero column sums in the parent matrix by adding a root vertex to the usually considered matrix directed graph. We use our result to prove a version of the matrix-forest, or all-minors, theorem, which relates minors of the matrix to forests of arborescences of the matrix digraph. We then show that it is possible, when the source and target vertices of an arc are not strongly connected, to move the source of the arc in the matrix directed graph and leave the resulting matrix determinant unchanged, as long as the source and target vertices are not strongly connected after the move. This result enables graphical strategies for factoring matrix determinants. △ Less

Submitted 13 September, 2023; v1 submitted 11 September, 2023; originally announced September 2023.

arXiv:2309.04232 [pdf, other]

Seeding Contradiction: a fast method for generating full-coverage test suites

Authors: Li Huang, Bertrand Meyer, Manuel Oriol

Abstract: The regression test suite, a key resource for managing program evolution, needs to achieve 100% coverage, or very close, to be useful. Devising a test suite manually is unacceptably tedious, but existing automated methods are often inefficient. The method described in this article, ``Seeding Contradiction'', inserts incorrect instructions into every basic block of the program, enabling an SMT-base… ▽ More The regression test suite, a key resource for managing program evolution, needs to achieve 100% coverage, or very close, to be useful. Devising a test suite manually is unacceptably tedious, but existing automated methods are often inefficient. The method described in this article, ``Seeding Contradiction'', inserts incorrect instructions into every basic block of the program, enabling an SMT-based Hoare-style prover to generate a counterexample for every branch of the program and, from the collection of all such counterexamples, a test suite. The method is static, works fast, and achieves excellent coverage. △ Less

Submitted 8 September, 2023; originally announced September 2023.

arXiv:2308.13201 [pdf, other]

doi 10.1109/TASLP.2024.3416697

Deep Active Audio Feature Learning in Resource-Constrained Environments

Authors: Md Mohaimenuzzaman, Christoph Bergmeir, Bernd Meyer

Abstract: The scarcity of labelled data makes training Deep Neural Network (DNN) models in bioacoustic applications challenging. In typical bioacoustics applications, manually labelling the required amount of data can be prohibitively expensive. To effectively identify both new and current classes, DNN models must continue to learn new features from a modest amount of fresh data. Active Learning (AL) is an… ▽ More The scarcity of labelled data makes training Deep Neural Network (DNN) models in bioacoustic applications challenging. In typical bioacoustics applications, manually labelling the required amount of data can be prohibitively expensive. To effectively identify both new and current classes, DNN models must continue to learn new features from a modest amount of fresh data. Active Learning (AL) is an approach that can help with this learning while requiring little labelling effort. Nevertheless, the use of fixed feature extraction approaches limits feature quality, resulting in underutilization of the benefits of AL. We describe an AL framework that addresses this issue by incorporating feature extraction into the AL loop and refining the feature extractor after each round of manual annotation. In addition, we use raw audio processing rather than spectrograms, which is a novel approach. Experiments reveal that the proposed AL framework requires 14.3%, 66.7%, and 47.4% less labelling effort on benchmark audio datasets ESC-50, UrbanSound8k, and InsectWingBeat, respectively, for a large DNN model and similar savings on a microcontroller-based counterpart. Furthermore, we showcase the practical relevance of our study by incorporating data from conservation biology projects. All codes are publicly available on GitHub. △ Less

Submitted 30 June, 2024; v1 submitted 25 August, 2023; originally announced August 2023.

Journal ref: IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 32, 2024

arXiv:2304.11207 [pdf, other]

SSS3D: Fast Neural Architecture Search For Efficient Three-Dimensional Semantic Segmentation

Authors: Olivier Therrien, Marihan Amein, Zhuoran Xiong, Warren J. Gross, Brett H. Meyer

Abstract: We present SSS3D, a fast multi-objective NAS framework designed to find computationally efficient 3D semantic scene segmentation networks. It uses RandLA-Net, an off-the-shelf point-based network, as a super-network to enable weight sharing and reduce search time by 99.67% for single-stage searches. SSS3D has a complex search space composed of sampling and architectural parameters that can form 2.… ▽ More We present SSS3D, a fast multi-objective NAS framework designed to find computationally efficient 3D semantic scene segmentation networks. It uses RandLA-Net, an off-the-shelf point-based network, as a super-network to enable weight sharing and reduce search time by 99.67% for single-stage searches. SSS3D has a complex search space composed of sampling and architectural parameters that can form 2.88 * 10^17 possible networks. To further reduce search time, SSS3D splits the complete search space and introduces a two-stage search that finds optimal subnetworks in 54% of the time required by single-stage searches. △ Less

Submitted 21 April, 2023; originally announced April 2023.

Comments: Accepted as a full paper by the TinyML Research Symposium 2023

arXiv:2303.16322 [pdf, other]

FMAS: Fast Multi-Objective SuperNet Architecture Search for Semantic Segmentation

Authors: Zhuoran Xiong, Marihan Amein, Olivier Therrien, Warren J. Gross, Brett H. Meyer

Abstract: We present FMAS, a fast multi-objective neural architecture search framework for semantic segmentation. FMAS subsamples the structure and pre-trained parameters of DeepLabV3+, without fine-tuning, dramatically reducing training time during search. To further reduce candidate evaluation time, we use a subset of the validation dataset during the search. Only the final, Pareto non-dominated, candidat… ▽ More We present FMAS, a fast multi-objective neural architecture search framework for semantic segmentation. FMAS subsamples the structure and pre-trained parameters of DeepLabV3+, without fine-tuning, dramatically reducing training time during search. To further reduce candidate evaluation time, we use a subset of the validation dataset during the search. Only the final, Pareto non-dominated, candidates are ultimately fine-tuned using the complete training set. We evaluate FMAS by searching for models that effectively trade accuracy and computational cost on the PASCAL VOC 2012 dataset. FMAS finds competitive designs quickly, e.g., taking just 0.5 GPU days to discover a DeepLabV3+ variant that reduces FLOPs and parameters by 10$\%$ and 20$\%$ respectively, for less than 3$\%$ increased error. We also search on an edge device called GAP8 and use its latency as the metric. FMAS is capable of finding 2.2$\times$ faster network with 7.61$\%$ MIoU loss. △ Less

Submitted 28 March, 2023; originally announced March 2023.

Comments: Accepted as a full paper by the TinyML Research Symposium 2023

arXiv:2301.10559 [pdf, other]

Tracking Different Ant Species: An Unsupervised Domain Adaptation Framework and a Dataset for Multi-object Tracking

Authors: Chamath Abeysinghe, Chris Reid, Hamid Rezatofighi, Bernd Meyer

Abstract: Tracking individuals is a vital part of many experiments conducted to understand collective behaviour. Ants are the paradigmatic model system for such experiments but their lack of individually distinguishing visual features and their high colony densities make it extremely difficult to perform reliable tracking automatically. Additionally, the wide diversity of their species' appearances makes a… ▽ More Tracking individuals is a vital part of many experiments conducted to understand collective behaviour. Ants are the paradigmatic model system for such experiments but their lack of individually distinguishing visual features and their high colony densities make it extremely difficult to perform reliable tracking automatically. Additionally, the wide diversity of their species' appearances makes a generalized approach even harder. In this paper, we propose a data-driven multi-object tracker that, for the first time, employs domain adaptation to achieve the required generalisation. This approach is built upon a joint-detection-and-tracking framework that is extended by a set of domain discriminator modules integrating an adversarial training strategy in addition to the tracking loss. In addition to this novel domain-adaptive tracking framework, we present a new dataset and a benchmark for the ant tracking problem. The dataset contains 57 video sequences with full trajectory annotation, including 30k frames captured from two different ant species moving on different background patterns. It comprises 33 and 24 sequences for source and target domains, respectively. We compare our proposed framework against other domain-adaptive and non-domain-adaptive multi-object tracking baselines using this dataset and show that incorporating domain adaptation at multiple levels of the tracking pipeline yields significant improvements. The code and the dataset are available at https://github.com/chamathabeysinghe/da-tracker. △ Less

Submitted 16 May, 2023; v1 submitted 25 January, 2023; originally announced January 2023.

arXiv:2301.02206 [pdf, other]

Lessons from Formally Verified Deployed Software Systems (Extended version)

Authors: Li Huang, Sophie Ebersold, Alexander Kogtenkov, Bertrand Meyer, Yinling Liu

Abstract: The technology of formal software verification has made spectacular advances, but how much does it actually benefit the development of practical software? Considerable disagreement remains about the practicality of building systems with mechanically-checked proofs of correctness. Is this prospect confined to a few expensive, life-critical projects, or can the idea be applied to a wide segment of t… ▽ More The technology of formal software verification has made spectacular advances, but how much does it actually benefit the development of practical software? Considerable disagreement remains about the practicality of building systems with mechanically-checked proofs of correctness. Is this prospect confined to a few expensive, life-critical projects, or can the idea be applied to a wide segment of the software industry? To help answer this question, the present survey examines a range of projects, in various application areas, that have produced formally verified systems and deployed them for actual use. It considers the technologies used, the form of verification applied, the results obtained, and the lessons that the software industry should draw regarding its ability to benefit from formal verification techniques and tools. Note: this version is the extended article, covering all the systems identified as relevant. A shorter version, covering only a selection, is also available. △ Less

Submitted 28 March, 2024; v1 submitted 5 January, 2023; originally announced January 2023.

Comments: 94 pages, 26 figures, 8 previous surveys, 32 systems descriptions

ACM Class: A.1; D.0; D.2; F.0; F.3; K.2; K.6

arXiv:2212.12965 [pdf, other]

BD-KD: Balancing the Divergences for Online Knowledge Distillation

Authors: Ibtihel Amara, Nazanin Sepahvand, Brett H. Meyer, Warren J. Gross, James J. Clark

Abstract: Knowledge distillation (KD) has gained a lot of attention in the field of model compression for edge devices thanks to its effectiveness in compressing large powerful networks into smaller lower-capacity models. Online distillation, in which both the teacher and the student are learning collaboratively, has also gained much interest due to its ability to improve on the performance of the networks… ▽ More Knowledge distillation (KD) has gained a lot of attention in the field of model compression for edge devices thanks to its effectiveness in compressing large powerful networks into smaller lower-capacity models. Online distillation, in which both the teacher and the student are learning collaboratively, has also gained much interest due to its ability to improve on the performance of the networks involved. The Kullback-Leibler (KL) divergence ensures the proper knowledge transfer between the teacher and student. However, most online KD techniques present some bottlenecks under the network capacity gap. By cooperatively and simultaneously training, the models the KL distance becomes incapable of properly minimizing the teacher's and student's distributions. Alongside accuracy, critical edge device applications are in need of well-calibrated compact networks. Confidence calibration provides a sensible way of getting trustworthy predictions. We propose BD-KD: Balancing of Divergences for online Knowledge Distillation. We show that adaptively balancing between the reverse and forward divergences shifts the focus of the training strategy to the compact student network without limiting the teacher network's learning process. We demonstrate that, by performing this balancing design at the level of the student distillation loss, we improve upon both performance accuracy and calibration of the compact student network. We conducted extensive experiments using a variety of network architectures and show improvements on multiple datasets including CIFAR-10, CIFAR-100, Tiny-ImageNet, and ImageNet. We illustrate the effectiveness of our approach through comprehensive comparisons and ablations with current state-of-the-art online and offline KD techniques. △ Less

Submitted 25 December, 2022; originally announced December 2022.

arXiv:2211.16597 [pdf]

Right and wrong: ten choices in language design

Authors: Bertrand Meyer

Abstract: A description of language design choices that have profound effects on software quality, criticism of how ordinary OO languages address them, and explanation of the thinking behind Eiffel's corresponding mechanisms. A description of language design choices that have profound effects on software quality, criticism of how ordinary OO languages address them, and explanation of the thinking behind Eiffel's corresponding mechanisms. △ Less

Submitted 14 June, 2023; v1 submitted 29 November, 2022; originally announced November 2022.

arXiv:2211.12185 [pdf, other]

Multimorbidity Content-Based Medical Image Retrieval Using Proxies

Authors: Yunyan Xing, Benjamin J. Meyer, Mehrtash Harandi, Tom Drummond, Zongyuan Ge

Abstract: Content-based medical image retrieval is an important diagnostic tool that improves the explainability of computer-aided diagnosis systems and provides decision making support to healthcare professionals. Medical imaging data, such as radiology images, are often multimorbidity; a single sample may have more than one pathology present. As such, image retrieval systems for the medical domain must be… ▽ More Content-based medical image retrieval is an important diagnostic tool that improves the explainability of computer-aided diagnosis systems and provides decision making support to healthcare professionals. Medical imaging data, such as radiology images, are often multimorbidity; a single sample may have more than one pathology present. As such, image retrieval systems for the medical domain must be designed for the multi-label scenario. In this paper, we propose a novel multi-label metric learning method that can be used for both classification and content-based image retrieval. In this way, our model is able to support diagnosis by predicting the presence of diseases and provide evidence for these predictions by returning samples with similar pathological content to the user. In practice, the retrieved images may also be accompanied by pathology reports, further assisting in the diagnostic process. Our method leverages proxy feature vectors, enabling the efficient learning of a robust feature space in which the distance between feature vectors can be used as a measure of the similarity of those samples. Unlike existing proxy-based methods, training samples are able to assign to multiple proxies that span multiple class labels. This multi-label proxy assignment results in a feature space that encodes the complex relationships between diseases present in medical imaging data. Our method outperforms state-of-the-art image retrieval systems and a set of baseline approaches. We demonstrate the efficacy of our approach to both classification and content-based image retrieval on two multimorbidity radiology datasets. △ Less

Submitted 22 November, 2022; originally announced November 2022.

arXiv:2209.07606 [pdf, other]

CES-KD: Curriculum-based Expert Selection for Guided Knowledge Distillation

Authors: Ibtihel Amara, Maryam Ziaeefard, Brett H. Meyer, Warren Gross, James J. Clark

Abstract: Knowledge distillation (KD) is an effective tool for compressing deep classification models for edge devices. However, the performance of KD is affected by the large capacity gap between the teacher and student networks. Recent methods have resorted to a multiple teacher assistant (TA) setting for KD, which sequentially decreases the size of the teacher model to relatively bridge the size gap betw… ▽ More Knowledge distillation (KD) is an effective tool for compressing deep classification models for edge devices. However, the performance of KD is affected by the large capacity gap between the teacher and student networks. Recent methods have resorted to a multiple teacher assistant (TA) setting for KD, which sequentially decreases the size of the teacher model to relatively bridge the size gap between these models. This paper proposes a new technique called Curriculum Expert Selection for Knowledge Distillation (CES-KD) to efficiently enhance the learning of a compact student under the capacity gap problem. This technique is built upon the hypothesis that a student network should be guided gradually using stratified teaching curriculum as it learns easy (hard) data samples better and faster from a lower (higher) capacity teacher network. Specifically, our method is a gradual TA-based KD technique that selects a single teacher per input image based on a curriculum driven by the difficulty in classifying the image. In this work, we empirically verify our hypothesis and rigorously experiment with CIFAR-10, CIFAR-100, CINIC-10, and ImageNet datasets and show improved accuracy on VGG-like models, ResNets, and WideResNets architectures. △ Less

Submitted 15 September, 2022; originally announced September 2022.

Comments: ICPR2022

arXiv:2209.02189 [pdf, other]

Object-Oriented Requirements: a Unified Framework for Specifications, Scenarios and Tests

Authors: Maria Naumcheva, Sophie Ebersold, Alexandr Naumchev, Jean-Michel Bruel, Florian Galinier, Bertrand Meyer

Abstract: A paradox of requirements specifications as dominantly practiced in the industry is that they often claim to be object-oriented (OO) but largely rely on procedural (non-OO) techniques. Use cases and user stories describe functional flows, not object types. To gain the benefits provided by object technology (such as extendibility, reusability, reliability), requirements should instead take advantag… ▽ More A paradox of requirements specifications as dominantly practiced in the industry is that they often claim to be object-oriented (OO) but largely rely on procedural (non-OO) techniques. Use cases and user stories describe functional flows, not object types. To gain the benefits provided by object technology (such as extendibility, reusability, reliability), requirements should instead take advantage of the same data abstraction concepts - classes, inheritance, information hiding - as OO design and OO programs. Many people find use cases and user stories appealing because of the simplicity and practicality of the concepts. Can we reconcile requirements with object-oriented principles and get the best of both worlds? This article proposes a unified framework. It shows that the concept of class is general enough to describe not only "objects" in a narrow sense but also scenarios such as use cases and user stories and other important artifacts such as test cases and oracles. Having a single framework opens the way to requirements that enjoy the benefits of both approaches: like use cases and user stories, they reflect the practical views of stakeholders; like object-oriented requirements, they lend themselves to evolution and reuse. △ Less

Submitted 10 May, 2023; v1 submitted 5 September, 2022; originally announced September 2022.

arXiv:2208.10492 [pdf, other]

Improving Counterexample Quality from Failed Program Verification

Authors: Li Huang, Bertrand Meyer, Manuel Oriol

Abstract: In software verification, a successful automated program proof is the ultimate triumph. The road to such success is, however, paved with many failed proof attempts. The message produced by the prover when a proof fails is often obscure, making it very hard to know how to proceed further. The work reported here attempts to help in such cases by providing immediately understandable counterexamples.… ▽ More In software verification, a successful automated program proof is the ultimate triumph. The road to such success is, however, paved with many failed proof attempts. The message produced by the prover when a proof fails is often obscure, making it very hard to know how to proceed further. The work reported here attempts to help in such cases by providing immediately understandable counterexamples. To this end, it introduces an approach called Counterexample Extraction and Minimization (CEAM). When a proof fails, CEAM turns the counterexample model generated by the prover into a a clearly understandable version; it can in addition simplify the counterexamples further by minimizing the integer values they contain. We have implemented the CEAM approach as an extension to the AutoProof verifier and demonstrate its application to a collection of examples. △ Less

Submitted 26 August, 2022; v1 submitted 21 August, 2022; originally announced August 2022.

Comments: arXiv admin note: text overlap with arXiv:2208.09873

ACM Class: D.2.4; D.2.5; D.2.1

arXiv:2208.09873 [pdf, other]

A Failed Proof Can Yield a Useful Test

Authors: Li Huang, Bertrand Meyer

Abstract: A successful automated program proof is, in software verification, the ultimate triumph. In practice, however, the road to such success is paved with many failed proof attempts. Unlike a failed test, which provides concrete evidence of an actual bug in the program, a failed proof leaves the programmer in the dark. Can we instead learn something useful from it? The work reported here takes advant… ▽ More A successful automated program proof is, in software verification, the ultimate triumph. In practice, however, the road to such success is paved with many failed proof attempts. Unlike a failed test, which provides concrete evidence of an actual bug in the program, a failed proof leaves the programmer in the dark. Can we instead learn something useful from it? The work reported here takes advantage of the rich internal information that some automatic provers collect about the program when attempting a proof. If the proof fails, the Proof2Test tool presented in this article uses the counterexample generated by the prover (specifically, the SMT solver underlying the proof environment Boogie, used in the AutoProof system to perform correctness proofs of contract-equipped Eiffel programs) to produce a failed test, which provides the programmer with immediately exploitable information to correct the program. The discussion presents the Proof2Test tool and demonstrates the application of the ideas and tool to a collection of representative examples. △ Less

Submitted 24 April, 2023; v1 submitted 21 August, 2022; originally announced August 2022.

ACM Class: D.2.4; D.2.5; D.2.1

arXiv:2208.02070 [pdf, other]

Efficient Fine-Tuning of Compressed Language Models with Learners

Authors: Danilo Vucetic, Mohammadreza Tayaranian, Maryam Ziaeefard, James J. Clark, Brett H. Meyer, Warren J. Gross

Abstract: Fine-tuning BERT-based models is resource-intensive in memory, computation, and time. While many prior works aim to improve inference efficiency via compression techniques, e.g., pruning, these works do not explicitly address the computational challenges of training to downstream tasks. We introduce Learner modules and priming, novel methods for fine-tuning that exploit the overparameterization of… ▽ More Fine-tuning BERT-based models is resource-intensive in memory, computation, and time. While many prior works aim to improve inference efficiency via compression techniques, e.g., pruning, these works do not explicitly address the computational challenges of training to downstream tasks. We introduce Learner modules and priming, novel methods for fine-tuning that exploit the overparameterization of pre-trained language models to gain benefits in convergence speed and resource utilization. Learner modules navigate the double bind of 1) training efficiently by fine-tuning a subset of parameters, and 2) training effectively by ensuring quick convergence and high metric scores. Our results on DistilBERT demonstrate that learners perform on par with or surpass the baselines. Learners train 7x fewer parameters than state-of-the-art methods on GLUE. On CoLA, learners fine-tune 20% faster, and have significantly lower resource utilization. △ Less

Submitted 3 August, 2022; originally announced August 2022.

Comments: 8 pages, 9 figures, 2 tables, presented at ICML 2022 workshop on Hardware-Aware Efficient Training (HAET 2022)

arXiv:2205.01541 [pdf, other]

doi 10.1109/ISCAS48785.2022.9937567

Efficient Fine-Tuning of BERT Models on the Edge

Authors: Danilo Vucetic, Mohammadreza Tayaranian, Maryam Ziaeefard, James J. Clark, Brett H. Meyer, Warren J. Gross

Abstract: Resource-constrained devices are increasingly the deployment targets of machine learning applications. Static models, however, do not always suffice for dynamic environments. On-device training of models allows for quick adaptability to new scenarios. With the increasing size of deep neural networks, as noted with the likes of BERT and other natural language processing models, comes increased reso… ▽ More Resource-constrained devices are increasingly the deployment targets of machine learning applications. Static models, however, do not always suffice for dynamic environments. On-device training of models allows for quick adaptability to new scenarios. With the increasing size of deep neural networks, as noted with the likes of BERT and other natural language processing models, comes increased resource requirements, namely memory, computation, energy, and time. Furthermore, training is far more resource intensive than inference. Resource-constrained on-device learning is thus doubly difficult, especially with large BERT-like models. By reducing the memory usage of fine-tuning, pre-trained BERT models can become efficient enough to fine-tune on resource-constrained devices. We propose Freeze And Reconfigure (FAR), a memory-efficient training regime for BERT-like models that reduces the memory usage of activation maps during fine-tuning by avoiding unnecessary parameter updates. FAR reduces fine-tuning time on the DistilBERT model and CoLA dataset by 30%, and time spent on memory operations by 47%. More broadly, reductions in metric performance on the GLUE and SQuAD datasets are around 1% on average. △ Less

Submitted 3 May, 2022; originally announced May 2022.

Comments: 4 pages, 2 figures, 3 tables. To be published in ISCAS 2022 and made available on IEEE Xplore

arXiv:2204.01300 [pdf, other]

tPLCnet: Real-time Deep Packet Loss Concealment in the Time Domain Using a Short Temporal Context

Authors: Nils L. Westhausen, Bernd T. Meyer

Abstract: This paper introduces a real-time time-domain packet loss concealment (PLC) neural-network (tPLCnet). It efficiently predicts lost frames from a short context buffer in a sequence-to-one (seq2one) fashion. Because of its seq2one structure, a continuous inference of the model is not required since it can be triggered when packet loss is actually detected. It is trained on 64h of open-source speech… ▽ More This paper introduces a real-time time-domain packet loss concealment (PLC) neural-network (tPLCnet). It efficiently predicts lost frames from a short context buffer in a sequence-to-one (seq2one) fashion. Because of its seq2one structure, a continuous inference of the model is not required since it can be triggered when packet loss is actually detected. It is trained on 64h of open-source speech data and packet-loss traces of real calls provided by the Audio PLC Challenge. The model with the lowest complexity described in this paper reaches a robust PLC performance and consistent improvements over the zero-filling baseline for all metrics. A configuration with higher complexity is submitted to the PLC Challenge and shows a performance increase of 1.07 compared to the zero-filling baseline in terms of PLC-MOS on the blind test set and reaches a competitive 3rd place in the challenge ranking. △ Less

Submitted 4 April, 2022; originally announced April 2022.

Comments: Submitted to Interspeech 2022

arXiv:2203.09148 [pdf, other]

doi 10.1016/j.csl.2021.101329

Prediction of speech intelligibility with DNN-based performance measures

Authors: Angel Mario Castro Martinez, Constantin Spille, Jana Roßbach, Birger Kollmeier, Bernd T. Meyer

Abstract: This paper presents a speech intelligibility model based on automatic speech recognition (ASR), combining phoneme probabilities from deep neural networks (DNN) and a performance measure that estimates the word error rate from these probabilities. This model does not require the clean speech reference nor the word labels during testing as the ASR decoding step, which finds the most likely sequence… ▽ More This paper presents a speech intelligibility model based on automatic speech recognition (ASR), combining phoneme probabilities from deep neural networks (DNN) and a performance measure that estimates the word error rate from these probabilities. This model does not require the clean speech reference nor the word labels during testing as the ASR decoding step, which finds the most likely sequence of words given phoneme posterior probabilities, is omitted. The model is evaluated via the root-mean-squared error between the predicted and observed speech reception thresholds from eight normal-hearing listeners. The recognition task consists of identifying noisy words from a German matrix sentence test. The speech material was mixed with eight noise maskers covering different modulation types, from speech-shaped stationary noise to a single-talker masker. The prediction performance is compared to five established models and an ASR-model using word labels. Two combinations of features and networks were tested. Both include temporal information either at the feature level (amplitude modulation filterbanks and a feed-forward network) or captured by the architecture (mel-spectrograms and a time-delay deep neural network, TDNN). The TDNN model is on par with the DNN while reducing the number of parameters by a factor of 37; this optimization allows parallel streams on dedicated hearing aid hardware as a forward-pass can be computed within the 10ms of each frame. The proposed model performs almost as well as the label-based model and produces more accurate predictions than the baseline models. △ Less

Submitted 17 March, 2022; originally announced March 2022.

Journal ref: Computer Speech & Language, 74, p.101329 (2022)

arXiv:2202.12422 [pdf, other]

Standard Deviation-Based Quantization for Deep Neural Networks

Authors: Amir Ardakani, Arash Ardakani, Brett Meyer, James J. Clark, Warren J. Gross

Abstract: Quantization of deep neural networks is a promising approach that reduces the inference cost, making it feasible to run deep networks on resource-restricted devices. Inspired by existing methods, we propose a new framework to learn the quantization intervals (discrete values) using the knowledge of the network's weight and activation distributions, i.e., standard deviation. Furthermore, we propose… ▽ More Quantization of deep neural networks is a promising approach that reduces the inference cost, making it feasible to run deep networks on resource-restricted devices. Inspired by existing methods, we propose a new framework to learn the quantization intervals (discrete values) using the knowledge of the network's weight and activation distributions, i.e., standard deviation. Furthermore, we propose a novel base-2 logarithmic quantization scheme to quantize weights to power-of-two discrete values. Our proposed scheme allows us to replace resource-hungry high-precision multipliers with simple shift-add operations. According to our evaluations, our method outperforms existing work on CIFAR10 and ImageNet datasets and even achieves better accuracy performance with 3-bit weights and activations when compared to the full-precision models. Moreover, our scheme simultaneously prunes the network's parameters and allows us to flexibly adjust the pruning ratio during the quantization process. △ Less

Submitted 24 February, 2022; originally announced February 2022.

arXiv:2111.01914 [pdf, other]

Reduction of Subjective Listening Effort for TV Broadcast Signals with Recurrent Neural Networks

Authors: Nils L. Westhausen, Rainer Huber, Hannah Baumgartner, Ragini Sinha, Jan Rennies, Bernd T. Meyer

Abstract: Listening to the audio of TV broadcast signals can be challenging for hearing-impaired as well as normal-hearing listeners, especially when background sounds are prominent or too loud compared to the speech signal. This can result in a reduced satisfaction and increased listening effort of the listeners. Since the broadcast sound is usually premixed, we perform a subjective evaluation for quantify… ▽ More Listening to the audio of TV broadcast signals can be challenging for hearing-impaired as well as normal-hearing listeners, especially when background sounds are prominent or too loud compared to the speech signal. This can result in a reduced satisfaction and increased listening effort of the listeners. Since the broadcast sound is usually premixed, we perform a subjective evaluation for quantifying the potential of speech enhancement systems based on audio source separation and recurrent neural networks (RNN). Recently, RNNs have shown promising results in the context of sound source separation and real-time signal processing. In this paper, we separate the speech from the background signals and remix the separated sounds at a higher signal-to-noise ratio. This differs from classic speech enhancement, where usually only the extracted speech signal is exploited. The subjective evaluation with 20 normal-hearing subjects on real TV-broadcast material shows that our proposed enhancement system is able to reduce the listening effort by around 2 points on a 13-point listening effort rating scale and increases the perceived sound quality compared to the original mixture. △ Less

Submitted 2 November, 2021; originally announced November 2021.

Comments: Accepted by IEEE/ACM Transactions on Audio, Speech and Language Processing. This version is the authors' version and may vary from the final publication in details

arXiv:2109.06557 [pdf, other]

The concept of class invariant in object-oriented programming

Authors: Bertrand Meyer, Alisa Arkadova, Alexander Kogtenkov

Abstract: Class invariants -- consistency constraints preserved by every operation on objects of a given type -- are fundamental to building, understanding and verifying object-oriented programs. For verification, however, they raise difficulties, which have not yet received a generally accepted solution. The present work introduces a proof rule meant to address these issues and allow verification tools to… ▽ More Class invariants -- consistency constraints preserved by every operation on objects of a given type -- are fundamental to building, understanding and verifying object-oriented programs. For verification, however, they raise difficulties, which have not yet received a generally accepted solution. The present work introduces a proof rule meant to address these issues and allow verification tools to benefit from invariants. It clarifies the notion of invariant and identifies the three associated problems: callbacks, furtive access and reference leak. As an example, the 2016 Ethereum DAO bug, in which $50 million were stolen, resulted from a callback invalidating an invariant. The discussion starts with a simplified model of computation and an associated proof rule, demonstrating its soundness. It then removes one by one the three simplifying assumptions, each removal raising one of the three issues, and leading to a corresponding adaptation to the proof rule. The final version of the rule can tackle tricky examples, including "challenge problems" listed in the literature. △ Less

Submitted 20 January, 2024; v1 submitted 14 September, 2021; originally announced September 2021.

Comments: Revised version following refereeing

ACM Class: D.2.4; F.3.1; D.1.5; D.3.3; F.3.3

arXiv:2108.06128 [pdf, other]

doi 10.1109/ACCESS.2022.3140807

Pruning vs XNOR-Net: A Comprehensive Study of Deep Learning for Audio Classification on Edge-devices

Authors: Md Mohaimenuzzaman, Christoph Bergmeir, Bernd Meyer

Abstract: Deep learning has celebrated resounding successes in many application areas of relevance to the Internet of Things (IoT), such as computer vision and machine listening. These technologies must ultimately be brought directly to the edge to fully harness the power of deep learning for the IoT. The obvious challenge is that deep learning techniques can only be implemented on strictly resource-constra… ▽ More Deep learning has celebrated resounding successes in many application areas of relevance to the Internet of Things (IoT), such as computer vision and machine listening. These technologies must ultimately be brought directly to the edge to fully harness the power of deep learning for the IoT. The obvious challenge is that deep learning techniques can only be implemented on strictly resource-constrained edge devices if the models are radically downsized. This task relies on different model compression techniques, such as network pruning, quantization, and the recent advancement of XNOR-Net. This study examines the suitability of these techniques for audio classification on microcontrollers. We present an application of XNOR-Net for end-to-end raw audio classification and a comprehensive empirical study comparing this approach with pruning-and-quantization methods. We show that raw audio classification with XNOR yields comparable performance to regular full precision networks for small numbers of classes while reducing memory requirements 32-fold and computation requirements 58-fold. However, as the number of classes increases significantly, performance degrades, and pruning-and-quantization based compression techniques take over as the preferred technique being able to satisfy the same space constraints but requiring approximately 8x more computation. We show that these insights are consistent between raw audio classification and image classification using standard benchmark sets. To the best of our knowledge, this is the first study to apply XNOR to end-to-end audio classification and evaluate it in the context of alternative techniques. All codes are publicly available on GitHub. △ Less

Submitted 17 January, 2022; v1 submitted 13 August, 2021; originally announced August 2021.

Comments: 13 pages

Journal ref: in IEEE Access, vol. 10, 2022

arXiv:2103.03483 [pdf]

doi 10.1016/j.patcog.2022.109025

Environmental Sound Classification on the Edge: A Pipeline for Deep Acoustic Networks on Extremely Resource-Constrained Devices

Authors: Md Mohaimenuzzaman, Christoph Bergmeir, Ian Thomas West, Bernd Meyer

Abstract: Significant efforts are being invested to bring state-of-the-art classification and recognition to edge devices with extreme resource constraints (memory, speed, and lack of GPU support). Here, we demonstrate the first deep network for acoustic recognition that is small, flexible and compression-friendly yet achieves state-of-the-art performance for raw audio classification. Rather than handcrafti… ▽ More Significant efforts are being invested to bring state-of-the-art classification and recognition to edge devices with extreme resource constraints (memory, speed, and lack of GPU support). Here, we demonstrate the first deep network for acoustic recognition that is small, flexible and compression-friendly yet achieves state-of-the-art performance for raw audio classification. Rather than handcrafting a once-off solution, we present a generic pipeline that automatically converts a large deep convolutional network via compression and quantization into a network for resource-impoverished edge devices. After introducing ACDNet, which produces above state-of-the-art accuracy on ESC-10 (96.65%), ESC-50 (87.10%), UrbanSound8K (84.45%) and AudioEvent (92.57%), we describe the compression pipeline and show that it allows us to achieve 97.22% size reduction and 97.28% FLOP reduction while maintaining close to state-of-the-art accuracy 96.25%, 83.65%, 78.27% and 89.69% on these datasets. We describe a successful implementation on a standard off-the-shelf microcontroller and, beyond laboratory benchmarks, report successful tests on real-world datasets. △ Less

Submitted 20 September, 2022; v1 submitted 5 March, 2021; originally announced March 2021.

Journal ref: Pattern Recognition, p.109025 (2022)

arXiv:2102.03116 [pdf, other]

doi 10.1145/3471904

Worst-Case Execution Time Calculation for Query-Based Monitors by Witness Generation

Authors: Márton Búr, Kristóf Marussy, Brett H. Meyer, Dániel Varró

Abstract: Runtime monitoring plays a key role in the assurance of modern intelligent cyber-physical systems, which are frequently data-intensive and safety-critical. While graph queries can serve as an expressive yet formally precise specification language to capture the safety properties of interest, there are no timeliness guarantees for such auto-generated runtime monitoring programs, which prevents thei… ▽ More Runtime monitoring plays a key role in the assurance of modern intelligent cyber-physical systems, which are frequently data-intensive and safety-critical. While graph queries can serve as an expressive yet formally precise specification language to capture the safety properties of interest, there are no timeliness guarantees for such auto-generated runtime monitoring programs, which prevents their use in a real-time setting. While worst-case execution time (WCET) bounds derived by existing static WCET estimation techniques are safe, they may not be tight as they are unable to exploit domain-specific (semantic) information about the input models. This paper presents a semantic-aware WCET analysis method for data-driven monitoring programs derived from graph queries. The method incorporates results obtained from low-level timing analysis into the objective function of a modern graph solver. This allows the systematic generation of input graph models up to a specified size (referred to as witness models) for which the monitor is expected to take the most time to complete. Hence the estimated execution time of the monitors on these graphs can be considered as safe and tight WCET. Additionally, we perform a set of experiments with query-based programs running on a real-time platform over a set of generated models to investigate the relationship between execution times and their estimates, and compare WCET estimates produced by our approach with results from two well-known timing analyzers, aiT and OTAWA. △ Less

Submitted 3 November, 2021; v1 submitted 5 February, 2021; originally announced February 2021.

Comments: 36 pages, 11 figures, submitted to ACM Transactions on Embedded Computing Systems (accepted version)

ACM Class: C.3.3

Journal ref: ACM Transactions on Embedded Computing Systems, Volume 20, Issue 6, 2021 November

arXiv:2006.01659 [pdf, other]

Surprisal-Triggered Conditional Computation with Neural Networks

Authors: Loren Lugosch, Derek Nowrouzezahrai, Brett H. Meyer

Abstract: Autoregressive neural network models have been used successfully for sequence generation, feature extraction, and hypothesis scoring. This paper presents yet another use for these models: allocating more computation to more difficult inputs. In our model, an autoregressive model is used both to extract features and to predict observations in a stream of input observations. The surprisal of the inp… ▽ More Autoregressive neural network models have been used successfully for sequence generation, feature extraction, and hypothesis scoring. This paper presents yet another use for these models: allocating more computation to more difficult inputs. In our model, an autoregressive model is used both to extract features and to predict observations in a stream of input observations. The surprisal of the input, measured as the negative log-likelihood of the current observation according to the autoregressive model, is used as a measure of input difficulty. This in turn determines whether a small, fast network, or a big, slow network, is used. Experiments on two speech recognition tasks show that our model can match the performance of a baseline in which the big network is always used with 15% fewer FLOPs. △ Less

Submitted 2 June, 2020; originally announced June 2020.

arXiv:2005.07551 [pdf, other]

Dual-Signal Transformation LSTM Network for Real-Time Noise Suppression

Authors: Nils L. Westhausen, Bernd T. Meyer

Abstract: This paper introduces a dual-signal transformation LSTM network (DTLN) for real-time speech enhancement as part of the Deep Noise Suppression Challenge (DNS-Challenge). This approach combines a short-time Fourier transform (STFT) and a learned analysis and synthesis basis in a stacked-network approach with less than one million parameters. The model was trained on 500 h of noisy speech provided by… ▽ More This paper introduces a dual-signal transformation LSTM network (DTLN) for real-time speech enhancement as part of the Deep Noise Suppression Challenge (DNS-Challenge). This approach combines a short-time Fourier transform (STFT) and a learned analysis and synthesis basis in a stacked-network approach with less than one million parameters. The model was trained on 500 h of noisy speech provided by the challenge organizers. The network is capable of real-time processing (one frame in, one frame out) and reaches competitive results. Combining these two types of signal transformations enables the DTLN to robustly extract information from magnitude spectra and incorporate phase information from the learned feature basis. The method shows state-of-the-art performance and outperforms the DNS-Challenge baseline by 0.24 points absolute in terms of the mean opinion score (MOS). △ Less

Submitted 22 October, 2020; v1 submitted 15 May, 2020; originally announced May 2020.

Comments: Accepted by Interspeech 2020

arXiv:2003.08477 [pdf, other]

Integrating State of the Art Compute, Communication, and Autotuning Strategies to Multiply the Performance of the Application Programm CPMD for Ab Initio Molecular Dynamics Simulations

Authors: Tobias Klöffel, Gerald Mathias, Bernd Meyer

Abstract: We present our recent code modernizations of the of the ab initio molecular dynamics program CPMD (www.cpmd.org) with a special focus on the ultra-soft pseudopotential (USPP) code path. Following the internal instrumentation of CPMD, all time critical routines have been revised to maximize the computational throughput and to minimize the communication overhead for optimal performance. Throughout t… ▽ More We present our recent code modernizations of the of the ab initio molecular dynamics program CPMD (www.cpmd.org) with a special focus on the ultra-soft pseudopotential (USPP) code path. Following the internal instrumentation of CPMD, all time critical routines have been revised to maximize the computational throughput and to minimize the communication overhead for optimal performance. Throughout the program missing hybrid MPI+OpenMP parallelization has been added to optimize scaling. For communication intensive routines, as the multiple distributed 3d FFTs of the electronic states and distributed matrix-matrix multiplications related to the $β$-projectors of the pseudopotentials, this MPI+OpenMP parallelization now overlaps computation and communication. The necessary partitioning of the workload is optimized by an auto-tuning algorithm. In addition, the largest global MPI_Allreduce operation has been replaced by highly tuned node-local parallelized operations using MPI shared-memory windows to avoid inter-node communication. A batched algorithm for the multiple 3d FFTs improves the throughput of the MPI_Alltoall communication and, thus, the scalability of the implementation, both for USPP and for the frequently used norm-conserving pseudopotential code path. The enhanced performance and scalability is demonstrated on a mid-sized benchmark system of 256 water molecules and further water systems of from 32 up to 2048 molecules. △ Less

Submitted 18 March, 2020; originally announced March 2020.

Comments: 12 pages, 10 figures

arXiv:2003.08074 [pdf, other]

OpenGAN: Open Set Generative Adversarial Networks

Authors: Luke Ditria, Benjamin J. Meyer, Tom Drummond

Abstract: Many existing conditional Generative Adversarial Networks (cGANs) are limited to conditioning on pre-defined and fixed class-level semantic labels or attributes. We propose an open set GAN architecture (OpenGAN) that is conditioned per-input sample with a feature embedding drawn from a metric space. Using a state-of-the-art metric learning model that encodes both class-level and fine-grained seman… ▽ More Many existing conditional Generative Adversarial Networks (cGANs) are limited to conditioning on pre-defined and fixed class-level semantic labels or attributes. We propose an open set GAN architecture (OpenGAN) that is conditioned per-input sample with a feature embedding drawn from a metric space. Using a state-of-the-art metric learning model that encodes both class-level and fine-grained semantic information, we are able to generate samples that are semantically similar to a given source image. The semantic information extracted by the metric learning model transfers to out-of-distribution novel classes, allowing the generative model to produce samples that are outside of the training distribution. We show that our proposed method is able to generate 256$\times$256 resolution images from novel classes that are of similar visual quality to those from the training classes. In lieu of a source image, we demonstrate that random sampling of the metric space also results in high-quality samples. We show that interpolation in the feature space and latent space results in semantically and visually plausible transformations in the image space. Finally, the usefulness of the generated samples to the downstream task of data augmentation is demonstrated. We show that classifier performance can be significantly improved by augmenting the training data with OpenGAN samples on classes that are outside of the GAN training distribution. △ Less

Submitted 18 March, 2020; originally announced March 2020.

arXiv:2003.06185 [pdf, other]

doi 10.1109/ENERGYCon48941.2020.9236523

Methods for Actors in the Electric Power System to Prevent, Detect and React to ICT Attacks and Failures

Authors: Dennis van der Velde, Martin Henze, Philipp Kathmann, Erik Wassermann, Michael Andres, Detert Bracht, Raphael Ernst, George Hallak, Benedikt Klaer, Philipp Linnartz, Benjamin Meyer, Simon Ofner, Tobias Pletzer, Richard Sethmann

Abstract: The fundamental changes in power supply and increasing decentralization require more active grid operation and an increased integration of ICT at all power system actors. This trend raises complexity and increasingly leads to interactions between primary grid operation and ICT as well as different power system actors. For example, virtual power plants control various assets in the distribution gri… ▽ More The fundamental changes in power supply and increasing decentralization require more active grid operation and an increased integration of ICT at all power system actors. This trend raises complexity and increasingly leads to interactions between primary grid operation and ICT as well as different power system actors. For example, virtual power plants control various assets in the distribution grid via ICT to jointly market existing flexibilities. Failures of ICT or targeted attacks can thus have serious effects on security of supply and system stability. This paper presents a holistic approach to providing methods specifically for actors in the power system for prevention, detection, and reaction to ICT attacks and failures. The focus of our measures are solutions for ICT monitoring, systems for the detection of ICT attacks and intrusions in the process network, and the provision of actionable guidelines as well as a practice environment for the response to potential ICT security incidents. △ Less

Submitted 13 March, 2020; originally announced March 2020.

Comments: 6 pages, 4 figures, to be published in Proceedings of the 2020 6th IEEE International Energy Conference (ENERGYCon)

arXiv:1911.02564 [pdf, other]

The role of formalism in system requirements (full version)

Authors: Jean-Michel Bruel, Sophie Ebersold, Florian Galinier, Alexandr Naumchev, Manuel Mazzara, Bertrand Meyer

Abstract: A major determinant of the quality of software systems is the quality of their requirements, which should be both understandable and precise. Most requirements are written in natural language, good for understandability but lacking in precision. To make requirements precise, researchers have for years advocated the use of mathematics-based notations and methods, known as "formal". Many exist, diff… ▽ More A major determinant of the quality of software systems is the quality of their requirements, which should be both understandable and precise. Most requirements are written in natural language, good for understandability but lacking in precision. To make requirements precise, researchers have for years advocated the use of mathematics-based notations and methods, known as "formal". Many exist, differing in their style, scope and applicability. The present survey discusses some of the main formal approaches and compares them to informal methods. The analysis uses a set of 9 complementary criteria, such as level of abstraction, tool availability, traceability support. It classifies the approaches into five categories: general-purpose, natural-language, graph/automata, other mathematical notations, seamless (programming-language-based). It presents approaches in all of these categories, altogether 22 different ones, including for example SysML, Relax, Eiffel, Event-B, Alloy. The review discusses a number of open questions, including seamlessness, the role of tools and education, and how to make industrial applications benefit more from the contributions of formal approaches. (This is the full version of the survey, including some sections and two appendices which, because of length restrictions, do not appear in the submitted version.) △ Less

Submitted 16 April, 2020; v1 submitted 6 November, 2019; originally announced November 2019.

Comments: Fourth version (15 April 2020). This is the full version, including some sections and 2 appendices not appearing in the short version

arXiv:1906.06614 [pdf]

The Anatomy of Requirements

Authors: Bertrand Meyer, Jean-Michel Bruel, Sophie Ebersold, Florian Galinier, Alexandr Naumchev

Abstract: Requirements engineering is crucial to software development but lacks a precise definition of its fundamental concepts. Even the basic definitions in the literature and in industry standards are often vague and verbose. To remedy this situation and provide a solid basis for discussions of requirements, this work provides precise definitions of the fundamental requirements concepts and two systemat… ▽ More Requirements engineering is crucial to software development but lacks a precise definition of its fundamental concepts. Even the basic definitions in the literature and in industry standards are often vague and verbose. To remedy this situation and provide a solid basis for discussions of requirements, this work provides precise definitions of the fundamental requirements concepts and two systematic classifications: a taxonomy of requirement elements (such as components, goals, constraints...) ; and a taxonomy of possible relations between these elements (such as "extends", "excepts", "belongs"...). The discussion evaluates the taxonomies on published requirements documents; readers can test the concepts in two online quizzes. The intended result of this work is to spur new advances in the study and practice of software requirements by clarifying the fundamental concepts. △ Less

Submitted 11 July, 2019; v1 submitted 15 June, 2019; originally announced June 2019.

Comments: Version 2, 28 June 2019

arXiv:1905.08492 [pdf, other]

doi 10.1109/ICASSP40776.2020.9054196

DNN-Based Speech Presence Probability Estimation for Multi-Frame Single-Microphone Speech Enhancement

Authors: Marvin Tammen, Dörte Fischer, Bernd T. Meyer, Simon Doclo

Abstract: Multi-frame approaches for single-microphone speech enhancement, e.g., the multi-frame minimum-power-distortionless-response (MFMPDR) filter, are able to exploit speech correlations across neighboring time frames. In contrast to single-frame approaches such as the Wiener gain, it has been shown that multi-frame approaches achieve a substantial noise reduction with hardly any speech distortion, pro… ▽ More Multi-frame approaches for single-microphone speech enhancement, e.g., the multi-frame minimum-power-distortionless-response (MFMPDR) filter, are able to exploit speech correlations across neighboring time frames. In contrast to single-frame approaches such as the Wiener gain, it has been shown that multi-frame approaches achieve a substantial noise reduction with hardly any speech distortion, provided that an accurate estimate of the correlation matrices and especially the speech interframe correlation (IFC) vector is available. Typical estimation procedures of the IFC vector require an estimate of the speech presence probability (SPP) in each time-frequency (TF) bin. In this paper, we propose to use a bi-directional long short-term memory deep neural network (DNN) to estimate the SPP for each TF bin. Aiming at achieving a robust performance, the DNN is trained for various noise types and within a large signal-to-noise-ratio range. Experimental results show that the MFMPDR in combination with the proposed data-driven SPP estimator yields an increased speech quality compared to a state-of-the-art model-based SPP estimator. Furthermore, it is confirmed that exploiting interframe correlations in the MFMPDR is beneficial when compared to the Wiener gain especially in adverse scenarios. △ Less

Submitted 14 November, 2022; v1 submitted 21 May, 2019; originally announced May 2019.

arXiv:1902.10363 [pdf, other]

The Importance of Metric Learning for Robotic Vision: Open Set Recognition and Active Learning

Authors: Benjamin J. Meyer, Tom Drummond

Abstract: State-of-the-art deep neural network recognition systems are designed for a static and closed world. It is usually assumed that the distribution at test time will be the same as the distribution during training. As a result, classifiers are forced to categorise observations into one out of a set of predefined semantic classes. Robotic problems are dynamic and open world; a robot will likely observ… ▽ More State-of-the-art deep neural network recognition systems are designed for a static and closed world. It is usually assumed that the distribution at test time will be the same as the distribution during training. As a result, classifiers are forced to categorise observations into one out of a set of predefined semantic classes. Robotic problems are dynamic and open world; a robot will likely observe objects that are from outside of the training set distribution. Classifier outputs in robotic applications can lead to real-world robotic action and as such, a practical recognition system should not silently fail by confidently misclassifying novel observations. We show how a deep metric learning classification system can be applied to such open set recognition problems, allowing the classifier to label novel observations as unknown. Further to detecting novel examples, we propose an open set active learning approach that allows a robot to efficiently query a user about unknown observations. Our approach enables a robot to improve its understanding of the true distribution of data in the environment, from a small number of label queries. Experimental results show that our approach significantly outperforms comparable methods in both the open set recognition and active learning problems. △ Less

Submitted 27 February, 2019; originally announced February 2019.

Comments: Accepted to the 2019 IEEE International Conference on Robotics and Automation (ICRA)

arXiv:1809.11086 [pdf, other]

Learning Recurrent Binary/Ternary Weights

Authors: Arash Ardakani, Zhengyun Ji, Sean C. Smithson, Brett H. Meyer, Warren J. Gross

Abstract: Recurrent neural networks (RNNs) have shown excellent performance in processing sequence data. However, they are both complex and memory intensive due to their recursive nature. These limitations make RNNs difficult to embed on mobile devices requiring real-time processes with limited hardware resources. To address the above issues, we introduce a method that can learn binary and ternary weights d… ▽ More Recurrent neural networks (RNNs) have shown excellent performance in processing sequence data. However, they are both complex and memory intensive due to their recursive nature. These limitations make RNNs difficult to embed on mobile devices requiring real-time processes with limited hardware resources. To address the above issues, we introduce a method that can learn binary and ternary weights during the training phase to facilitate hardware implementations of RNNs. As a result, using this approach replaces all multiply-accumulate operations by simple accumulations, bringing significant benefits to custom hardware in terms of silicon area and power consumption. On the software side, we evaluate the performance (in terms of accuracy) of our method using long short-term memories (LSTMs) on various sequential models including sequence classification and language modeling. We demonstrate that our method achieves competitive results on the aforementioned tasks while using binary/ternary weights during the runtime. On the hardware side, we present custom hardware for accelerating the recurrent computations of LSTMs with binary/ternary weights. Ultimately, we show that LSTMs with binary/ternary weights can achieve up to 12x memory saving and 10x inference speedup compared to the full-precision implementation on an ASIC platform. △ Less

Submitted 24 January, 2019; v1 submitted 28 September, 2018; originally announced September 2018.

Comments: Published as a conference paper at ICLR 2019

arXiv:1808.08751 [pdf, other]

AutoFrame: Automatic Frame Inference for Object-Oriented Languages

Authors: Victor Rivera, Bertrand Meyer

Abstract: Automatic program verification has made tremendous strides, but is not yet for the masses. How do we make it less painful? This article addresses one of the obstacles: the need to specify explicit "frame clauses", expressing what properties are left unchanged by an operation. It is fair enough to ask the would-be (human) prover to state what each operation changes, and how, but the (mechanical) pr… ▽ More Automatic program verification has made tremendous strides, but is not yet for the masses. How do we make it less painful? This article addresses one of the obstacles: the need to specify explicit "frame clauses", expressing what properties are left unchanged by an operation. It is fair enough to ask the would-be (human) prover to state what each operation changes, and how, but the (mechanical) prover also requires knowledge of what it does not change. The process of specifying and verifying these properties is tedious and error-prone, and must be repeated whenever the software evolves. it is also hard to justify, since all the information about what the code changes is in the code itself. The AutoFrame tool presented here performs this analysis entirely automatically. It applies to object-oriented programming, where the issue is compounded by aliasing: if x is aliased to y, any update to x.a also affects y.a, even though the updating instruction usually does not even mention y. This aspect turns out to be the most delicate, and is addressed in AutoFrame by taking advantage of a companion tool, AutoAlias, which performs sound and sufficiently precise alias analysis, also in an entirely automatic way. Some practical results of AutoFrame so far are: (1) the automatic reconstruction (in about 25 seconds on an ordinary laptop) of the exact frame clauses, a total of 169 clauses, for an 8,000-line data structures and algorithms library which was previously (with the manually written frame clauses) verified for functional correctness using a mechanical program prover; and (2) the automatic generation (in less than 4 minutes) of frame conditions for a 150,000-line graphical and GUI library. The source code of AutoFrame and these examples are available for download. △ Less

Submitted 3 August, 2019; v1 submitted 27 August, 2018; originally announced August 2018.

arXiv:1808.08748 [pdf, other]

AutoAlias: Automatic Variable-Precision Alias Analysis for Object-Oriented Programs

Authors: Victor Rivera, Bertrand Meyer

Abstract: The aliasing question (can two reference expressions point, during an execution, to the same object?) is both one of the most critical in practice, for applications ranging from compiler optimization to programmer verification, and one of the most heavily researched, with many hundreds of publications over several decades. One might then expect that good off-the-shelf solutions are widely availabl… ▽ More The aliasing question (can two reference expressions point, during an execution, to the same object?) is both one of the most critical in practice, for applications ranging from compiler optimization to programmer verification, and one of the most heavily researched, with many hundreds of publications over several decades. One might then expect that good off-the-shelf solutions are widely available, ready to be plugged into a compiler or verifier. This is not the case. In practice, efficient and precise alias analysis remains an open problem. We present a practical tool, AutoAlias, which can be used to perform automatic alias analysis for object-oriented programs. Based on the theory of "duality semantics", an application of Abstract Interpretation ideas, it is directed at object-oriented languages and has been implemented for Eiffel as an addition to the EiffelStudio environment. It offers variable-precision analysis, controllable through the choice of a constant that governs the number of fix point iterations: a higher number means better precision and higher computation time. All the source code of AutoAlias, as well as detailed results of analyses reported in this article, are publicly available. Practical applications so far have covered a library of data structures and algorithms and a library for GUI creation. For the former, AutoAlias achieves a precision appropriate for practical purposes and execution times in the order of 25 seconds for about 8000 lines of intricate code. For the GUI library, AutoAlias produces the alias analysis in around 232 seconds for about 150000 lines of intricate code. △ Less

Submitted 19 April, 2019; v1 submitted 27 August, 2018; originally announced August 2018.

arXiv:1712.05078 [pdf]

Fourteen Years of Software Engineering at ETH Zurich

Authors: Bertrand Meyer

Abstract: A Chair of Software Engineering existed at ETH Zurich, the Swiss Federal Insti-tute of Technology, from 1 October 2001 to 31 January 2016, under my leader-ship. Our work, summarized here, covered a wide range of theoretical and practi-cal topics, with object technology in the Eiffel method as the unifying thread . A Chair of Software Engineering existed at ETH Zurich, the Swiss Federal Insti-tute of Technology, from 1 October 2001 to 31 January 2016, under my leader-ship. Our work, summarized here, covered a wide range of theoretical and practi-cal topics, with object technology in the Eiffel method as the unifying thread . △ Less

Submitted 16 December, 2017; v1 submitted 13 December, 2017; originally announced December 2017.

Comments: 118 pages, numerous figures and photographs

arXiv:1710.02801 [pdf, other]

AutoReq: expressing and verifying requirements for control systems

Authors: Alexandr Naumchev, Bertrand Meyer, Manuel Mazzara, Florian Galinier, Jean-Michel Bruel, Sophie Ebersold

Abstract: The considerable effort of writing requirements is only worthwhile if the result meets two conditions: the requirements reflect stakeholders' needs, and the implementation satisfies them. In usual approaches, the use of different notations for requirements (often natural language) and implementations (a programming language) makes both conditions elusive. AutoReq, presented in this article, takes… ▽ More The considerable effort of writing requirements is only worthwhile if the result meets two conditions: the requirements reflect stakeholders' needs, and the implementation satisfies them. In usual approaches, the use of different notations for requirements (often natural language) and implementations (a programming language) makes both conditions elusive. AutoReq, presented in this article, takes a different approach to both the writing of requirements and their verification. Applying the approach to a well-documented example, a landing gear system, allowed for a mechanical proof of consistency and uncovered an error in a published discussion of the problem. △ Less

Submitted 4 February, 2019; v1 submitted 8 October, 2017; originally announced October 2017.

arXiv:1705.09780 [pdf, other]

Deep Metric Learning and Image Classification with Nearest Neighbour Gaussian Kernels

Authors: Benjamin J. Meyer, Ben Harwood, Tom Drummond

Abstract: We present a Gaussian kernel loss function and training algorithm for convolutional neural networks that can be directly applied to both distance metric learning and image classification problems. Our method treats all training features from a deep neural network as Gaussian kernel centres and computes loss by summing the influence of a feature's nearby centres in the feature embedding space. Our… ▽ More We present a Gaussian kernel loss function and training algorithm for convolutional neural networks that can be directly applied to both distance metric learning and image classification problems. Our method treats all training features from a deep neural network as Gaussian kernel centres and computes loss by summing the influence of a feature's nearby centres in the feature embedding space. Our approach is made scalable by treating it as an approximate nearest neighbour search problem. We show how to make end-to-end learning feasible, resulting in a well formed embedding space, in which semantically related instances are likely to be located near one another, regardless of whether or not the network was trained on those classes. Our approach outperforms state-of-the-art deep metric learning approaches on embedding learning challenges, as well as conventional softmax classification on several datasets. △ Less

Submitted 2 July, 2018; v1 submitted 27 May, 2017; originally announced May 2017.

Comments: Accepted in the International Conference on Image Processing (ICIP) 2018. Formerly titled Nearest Neighbour Radial Basis Function Solvers for Deep Neural Networks

arXiv:1704.04905 [pdf, other]

A contract-based method to specify stimulus-response requirements

Authors: Alexandr Naumchev, Manuel Mazzara, Bertrand Meyer, Jean-Michel Bruel, Florian Galinier, Sophie Ebersold

Abstract: A number of formal methods exist for capturing stimulus-response requirements in a declarative form. Someone yet needs to translate the resulting declarative statements into imperative programs. The present article describes a method for specification and verification of stimulus-response requirements in the form of imperative program routines with conditionals and assertions. A program prover the… ▽ More A number of formal methods exist for capturing stimulus-response requirements in a declarative form. Someone yet needs to translate the resulting declarative statements into imperative programs. The present article describes a method for specification and verification of stimulus-response requirements in the form of imperative program routines with conditionals and assertions. A program prover then checks a candidate program directly against the stated requirements. The article illustrates the approach by applying it to an ASM model of the Landing Gear System, a widely used realistic example proposed for evaluating specification and verification techniques. △ Less

Submitted 17 April, 2017; originally announced April 2017.

arXiv:1704.04189 [pdf, other]

Seamless Requirements

Authors: Alexandr Naumchev, Bertrand Meyer

Abstract: Popular notations for functional requirements specifications frequently ignore developers' needs, target specific development models, or require translation of requirements into tests for verification; the results can give out-of-sync or downright incompatible artifacts. Seamless Requirements, a new approach to specifying functional requirements, contributes to developers' understanding of require… ▽ More Popular notations for functional requirements specifications frequently ignore developers' needs, target specific development models, or require translation of requirements into tests for verification; the results can give out-of-sync or downright incompatible artifacts. Seamless Requirements, a new approach to specifying functional requirements, contributes to developers' understanding of requirements and to software quality regardless of the process, while the process itself becomes lighter due to the absence of tests in the presence of formal verification. A development case illustrates these benefits, and a discussion compares seamless requirements to other approaches. △ Less

Submitted 13 April, 2017; originally announced April 2017.

Showing 1–50 of 88 results for author: Meyer, B