-
Airborne Sound Analysis for the Detection of Bearing Faults in Railway Vehicles with Real-World Data
Authors:
Matthias Kreuzer,
David Schmidt,
Simon Wokusch,
Walter Kellermann
Abstract:
In this paper, we address the challenging problem of detecting bearing faults in railway vehicles by analyzing acoustic signals recorded during regular operation. For this, we introduce Mel Frequency Cepstral Coefficients (MFCCs) as features, which form the input to a simple Multi-Layer Perceptron classifier. The proposed method is evaluated with real-world data that was obtained for state-of-the-…
▽ More
In this paper, we address the challenging problem of detecting bearing faults in railway vehicles by analyzing acoustic signals recorded during regular operation. For this, we introduce Mel Frequency Cepstral Coefficients (MFCCs) as features, which form the input to a simple Multi-Layer Perceptron classifier. The proposed method is evaluated with real-world data that was obtained for state-of-the-art commuter railway vehicles in a measurement campaign. The experiments show that with the chosen MFCC features bearing faults can be reliably detected even for bearing damages that were not included in training.
△ Less
Submitted 24 May, 2023; v1 submitted 14 April, 2023;
originally announced April 2023.
-
SA-SASV: An End-to-End Spoof-Aggregated Spoofing-Aware Speaker Verification System
Authors:
Zhongwei Teng,
Quchen Fu,
Jules White,
Maria E. Powell,
Douglas C. Schmidt
Abstract:
Research in the past several years has boosted the performance of automatic speaker verification systems and countermeasure systems to deliver low Equal Error Rates (EERs) on each system. However, research on joint optimization of both systems is still limited. The Spoofing-Aware Speaker Verification (SASV) 2022 challenge was proposed to encourage the development of integrated SASV systems with ne…
▽ More
Research in the past several years has boosted the performance of automatic speaker verification systems and countermeasure systems to deliver low Equal Error Rates (EERs) on each system. However, research on joint optimization of both systems is still limited. The Spoofing-Aware Speaker Verification (SASV) 2022 challenge was proposed to encourage the development of integrated SASV systems with new metrics to evaluate joint model performance. This paper proposes an ensemble-free end-to-end solution, known as Spoof-Aggregated-SASV (SA-SASV) to build a SASV system with multi-task classifiers, which are optimized by multiple losses and has more flexible requirements in training set. The proposed system is trained on the ASVSpoof 2019 LA dataset, a spoof verification dataset with small number of bonafide speakers. Results of SASV-EER indicate that the model performance can be further improved by training in complete automatic speaker verification and countermeasure datasets.
△ Less
Submitted 24 March, 2022; v1 submitted 12 March, 2022;
originally announced March 2022.
-
FastAudio: A Learnable Audio Front-End for Spoof Speech Detection
Authors:
Quchen Fu,
Zhongwei Teng,
Jules White,
Maria Powell,
Douglas C. Schmidt
Abstract:
Voice assistants, such as smart speakers, have exploded in popularity. It is currently estimated that the smart speaker adoption rate has exceeded 35% in the US adult population. Manufacturers have integrated speaker identification technology, which attempts to determine the identity of the person speaking, to provide personalized services to different members of the same family. Speaker identific…
▽ More
Voice assistants, such as smart speakers, have exploded in popularity. It is currently estimated that the smart speaker adoption rate has exceeded 35% in the US adult population. Manufacturers have integrated speaker identification technology, which attempts to determine the identity of the person speaking, to provide personalized services to different members of the same family. Speaker identification can also play an important role in controlling how the smart speaker is used. For example, it is not critical to correctly identify the user when playing music. However, when reading the user's email out loud, it is critical to correctly verify the speaker that making the request is the authorized user. Speaker verification systems, which authenticate the speaker identity, are therefore needed as a gatekeeper to protect against various spoofing attacks that aim to impersonate the enrolled user. This paper compares popular learnable front-ends which learn the representations of audio by joint training with downstream tasks (End-to-End). We categorize the front-ends by defining two generic architectures and then analyze the filtering stages of both types in terms of learning constraints. We propose replacing fixed filterbanks with a learnable layer that can better adapt to anti-spoofing tasks. The proposed FastAudio front-end is then tested with two popular back-ends to measure the performance on the LA track of the ASVspoof 2019 dataset. The FastAudio front-end achieves a relative improvement of 27% when compared with fixed front-ends, outperforming all other learnable front-ends on this task.
△ Less
Submitted 6 September, 2021;
originally announced September 2021.
-
Complementing Handcrafted Features with Raw Waveform Using a Light-weight Auxiliary Model
Authors:
Zhongwei Teng,
Quchen Fu,
Jules White,
Maria Powell,
Douglas C. Schmidt
Abstract:
An emerging trend in audio processing is capturing low-level speech representations from raw waveforms. These representations have shown promising results on a variety of tasks, such as speech recognition and speech separation. Compared to handcrafted features, learning speech features via backpropagation provides the model greater flexibility in how it represents data for different tasks theoreti…
▽ More
An emerging trend in audio processing is capturing low-level speech representations from raw waveforms. These representations have shown promising results on a variety of tasks, such as speech recognition and speech separation. Compared to handcrafted features, learning speech features via backpropagation provides the model greater flexibility in how it represents data for different tasks theoretically. However, results from empirical study shows that, in some tasks, such as voice spoof detection, handcrafted features are more competitive than learned features. Instead of evaluating handcrafted features and raw waveforms independently, this paper proposes an Auxiliary Rawnet model to complement handcrafted features with features learned from raw waveforms. A key benefit of the approach is that it can improve accuracy at a relatively low computational cost. The proposed Auxiliary Rawnet model is tested using the ASVspoof 2019 dataset and the results from this dataset indicate that a light-weight waveform encoder can potentially boost the performance of handcrafted-features-based encoders in exchange for a small amount of additional computational work.
△ Less
Submitted 6 September, 2021;
originally announced September 2021.
-
Evaluation of gps/glonass patch versus rf gps (L1) patch antenna performance parameter
Authors:
Gholam Aghashirin,
Hoda S. Abdel Aty Zohdy,
Mohamed A. Zohdy,
Darrell Schmidt,
Adam Timmons
Abstract:
In any wireless communication network and system an antenna is an important element along the propagation path of an electrical signals. Antenna module is a vital component of automated driving systems, it should function as needed in dGPS, HD map correction services, and radio and navigation systems. The main scope of this engineering research work involves the evaluation and determining the perf…
▽ More
In any wireless communication network and system an antenna is an important element along the propagation path of an electrical signals. Antenna module is a vital component of automated driving systems, it should function as needed in dGPS, HD map correction services, and radio and navigation systems. The main scope of this engineering research work involves the evaluation and determining the performance parameter and characteristic of the GPS/GLONASS patch vs RF GPS L1(1.57542 GHz) patch antenna characteristic. FEKO simulation studies are carried out to extensively compare, make an assessment and evaluate the characteristic and performance parameter, such as the average/passive gain of the proposed antenna in the presence of background noise. Prior to the start of the FEKO simulation studies, a physical mechanical dimension measurements via a Digital instrumentation were conducted for the following: Radiating Element Size: The actual length (L), and width (W), Substrate Material Size: The substrate length (Lsub), width (Wsub), and height (h). The proposed antenna model for GPS only patch antenna operating at 1.57542 GHz and the dual band patch antenna resonating at 1.5925 GHz are developed. To be specific, this work presents the design, modeling, determining passive gain of the RF GPS L1 patch vs. dual band patch antenna with intended targeted applications within the automotive system and space. Simulation are undertaken to generate the RF GPS L1 patch and dual band patch antenna structure respectively for the sole purpose of evaluating the performance of the proposed dual band antenna. Simulation are performed rather than mathematical modelling. The emphasis of this paper is how to obtain the equivalent amount of total passive gain in a GPS vs. that of dual band antenna.
△ Less
Submitted 7 September, 2020;
originally announced September 2020.
-
Noisier2Noise: Learning to Denoise from Unpaired Noisy Data
Authors:
Nick Moran,
Dan Schmidt,
Yu Zhong,
Patrick Coady
Abstract:
We present a method for training a neural network to perform image denoising without access to clean training examples or access to paired noisy training examples. Our method requires only a single noisy realization of each training example and a statistical model of the noise distribution, and is applicable to a wide variety of noise models, including spatially structured noise. Our model produce…
▽ More
We present a method for training a neural network to perform image denoising without access to clean training examples or access to paired noisy training examples. Our method requires only a single noisy realization of each training example and a statistical model of the noise distribution, and is applicable to a wide variety of noise models, including spatially structured noise. Our model produces results which are competitive with other learned methods which require richer training data, and outperforms traditional non-learned denoising methods. We present derivations of our method for arbitrary additive noise, an improvement specific to Gaussian additive noise, and an extension to multiplicative Bernoulli noise.
△ Less
Submitted 25 October, 2019;
originally announced October 2019.