Search | arXiv e-print repository

High-Fidelity Noise Reduction with Differentiable Signal Processing

Authors: Christian J. Steinmetz, Thomas Walther, Joshua D. Reiss

Abstract: Noise reduction techniques based on deep learning have demonstrated impressive performance in enhancing the overall quality of recorded speech. While these approaches are highly performant, their application in audio engineering can be limited due to a number of factors. These include operation only on speech without support for music, lack of real-time capability, lack of interpretable control pa… ▽ More Noise reduction techniques based on deep learning have demonstrated impressive performance in enhancing the overall quality of recorded speech. While these approaches are highly performant, their application in audio engineering can be limited due to a number of factors. These include operation only on speech without support for music, lack of real-time capability, lack of interpretable control parameters, operation at lower sample rates, and a tendency to introduce artifacts. On the other hand, signal processing-based noise reduction algorithms offer fine-grained control and operation on a broad range of content, however, they often require manual operation to achieve the best results. To address the limitations of both approaches, in this work we introduce a method that leverages a signal processing-based denoiser that when combined with a neural network controller, enables fully automatic and high-fidelity noise reduction on both speech and music signals. We evaluate our proposed method with objective metrics and a perceptual listening test. Our evaluation reveals that speech enhancement models can be extended to music, however training the model to remove only stationary noise is critical. Furthermore, our proposed approach achieves performance on par with the deep learning models, while being significantly more efficient and introducing fewer artifacts in some cases. Listening examples are available online at https://tape.it/research/denoiser . △ Less

Submitted 17 October, 2023; originally announced October 2023.

Comments: Accepted for publication at the 155th Convention of the Audio Engineering Society

arXiv:1711.09793 [pdf, other]

The Status of Quantum-Based Long-Term Secure Communication over the Internet

Authors: Matthias Geihs, Oleg Nikiforov, Denise Demirel, Alexander Sauer, Denis Butin, Felix Günther, Gernot Alber, Thomas Walther, Johannes Buchmann

Abstract: Sensitive digital data, such as health information or governmental archives, are often stored for decades or centuries. The processing of such data calls for long-term security. Secure channels on the Internet require robust key establishment methods. Currently used key distribution protocols are either vulnerable to future attacks based on Shor's algorithm, or vulnerable in principle due to their… ▽ More Sensitive digital data, such as health information or governmental archives, are often stored for decades or centuries. The processing of such data calls for long-term security. Secure channels on the Internet require robust key establishment methods. Currently used key distribution protocols are either vulnerable to future attacks based on Shor's algorithm, or vulnerable in principle due to their reliance on computational problems. Quantum-based key distribution protocols are information-theoretically secure and offer long-term security. However, significant obstacles to their real-world use remain. This paper, which results from a multidisciplinary project involving computer scientists and physicists, systematizes knowledge about obstacles to and strategies for the realization of long-term secure Internet communication from quantum-based key distribution. We discuss performance and security particulars, consider the specific challenges arising from multi-user network settings, and identify key challenges for actual deployment. △ Less

Submitted 27 November, 2017; originally announced November 2017.

arXiv:1704.03724 [pdf, other]

Unsupervised Construction of Human Body Models Using Principles of Organic Computing

Authors: Thomas Walther, Rolf P. Würtz

Abstract: Unsupervised learning of a generalizable model of the visual appearance of humans from video data is of major importance for computing systems interacting naturally with their users and others. We propose a step towards automatic behavior understanding by integrating principles of Organic Computing into the posture estimation cycle, thereby relegating the need for human intervention while simultan… ▽ More Unsupervised learning of a generalizable model of the visual appearance of humans from video data is of major importance for computing systems interacting naturally with their users and others. We propose a step towards automatic behavior understanding by integrating principles of Organic Computing into the posture estimation cycle, thereby relegating the need for human intervention while simultaneously raising the level of system autonomy. The system extracts coherent motion from moving upper bodies and autonomously decides about limbs and their possible spatial relationships. The models from many videos are integrated into meta-models, which show good generalization to different individuals, backgrounds, and attire. These models allow robust interpretation of single video frames without temporal continuity and posture mimicking by an android robot. △ Less

Submitted 12 April, 2017; originally announced April 2017.

ACM Class: I.2.10; I.5.4

arXiv:1606.07598 [pdf, ps, other]

An Active Machine Hearing System for Auditory Stream Segregation

Authors: Christopher Schymura, Thomas Walther, Dorothea Kolossa

Abstract: This study describes a binaural machine hearing system that is capable of performing auditory stream segregation in scenarios where multiple sound sources are present. The process of stream segregation refers to the capability of human listeners to group acoustic signals into sets of distinct auditory streams, corresponding to individual sound sources. The proposed computational framework mimics t… ▽ More This study describes a binaural machine hearing system that is capable of performing auditory stream segregation in scenarios where multiple sound sources are present. The process of stream segregation refers to the capability of human listeners to group acoustic signals into sets of distinct auditory streams, corresponding to individual sound sources. The proposed computational framework mimics this ability via a probabilistic clustering scheme for joint localization and segregation. This scheme is based on mixtures of von Mises distributions to model the angular positions of the sound sources surrounding the listener. The distribution parameters are estimated using block-wise processing of auditory cues extracted from binaural signals. Additionally, the proposed system can conduct rotational head movements to improve localization and stream segregation performance. Evaluation of the system is conducted in scenarios containing multiple simultaneously active speech and non-speech sounds placed at different positions relative to the listener. △ Less

Submitted 24 June, 2016; originally announced June 2016.

Showing 1–4 of 4 results for author: Walther, T