Skip to main content

Showing 1–7 of 7 results for author: Strauss, M

Searching in archive eess. Search in all archives.
.
  1. SEFGAN: Harvesting the Power of Normalizing Flows and GANs for Efficient High-Quality Speech Enhancement

    Authors: Martin Strauss, Nicola Pia, Nagashree K. S. Rao, Bernd Edler

    Abstract: This paper proposes SEFGAN, a Deep Neural Network (DNN) combining maximum likelihood training and Generative Adversarial Networks (GANs) for efficient speech enhancement (SE). For this, a DNN is trained to synthesize the enhanced speech conditioned on noisy speech using a Normalizing Flow (NF) as generator in a GAN framework. While the combination of likelihood models and GANs is not trivial, SEFG… ▽ More

    Submitted 4 December, 2023; originally announced December 2023.

    Comments: Preprint. Accepted to IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) 2023

  2. arXiv:2305.19100  [pdf, other

    eess.AS cs.SD

    Predicting Preferred Dialogue-to-Background Loudness Difference in Dialogue-Separated Audio

    Authors: Luca Resti, Martin Strauss, Matteo Torcoli, Emanuël Habets, Bernd Edler

    Abstract: Dialogue Enhancement (DE) enables the rebalancing of dialogue and background sounds to fit personal preferences and needs in the context of broadcast audio. When individual audio stems are unavailable from production, Dialogue Separation (DS) can be applied to the final audio mixture to obtain estimates of these stems. This work focuses on Preferred Loudness Differences (PLDs) between dialogue and… ▽ More

    Submitted 31 May, 2023; v1 submitted 30 May, 2023; originally announced May 2023.

    Comments: Paper accepted at the 15th International Conference on Quality of Multimedia Experience (QoMEX), 4 pages, 2 figures

  3. arXiv:2305.08812  [pdf, other

    cs.LO cs.SE eess.SY

    Slow Down, Move Over: A Case Study in Formal Verification, Refinement, and Testing of the Responsibility-Sensitive Safety Model for Self-Driving Cars

    Authors: Megan Strauss, Stefan Mitsch

    Abstract: Technology advances give us the hope of driving without human error, reducing vehicle emissions and simplifying an everyday task with the future of self-driving cars. Making sure these vehicles are safe is very important to the continuation of this field. In this paper, we formalize the Responsibility-Sensitive Safety model (RSS) for self-driving cars and prove the safety and optimality of this mo… ▽ More

    Submitted 15 May, 2023; originally announced May 2023.

  4. arXiv:2210.11654  [pdf, other

    eess.AS cs.SD

    Improved Normalizing Flow-Based Speech Enhancement using an All-pole Gammatone Filterbank for Conditional Input Representation

    Authors: Martin Strauss, Matteo Torcoli, Bernd Edler

    Abstract: Deep generative models for Speech Enhancement (SE) received increasing attention in recent years. The most prominent example are Generative Adversarial Networks (GANs), while normalizing flows (NF) received less attention despite their potential. Building on previous work, architectural modifications are proposed, along with an investigation of different conditional input representations. Despite… ▽ More

    Submitted 20 October, 2022; originally announced October 2022.

    Comments: Accepted for Presentation at IEEE SLT 2022

  5. arXiv:2106.09093  [pdf, other

    eess.AS cs.SD

    A Hands-on Comparison of DNNs for Dialog Separation Using Transfer Learning from Music Source Separation

    Authors: Martin Strauss, Jouni Paulus, Matteo Torcoli, Bernd Edler

    Abstract: This paper describes a hands-on comparison on using state-of-the-art music source separation deep neural networks (DNNs) before and after task-specific fine-tuning for separating speech content from non-speech content in broadcast audio (i.e., dialog separation). The music separation models are selected as they share the number of channels (2) and sampling rate (44.1 kHz or higher) with the consid… ▽ More

    Submitted 22 June, 2021; v1 submitted 16 June, 2021; originally announced June 2021.

    Comments: accepted in INTERSPEECH 2021

  6. A Flow-Based Neural Network for Time Domain Speech Enhancement

    Authors: Martin Strauss, Bernd Edler

    Abstract: Speech enhancement involves the distinction of a target speech signal from an intrusive background. Although generative approaches using Variational Autoencoders or Generative Adversarial Networks (GANs) have increasingly been used in recent years, normalizing flow (NF) based systems are still scarse, despite their success in related fields. Thus, in this paper we propose a NF framework to directl… ▽ More

    Submitted 16 June, 2021; originally announced June 2021.

    Comments: Accepted to ICASSP 2021

  7. arXiv:1907.04655  [pdf, other

    eess.SP cs.SD eess.AS

    Audio-Based Search and Rescue with a Drone: Highlights from the IEEE Signal Processing Cup 2019 Student Competition

    Authors: Antoine Deleforge, Diego Di Carlo, Martin Strauss, Romain Serizel, Lucio Marcenaro

    Abstract: Unmanned aerial vehicles (UAV), commonly referred to as drones, have raised increasing interest in recent years. Search and rescue scenarios where humans in emergency situations need to be quickly found in areas difficult to access constitute an important field of application for this technology. While research efforts have mostly focused on develo** video-based solutions for this task \cite{lop… ▽ More

    Submitted 3 July, 2019; originally announced July 2019.

    Journal ref: IEEE Signal Processing Magazine, Institute of Electrical and Electronics Engineers, In press