Skip to main content

Showing 1–8 of 8 results for author: Paulus, J

Searching in archive eess. Search in all archives.
.
  1. arXiv:2206.02125  [pdf, other

    eess.AS cs.SD

    Geometrically-Motivated Primary-Ambient Decomposition With Center-Channel Extraction

    Authors: Jouni Paulus, Matteo Torcoli

    Abstract: A geometrically-motivated method for primary-ambient decomposition is proposed and evaluated in an up-mixing application. The method consists of two steps, accommodating a particularly intuitive explanation. The first step consists of signal-adaptive rotations applied on the input stereo scene, which translate the primary sound sources into the center of the rotated scene. The second step applies… ▽ More

    Submitted 5 June, 2022; originally announced June 2022.

    Comments: accepted into EUSIPCO 2022

  2. arXiv:2206.02124  [pdf, other

    eess.AS cs.SD

    Sampling Frequency Independent Dialogue Separation

    Authors: Jouni Paulus, Matteo Torcoli

    Abstract: In some DNNs for audio source separation, the relevant model parameters are independent of the sampling frequency of the audio used for training. Considering the application of dialogue separation, this is shown for two DNN architectures: a U-Net and a fully-convolutional model. The models are trained with audio sampled at 8 kHz. The learned parameters are transferred to models for processing audi… ▽ More

    Submitted 5 June, 2022; originally announced June 2022.

    Comments: accepted into EUSIPCO 2022

  3. arXiv:2204.02661  [pdf, other

    cs.LG cs.CV eess.IV

    CAIPI in Practice: Towards Explainable Interactive Medical Image Classification

    Authors: Emanuel Slany, Yannik Ott, Stephan Scheele, Jan Paulus, Ute Schmid

    Abstract: Would you trust physicians if they cannot explain their decisions to you? Medical diagnostics using machine learning gained enormously in importance within the last decade. However, without further enhancements many state-of-the-art machine learning methods are not suitable for medical application. The most important reasons are insufficient data set quality and the black-box behavior of machine l… ▽ More

    Submitted 31 May, 2022; v1 submitted 6 April, 2022; originally announced April 2022.

    Comments: Manuscript accepted at IFIP AIAI 2022, correct typo in Discussion

  4. arXiv:2112.09494  [pdf

    eess.AS eess.SP

    Dialog+ in Broadcasting: First Field Tests Using Deep-Learning-Based Dialogue Enhancement

    Authors: Matteo Torcoli, Christian Simon, Jouni Paulus, Davide Straninger, Alfred Riedel, Volker Koch, Stefan Wits, Daniela Rieger, Harald Fuchs, Christian Uhle, Stefan Meltzer, Adrian Murtaza

    Abstract: Difficulties in following speech due to loud background sounds are common in broadcasting. Object-based audio, e.g., MPEG-H Audio solves this problem by providing a user-adjustable speech level. While object-based audio is gaining momentum, transitioning to it requires time and effort. Also, lots of content exists, produced and archived outside the object-based workflows. To address this, Fraunhof… ▽ More

    Submitted 17 December, 2021; originally announced December 2021.

    Comments: Presented at IBC 2021 (International Broadcasting Convention)

  5. Controlling the Perceived Sound Quality for Dialogue Enhancement with Deep Learning

    Authors: Christian Uhle, Matteo Torcoli, Jouni Paulus

    Abstract: Speech enhancement attenuates interfering sounds in speech signals but may introduce artifacts that perceivably deteriorate the output signal. We propose a method for controlling the trade-off between the attenuation of the interfering background signal and the loss of sound quality. A deep neural network estimates the attenuation of the separated background signal such that the sound quality, qua… ▽ More

    Submitted 22 July, 2021; originally announced July 2021.

    Comments: Accepted paper at ICASSP 2020

    Journal ref: ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

  6. Controlling the Remixing of Separated Dialogue with a Non-Intrusive Quality Estimate

    Authors: Matteo Torcoli, Jouni Paulus, Thorsten Kastner, Christian Uhle

    Abstract: Remixing separated audio sources trades off interferer attenuation against the amount of audible deteriorations. This paper proposes a non-intrusive audio quality estimation method for controlling this trade-off in a signal-adaptive manner. The recently proposed 2f-model is adopted as the underlying quality measure, since it has been shown to correlate strongly with basic audio quality in source s… ▽ More

    Submitted 21 July, 2021; originally announced July 2021.

    Comments: Manuscript accepted for the 2021 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics

  7. arXiv:2106.09093  [pdf, other

    eess.AS cs.SD

    A Hands-on Comparison of DNNs for Dialog Separation Using Transfer Learning from Music Source Separation

    Authors: Martin Strauss, Jouni Paulus, Matteo Torcoli, Bernd Edler

    Abstract: This paper describes a hands-on comparison on using state-of-the-art music source separation deep neural networks (DNNs) before and after task-specific fine-tuning for separating speech content from non-speech content in broadcast audio (i.e., dialog separation). The music separation models are selected as they share the number of channels (2) and sampling rate (44.1 kHz or higher) with the consid… ▽ More

    Submitted 22 June, 2021; v1 submitted 16 June, 2021; originally announced June 2021.

    Comments: accepted in INTERSPEECH 2021

  8. arXiv:1909.11549  [pdf, other

    eess.AS cs.SD

    MPEG-H Audio for Improving Accessibility in Broadcasting and Streaming

    Authors: Christian Simon, Matteo Torcoli, Jouni Paulus

    Abstract: Broadcasting and streaming services still suffer from various levels of accessibility barriers for a significant portion of the population, limiting the access to information and culture, and in the most severe cases limiting the empowerment of people. This paper provides a brief overview of some of the most common accessibility barriers encountered. It then gives a short introduction to object-ba… ▽ More

    Submitted 25 September, 2019; originally announced September 2019.

    Comments: White Paper