Skip to main content

Showing 1–7 of 7 results for author: Mack, W

Searching in archive eess. Search in all archives.
.
  1. arXiv:2303.07143  [pdf, other

    eess.AS cs.LG cs.SD

    Multi-Microphone Speaker Separation by Spatial Regions

    Authors: Julian Wechsler, Srikanth Raj Chetupalli, Wolfgang Mack, Emanuël A. P. Habets

    Abstract: We consider the task of region-based source separation of reverberant multi-microphone recordings. We assume pre-defined spatial regions with a single active source per region. The objective is to estimate the signals from the individual spatial regions as captured by a reference microphone while retaining a correspondence between signals and spatial regions. We propose a data-driven approach usin… ▽ More

    Submitted 13 March, 2023; originally announced March 2023.

    Comments: Submitted to the 2023 IEEE International Conference on Acoustics, Speech, and Signal Processing

  2. arXiv:2206.13808  [pdf, other

    eess.AS cs.SD

    Speaker Verification in Multi-Speaker Environments Using Temporal Feature Fusion

    Authors: Ahmad Aloradi, Wolfgang Mack, Mohamed Elminshawi, Emanuël A. P. Habets

    Abstract: Verifying the identity of a speaker is crucial in modern human-machine interfaces, e.g., to ensure privacy protection or to enable biometric authentication. Classical speaker verification (SV) approaches estimate a fixed-dimensional embedding from a speech utterance that encodes the speaker's voice characteristics. A speaker is verified if his/her voice embedding is sufficiently similar to the emb… ▽ More

    Submitted 28 June, 2022; originally announced June 2022.

    Comments: To be presented at EUSIPCO 2022

  3. arXiv:2202.00733  [pdf, other

    eess.AS cs.SD

    New Insights on Target Speaker Extraction

    Authors: Mohamed Elminshawi, Wolfgang Mack, Srikanth Raj Chetupalli, Soumitro Chakrabarty, Emanuël A. P. Habets

    Abstract: Speaker extraction (SE) aims to segregate the speech of a target speaker from a mixture of interfering speakers with the help of auxiliary information. Several forms of auxiliary information have been employed in single-channel SE, such as a speech snippet enrolled from the target speaker or visual information corresponding to the spoken utterance. The effectiveness of the auxiliary information in… ▽ More

    Submitted 15 September, 2023; v1 submitted 1 February, 2022; originally announced February 2022.

  4. Signal-Aware Direction-of-Arrival Estimation Using Attention Mechanisms

    Authors: Wolfgang Mack, Julian Wechsler, Emanuël A. P. Habets

    Abstract: The direction-of-arrival (DOA) of sound sources is an essential acoustic parameter used, e.g., for multi-channel speech enhancement or source tracking. Complex acoustic scenarios consisting of sources-of-interest, interfering sources, reverberation, and noise make the estimation of the DOAs corresponding to the sources-of-interest a challenging task. Recently proposed attention mechanisms allow DO… ▽ More

    Submitted 3 January, 2022; originally announced January 2022.

  5. arXiv:2011.04569  [pdf, other

    eess.AS cs.AI cs.SD

    Informed Source Extraction With Application to Acoustic Echo Reduction

    Authors: Mohamed Elminshawi, Wolfgang Mack, Emanuël A. P. Habets

    Abstract: Informed speaker extraction aims to extract a target speech signal from a mixture of sources given prior knowledge about the desired speaker. Recent deep learning-based methods leverage a speaker discriminative model that maps a reference snippet uttered by the target speaker into a single embedding vector that encapsulates the characteristics of the target speaker. However, such modeling delibera… ▽ More

    Submitted 26 October, 2021; v1 submitted 9 November, 2020; originally announced November 2020.

    Comments: Published at ITG 2021

    Report number: 978-3-8007-5627-8

  6. Efficient Training Data Generation for Phase-Based DOA Estimation

    Authors: Fabian Hübner, Wolfgang Mack, Emanuël A. P. Habets

    Abstract: Deep learning (DL) based direction of arrival (DOA) estimation is an active research topic and currently represents the state-of-the-art. Usually, DL-based DOA estimators are trained with recorded data or computationally expensive generated data. Both data types require significant storage and excessive time to, respectively, record or generate. We propose a low complexity online data generation m… ▽ More

    Submitted 9 November, 2020; originally announced November 2020.

    Comments: Submitted to ICASSP 2021

  7. Deep Filtering: Signal Extraction and Reconstruction Using Complex Time-Frequency Filters

    Authors: Wolfgang Mack, Emanuël A. P. Habets

    Abstract: Signal extraction from a single-channel mixture with additional undesired signals is most commonly performed using time-frequency (TF) masks. Typically, the mask is estimated with a deep neural network (DNN), and element-wise applied to the complex mixture short-time Fourier transform (STFT) representation to perform the extraction. Ideal mask magnitudes are zero for solely undesired signals in a… ▽ More

    Submitted 9 December, 2019; v1 submitted 17 April, 2019; originally announced April 2019.