Skip to main content

Showing 1–22 of 22 results for author: Magron, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2401.13548  [pdf

    cs.SD eess.AS

    A Phoneme-Scale Assessment of Multichannel Speech Enhancement Algorithms

    Authors: Nasser-Eddine Monir, Paul Magron, Romain Serizel

    Abstract: In the intricate acoustic landscapes where speech intelligibility is challenged by noise and reverberation, multichannel speech enhancement emerges as a promising solution for individuals with hearing loss. Such algorithms are commonly evaluated at the utterance level. However, this approach overlooks the granular acoustic nuances revealed by phoneme-specific analysis, potentially obscuring key in… ▽ More

    Submitted 24 January, 2024; originally announced January 2024.

    Comments: This is the preprint of the paper that we submitted to the Trends in Hearing Journal

  2. arXiv:2303.01864  [pdf, ps, other

    cs.SD eess.AS

    Spectrogram Inversion for Audio Source Separation via Consistency, Mixing, and Magnitude Constraints

    Authors: Paul Magron, Tuomas Virtanen

    Abstract: Audio source separation is often achieved by estimating the magnitude spectrogram of each source, and then applying a phase recovery (or spectrogram inversion) algorithm to retrieve time-domain signals. Typically, spectrogram inversion is treated as an optimization problem involving one or several terms in order to promote estimates that comply with a consistency property, a mixing constraint, and… ▽ More

    Submitted 30 June, 2023; v1 submitted 3 March, 2023; originally announced March 2023.

  3. Algorithms for audio inpainting based on probabilistic nonnegative matrix factorization

    Authors: Ondřej Mokrý, Paul Magron, Thomas Oberlin, Cédric Févotte

    Abstract: Audio inpainting, i.e., the task of restoring missing or occluded audio signal samples, usually relies on sparse representations or autoregressive modeling. In this paper, we propose to structure the spectrogram with nonnegative matrix factorization (NMF) in a probabilistic framework. First, we treat the missing samples as latent variables, and derive two expectation-maximization algorithms for es… ▽ More

    Submitted 5 January, 2023; v1 submitted 28 June, 2022; originally announced June 2022.

  4. A majorization-minimization algorithm for nonnegative binary matrix factorization

    Authors: Paul Magron, Cédric Févotte

    Abstract: This paper tackles the problem of decomposing binary data using matrix factorization. We consider the family of mean-parametrized Bernoulli models, a class of generative models that are well suited for modeling binary data and enables interpretability of the factors. We factorize the Bernoulli parameter and consider an additional Beta prior on one of the factors to further improve the model's expr… ▽ More

    Submitted 20 April, 2022; originally announced April 2022.

  5. arXiv:2204.01360  [pdf, other

    cs.SD eess.AS eess.SP

    Learning the Proximity Operator in Unfolded ADMM for Phase Retrieval

    Authors: Pierre-Hugo Vial, Paul Magron, Thomas Oberlin, Cédric Févotte

    Abstract: This paper considers the phase retrieval (PR) problem, which aims to reconstruct a signal from phaseless measurements such as magnitude or power spectrograms. PR is generally handled as a minimization problem involving a quadratic loss. Recent works have considered alternative discrepancy measures, such as the Bregman divergences, but it is still challenging to tailor the optimal loss for a given… ▽ More

    Submitted 4 April, 2022; originally announced April 2022.

    Comments: 10 pages, 5 figures, submitted to IEEE SPL

  6. arXiv:2203.15758  [pdf, other

    cs.LG cs.SD eess.AS

    A Sparsity-promoting Dictionary Model for Variational Autoencoders

    Authors: Mostafa Sadeghi, Paul Magron

    Abstract: Structuring the latent space in probabilistic deep generative models, e.g., variational autoencoders (VAEs), is important to yield more expressive models and interpretable representations, and to avoid overfitting. One way to achieve this objective is to impose a sparsity constraint on the latent variables, e.g., via a Laplace prior. However, such approaches usually complicate the training phase,… ▽ More

    Submitted 17 June, 2022; v1 submitted 29 March, 2022; originally announced March 2022.

    Comments: Proc. of Interspeech 2022

  7. arXiv:2102.12369  [pdf, ps, other

    cs.IR

    Neural content-aware collaborative filtering for cold-start music recommendation

    Authors: Paul Magron, Cédric Févotte

    Abstract: State-of-the-art music recommender systems are based on collaborative filtering, which builds upon learning similarities between users and songs from the available listening data. These approaches inherently face the cold-start problem, as they cannot recommend novel songs with no listening history. Content-aware recommendation addresses this issue by incorporating content information about the so… ▽ More

    Submitted 20 July, 2022; v1 submitted 24 February, 2021; originally announced February 2021.

  8. arXiv:2011.12818  [pdf, other

    cs.SD eess.AS

    Phase retrieval with Bregman divergences: Application to audio signal recovery

    Authors: Pierre-Hugo Vial, Paul Magron, Thomas Oberlin, Cédric Févotte

    Abstract: Phase retrieval aims to recover a signal from magnitude or power spectra measurements. It is often addressed by considering a minimization problem involving a quadratic cost function. We propose a different formulation based on Bregman divergences, which encompass divergences that are appropriate for audio signal processing applications. We derive a fast gradient algorithm to solve this problem.

    Submitted 25 November, 2020; originally announced November 2020.

    Comments: in Proceedings of iTWIST'20, Paper-ID: 16, Nantes, France, December, 2-4, 2020

  9. arXiv:2010.10276  [pdf, ps, other

    cs.IR cs.SD eess.AS

    Leveraging the structure of musical preference in content-aware music recommendation

    Authors: Paul Magron, Cédric Févotte

    Abstract: State-of-the-art music recommendation systems are based on collaborative filtering, which predicts a user's interest from his listening habits and similarities with other users' profiles. These approaches are agnostic to the song content, and therefore face the cold-start problem: they cannot recommend novel songs without listening history. To tackle this issue, content-aware recommendation incorp… ▽ More

    Submitted 9 February, 2021; v1 submitted 20 October, 2020; originally announced October 2020.

  10. arXiv:2010.10255  [pdf, ps, other

    cs.SD eess.AS

    Phase recovery with Bregman divergences for audio source separation

    Authors: Paul Magron, Pierre-Hugo Vial, Thomas Oberlin, Cédric Févotte

    Abstract: Time-frequency audio source separation is usually achieved by estimating the short-time Fourier transform (STFT) magnitude of each source, and then applying a phase recovery algorithm to retrieve time-domain signals. In particular, the multiple input spectrogram inversion (MISI) algorithm has shown good performance in several recent works. This algorithm minimizes a quadratic reconstruction error… ▽ More

    Submitted 9 February, 2021; v1 submitted 20 October, 2020; originally announced October 2020.

  11. Phase retrieval with Bregman divergences and application to audio signal recovery

    Authors: Pierre-Hugo Vial, Paul Magron, Thomas Oberlin, Cédric Févotte

    Abstract: Phase retrieval (PR) aims to recover a signal from the magnitudes of a set of inner products. This problem arises in many audio signal processing applications which operate on a short-time Fourier transform magnitude or power spectrogram, and discard the phase information. Recovering the missing phase from the resulting modified spectrogram is indeed necessary in order to synthesize time-domain si… ▽ More

    Submitted 13 January, 2021; v1 submitted 1 October, 2020; originally announced October 2020.

    Comments: 23 pages, 3 figures, accepted for publication in the IEEE Journal of Selected Topics in Signal Processing

  12. Online Spectrogram Inversion for Low-Latency Audio Source Separation

    Authors: Paul Magron, Tuomas Virtanen

    Abstract: Audio source separation is usually achieved by estimating the short-time Fourier transform (STFT) magnitude of each source, and then applying a spectrogram inversion algorithm to retrieve time-domain signals. In particular, the multiple input spectrogram inversion (MISI) algorithm has been exploited successfully in several recent works. However, this algorithm suffers from two drawbacks, which we… ▽ More

    Submitted 24 February, 2020; v1 submitted 8 November, 2019; originally announced November 2019.

  13. arXiv:1907.08506  [pdf, other

    cs.SD cs.LG eess.AS

    Language Modelling for Sound Event Detection with Teacher Forcing and Scheduled Sampling

    Authors: Konstantinos Drossos, Shayan Gharib, Paul Magron, Tuomas Virtanen

    Abstract: A sound event detection (SED) method typically takes as an input a sequence of audio frames and predicts the activities of sound events in each frame. In real-life recordings, the sound events exhibit some temporal structure: for instance, a "car horn" will likely be followed by a "car passing by". While this temporal structure is widely exploited in sequence prediction tasks (e.g., in machine tra… ▽ More

    Submitted 6 November, 2019; v1 submitted 19 July, 2019; originally announced July 2019.

    Comments: Fixed the display of URLs at footnote, updated the results

  14. arXiv:1904.10678  [pdf, ps, other

    cs.SD cs.LG eess.AS

    Unsupervised Adversarial Domain Adaptation Based On The Wasserstein Distance For Acoustic Scene Classification

    Authors: Konstantinos Drossos, Paul Magron, Tuomas Virtanen

    Abstract: A challenging problem in deep learning-based machine listening field is the degradation of the performance when using data from unseen conditions. In this paper we focus on the acoustic scene classification (ASC) task and propose an adversarial deep learning method to allow adapting an acoustic scene classification system to deal with a new acoustic channel resulting from data captured with a diff… ▽ More

    Submitted 6 November, 2019; v1 submitted 24 April, 2019; originally announced April 2019.

    Comments: Updated indices at Eq 6

  15. arXiv:1807.11298  [pdf, other

    cs.SD eess.AS

    Harmonic-Percussive Source Separation with Deep Neural Networks and Phase Recovery

    Authors: Konstantinos Drossos, Paul Magron, Stylianos Ioannis Mimilakis, Tuomas Virtanen

    Abstract: Harmonic/percussive source separation (HPSS) consists in separating the pitched instruments from the percussive parts in a music mixture. In this paper, we propose to apply the recently introduced Masker-Denoiser with twin networks (MaD TwinNet) system to this task. MaD TwinNet is a deep learning architecture that has reached state-of-the-art results in monaural singing voice separation. Herein, w… ▽ More

    Submitted 30 July, 2018; originally announced July 2018.

  16. arXiv:1802.03156  [pdf, ps, other

    cs.SD eess.AS

    Complex ISNMF: a Phase-Aware Model for Monaural Audio Source Separation

    Authors: Paul Magron, Tuomas Virtanen

    Abstract: This paper introduces a phase-aware probabilistic model for audio source separation. Classical source models in the short-term Fourier transform domain use circularly-symmetric Gaussian or Poisson random variables. This is equivalent to assuming that the phase of each source is uniformly distributed, which is not suitable for exploiting the underlying structure of the phase. Drawing on preliminary… ▽ More

    Submitted 30 September, 2018; v1 submitted 9 February, 2018; originally announced February 2018.

  17. arXiv:1608.01953  [pdf, ps, other

    cs.SD

    Model-based STFT phase recovery for audio source separation

    Authors: Paul Magron, Roland Badeau, Bertrand David

    Abstract: For audio source separation applications, it is common to estimate the magnitude of the short-time Fourier transform (STFT) of each source. In order to further synthesizing time-domain signals, it is necessary to recover the phase of the corresponding complex-valued STFT. Most authors in this field choose a Wiener-like filtering approach which boils down to using the phase of the original mixture.… ▽ More

    Submitted 27 February, 2018; v1 submitted 5 August, 2016; originally announced August 2016.

  18. arXiv:1608.01844  [pdf, ps, other

    cs.SD

    Lévy NMF for robust nonnegative source separation

    Authors: Paul Magron, Roland Badeau, Antoine Liutkus

    Abstract: Source separation, which consists in decomposing data into meaningful structured components, is an active research topic in many areas, such as music and image signal processing, applied physics and text mining. In this paper, we introduce the Positive $α$-stable (P$α$S) distributions to model the latent sources, which are a subclass of the stable distributions family. They notably permit us to mo… ▽ More

    Submitted 8 November, 2016; v1 submitted 5 August, 2016; originally announced August 2016.

  19. Phase recovery in NMF for audio source separation: an insightful benchmark

    Authors: Paul Magron, Roland Badeau, Bertrand David

    Abstract: Nonnegative Matrix Factorization (NMF) is a powerful tool for decomposing mixtures of audio signals in the Time-Frequency (TF) domain. In applications such as source separation, the phase recovery for each extracted component is a major issue since it often leads to audible artifacts. In this paper, we present a methodology for evaluating various NMF-based source separation techniques involving ph… ▽ More

    Submitted 24 May, 2016; originally announced May 2016.

    Comments: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2015

  20. Phase reconstruction of spectrograms based on a model of repeated audio events

    Authors: Paul Magron, Roland Badeau, Bertrand David

    Abstract: Phase recovery of modified spectrograms is a major issue in audio signal processing applications, such as source separation. This paper introduces a novel technique for estimating the phases of components in complex mixtures within onset frames in the Time-Frequency (TF) domain. We propose to exploit the phase repetitions from one onset frame to another. We introduce a reference phase which charac… ▽ More

    Submitted 24 May, 2016; originally announced May 2016.

    Comments: IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) 2015

  21. arXiv:1605.07467  [pdf, ps, other

    cs.SD

    Phase reconstruction of spectrograms with linear unwrap**: application to audio signal restoration

    Authors: Paul Magron, Roland Badeau, Bertrand David

    Abstract: This paper introduces a novel technique for reconstructing the phase of modified spectrograms of audio signals. From the analysis of mixtures of sinusoids we obtain relationships between phases of successive time frames in the Time-Frequency (TF) domain. To obtain similar relationships over frequencies, in particular within onset frames, we study an impulse model. Instantaneous frequencies and att… ▽ More

    Submitted 24 May, 2016; originally announced May 2016.

    Comments: European Signal Processing Conference (EUSIPCO) 2015

  22. Complex NMF under phase constraints based on signal modeling: application to audio source separation

    Authors: Paul Magron, Roland Badeau, Bertrand David

    Abstract: Nonnegative Matrix Factorization (NMF) is a powerful tool for decomposing mixtures of audio signals in the Time-Frequency (TF) domain. In the source separation framework, the phase recovery for each extracted component is necessary for synthesizing time-domain signals. The Complex NMF (CNMF) model aims to jointly estimate the spectrogram and the phase of the sources, but requires to constrain the… ▽ More

    Submitted 24 May, 2016; originally announced May 2016.

    Comments: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2016