Skip to main content

Showing 1–30 of 30 results for author: Deleforge, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.03385  [pdf, other

    cs.SD eess.AS eess.SP physics.class-ph

    Fully Reversing the Shoebox Image Source Method: From Impulse Responses to Room Parameters

    Authors: Tom Sprunck, Antoine Deleforge, Yannick Privat, Cédric Foy

    Abstract: We present an algorithm that fully reverses the shoebox image source method (ISM), a popular and widely used room impulse response (RIR) simulator for cuboid rooms introduced by Allen and Berkley in 1979. More precisely, given a discrete multichannel RIR generated by the shoebox ISM for a microphone array of known geometry, the algorithm reliably recovers the 18 input parameters. These are the 3D… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  2. arXiv:2308.02560  [pdf, other

    cs.SD cs.LG eess.AS

    From Discrete Tokens to High-Fidelity Audio Using Multi-Band Diffusion

    Authors: Robin San Roman, Yossi Adi, Antoine Deleforge, Romain Serizel, Gabriel Synnaeve, Alexandre Défossez

    Abstract: Deep generative models can generate high-fidelity audio conditioned on various types of representations (e.g., mel-spectrograms, Mel-frequency Cepstral Coefficients (MFCC)). Recently, such models have been used to synthesize audio waveforms conditioned on highly compressed representations. Although such methods produce impressive results, they are prone to generate audible artifacts when the condi… ▽ More

    Submitted 8 November, 2023; v1 submitted 2 August, 2023; originally announced August 2023.

    Comments: 10 pages

    Journal ref: Thirty-seventh Conference on Neural Information Processing Systems (2023)

  3. arXiv:2211.16958  [pdf, ps, other

    cs.SD eess.AS

    How to (virtually) train your speaker localizer

    Authors: Prerak Srivastava, Antoine Deleforge, Archontis Politis, Emmanuel Vincent

    Abstract: Learning-based methods have become ubiquitous in speaker localization. Existing systems rely on simulated training sets for the lack of sufficiently large, diverse and annotated real datasets. Most room acoustics simulators used for this purpose rely on the image source method (ISM) because of its computational efficiency. This paper argues that carefully extending the ISM to incorporate more real… ▽ More

    Submitted 25 May, 2023; v1 submitted 30 November, 2022; originally announced November 2022.

    Comments: Published in INTERSPEECH 2023

  4. arXiv:2208.14017  [pdf, ps, other

    cs.SD eess.AS eess.SP physics.class-ph

    Gridless 3D Recovery of Image Sources from Room Impulse Responses

    Authors: Tom Sprunck, Yannick Privat, Cédric Foy, Antoine Deleforge

    Abstract: Given a sound field generated by a sparse distribution of impulse image sources, can the continuous 3D positions and amplitudes of these sources be recovered from discrete, bandlimited measurements of the field at a finite set of locations, e.g., a multichannel room impulse response? Borrowing from recent advances in super-resolution imaging, it is shown that this nonlinear, non-convex inverse pro… ▽ More

    Submitted 7 December, 2022; v1 submitted 30 August, 2022; originally announced August 2022.

    Comments: IEEE Signal Processing Letters, 2022

  5. arXiv:2207.09133  [pdf, other

    cs.SD eess.AS

    Realistic sources, receivers and walls improve the generalisability of virtually-supervised blind acoustic parameter estimators

    Authors: Prerak Srivastava, Antoine Deleforge, Emmanuel Vincent

    Abstract: Blind acoustic parameter estimation consists in inferring the acoustic properties of an environment from recordings of unknown sound sources. Recent works in this area have utilized deep neural networks trained either partially or exclusively on simulated data, due to the limited availability of real annotated measurements. In this paper, we study whether a model purely trained using a fast image-… ▽ More

    Submitted 19 July, 2022; originally announced July 2022.

  6. arXiv:2111.08327  [pdf, other

    cs.SD eess.AS

    Detecting acoustic reflectors using a robot's ego-noise

    Authors: Usama Saqib, Antoine Deleforge, Jesper Jensen

    Abstract: In this paper, we propose a method to estimate the proximity of an acoustic reflector, e.g., a wall, using ego-noise, i.e., the noise produced by the moving parts of a listening robot. This is achieved by estimating the times of arrival of acoustic echoes reflected from the surface. Simulated experiments show that the proposed nonintrusive approach is capable of accurately estimating the distance… ▽ More

    Submitted 16 November, 2021; originally announced November 2021.

    Journal ref: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Jun 2021, Toronto, Canada

  7. arXiv:2109.00393  [pdf, other

    cs.NE cs.SD eess.AS physics.class-ph

    Mean absorption estimation from room impulse responses using virtually supervised learning

    Authors: Cédric Foy, Antoine Deleforge, Diego Di Carlo

    Abstract: In the context of building acoustics and the acoustic diagnosis of an existing room, this paper introduces and investigates a new approach to estimate mean absorption coefficients solely from a room impulse response (RIR). This inverse problem is tackled via virtually-supervised learning, namely, the RIR-to-absorption map** is implicitly learned by regression on a simulated dataset using artific… ▽ More

    Submitted 1 September, 2021; originally announced September 2021.

    Journal ref: Journal of the Acoustical Society of America, Acoustical Society of America, 2021, 150 (2), pp.1286-1299

  8. arXiv:2107.13832  [pdf, other

    cs.SD cs.LG eess.AS

    Blind Room Parameter Estimation Using Multiple-Multichannel Speech Recordings

    Authors: Prerak Srivastava, Antoine Deleforge, Emmanuel Vincent

    Abstract: Knowing the geometrical and acoustical parameters of a room may benefit applications such as audio augmented reality, speech dereverberation or audio forensics. In this paper, we study the problem of jointly estimating the total surface area, the volume, as well as the frequency-dependent reverberation time and mean surface absorption of a room in a blind fashion, based on two-channel noisy speech… ▽ More

    Submitted 29 July, 2021; originally announced July 2021.

    Comments: Accepted In WASPAA 2021 ( IEEE Workshop on Applications of Signal Processing to Audio and Acoustics )

  9. arXiv:2106.06999  [pdf, other

    eess.AS cs.SD

    A Dataset of Dynamic Reverberant Sound Scenes with Directional Interferers for Sound Event Localization and Detection

    Authors: Archontis Politis, Sharath Adavanne, Daniel Krause, Antoine Deleforge, Prerak Srivastava, Tuomas Virtanen

    Abstract: This report presents the dataset and baseline of Task 3 of the DCASE2021 Challenge on Sound Event Localization and Detection (SELD). The dataset is based on emulation of real recordings of static or moving sound events under real conditions of reverberation and ambient noise, using spatial room impulse responses captured in a variety of rooms and delivered in two spatial formats. The acoustical sy… ▽ More

    Submitted 4 July, 2021; v1 submitted 13 June, 2021; originally announced June 2021.

  10. arXiv:2104.13168  [pdf, other

    eess.AS cs.SD

    dEchorate: a Calibrated Room Impulse Response Database for Echo-aware Signal Processing

    Authors: Diego Di Carlo, Pinchas Tandeitnik, Cédric Foy, Antoine Deleforge, Nancy Bertin, Sharon Gannot

    Abstract: This paper presents dEchorate: a new database of measured multichannel Room Impulse Responses (RIRs) including annotations of early echo timings and 3D positions of microphones, real sources and image sources under different wall configurations in a cuboid room. These data provide a tool for benchmarking recent methods in echo-aware speech enhancement, room geometry estimation, RIR estimation, aco… ▽ More

    Submitted 27 April, 2021; originally announced April 2021.

  11. arXiv:2005.04132  [pdf, other

    eess.AS cs.SD

    Asteroid: the PyTorch-based audio source separation toolkit for researchers

    Authors: Manuel Pariente, Samuele Cornell, Joris Cosentino, Sunit Sivasankaran, Efthymios Tzinis, Jens Heitkaemper, Michel Olvera, Fabian-Robert Stöter, Mathieu Hu, Juan M. Martín-Doñas, David Ditter, Ariel Frank, Antoine Deleforge, Emmanuel Vincent

    Abstract: This paper describes Asteroid, the PyTorch-based audio source separation toolkit for researchers. Inspired by the most successful neural source separation systems, it provides all neural building blocks required to build such a system. To improve reproducibility, Kaldi-style recipes on common audio source separation datasets are also provided. This paper describes the software architecture of Aste… ▽ More

    Submitted 8 May, 2020; originally announced May 2020.

    Comments: Submitted to Interspeech 2020

  12. arXiv:1910.10400  [pdf, other

    cs.SD cs.LG eess.AS eess.SP

    Filterbank design for end-to-end speech separation

    Authors: Manuel Pariente, Samuele Cornell, Antoine Deleforge, Emmanuel Vincent

    Abstract: Single-channel speech separation has recently made great progress thanks to learned filterbanks as used in ConvTasNet. In parallel, parameterized filterbanks have been proposed for speaker recognition where only center frequencies and bandwidths are learned. In this work, we extend real-valued learned and parameterized filterbanks into complex-valued analytic filterbanks and define a set of corres… ▽ More

    Submitted 28 February, 2020; v1 submitted 23 October, 2019; originally announced October 2019.

    Comments: ICASSP 2020

  13. arXiv:1907.04655  [pdf, other

    eess.SP cs.SD eess.AS

    Audio-Based Search and Rescue with a Drone: Highlights from the IEEE Signal Processing Cup 2019 Student Competition

    Authors: Antoine Deleforge, Diego Di Carlo, Martin Strauss, Romain Serizel, Lucio Marcenaro

    Abstract: Unmanned aerial vehicles (UAV), commonly referred to as drones, have raised increasing interest in recent years. Search and rescue scenarios where humans in emergency situations need to be quickly found in areas difficult to access constitute an important field of application for this technology. While research efforts have mostly focused on develo** video-based solutions for this task \cite{lop… ▽ More

    Submitted 3 July, 2019; originally announced July 2019.

    Journal ref: IEEE Signal Processing Magazine, Institute of Electrical and Electronics Engineers, In press

  14. arXiv:1905.01209  [pdf, other

    cs.SD cs.LG eess.AS stat.ML

    A Statistically Principled and Computationally Efficient Approach to Speech Enhancement using Variational Autoencoders

    Authors: Manuel Pariente, Antoine Deleforge, Emmanuel Vincent

    Abstract: Recent studies have explored the use of deep generative models of speech spectra based of variational autoencoders (VAEs), combined with unsupervised noise models, to perform speech enhancement. These studies developed iterative algorithms involving either Gibbs sampling or gradient descent at each step, making them computationally expensive. This paper proposes a variational inference method to i… ▽ More

    Submitted 14 May, 2019; v1 submitted 3 May, 2019; originally announced May 2019.

    Comments: Submitted to INTERSPEECH 2019

  15. arXiv:1812.05901  [pdf, ps, other

    cs.SD eess.AS

    Evaluation of an open-source implementation of the SRP-PHAT algorithm within the 2018 LOCATA challenge

    Authors: Romain Lebarbenchon, Ewen Camberlein, Diego di Carlo, Clément Gaultier, Antoine Deleforge, Nancy Bertin

    Abstract: This short paper presents an efficient, flexible implementation of the SRP-PHAT multichannel sound source localization method. The method is evaluated on the single-source tasks of the LOCATA 2018 development dataset, and an associated Matlab toolbox is made available online.

    Submitted 14 December, 2018; originally announced December 2018.

    Comments: In Proceedings of the LOCATA Challenge Workshop - a satellite event of IWAENC 2018 (arXiv:1811.08482 )

    Report number: LOCATAchallenge/2018/01

  16. arXiv:1810.13338  [pdf, other

    cs.SD cs.AI eess.AS

    MULAN: A Blind and Off-Grid Method for Multichannel Echo Retrieval

    Authors: Helena Peic Tukuljac, Antoine Deleforge, Rémi Gribonval

    Abstract: This paper addresses the general problem of blind echo retrieval, i.e., given M sensors measuring in the discrete-time domain M mixtures of K delayed and attenuated copies of an unknown source signal, can the echo locations and weights be recovered? This problem has broad applications in fields such as sonars, seismol-ogy, ultrasounds or room acoustics. It belongs to the broader class of blind cha… ▽ More

    Submitted 31 October, 2018; originally announced October 2018.

    Journal ref: Thirty-second Conference on Neural Information Processing Systems (NIPS 2018), Dec 2018, Montr{é}al, Canada

  17. Separake: Source Separation with a Little Help From Echoes

    Authors: Robin Scheibler, Diego Di Carlo, Antoine Deleforge, Ivan Dokmanić

    Abstract: It is commonly believed that multipath hurts various audio processing algorithms. At odds with this belief, we show that multipath in fact helps sound source separation, even with very simple propagation models. Unlike most existing methods, we neither ignore the room impulse responses, nor we attempt to estimate them fully. We rather assume that we know the positions of a few virtual microphones… ▽ More

    Submitted 17 November, 2017; originally announced November 2017.

  18. arXiv:1711.04460  [pdf, other

    stat.ML cs.SD eess.AS

    Blind Source Separation Using Mixtures of Alpha-Stable Distributions

    Authors: Nicolas Keriven, Antoine Deleforge, Antoine Liutkus

    Abstract: We propose a new blind source separation algorithm based on mixtures of alpha-stable distributions. Complex symmetric alpha-stable distributions have been recently showed to better model audio signals in the time-frequency domain than classical Gaussian distributions thanks to their larger dynamic range. However, inference of these models is notoriously hard to perform because their probability de… ▽ More

    Submitted 12 February, 2018; v1 submitted 13 November, 2017; originally announced November 2017.

    Comments: International Conference on Acoustics, Speech and Signal Processing (ICASSP), Apr 2018, Calgary, Canada

  19. arXiv:1704.08972  [pdf, ps, other

    cs.IT

    Phase retrieval with a multivariate Von Mises prior: from a Bayesian formulation to a lifting solution

    Authors: Angelique Dremeau, Antoine Deleforge

    Abstract: In this paper, we investigate a new method for phase recovery when prior information on the missing phases is available. In particular, we propose to take into account this information in a generic fashion by means of a multivariate Von Mises dis- tribution. Building on a Bayesian formulation (a Maximum A Posteriori estimation), we show that the problem can be expressed using a Mahalanobis distanc… ▽ More

    Submitted 28 April, 2017; originally announced April 2017.

    Comments: Preprint of the paper published in the proc. of ICASSP'17

  20. arXiv:1612.06287  [pdf, other

    cs.SD cs.LG

    VAST : The Virtual Acoustic Space Traveler Dataset

    Authors: Clément Gaultier, Saurabh Kataria, Antoine Deleforge

    Abstract: This paper introduces a new paradigm for sound source lo-calization referred to as virtual acoustic space traveling (VAST) and presents a first dataset designed for this purpose. Existing sound source localization methods are either based on an approximate physical model (physics-driven) or on a specific-purpose calibration set (data-driven). With VAST, the idea is to learn a map** from audio fe… ▽ More

    Submitted 14 December, 2016; originally announced December 2016.

    Comments: International Conference on Latent Variable Analysis and Signal Separation (LVA/ICA), Feb 2017, Grenoble, France. International Conference on Latent Variable Analysis and Signal Separation

  21. arXiv:1609.09747  [pdf, other

    cs.SD

    Hearing in a shoe-box : binaural source position and wall absorption estimation using virtually supervised learning

    Authors: Saurabh Kataria, Clément Gaultier, Antoine Deleforge

    Abstract: This paper introduces a new framework for supervised sound source localization referred to as virtually-supervised learning. An acoustic shoe-box room simulator is used to generate a large number of binaural single-source audio scenes. These scenes are used to build a dataset of spatial binaural features annotated with acoustic properties such as the 3D source position and the walls' absorption co… ▽ More

    Submitted 20 March, 2017; v1 submitted 30 September, 2016; originally announced September 2016.

    Comments: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Mar 2017, New-Orleans, United States

    Report number: hal-01372435

  22. arXiv:1609.09744  [pdf, other

    cs.SD stat.ML

    Phase Unmixing : Multichannel Source Separation with Magnitude Constraints

    Authors: Antoine Deleforge, Yann Traonmilin

    Abstract: We consider the problem of estimating the phases of K mixed complex signals from a multichannel observation, when the mixing matrix and signal magnitudes are known. This problem can be cast as a non-convex quadratically constrained quadratic program which is known to be NP-hard in general. We propose three approaches to tackle it: a heuristic method, an alternate minimization method, and a convex… ▽ More

    Submitted 20 March, 2017; v1 submitted 30 September, 2016; originally announced September 2016.

    Comments: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Mar 2017, New Orleans, United States

    Report number: hal-01372418

  23. arXiv:1609.09743  [pdf, other

    cs.SD stat.AP

    Rectified binaural ratio: A complex T-distributed feature for robust sound localization

    Authors: Antoine Deleforge, Florence Forbes

    Abstract: Most existing methods in binaural sound source localization rely on some kind of aggregation of phase-and level-difference cues in the time-frequency plane. While different ag-gregation schemes exist, they are often heuristic and suffer in adverse noise conditions. In this paper, we introduce the rectified binaural ratio as a new feature for sound source local-ization. We show that for Gaussian-pr… ▽ More

    Submitted 30 September, 2016; originally announced September 2016.

    Comments: European Signal Processing Conference, Aug 2016, Budapest, Hungary. Proceedings of the 24th European Signal Processing Conference (EUSIPCO), 2016, 2016

  24. Robust Head-Pose Estimation Based on Partially-Latent Mixture of Linear Regressions

    Authors: Vincent Drouard, Radu Horaud, Antoine Deleforge, Silèye Ba, Georgios Evangelidis

    Abstract: Head-pose estimation has many applications, such as social event analysis, human-robot and human-computer interaction, driving assistance, and so forth. Head-pose estimation is challenging because it must cope with changing illumination conditions, variabilities in face orientation and in appearance, partial occlusions of facial landmarks, as well as bounding-box-to-face alignment errors. We propo… ▽ More

    Submitted 6 March, 2017; v1 submitted 31 March, 2016; originally announced March 2016.

    Comments: 12 pages, 5 figures, 3 tables

    Journal ref: IEEE Transactions on Image Processing, volume 26, Issue 3, 1428-1440, 2017

  25. arXiv:1507.00201  [pdf, ps, other

    cs.SD

    Towards a Generalization of Relative Transfer Functions to More Than One Source

    Authors: Antoine Deleforge, Sharon Gannot, Walter Kellermann

    Abstract: We propose a natural way to generalize relative transfer functions (RTFs) to more than one source. We first prove that such a generalization is not possible using a single multichannel spectro-temporal observation, regardless of the number of microphones. We then introduce a new transform for multichannel multi-frame spectrograms, i.e., containing several channels and time frames in each time-freq… ▽ More

    Submitted 1 July, 2015; originally announced July 2015.

  26. arXiv:1410.2430  [pdf, ps, other

    cs.SD

    Phase-Optimized K-SVD for Signal Extraction from Underdetermined Multichannel Sparse Mixtures

    Authors: Antoine Deleforge, Walter Kellermann

    Abstract: We propose a novel sparse representation for heavily underdetermined multichannel sound mixtures, i.e., with much more sources than microphones. The proposed approach operates in the complex Fourier domain, thus preserving spatial characteristics carried by phase differences. We derive a generalization of K-SVD which jointly estimates a dictionary capturing both spectral and spatial features, a sp… ▽ More

    Submitted 9 October, 2014; originally announced October 2014.

  27. Hyper-Spectral Image Analysis with Partially-Latent Regression and Spatial Markov Dependencies

    Authors: Antoine Deleforge, Florence Forbes, Sileye Ba, Radu Horaud

    Abstract: Hyper-spectral data can be analyzed to recover physical properties at large planetary scales. This involves resolving inverse problems which can be addressed within machine learning, with the advantage that, once a relationship between physical parameters and spectra has been established in a data-driven fashion, the learned relationship can be used to estimate physical parameters for new hyper-sp… ▽ More

    Submitted 27 March, 2015; v1 submitted 30 September, 2014; originally announced September 2014.

    Comments: 12 pages, 4 figures, 3 tables

    Journal ref: IEEE Journal on Selected Topics in Signal Processing, volume 9, number 6, 1037-1048, 2015

  28. arXiv:1408.2700  [pdf, other

    cs.SD cs.MM stat.AP stat.ML

    Co-Localization of Audio Sources in Images Using Binaural Features and Locally-Linear Regression

    Authors: Antoine Deleforge, Radu Horaud, Yoav Schechner, Laurent Girin

    Abstract: This paper addresses the problem of localizing audio sources using binaural measurements. We propose a supervised formulation that simultaneously localizes multiple sources at different locations. The approach is intrinsically efficient because, contrary to prior work, it relies neither on source separation, nor on monaural segregation. The method starts with a training stage that establishes a lo… ▽ More

    Submitted 15 April, 2016; v1 submitted 12 August, 2014; originally announced August 2014.

    Comments: 15 pages, 8 figures

    Journal ref: IEEE Transactions on Audio, Speech, and Language Processing 23(4), 718-731, April, 2015

  29. Acoustic Space Learning for Sound Source Separation and Localization on Binaural Manifolds

    Authors: Antoine Deleforge, Florence Forbes, Radu Horaud

    Abstract: In this paper we address the problems of modeling the acoustic space generated by a full-spectrum sound source and of using the learned model for the localization and separation of multiple sources that simultaneously emit sparse-spectrum sounds. We lay theoretical and methodological grounds in order to introduce the binaural manifold paradigm. We perform an in-depth study of the latent low-dimens… ▽ More

    Submitted 20 March, 2014; v1 submitted 11 February, 2014; originally announced February 2014.

    Comments: 19 pages, 9 figures, 3 tables

    Journal ref: International Journal of Neural Systems 25(1) 2015

  30. High-Dimensional Regression with Gaussian Mixtures and Partially-Latent Response Variables

    Authors: Antoine Deleforge, Florence Forbes, Radu Horaud

    Abstract: In this work we address the problem of approximating high-dimensional data with a low-dimensional representation. We make the following contributions. We propose an inverse regression method which exchanges the roles of input and response, such that the low-dimensional variable becomes the regressor, and which is tractable. We introduce a mixture of locally-linear probabilistic map** model that… ▽ More

    Submitted 20 December, 2013; v1 submitted 10 August, 2013; originally announced August 2013.

    Journal ref: Statistics and Computing, 25(5), 893-911, 2015