Search | arXiv e-print repository

Fast Cross-Correlation for TDoA Estimation on Small Aperture Microphone Arrays

Authors: François Grondin, Marc-Antoine Maheux, Jean-Samuel Lauzon, Jonathan Vincent, François Michaud

Abstract: This paper introduces the Fast Cross-Correlation (FCC) method for Time Difference of Arrival (TDoA) Estimation for pairs of microphones on a small aperture microphone array. FCC relies on low-rank decomposition and exploits symmetry in even and odd bases to speed up computation while preserving TDoA accuracy. FCC reduces the number of flops by a factor of 4.5 and the execution speed by factors bet… ▽ More This paper introduces the Fast Cross-Correlation (FCC) method for Time Difference of Arrival (TDoA) Estimation for pairs of microphones on a small aperture microphone array. FCC relies on low-rank decomposition and exploits symmetry in even and odd bases to speed up computation while preserving TDoA accuracy. FCC reduces the number of flops by a factor of 4.5 and the execution speed by factors between 3.5 and 8.3 on embedded hardware, compared to the state-of-the-art Generalized Cross-Correlation (GCC) method that relies on the Fast Fourier Transform (FFT). This improvement can provide portable microphone arrays with extended battery life and allow real-time processing on low-cost hardware. △ Less

Submitted 10 March, 2023; v1 submitted 28 April, 2022; originally announced April 2022.

Comments: Submitted to IEEE ICASSP 2023

arXiv:2203.16494 [pdf, other]

S-OPT: A Points Selection Algorithm for Hyper-Reduction in Reduced Order Models

Authors: Jessica T. Lauzon, Siu Wun Cheung, Yeonjong Shin, Youngsoo Choi, Dylan Matthew Copeland, Kevin Huynh

Abstract: While projection-based reduced order models can reduce the dimension of full order solutions, the resulting reduced models may still contain terms that scale with the full order dimension. Hyper-reduction techniques are sampling-based methods that further reduce this computational complexity by approximating such terms with a much smaller dimension. The goal of this work is to introduce a points s… ▽ More While projection-based reduced order models can reduce the dimension of full order solutions, the resulting reduced models may still contain terms that scale with the full order dimension. Hyper-reduction techniques are sampling-based methods that further reduce this computational complexity by approximating such terms with a much smaller dimension. The goal of this work is to introduce a points selection algorithm developed by Shin and Xiu [SIAM J. Sci. Comput., 38 (2016), pp. A385--A411], as a hyper-reduction method. The selection algorithm is originally proposed as a stochastic collocation method for uncertainty quantification. Since the algorithm aims at maximizing a quantity S that measures both the column orthogonality and the determinant, we refer to the algorithm as S-OPT. Numerical examples are provided to demonstrate the performance of S-OPT and to compare its performance with an over-sampled Discrete Empirical Interpolation (DEIM) algorithm. We found that using the S-OPT algorithm is shown to predict the full order solutions with higher accuracy for a given number of indices. △ Less

Submitted 29 March, 2022; originally announced March 2022.

Comments: 26 pages, 15 figures, submitted to SIAM Journal of Scientific Computing

MSC Class: 37M99; 65M99; 76D05; 67Q05

arXiv:2203.14409 [pdf, other]

SMP-PHAT: Lightweight DoA Estimation by Merging Microphone Pairs

Authors: François Grondin, Marc-Antoine Maheux, Jean-Samuel Lauzon, Jonathan Vincent, François Michaud

Abstract: This paper introduces SMP-PHAT, which performs direction of arrival (DoA) of sound estimation with a microphone array by merging pairs of microphones that are parallel in space. This approach reduces the number of pairwise cross-correlation computations, and brings down the number of flops and memory lookups when searching for DoA. Experiments on low-cost hardware with commonly used microphone arr… ▽ More This paper introduces SMP-PHAT, which performs direction of arrival (DoA) of sound estimation with a microphone array by merging pairs of microphones that are parallel in space. This approach reduces the number of pairwise cross-correlation computations, and brings down the number of flops and memory lookups when searching for DoA. Experiments on low-cost hardware with commonly used microphone arrays show that the proposed method provides the same accuracy as the former SRP-PHAT approach, while reducing the computational load by 39% in some cases. △ Less

Submitted 27 March, 2022; originally announced March 2022.

Comments: Submitted to Interspeech 2022

arXiv:2103.03954 [pdf, other]

ODAS: Open embeddeD Audition System

Authors: François Grondin, Dominic Létourneau, Cédric Godin, Jean-Samuel Lauzon, Jonathan Vincent, Simon Michaud, Samuel Faucher, François Michaud

Abstract: Artificial audition aims at providing hearing capabilities to machines, computers and robots. Existing frameworks in robot audition offer interesting sound source localization, tracking and separation performance, although involve a significant amount of computations that limit their use on robots with embedded computing capabilities. This paper presents ODAS, the Open embeddeD Audition System fra… ▽ More Artificial audition aims at providing hearing capabilities to machines, computers and robots. Existing frameworks in robot audition offer interesting sound source localization, tracking and separation performance, although involve a significant amount of computations that limit their use on robots with embedded computing capabilities. This paper presents ODAS, the Open embeddeD Audition System framework, which includes strategies to reduce the computational load and perform robot audition tasks on low-cost embedded computing systems. It presents key features of ODAS, along with cases illustrating its uses in different robots and artificial audition applications. △ Less

Submitted 11 May, 2022; v1 submitted 5 March, 2021; originally announced March 2021.

Comments: This paper was published in Frontiers Robotics and AI

arXiv:2010.09930 [pdf, other]

BIRD: Big Impulse Response Dataset

Authors: François Grondin, Jean-Samuel Lauzon, Simon Michaud, Mirco Ravanelli, François Michaud

Abstract: This paper introduces BIRD, the Big Impulse Response Dataset. This open dataset consists of 100,000 multichannel room impulse responses (RIRs) generated from simulations using the Image Method, making it the largest multichannel open dataset currently available. These RIRs can be used toperform efficient online data augmentation for scenarios that involve two microphones and multiple sound sources… ▽ More This paper introduces BIRD, the Big Impulse Response Dataset. This open dataset consists of 100,000 multichannel room impulse responses (RIRs) generated from simulations using the Image Method, making it the largest multichannel open dataset currently available. These RIRs can be used toperform efficient online data augmentation for scenarios that involve two microphones and multiple sound sources. The paper also introduces use cases to illustrate how BIRD can perform data augmentation with existing speech corpora. △ Less

Submitted 19 October, 2020; originally announced October 2020.

arXiv:2008.00072 [pdf, other]

Dynamic Object Tracking and Masking for Visual SLAM

Authors: Jonathan Vincent, Mathieu Labbé, Jean-Samuel Lauzon, François Grondin, Pier-Marc Comtois-Rivet, François Michaud

Abstract: In dynamic environments, performance of visual SLAM techniques can be impaired by visual features taken from moving objects. One solution is to identify those objects so that their visual features can be removed for localization and map**. This paper presents a simple and fast pipeline that uses deep neural networks, extended Kalman filters and visual SLAM to improve both localization and mappin… ▽ More In dynamic environments, performance of visual SLAM techniques can be impaired by visual features taken from moving objects. One solution is to identify those objects so that their visual features can be removed for localization and map**. This paper presents a simple and fast pipeline that uses deep neural networks, extended Kalman filters and visual SLAM to improve both localization and map** in dynamic environments (around 14 fps on a GTX 1080). Results on the dynamic sequences from the TUM dataset using RTAB-Map as visual SLAM suggest that the approach achieves similar localization performance compared to other state-of-the-art methods, while also providing the position of the tracked dynamic objects, a 3D map free of those dynamic objects, better loop closure detection with the whole pipeline able to run on a robot moving at moderate speed. △ Less

Submitted 31 July, 2020; originally announced August 2020.

arXiv:2007.11079 [pdf, other]

3D Localization of a Sound Source Using Mobile Microphone Arrays Referenced by SLAM

Authors: Simon Michaud, Samuel Faucher, François Grondin, Jean-Samuel Lauzon, Mathieu Labbé, Dominic Létourneau, François Ferland, François Michaud

Abstract: A microphone array can provide a mobile robot with the capability of localizing, tracking and separating distant sound sources in 2D, i.e., estimating their relative elevation and azimuth. To combine acoustic data with visual information in real world settings, spatial correlation must be established. The approach explored in this paper consists of having two robots, each equipped with a microphon… ▽ More A microphone array can provide a mobile robot with the capability of localizing, tracking and separating distant sound sources in 2D, i.e., estimating their relative elevation and azimuth. To combine acoustic data with visual information in real world settings, spatial correlation must be established. The approach explored in this paper consists of having two robots, each equipped with a microphone array, localizing themselves in a shared reference map using SLAM. Based on their locations, data from the microphone arrays are used to triangulate in 3D the location of a sound source in relation to the same map. This strategy results in a novel cooperative sound map** approach using mobile microphone arrays. Trials are conducted using two mobile robots localizing a static or a moving sound source to examine in which conditions this is possible. Results suggest that errors under 0.3 m are observed when the relative angle between the two robots are above 30 degrees for a static sound source, while errors under 0.3 m for angles between 40 degrees and 140 degrees are observed with a moving sound source. △ Less

Submitted 21 July, 2020; originally announced July 2020.

Showing 1–7 of 7 results for author: Lauzon, J