Search | arXiv e-print repository

On the Integration of Acoustics and LiDAR: a Multi-Modal Approach to Acoustic Reflector Estimation

Authors: Ellen Riemens, Pablo Martínez-Nuevo, Jorge Martinez, Martin Møller, Richard C. Hendriks

Abstract: Having knowledge on the room acoustic properties, e.g., the location of acoustic reflectors, allows to better reproduce the sound field as intended. Current state-of-the-art methods for room boundary detection using microphone measurements typically focus on a two-dimensional setting, causing a model mismatch when employed in real-life scenarios. Detection of arbitrary reflectors in three dimensio… ▽ More Having knowledge on the room acoustic properties, e.g., the location of acoustic reflectors, allows to better reproduce the sound field as intended. Current state-of-the-art methods for room boundary detection using microphone measurements typically focus on a two-dimensional setting, causing a model mismatch when employed in real-life scenarios. Detection of arbitrary reflectors in three dimensions encounters practical limitations, e.g., the need for a spherical array and the increased computational complexity. Moreover, loudspeakers may not have an omnidirectional directivity pattern, as usually assumed in the literature, making the detection of acoustic reflectors in some directions more challenging. In the proposed method, a LiDAR sensor is added to a loudspeaker to improve wall detection accuracy and robustness. This is done in two ways. First, the model mismatch introduced by horizontal reflectors can be resolved by detecting reflectors with the LiDAR sensor to enable elimination of their detrimental influence from the 2D problem in pre-processing. Second, a LiDAR-based method is proposed to compensate for the challenging directions where the directive loudspeaker emits little energy. We show via simulations that this multi-modal approach, i.e., combining microphone and LiDAR sensors, improves the robustness and accuracy of wall detection. △ Less

Submitted 8 June, 2022; originally announced June 2022.

Comments: 5 pages, 9 figures, to be published in EUSIPCO 2022

arXiv:2202.08183 [pdf, other]

Increasing loudness in audio signals: a perceptually motivated approach to preserve audio quality

Authors: A. Jeannerot, N. de Koeijer, P. Martínez-Nuevo, M. B. Møller, J. Dyreby, P. Prandoni

Abstract: We present a method to maintain the subjective perception of volume of audio signals and, at the same time, reduce their absolute peak value. We focus on achieving this without compromising the perceived audio quality. This is specially useful, for example, to maximize the perceived reproduction level of loudspeakers where simply amplifying the signal amplitude, and hence their peak value, is limi… ▽ More We present a method to maintain the subjective perception of volume of audio signals and, at the same time, reduce their absolute peak value. We focus on achieving this without compromising the perceived audio quality. This is specially useful, for example, to maximize the perceived reproduction level of loudspeakers where simply amplifying the signal amplitude, and hence their peak value, is limited due to already constrained physical designs. In particular, we minimize the absolute peak value subject to a constraint based on auditory masking. This limits the perceptual difference between the original and the modified signals. Moreover, this constraint can be tuned and allows to control the resulting audio quality. We show results comparing loudness and audio quality as a function of peak reduction. These results suggest that our method presents the best trade-off between loudness and audio quality when compared against classical methods based on compression and clip**. △ Less

Submitted 16 February, 2022; originally announced February 2022.

arXiv:2105.06700 [pdf, other]

doi 10.1109/TSP.2021.3079802

Nonuniform Sampling Rate Conversion: An Efficient Approach

Authors: Pablo Martínez-Nuevo

Abstract: We present a discrete-time algorithm for nonuniform sampling rate conversion that presents low computational complexity and memory requirements. It generalizes arbitrary sampling rate conversion by accommodating time-varying conversion ratios, i.e., it can efficiently adapt to instantaneous changes of the input and output sampling rates. This approach is based on appropriately factorizing the time… ▽ More We present a discrete-time algorithm for nonuniform sampling rate conversion that presents low computational complexity and memory requirements. It generalizes arbitrary sampling rate conversion by accommodating time-varying conversion ratios, i.e., it can efficiently adapt to instantaneous changes of the input and output sampling rates. This approach is based on appropriately factorizing the time-varying discrete-time filter used for the conversion. Common filters that satisfy this factorization property are those where the underlying continuous-time filter consists of linear combinations of exponentials, e.g., those described by linear constant-coefficient differential equations. This factorization separates the computation into two parts: one consisting of a factor solely depending on the output sampling instants and the other consists of a summation -- that can be computed recursively -- whose terms depend solely on the input sampling instants and its number of terms is given by a relationship between input and output sampling instants. Thus, nonuniform sampling rates can be accommodated by updating the factors involved and adjusting the number of terms added. When the impulse response consists of exponentials, computing the factors can be done recursively in an efficient manner. △ Less

Submitted 14 May, 2021; originally announced May 2021.

Comments: 10 pages, 10 figures, journal

arXiv:2102.06455 [pdf, other]

Deep Sound Field Reconstruction in Real Rooms: Introducing the ISOBEL Sound Field Dataset

Authors: Miklas Strøm Kristoffersen, Martin Bo Møller, Pablo Martínez-Nuevo, Jan Østergaard

Abstract: Knowledge of loudspeaker responses are useful in a number of applications, where a sound system is located inside a room that alters the listening experience depending on position within the room. Acquisition of sound fields for sound sources located in reverberant rooms can be achieved through labor intensive measurements of impulse response functions covering the room, or alternatively by means… ▽ More Knowledge of loudspeaker responses are useful in a number of applications, where a sound system is located inside a room that alters the listening experience depending on position within the room. Acquisition of sound fields for sound sources located in reverberant rooms can be achieved through labor intensive measurements of impulse response functions covering the room, or alternatively by means of reconstruction methods which can potentially require significantly fewer measurements. This paper extends evaluations of sound field reconstruction at low frequencies by introducing a dataset with measurements from four real rooms. The ISOBEL Sound Field dataset is publicly available, and aims to bridge the gap between synthetic and real-world sound fields in rectangular rooms. Moreover, the paper advances on a recent deep learning-based method for sound field reconstruction using a very low number of microphones, and proposes an approach for modeling both magnitude and phase response in a U-Net-like neural network architecture. The complex-valued sound field reconstruction demonstrates that the estimated room transfer functions are of high enough accuracy to allow for personalized sound zones with contrast ratios comparable to ideal room transfer functions using 15 microphones below 150 Hz. △ Less

Submitted 12 February, 2021; originally announced February 2021.

arXiv:2008.00026 [pdf, ps, other]

Extrapolation of Bandlimited Multidimensional Signals from Continuous Measurements

Authors: Cornelius Frankenbach, Pablo Martínez-Nuevo, Martin Møller, Walter Kellermann

Abstract: Conventional sampling and interpolation commonly rely on discrete measurements. In this paper, we develop a theoretical framework for extrapolation of signals in higher dimensions from knowledge of the continuous waveform on bounded high-dimensional regions. In particular, we propose an iterative method to reconstruct bandlimited multidimensional signals based on truncated versions of the original… ▽ More Conventional sampling and interpolation commonly rely on discrete measurements. In this paper, we develop a theoretical framework for extrapolation of signals in higher dimensions from knowledge of the continuous waveform on bounded high-dimensional regions. In particular, we propose an iterative method to reconstruct bandlimited multidimensional signals based on truncated versions of the original signal to bounded regions---herein referred to as continuous measurements. In the proposed method, the reconstruction is performed by iterating on a convex combination of region-limiting and bandlimiting operations. We show that this iteration consists of a firmly nonexpansive operator and prove strong convergence for multidimensional bandlimited signals. In order to improve numerical stability, we introduce a regularized iteration and show its connection to Tikhonov regularization. The method is illustrated numerically for two-dimensional signals. △ Less

Submitted 31 July, 2020; originally announced August 2020.

Comments: Submitted to EUSIPCO 2020

arXiv:2003.01117 [pdf, other]

Inferring the location of reflecting surfaces exploiting loudspeaker directivity

Authors: Vincenzo Zaccà, Pablo Martinez-Nuevo, Martin Møller, Jorge Martínez, Richard Heusdens

Abstract: Accurate sound field reproduction in rooms is often limited by the lack of knowledge of the room characteristics. Information about the room shape or nearby reflecting boundaries can, in principle, be used to improve the accuracy of the reproduction. In this paper, we propose a method to infer the location of nearby reflecting boundaries from measurements on a microphone array. As opposed to tradi… ▽ More Accurate sound field reproduction in rooms is often limited by the lack of knowledge of the room characteristics. Information about the room shape or nearby reflecting boundaries can, in principle, be used to improve the accuracy of the reproduction. In this paper, we propose a method to infer the location of nearby reflecting boundaries from measurements on a microphone array. As opposed to traditional methods, we explicitly exploit the loudspeaker directivity model (beyond omnidirectional radiation) and the microphone array geometry. This approach does not require noiseless timing information of the echoes as input, nor a tailored loudspeaker-wall-microphone measurement step. Simulations show the proposed model outperforms current methods that disregard directivity in reverberant environments. △ Less

Submitted 2 March, 2020; originally announced March 2020.

Comments: Submitted to EUSIPCO 2020

arXiv:2001.11263 [pdf, other]

doi 10.1121/10.0001687

Sound field reconstruction in rooms: inpainting meets super-resolution

Authors: Francesc Lluís, Pablo Martínez-Nuevo, Martin Bo Møller, Sven Ewan Shepstone

Abstract: In this paper, a deep-learning-based method for sound field reconstruction is proposed. It is shown the possibility to reconstruct the magnitude of the sound pressure in the frequency band 30-300 Hz for an entire room by using a very low number of irregularly distributed microphones arbitrarily arranged. Moreover, the approach is agnostic to the location of the measurements in the Euclidean space.… ▽ More In this paper, a deep-learning-based method for sound field reconstruction is proposed. It is shown the possibility to reconstruct the magnitude of the sound pressure in the frequency band 30-300 Hz for an entire room by using a very low number of irregularly distributed microphones arbitrarily arranged. Moreover, the approach is agnostic to the location of the measurements in the Euclidean space. In particular, the presented approach uses a limited number of arbitrary discrete measurements of the magnitude of the sound field pressure in order to extrapolate this field to a higher-resolution grid of discrete points in space with a low computational complexity. The method is based on a U-net-like neural network with partial convolutions trained solely on simulated data, which itself is constructed from numerical simulations of Green's function across thousands of common rectangular rooms. Although extensible to three dimensions and different room shapes, the method focuses on reconstructing a two-dimensional plane of a rectangular room from measurements of the three-dimensional sound field. Experiments using simulated data together with an experimental validation in a real listening room are shown. The results suggest a performance which may exceed conventional reconstruction techniques for a low number of microphones and computational requirements. △ Less

Submitted 6 August, 2020; v1 submitted 30 January, 2020; originally announced January 2020.

Comments: Code: https://github.com/francesclluis/sound-field-neural-network

Journal ref: The Journal of the Acoustical Society of America 148, 649 (2020)

arXiv:1906.12227 [pdf, other]

An Image Source Method Framework for Arbitrary Reflecting Boundaries

Authors: Pierre Quinton, Pablo Martínez-Nuevo, Martin B. Møller

Abstract: We propose a theoretical framework for the image source method that generalizes to arbitrary reflecting boundaries, e.g. boundaries that are curved or even with certain openings. Furthermore, it can seamlessly incorporate boundary absorption, source directivity, and nonspecular reflections. This framework is based on the notion of reflection paths that allows the introduction of the concepts of va… ▽ More We propose a theoretical framework for the image source method that generalizes to arbitrary reflecting boundaries, e.g. boundaries that are curved or even with certain openings. Furthermore, it can seamlessly incorporate boundary absorption, source directivity, and nonspecular reflections. This framework is based on the notion of reflection paths that allows the introduction of the concepts of validity and visibility of virtual sources. These definitions facilitate the determination, for a given source and receiver location, of the distribution of virtual sources that explain the boundary effects of a wide range of reflecting surfaces. The structure of the set of virtual sources is then more general than just punctual virtual sources. Due to this more diverse configuration of image sources, we represent the room impulse response as an integral involving the temporal excitation signal against a measure determined by the source and receiver locations, and the original boundary. The latter smoothly enables, in an analytically tractable manner, the incorporation of more general boundary shapes as well as directivity of sources and boundary absorption while, at the same time, maintaining the conceptual benefits of the image source method. △ Less

Submitted 28 June, 2019; originally announced June 2019.

Comments: 9 pages, 6 figures

MSC Class: 28A78

arXiv:1802.04672 [pdf, other]

doi 10.1109/TSP.2019.2904027

Delta-Ramp Encoder for Amplitude Sampling and its Interpretation as Time Encoding

Authors: Pablo Martínez-Nuevo, Hsin-Yu Lai, Alan V. Oppenheim

Abstract: The theoretical basis for conventional acquisition of bandlimited signals typically relies on uniform time sampling and assumes infinite-precision amplitude values. In this paper, we explore signal representation and recovery based on uniform amplitude sampling with assumed infinite precision timing information. The approach is based on the delta-ramp encoder which consists of applying a one-level… ▽ More The theoretical basis for conventional acquisition of bandlimited signals typically relies on uniform time sampling and assumes infinite-precision amplitude values. In this paper, we explore signal representation and recovery based on uniform amplitude sampling with assumed infinite precision timing information. The approach is based on the delta-ramp encoder which consists of applying a one-level level-crossing detector to the result of adding an appropriate sawtooth-like waveform to the input signal. The output samples are the time instants of these level crossings, thus representing a time-encoded version of the input signal. For theoretical purposes, this system can be equivalently analyzed by reversibly transforming through ramp addition a nonmonotonic input signal into a monotonic one which is then uniformly sampled in amplitude. The monotonic function is then represented by the times at which the signal crosses a predefined and equally-spaced set of amplitude values. We refer to this technique as amplitude sampling. The time sequence generated can be interpreted alternatively as nonuniform time sampling of the original source signal. We derive duality and frequency-domain properties for the functions involved in the transformation. Iterative algorithms are proposed and implemented for recovery of the original source signal. As indicated in the simulations, the proposed iterative amplitude-sampling algorithm achieves a faster convergence rate than frame-based reconstruction for nonuniform sampling. The performance can also be improved by appropriate choice of the parameters while maintaining the same sampling density. △ Less

Submitted 7 February, 2020; v1 submitted 13 February, 2018; originally announced February 2018.

Comments: 12 pages, 11 figures, journal paper

MSC Class: 30D20 (Primary) 30D15 (Secondary) ACM Class: H.1.0; H.1.1; E.4

arXiv:1802.04634 [pdf, other]

doi 10.1109/TIT.2019.2907996

Lattice Functions for the Analysis of Analog-to-Digital Conversion

Authors: Pablo Martínez-Nuevo, Alan. V. Oppenheim

Abstract: Analog-to-digital (A/D) converters are the common interface between analog signals and the domain of digital discrete-time signal processing. In essence, this domain simultaneously incorporates quantization both in amplitude and time, i.e. amplitude quantization and uniform time sampling. Thus, we view A/D conversion as a sampling process in both the time and amplitude domains based on the observa… ▽ More Analog-to-digital (A/D) converters are the common interface between analog signals and the domain of digital discrete-time signal processing. In essence, this domain simultaneously incorporates quantization both in amplitude and time, i.e. amplitude quantization and uniform time sampling. Thus, we view A/D conversion as a sampling process in both the time and amplitude domains based on the observation that the underlying continuous-time signals representing digital sequences can be sampled in a lattice---i.e. at points restricted to lie on a uniform grid both in time and amplitude. We refer to them as lattice functions. This is in contrast with the traditional approach based on the classical sampling theorem and quantization error analysis. The latter has been mainly addressed with the help of probabilistic models, or deterministic ones either confined to very particular scenarios or considering worst-case assumptions. In this paper, we provide a deterministic theoretical analysis and framework for the functions involved in digital discrete-time processing. We show that lattice functions possess a rich analytic structure in the context of integral-valued entire functions of exponential type. We derive set and spectral properties of this class of functions. This allows us to prove in a deterministic way and for general bandlimited functions a fundamental lower bound on the maximum frequency component introduced by quantization that is independent of the resolution of the quantizer. △ Less

Submitted 28 March, 2019; v1 submitted 13 February, 2018; originally announced February 2018.

Comments: 9 pages, 5 figures, journal paper

MSC Class: 30D20 (Primary) 30D15 (Secondary) ACM Class: H.1.0

Showing 1–10 of 10 results for author: Martinez-Nuevo, P