Search | arXiv e-print repository

Efficient Sound Field Reconstruction with Conditional Invertible Neural Networks

Authors: Xenofon Karakonstantis, Efren Fernandez-Grande, Peter Gerstoft

Abstract: In this study, we introduce a method for estimating sound fields in reverberant environments using a conditional invertible neural network (CINN). Sound field reconstruction can be hindered by experimental errors, limited spatial data, model mismatches, and long inference times, leading to potentially flawed and prolonged characterizations. Further, the complexity of managing inherent uncertaintie… ▽ More In this study, we introduce a method for estimating sound fields in reverberant environments using a conditional invertible neural network (CINN). Sound field reconstruction can be hindered by experimental errors, limited spatial data, model mismatches, and long inference times, leading to potentially flawed and prolonged characterizations. Further, the complexity of managing inherent uncertainties often escalates computational demands or is neglected in models. Our approach seeks to balance accuracy and computational efficiency, while incorporating uncertainty estimates to tailor reconstructions to specific needs. By training a CINN with Monte Carlo simulations of random wave fields, our method reduces the dependency on extensive datasets and enables inference from sparse experimental data. The CINN proves versatile at reconstructing Room Impulse Responses (RIRs), by acting either as a likelihood model for maximum a posteriori estimation or as an approximate posterior distribution through amortized Bayesian inference. Compared to traditional Bayesian methods, the CINN achieves similar accuracy with greater efficiency and without requiring its adaptation to distinct sound field conditions. △ Less

Submitted 10 April, 2024; originally announced April 2024.

arXiv:2401.06313 [pdf, other]

Non-uniform Array and Frequency Spacing for Regularization-free Gridless DOA

Authors: Yifan Wu, Michael B. Wakin, Peter Gerstoft

Abstract: Gridless direction-of-arrival (DOA) estimation with multiple frequencies can be applied in acoustics source localization problems. We formulate this as an atomic norm minimization (ANM) problem and derive an equivalent regularization-free semi-definite program (SDP) thereby avoiding regularization bias. The DOA is retrieved using a Vandermonde decomposition on the Toeplitz matrix obtained from the… ▽ More Gridless direction-of-arrival (DOA) estimation with multiple frequencies can be applied in acoustics source localization problems. We formulate this as an atomic norm minimization (ANM) problem and derive an equivalent regularization-free semi-definite program (SDP) thereby avoiding regularization bias. The DOA is retrieved using a Vandermonde decomposition on the Toeplitz matrix obtained from the solution of the SDP. We also propose a fast SDP program to deal with non-uniform array and frequency spacing. For non-uniform spacings, the Toeplitz structure will not exist, but the DOA is retrieved via irregular Vandermonde decomposition (IVD), and we theoretically guarantee the existence of the IVD. We extend ANM to the multiple measurement vector (MMV) cases and derive its equivalent regularization-free SDP. Using multiple frequencies and the MMV model, we can resolve more sources than the number of physical sensors for a uniform linear array. Numerical results demonstrate that the regularization-free framework is robust to noise and aliasing, and it overcomes the regularization bias. △ Less

Submitted 11 January, 2024; originally announced January 2024.

arXiv:2310.11043 [pdf, other]

Spoofing Attack Detection in the Physical Layer with Robustness to User Movement

Authors: Daniel Romero, Tien Ngoc Ha, Peter Gerstoft

Abstract: In a spoofing attack, an attacker impersonates a legitimate user to access or modify data belonging to the latter. Typical approaches for spoofing detection in the physical layer declare an attack when a change is observed in certain channel features, such as the received signal strength (RSS) measured by spatially distributed receivers. However, since channels change over time, for example due to… ▽ More In a spoofing attack, an attacker impersonates a legitimate user to access or modify data belonging to the latter. Typical approaches for spoofing detection in the physical layer declare an attack when a change is observed in certain channel features, such as the received signal strength (RSS) measured by spatially distributed receivers. However, since channels change over time, for example due to user movement, such approaches are impractical. To sidestep this limitation, this paper proposes a scheme that combines the decisions of a position-change detector based on a deep neural network to distinguish spoofing from movement. Building upon community detection on graphs, the sequence of received frames is partitioned into subsequences to detect concurrent transmissions from distinct locations. The scheme can be easily deployed in practice since it just involves collecting a small dataset of measurements at a few tens of locations that need not even be computed or recorded. The scheme is evaluated on real data collected for this purpose. △ Less

Submitted 17 October, 2023; originally announced October 2023.

Comments: WCNC. arXiv admin note: text overlap with arXiv:2211.04269

arXiv:2310.10970 [pdf, other]

Deep Learning based Spatially Dependent Acoustical Properties Recovery

Authors: Ruixian Liu, Peter Gerstoft

Abstract: The physics-informed neural network (PINN) is capable of recovering partial differential equation (PDE) coefficients that remain constant throughout the spatial domain directly from physical measurements. In this work, we propose a spatially dependent physics-informed neural network (SD-PINN), which enables the recovery of coefficients in spatially-dependent PDEs using a single neural network, eli… ▽ More The physics-informed neural network (PINN) is capable of recovering partial differential equation (PDE) coefficients that remain constant throughout the spatial domain directly from physical measurements. In this work, we propose a spatially dependent physics-informed neural network (SD-PINN), which enables the recovery of coefficients in spatially-dependent PDEs using a single neural network, eliminating the requirement for domain-specific physical expertise. We apply the SD-PINN to spatially-dependent wave equation coefficients recovery to reveal the spatial distribution of acoustical properties in the inhomogeneous medium. The proposed method exhibits robustness to noise owing to the incorporation of a loss function for the physical constraint that the assumed PDE must be satisfied. For the coefficients recovery of spatially two-dimensional PDEs, we store the PDE coefficients at all locations in the 2D region of interest into a matrix and incorporate the low-rank assumption for such a matrix to recover the coefficients at locations without available measurements. △ Less

Submitted 22 November, 2023; v1 submitted 16 October, 2023; originally announced October 2023.

Comments: 19 pages, 15 figures

arXiv:2210.16173 [pdf, other]

Deep Learning Object Detection Approaches to Signal Identification

Authors: Luke Wood, Kevin Anderson, Peter Gerstoft, Richard Bell, Raghab Subbaraman, Dinesh Bharadia

Abstract: Traditionally source identification is solved using threshold based energy detection algorithms. These algorithms frequently sum up the activity in regions, and consider regions above a specific activity threshold to be sources. While these algorithms work for the majority of cases, they often fail to detect signals that occupy small frequency bands, fail to distinguish sources with overlap** fr… ▽ More Traditionally source identification is solved using threshold based energy detection algorithms. These algorithms frequently sum up the activity in regions, and consider regions above a specific activity threshold to be sources. While these algorithms work for the majority of cases, they often fail to detect signals that occupy small frequency bands, fail to distinguish sources with overlap** frequency bands, and cannot detect any signals under a specified signal to noise ratio. Through the conversion of raw signal data to spectrogram, source identification can be framed as an object detection problem. By leveraging modern advancements in deep learning based object detection, we propose a system that manages to alleviate the failure cases encountered when using traditional source identification algorithms. Our contributions include framing source identification as an object detection problem, the publication of a spectrogram object detection dataset, and evaluation of the RetinaNet and YOLOv5 object detection models trained on the dataset. Our final models achieve Mean Average Precisions of up to 0.906. With such a high Mean Average Precision, these models are sufficiently robust for use in real world applications. △ Less

Submitted 1 November, 2022; v1 submitted 26 October, 2022; originally announced October 2022.

arXiv:2208.12472 [pdf]

doi 10.1121/10.0016876

Graph-based sequential beamforming

Authors: Yongsung Park, Florian Meyer, Peter Gerstoft

Abstract: This paper presents a Bayesian estimation method for sequential direction finding. The proposed method estimates the number of directions of arrivals (DOAs) and their DOAs performing operations on the factor graph. The graph represents a statistical model for sequential beamforming. At each time step, belief propagation predicts the number of DOAs and their DOAs using posterior probability density… ▽ More This paper presents a Bayesian estimation method for sequential direction finding. The proposed method estimates the number of directions of arrivals (DOAs) and their DOAs performing operations on the factor graph. The graph represents a statistical model for sequential beamforming. At each time step, belief propagation predicts the number of DOAs and their DOAs using posterior probability density functions (pdfs) from the previous time and a different Bernoulli-von Mises state transition model. Variational Bayesian inference then updates the number of DOAs and their DOAs. The method promotes sparse solutions through a Bernoulli-Gaussian amplitude model, is gridless, and provides marginal posterior pdfs from which DOA estimates and their uncertainties can be extracted. Compared to nonsequential approaches, the method can reduce DOA estimation errors in scenarios involving multiple time steps and time-varying DOAs. Simulation results demonstrate performance improvements compared to state-of-the-art methods. The proposed method is evaluated using ocean acoustic experimental data. △ Less

Submitted 3 February, 2023; v1 submitted 26 August, 2022; originally announced August 2022.

Comments: 15 pages, 12 figures

Journal ref: J. Acoust. Soc. Am. 153(1) (2023) 723-737

arXiv:2207.06159 [pdf, other]

doi 10.1109/TSP.2023.3244091

Gridless DOA Estimation with Multiple Frequencies

Authors: Yifan Wu, Michael B. Wakin, Peter Gerstoft

Abstract: Direction-of-arrival (DOA) estimation is widely applied in acoustic source localization. A multi-frequency model is suitable for characterizing the broadband structure in acoustic signals. In this paper, the continuous (gridless) DOA estimation problem with multiple frequencies is considered. This problem is formulated as an atomic norm minimization (ANM) problem. The ANM problem is equivalent to… ▽ More Direction-of-arrival (DOA) estimation is widely applied in acoustic source localization. A multi-frequency model is suitable for characterizing the broadband structure in acoustic signals. In this paper, the continuous (gridless) DOA estimation problem with multiple frequencies is considered. This problem is formulated as an atomic norm minimization (ANM) problem. The ANM problem is equivalent to a semi-definite program (SDP) which can be solved by an off-the-shelf SDP solver. The dual certificate condition is provided to certify the optimality of the SDP solution so that the sources can be localized by finding the roots of a polynomial. We also construct the dual polynomial to satisfy the dual certificate condition and show that such a construction exists when the source amplitude has a uniform magnitude. In multi-frequency ANM, spatial aliasing of DOAs at higher frequencies can cause challenges. We discuss this issue extensively and propose a robust solution to combat aliasing. Numerical results support our theoretical findings and demonstrate the effectiveness of the proposed method. △ Less

Submitted 6 February, 2023; v1 submitted 13 July, 2022; originally announced July 2022.

Comments: This work has been accepted by IEEE Transactions on Signal Processing

arXiv:2103.01830 [pdf, other]

doi 10.1109/JIOT.2021.3103523

Audio scene monitoring using redundant ad-hoc microphone array networks

Authors: Peter Gerstoft, Yihan Hu, Michael J. Bianco, Chaitanya Patil, Ardel Alegre, Yoav Freund, Francois Grondin

Abstract: We present a system for localizing sound sources in a room with several ad-hoc microphone arrays. Each circular array performs direction of arrival (DOA) estimation independently using commercial software. The DOAs are fed to a fusion center, concatenated, and used to perform the localization based on two proposed methods, which require only few labeled source locations (anchor points) for trainin… ▽ More We present a system for localizing sound sources in a room with several ad-hoc microphone arrays. Each circular array performs direction of arrival (DOA) estimation independently using commercial software. The DOAs are fed to a fusion center, concatenated, and used to perform the localization based on two proposed methods, which require only few labeled source locations (anchor points) for training. The first proposed method is based on principal component analysis (PCA) of the observed DOA and does not require any knowledge of anchor points. The array cluster can then perform localization on a manifold defined by the PCA of concatenated DOAs over time. The second proposed method performs localization using an affine transformation between the DOA vectors and the room manifold. The PCA has fewer requirements on the training sequence, but is less robust to missing DOAs from one of the arrays. The methods are demonstrated with five IoT 8-microphone circular arrays, placed at unspecified fixed locations in an office. Both the PCA and the affine method can easily map out a rectangle based on a few anchor points with similar accuracy. The proposed methods provide a step towards monitoring activities in a smart home and require little installation effort as the array locations are not needed. △ Less

Submitted 23 August, 2021; v1 submitted 2 March, 2021; originally announced March 2021.

Comments: IN press, IEEE Internet of Things Journal

arXiv:2102.06372 [pdf, other]

doi 10.1109/ICASSP39728.2021.9414972

Alternating projections gridless covariance-based estimation for DOA

Authors: Yongsung Park, Peter Gerstoft

Abstract: We present a gridless sparse iterative covariance-based estimation method based on alternating projections for direction-of-arrival (DOA) estimation. The gridless DOA estimation is formulated in the reconstruction of Toeplitz-structured low rank matrix, and is solved efficiently with alternating projections. The method improves resolution by achieving sparsity, deals with single-snapshot data and… ▽ More We present a gridless sparse iterative covariance-based estimation method based on alternating projections for direction-of-arrival (DOA) estimation. The gridless DOA estimation is formulated in the reconstruction of Toeplitz-structured low rank matrix, and is solved efficiently with alternating projections. The method improves resolution by achieving sparsity, deals with single-snapshot data and coherent arrivals, and, with co-prime arrays, estimates more DOAs than the number of sensors. We evaluate the proposed method using simulation results focusing on co-prime arrays. △ Less

Submitted 12 February, 2021; originally announced February 2021.

Comments: 5 pages, accepted by (ICASSP 2021) 2021 IEEE International Conference on Acoustics, Speech, and Signal Processing

Journal ref: 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

arXiv:2101.10636 [pdf, other]

Semi-supervised source localization in reverberant environments with deep generative modeling

Authors: Michael J. Bianco, Sharon Gannot, Efren Fernandez-Grande, Peter Gerstoft

Abstract: We propose a semi-supervised approach to acoustic source localization in reverberant environments based on deep generative modeling. Localization in reverberant environments remains an open challenge. Even with large data volumes, the number of labels available for supervised learning in reverberant environments is usually small. We address this issue by performing semi-supervised learning (SSL) w… ▽ More We propose a semi-supervised approach to acoustic source localization in reverberant environments based on deep generative modeling. Localization in reverberant environments remains an open challenge. Even with large data volumes, the number of labels available for supervised learning in reverberant environments is usually small. We address this issue by performing semi-supervised learning (SSL) with convolutional variational autoencoders (VAEs) on reverberant speech signals recorded with microphone arrays. The VAE is trained to generate the phase of relative transfer functions (RTFs) between microphones, in parallel with a direction of arrival (DOA) classifier based on RTF-phase. These models are trained using both labeled and unlabeled RTF-phase sequences. In learning to perform these tasks, the VAE-SSL explicitly learns to separate the physical causes of the RTF-phase (i.e., source location) from distracting signal characteristics such as noise and speech activity. Relative to existing semi-supervised localization methods in acoustics, VAE-SSL is effectively an end-to-end processing approach which relies on minimal preprocessing of RTF-phase features. As far as we are aware, our paper presents the first approach to modeling the physics of acoustic propagation using deep generative modeling. The VAE-SSL approach is compared with two signal processing-based approaches, steered response power with phase transform (SRP-PHAT) and MUltiple SIgnal Classification (MUSIC), as well as fully supervised CNNs. We find that VAE-SSL can outperform the conventional approaches and the CNN in label-limited scenarios. Further, the trained VAE-SSL system can generate new RTF-phase samples, which shows the VAE-SSL approach learns the physics of the acoustic environment. The generative modeling in VAE-SSL thus provides a means of interpreting the learned representations. △ Less

Submitted 1 April, 2021; v1 submitted 26 January, 2021; originally announced January 2021.

Comments: Revision, submitted to IEEE Access

arXiv:2012.09982 [pdf, other]

doi 10.1121/10.0004221

Deep embedded clustering of coral reef bioacoustics

Authors: Emma Ozanich, Aaron Thode, Peter Gerstoft, Lauren A. Freeman, Simon Freeman

Abstract: Deep clustering was applied to unlabeled, automatically detected signals in a coral reef soundscape to distinguish fish pulse calls from segments of whale song. Deep embedded clustering (DEC) learned latent features and formed classification clusters using fixed-length power spectrograms of the signals. Handpicked spectral and temporal features were also extracted and clustered with Gaussian mixtu… ▽ More Deep clustering was applied to unlabeled, automatically detected signals in a coral reef soundscape to distinguish fish pulse calls from segments of whale song. Deep embedded clustering (DEC) learned latent features and formed classification clusters using fixed-length power spectrograms of the signals. Handpicked spectral and temporal features were also extracted and clustered with Gaussian mixture models (GMM) and conventional clustering. DEC, GMM, and conventional clustering were tested on simulated datasets of fish pulse calls (fish) and whale song units (whale) with randomized bandwidth, duration, and SNR. Both GMM and DEC achieved high accuracy and identified clusters with fish, whale, and overlap** fish and whale signals. Conventional clustering methods had low accuracy in scenarios with unequal-sized clusters or overlap** signals. Fish and whale signals recorded near Hawaii in February-March 2020 were clustered with DEC, GMM, and conventional clustering. DEC features demonstrated the highest accuracy of 77.5% on a small, manually labeled dataset for classifying signals into fish and whale clusters. △ Less

Submitted 21 March, 2021; v1 submitted 17 December, 2020; originally announced December 2020.

Comments: to appear in Journal of the Acoustical Society of America, April 2021

Journal ref: Journal of the Acoustical Society of America 149 (2021) 2587-2601

arXiv:2010.14420 [pdf, other]

SSLIDE: Sound Source Localization for Indoors based on Deep Learning

Authors: Yifan Wu, Roshan Ayyalasomayajula, Michael J. Bianco, Dinesh Bharadia, Peter Gerstoft

Abstract: This paper presents SSLIDE, Sound Source Localization for Indoors using DEep learning, which applies deep neural networks (DNNs) with encoder-decoder structure to localize sound sources with random positions in a continuous space. The spatial features of sound signals received by each microphone are extracted and represented as likelihood surfaces for the sound source locations in each point. Our… ▽ More This paper presents SSLIDE, Sound Source Localization for Indoors using DEep learning, which applies deep neural networks (DNNs) with encoder-decoder structure to localize sound sources with random positions in a continuous space. The spatial features of sound signals received by each microphone are extracted and represented as likelihood surfaces for the sound source locations in each point. Our DNN consists of an encoder network followed by two decoders. The encoder obtains a compressed representation of the input likelihoods. One decoder resolves the multipath caused by reverberation, and the other decoder estimates the source location. Experiments based on both the simulated and experimental data show that our method can not only outperform multiple signal classification (MUSIC), steered response power with phase transform (SRP-PHAT), sparse Bayesian learning (SBL), and a competing convolutional neural network (CNN) approach in the reverberant environment but also achieve a good generalization performance. △ Less

Submitted 15 February, 2021; v1 submitted 27 October, 2020; originally announced October 2020.

Comments: This paper has been accepted by ICASSP 2021

arXiv:2005.13163 [pdf, other]

doi 10.1109/MLSP49062.2020.9231825

Semi-supervised source localization with deep generative modeling

Authors: Michael J. Bianco, Sharon Gannot, Peter Gerstoft

Abstract: We propose a semi-supervised localization approach based on deep generative modeling with variational autoencoders (VAEs). Localization in reverberant environments remains a challenge, which machine learning (ML) has shown promise in addressing. Even with large data volumes, the number of labels available for supervised learning in reverberant environments is usually small. We address this issue b… ▽ More We propose a semi-supervised localization approach based on deep generative modeling with variational autoencoders (VAEs). Localization in reverberant environments remains a challenge, which machine learning (ML) has shown promise in addressing. Even with large data volumes, the number of labels available for supervised learning in reverberant environments is usually small. We address this issue by performing semi-supervised learning (SSL) with convolutional VAEs. The VAE is trained to generate the phase of relative transfer functions (RTFs), in parallel with a DOA classifier, on both labeled and unlabeled RTF samples. The VAE-SSL approach is compared with SRP-PHAT and fully-supervised CNNs. We find that VAE-SSL can outperform both SRP-PHAT and CNN in label-limited scenarios. △ Less

Submitted 11 February, 2021; v1 submitted 27 May, 2020; originally announced May 2020.

Comments: Published in proceedings of IEEE International Workshop on Machine Learning for Signal Processing. arXiv admin note: substantial text overlap with arXiv:2101.10636

arXiv:2003.04457 [pdf, other]

doi 10.1109/TSP.2021.3068353

Gridless DOA Estimation and Root-MUSIC for Non-Uniform Arrays

Authors: Mark Wagner, Yongsung Park, Peter Gerstoft

Abstract: The problem of gridless direction of arrival (DOA) estimation is addressed in the non-uniform array (NUA) case. Traditionally, gridless DOA estimation and root-MUSIC are only applicable for measurements from a uniform linear array (ULA). This is because the sample covariance matrix of ULA measurements has Toeplitz structure, and both algorithms are based on the Vandermonde decomposition of a Toepl… ▽ More The problem of gridless direction of arrival (DOA) estimation is addressed in the non-uniform array (NUA) case. Traditionally, gridless DOA estimation and root-MUSIC are only applicable for measurements from a uniform linear array (ULA). This is because the sample covariance matrix of ULA measurements has Toeplitz structure, and both algorithms are based on the Vandermonde decomposition of a Toeplitz matrix. The Vandermonde decomposition breaks a Toeplitz matrix into its harmonic components, from which the DOAs are estimated. First, we present the `irregular' Toeplitz matrix and irregular Vandermonde decomposition (IVD), which generalizes the Vandermonde decomposition to apply to a more general set of matrices. It is shown that the IVD is related to the MUSIC and root-MUSIC algorithms. Next, gridless DOA is generalized to the NUA case using IVD. The resulting non-convex optimization problem is solved using alternating projections (AP). A numerical analysis is performed on the AP based solution which shows that the generalization to NUAs has similar performance to traditional gridless DOA. △ Less

Submitted 9 March, 2020; originally announced March 2020.

arXiv:1905.04418 [pdf, other]

doi 10.1121/1.5133944

Machine learning in acoustics: theory and applications

Authors: Michael J. Bianco, Peter Gerstoft, James Traer, Emma Ozanich, Marie A. Roch, Sharon Gannot, Charles-Alban Deledalle

Abstract: Acoustic data provide scientific and engineering insights in fields ranging from biology and communications to ocean and Earth science. We survey the recent advances and transformative potential of machine learning (ML), including deep learning, in the field of acoustics. ML is a broad family of techniques, which are often based in statistics, for automatically detecting and utilizing patterns in… ▽ More Acoustic data provide scientific and engineering insights in fields ranging from biology and communications to ocean and Earth science. We survey the recent advances and transformative potential of machine learning (ML), including deep learning, in the field of acoustics. ML is a broad family of techniques, which are often based in statistics, for automatically detecting and utilizing patterns in data. Relative to conventional acoustics and signal processing, ML is data-driven. Given sufficient training data, ML can discover complex relationships between features and desired labels or actions, or between features themselves. With large volumes of training data, ML can discover models describing complex acoustic phenomena such as human speech and reverberation. ML in acoustics is rapidly develo** with compelling results and significant future promise. We first introduce ML, then highlight ML developments in four acoustics research areas: source localization in speech processing, source localization in ocean acoustics, bioacoustics, and environmental sounds in everyday scenes. △ Less

Submitted 1 December, 2019; v1 submitted 10 May, 2019; originally announced May 2019.

Comments: Published with free access in Journal of the Acoustical Society of America, 27 Nov. 2019

Journal ref: Journal of the Acoustical Society of America, 146(5) pp.3590--3628, 2019

arXiv:1904.00583 [pdf, other]

doi 10.1121/1.5126115

Sound source ranging using a feed-forward neural network with fitting-based early stop**

Authors: **g Chi, Xiaolei Li, Haozhong Wang, Dazhi Gao, Peter Gerstoft

Abstract: When a feed-forward neural network (FNN) is trained for source ranging in an ocean waveguide, it is difficult evaluating the range accuracy of the FNN on unlabeled test data. A fitting-based early stop** (FEAST) method is introduced to evaluate the range error of the FNN on test data where the distance of source is unknown. Based on FEAST, when the evaluated range error of the FNN reaches the mi… ▽ More When a feed-forward neural network (FNN) is trained for source ranging in an ocean waveguide, it is difficult evaluating the range accuracy of the FNN on unlabeled test data. A fitting-based early stop** (FEAST) method is introduced to evaluate the range error of the FNN on test data where the distance of source is unknown. Based on FEAST, when the evaluated range error of the FNN reaches the minimum on test data, stop** training, which will help to improve the ranging accuracy of the FNN on the test data. The FEAST is demonstrated on simulated and experimental data. △ Less

Submitted 1 April, 2019; originally announced April 2019.

arXiv:1903.12319 [pdf, other]

doi 10.1121/1.5116016

Deep-learning source localization using multi-frequency magnitude-only data

Authors: Haiqiang Niu, Zaixiao Gong, Emma Ozanich, Peter Gerstoft, Haibin Wang, Zhenglin Li

Abstract: A deep learning approach based on big data is proposed to locate broadband acoustic sources using a single hydrophone in ocean waveguides with uncertain bottom parameters. Several 50-layer residual neural networks, trained on a huge number of sound field replicas generated by an acoustic propagation model, are used to handle the bottom uncertainty in source localization. A two-step training strate… ▽ More A deep learning approach based on big data is proposed to locate broadband acoustic sources using a single hydrophone in ocean waveguides with uncertain bottom parameters. Several 50-layer residual neural networks, trained on a huge number of sound field replicas generated by an acoustic propagation model, are used to handle the bottom uncertainty in source localization. A two-step training strategy is presented to improve the training of the deep models. First, the range is discretized in a coarse (5 km) grid. Subsequently, the source range within the selected interval and source depth are discretized on a finer (0.1 km and 2 m) grid. The deep learning methods were demonstrated for simulated magnitude-only multi-frequency data in uncertain environments. Experimental data from the China Yellow Sea also validated the approach. △ Less

Submitted 17 July, 2019; v1 submitted 28 March, 2019; originally announced March 2019.

Comments: It has been published on the Journal of the Acoustical Society of America

Journal ref: J. Acoust. Soc. Am. 146(1), 211-222 (2019)

arXiv:1811.12215 [pdf, other]

Gridless Line Spectral Estimation with Multiple Measurement Vector via Variational Bayesian Inference

Authors: Qi Zhang, Jiang Zhu, Peter Gerstoft, Mihai-Alin Badiu, Zhiwei Xu

Abstract: Line spectral estimation (LSE) from multi snapshot samples is studied utilizing the variational Bayesian methods. Motivated by the recently proposed variational line spectral estimation (VALSE) method for a single snapshot, we develop the multisnapshot VALSE (MVALSE) for multi snapshot scenarios, which is important for array processing. The MVALSE shares the advantages of the VALSE method, such as… ▽ More Line spectral estimation (LSE) from multi snapshot samples is studied utilizing the variational Bayesian methods. Motivated by the recently proposed variational line spectral estimation (VALSE) method for a single snapshot, we develop the multisnapshot VALSE (MVALSE) for multi snapshot scenarios, which is important for array processing. The MVALSE shares the advantages of the VALSE method, such as automatically estimating the model order, noise variance and weight variance, closed-form updates of the posterior probability density function (PDF) of the frequencies. By using multiple snapshots, MVALSE improves the recovery performance and it encodes the prior distribution naturally. Finally, numerical results demonstrate the competitive performance of the MVALSE compared to state-of-the-art methods. △ Less

Submitted 28 November, 2018; originally announced November 2018.

Comments: 5 pages. arXiv admin note: substantial text overlap with arXiv:1803.06497

arXiv:1712.08655 [pdf, other]

Travel time tomography with adaptive dictionaries

Authors: Michael Bianco, Peter Gerstoft

Abstract: We develop a 2D travel time tomography method which regularizes the inversion by modeling groups of slowness pixels from discrete slowness maps, called patches, as sparse linear combinations of atoms from a dictionary. We propose to use dictionary learning during the inversion to adapt dictionaries to specific slowness maps. This patch regularization, called the local model, is integrated into the… ▽ More We develop a 2D travel time tomography method which regularizes the inversion by modeling groups of slowness pixels from discrete slowness maps, called patches, as sparse linear combinations of atoms from a dictionary. We propose to use dictionary learning during the inversion to adapt dictionaries to specific slowness maps. This patch regularization, called the local model, is integrated into the overall slowness map, called the global model. The local model considers small-scale variations using a sparsity constraint and the global model considers larger-scale features constrained using $\ell_2$ regularization. This strategy in a locally-sparse travel time tomography (LST) approach enables simultaneous modeling of smooth and discontinuous slowness features. This is in contrast to conventional tomography methods, which constrain models to be exclusively smooth or discontinuous. We develop a $\textit{maximum a posteriori}$ formulation for LST and exploit the sparsity of slowness patches using dictionary learning. The LST approach compares favorably with smoothness and total variation regularization methods on densely, but irregularly sampled synthetic slowness maps. △ Less

Submitted 16 May, 2018; v1 submitted 15 December, 2017; originally announced December 2017.

Comments: Submitted to IEEE Transactions on Computational Imaging (1st revision)

arXiv:1711.03847 [pdf, other]

Sparse Bayesian Learning for DOA Estimation in Heteroscedastic Noise

Authors: Peter Gerstoft, Santosh Nannuru, Christoph F. Mecklenbräuker, Geert Leus

Abstract: The paper considers direction of arrival (DOA) estimation from long-term observations in a noisy environment. In such an environment the noise source might evolve, causing the stationary models to fail. Therefore a heteroscedastic Gaussian noise model is introduced where the variance can vary across observations and sensors. The source amplitudes are assumed independent zero-mean complex Gaussian… ▽ More The paper considers direction of arrival (DOA) estimation from long-term observations in a noisy environment. In such an environment the noise source might evolve, causing the stationary models to fail. Therefore a heteroscedastic Gaussian noise model is introduced where the variance can vary across observations and sensors. The source amplitudes are assumed independent zero-mean complex Gaussian distributed with unknown variances (i.e. the source powers), inspiring stochastic maximum likelihood DOA estimation. The DOAs of plane waves are estimated from multi-snapshot sensor array data using sparse Bayesian learning (SBL) where the noise is estimated across both sensors and snapshots. This SBL approach is more flexible and performs better than high-resolution methods since they cannot estimate the heteroscedastic noise process. An alternative to SBL is simple data normalization, whereby only the phase across the array is utilized. Simulations demonstrate that taking the heteroscedastic noise into account improves DOA estimation. △ Less

Submitted 8 November, 2017; originally announced November 2017.

Comments: Submitted to IEEE TSP

Showing 1–20 of 20 results for author: Gerstoft, P