Search | arXiv e-print repository

Multi-Modal Learning-based Reconstruction of High-Resolution Spatial Wind Speed Fields

Authors: Matteo Zambra, Nicolas Farrugia, Dorian Cazau, Alexandre Gensse, Ronan Fablet

Abstract: Wind speed at sea surface is a key quantity for a variety of scientific applications and human activities. Due to the non-linearity of the phenomenon, a complete description of such variable is made infeasible on both the small scale and large spatial extents. Methods relying on Data Assimilation techniques, despite being the state-of-the-art for Numerical Weather Prediction, can not provide the r… ▽ More Wind speed at sea surface is a key quantity for a variety of scientific applications and human activities. Due to the non-linearity of the phenomenon, a complete description of such variable is made infeasible on both the small scale and large spatial extents. Methods relying on Data Assimilation techniques, despite being the state-of-the-art for Numerical Weather Prediction, can not provide the reconstructions with a spatial resolution that can compete with satellite imagery. In this work we propose a framework based on Variational Data Assimilation and Deep Learning concepts. This framework is applied to recover rich-in-time, high-resolution information on sea surface wind speed. We design our experiments using synthetic wind data and different sampling schemes for high-resolution and low-resolution versions of original data to emulate the real-world scenario of spatio-temporally heterogeneous observations. Extensive numerical experiments are performed to assess systematically the impact of low and high-resolution wind fields and in-situ observations on the model reconstruction performance. We show that in-situ observations with richer temporal resolution represent an added value in terms of the model reconstruction performance. We show how a multi-modal approach, that explicitly informs the model about the heterogeneity of the available observations, can improve the reconstruction task by exploiting the complementary information in spatial and local point-wise data. To conclude, we propose an analysis to test the robustness of the chosen framework against phase delay and amplitude biases in low-resolution data and against interruptions of in-situ observations supply at evaluation time △ Less

Submitted 14 December, 2023; originally announced December 2023.

Comments: 22 pages, 13 figures. This work is to be submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2208.08912 [pdf, other]

Learning-based estimation of in-situ wind speed from underwater acoustics

Authors: Matteo Zambra, Dorian Cazau, Nicolas Farrugia, Alexandre Gensse, Sara Pensieri, Roberto Bozzano, Ronan Fablet

Abstract: Wind speed retrieval at sea surface is of primary importance for scientific and operational applications. Besides weather models, in-situ measurements and remote sensing technologies, especially satellite sensors, provide complementary means to monitor wind speed. As sea surface winds produce sounds that propagate underwater, underwater acoustics recordings can also deliver fine-grained wind-relat… ▽ More Wind speed retrieval at sea surface is of primary importance for scientific and operational applications. Besides weather models, in-situ measurements and remote sensing technologies, especially satellite sensors, provide complementary means to monitor wind speed. As sea surface winds produce sounds that propagate underwater, underwater acoustics recordings can also deliver fine-grained wind-related information. Whereas model-driven schemes, especially data assimilation approaches, are the state-of-the-art schemes to address inverse problems in geoscience, machine learning techniques become more and more appealing to fully exploit the potential of observation datasets. Here, we introduce a deep learning approach for the retrieval of wind speed time series from underwater acoustics possibly complemented by other data sources such as weather model reanalyses. Our approach bridges data assimilation and learning-based frameworks to benefit both from prior physical knowledge and computational efficiency. Numerical experiments on real data demonstrate that we outperform the state-of-the-art data-driven methods with a relative gain up to 16% in terms of RMSE. Interestingly, these results support the relevance of the time dynamics of underwater acoustic data to better inform the time evolution of wind speed. They also show that multimodal data, here underwater acoustics data combined with ECMWF reanalysis data, may further improve the reconstruction performance, including the robustness with respect to missing underwater acoustics data. △ Less

Submitted 18 August, 2022; originally announced August 2022.

Comments: 13 pages, 5 figures

arXiv:1903.06695 [pdf, other]

Development details and computational benchmarking of DEPAM

Authors: Paul Nguyen Hong Duc, Dorian Cazau

Abstract: In the big data era of observational oceanography, passive acoustics datasets are becoming too high volume to be processed on local computers due to their processor and memory limitations. As a result there is a current need for our community to turn to cloud-based distributed computing. We present a scalable computing system for FFT (Fast Fourier Transform)-based features (e.g., Power Spectral De… ▽ More In the big data era of observational oceanography, passive acoustics datasets are becoming too high volume to be processed on local computers due to their processor and memory limitations. As a result there is a current need for our community to turn to cloud-based distributed computing. We present a scalable computing system for FFT (Fast Fourier Transform)-based features (e.g., Power Spectral Density) based on the Apache distributed frameworks Hadoop and Spark. These features are at the core of many different types of acoustic analysis where the need of processing data at scale with speed is evident, e.g. serving as long-term averaged learning representations of soundscapes to identify periods of acoustic interest. In addition to provide a complete description of our system implementation, we also performed a computational benchmark comparing our system to three other Scala-only, Matlab and Python based systems in standalone executions, and evaluated its scalability using the speed up metric. Our current results are very promising in terms of computational performance, as we show that our proposed Hadoop/Spark system performs reasonably well on a single node setup comparatively to state-of-the-art processing tools used by the PAM community, and that it could also fully leverage more intensive cluster resources with a almost-linear scalability behaviour above a certain dataset volume. △ Less

Submitted 7 June, 2019; v1 submitted 3 March, 2019; originally announced March 2019.

arXiv:1902.06659 [pdf, other]

Theory-plus-code documentation of the DEPAM workflow for soundscape description

Authors: D. Cazau

Abstract: In the Big Data era, the community of PAM faces strong challenges, including the need for more standardized processing tools accross its different applications in oceanography, and for more scalable and high-performance computing systems to process more efficiently the everly growing datasets. In this work we address conjointly both issues by first proposing a detailed theory-plus-code document of… ▽ More In the Big Data era, the community of PAM faces strong challenges, including the need for more standardized processing tools accross its different applications in oceanography, and for more scalable and high-performance computing systems to process more efficiently the everly growing datasets. In this work we address conjointly both issues by first proposing a detailed theory-plus-code document of a classical analysis workflow to describe the content of PAM data, which hopefully will be reviewed and adopted by a maximum of PAM experts to make it standardized. Second, we transposed this workflow into the Scala language within the Spark/Hadoop frameworks so it can be directly scaled out on several node cluster. △ Less

Submitted 14 February, 2019; originally announced February 2019.

arXiv:1704.08729 [pdf, other]

Calibration of a two-state pitch-wise HMM method for note segmentation in Automatic Music Transcription systems

Authors: Dorian Cazau, Yuancheng Wang, Olivier Adam, Qiao Wang, Grégory Nuel

Abstract: Many methods for automatic music transcription involves a multi-pitch estimation method that estimates an activity score for each pitch. A second processing step, called note segmentation, has to be performed for each pitch in order to identify the time intervals when the notes are played. In this study, a pitch-wise two-state on/off firstorder Hidden Markov Model (HMM) is developed for note segme… ▽ More Many methods for automatic music transcription involves a multi-pitch estimation method that estimates an activity score for each pitch. A second processing step, called note segmentation, has to be performed for each pitch in order to identify the time intervals when the notes are played. In this study, a pitch-wise two-state on/off firstorder Hidden Markov Model (HMM) is developed for note segmentation. A complete parametrization of the HMM sigmoid function is proposed, based on its original regression formulation, including a parameter alpha of slope smoothing and beta? of thresholding contrast. A comparative evaluation of different note segmentation strategies was performed, differentiated according to whether they use a fixed threshold, called "Hard Thresholding" (HT), or a HMM-based thresholding method, called "Soft Thresholding" (ST). This evaluation was done following MIREX standards and using the MAPS dataset. Also, different transcription scenarios and recording natures were tested using three units of the Degradation toolbox. Results show that note segmentation through a HMM soft thresholding with a data-based optimization of the {alpha,beta} parameter couple significantly enhances transcription performance. △ Less

Submitted 27 April, 2017; originally announced April 2017.

arXiv:1704.03711 [pdf, other]

Investigation on the use of Hidden-Markov Models in automatic transcription of music

Authors: D. Cazau, G. Nuel

Abstract: Hidden Markov Models (HMMs) are a ubiquitous tool to model time series data, and have been widely used in two main tasks of Automatic Music Transcription (AMT): note segmentation, i.e. identifying the played notes after a multi-pitch estimation, and sequential post-processing, i.e. correcting note segmentation using training data. In this paper, we employ the multi-pitch estimation method called P… ▽ More Hidden Markov Models (HMMs) are a ubiquitous tool to model time series data, and have been widely used in two main tasks of Automatic Music Transcription (AMT): note segmentation, i.e. identifying the played notes after a multi-pitch estimation, and sequential post-processing, i.e. correcting note segmentation using training data. In this paper, we employ the multi-pitch estimation method called Probabilistic Latent Component Analysis (PLCA), and develop AMT systems by integrating different HMM-based modules in this framework. For note segmentation, we use two different twostate on/o? HMMs, including a higher-order one for duration modeling. For sequential post-processing, we focused on a musicological modeling of polyphonic harmonic transitions, using a first- and second-order HMMs whose states are defined through candidate note mixtures. These different PLCA plus HMM systems have been evaluated comparatively on two different instrument repertoires, namely the piano (using the MAPS database) and the marovany zither. Our results show that the use of HMMs could bring noticeable improvements to transcription results, depending on the instrument repertoire. △ Less

Submitted 12 April, 2017; originally announced April 2017.

Comments: arXiv admin note: text overlap with arXiv:1703.09772

arXiv:1703.09775 [pdf, other]

Deep scattering transform applied to note onset detection and instrument recognition

Authors: D. Cazau, G. Revillon, O. Adam

Abstract: Automatic Music Transcription (AMT) is one of the oldest and most well-studied problems in the field of music information retrieval. Within this challenging research field, onset detection and instrument recognition take important places in transcription systems, as they respectively help to determine exact onset times of notes and to recognize the corresponding instrument sources. The aim of this… ▽ More Automatic Music Transcription (AMT) is one of the oldest and most well-studied problems in the field of music information retrieval. Within this challenging research field, onset detection and instrument recognition take important places in transcription systems, as they respectively help to determine exact onset times of notes and to recognize the corresponding instrument sources. The aim of this study is to explore the usefulness of multiscale scattering operators for these two tasks on plucked string instrument and piano music. After resuming the theoretical background and illustrating the key features of this sound representation method, we evaluate its performances comparatively to other classical sound representations. Using both MIDI-driven datasets with real instrument samples and real musical pieces, scattering is proved to outperform other sound representations for these AMT subtasks, putting forward its richer sound representation and invariance properties. △ Less

Submitted 28 March, 2017; originally announced March 2017.

arXiv:1703.09772 [pdf, other]

Particle Filtering for PLCA model with Application to Music Transcription

Authors: D. Cazau, G. Revillon, W. Yuancheng, O. Adam

Abstract: Automatic Music Transcription (AMT) consists in automatically estimating the notes in an audio recording, through three attributes: onset time, duration and pitch. Probabilistic Latent Component Analysis (PLCA) has become very popular for this task. PLCA is a spectrogram factorization method, able to model a magnitude spectrogram as a linear combination of spectral vectors from a dictionary. Such… ▽ More Automatic Music Transcription (AMT) consists in automatically estimating the notes in an audio recording, through three attributes: onset time, duration and pitch. Probabilistic Latent Component Analysis (PLCA) has become very popular for this task. PLCA is a spectrogram factorization method, able to model a magnitude spectrogram as a linear combination of spectral vectors from a dictionary. Such methods use the Expectation-Maximization (EM) algorithm to estimate the parameters of the acoustic model. This algorithm presents well-known inherent defaults (local convergence, initialization dependency), making EM-based systems limited in their applications to AMT, particularly in regards to the mathematical form and number of priors. To overcome such limits, we propose in this paper to employ a different estimation framework based on Particle Filtering (PF), which consists in sampling the posterior distribution over larger parameter ranges. This framework proves to be more robust in parameter estimation, more flexible and unifying in the integration of prior knowledge in the system. Note-level transcription accuracies of 61.8 $\%$ and 59.5 $\%$ were achieved on evaluation sound datasets of two different instrument repertoires, including the classical piano (from MAPS dataset) and the marovany zither, and direct comparisons to previous PLCA-based approaches are provided. Steps for further development are also outlined. △ Less

Submitted 28 March, 2017; originally announced March 2017.

Showing 1–8 of 8 results for author: Cazau, D