Search | arXiv e-print repository

Reinforcement Twinning: from digital twins to model-based reinforcement learning

Authors: Lorenzo Schena, Pedro Marques, Romain Poletti, Samuel Ahizi, Jan Van den Berghe, Miguel A. Mendez

Abstract: We propose a novel framework for simultaneously training the digital twin of an engineering system and an associated control agent. The training of the twin combines methods from adjoint-based data assimilation and system identification, while the training of the control agent combines model-based optimal control and model-free reinforcement learning. The training of the control agent is achieved… ▽ More We propose a novel framework for simultaneously training the digital twin of an engineering system and an associated control agent. The training of the twin combines methods from adjoint-based data assimilation and system identification, while the training of the control agent combines model-based optimal control and model-free reinforcement learning. The training of the control agent is achieved by letting it evolve independently along two paths: one driven by a model-based optimal control and another driven by reinforcement learning. The virtual environment offered by the digital twin is used as a playground for confrontation and indirect interaction. This interaction occurs as an ``expert demonstrator", where the best policy is selected for the interaction with the real environment and ``cloned" to the other if the independent training stagnates. We refer to this framework as Reinforcement Twinning (RT). The framework is tested on three vastly different engineering systems and control tasks, namely (1) the control of a wind turbine subject to time-varying wind speed, (2) the trajectory control of flap**-wing micro air vehicles (FWMAVs) subject to wind gusts, and (3) the mitigation of thermal loads in the management of cryogenic storage tanks. The test cases are implemented using simplified models for which the ground truth on the closure law is available. The results show that the adjoint-based training of the digital twin is remarkably sample-efficient and completed within a few iterations. Concerning the control agent training, the results show that the model-based and the model-free control training benefit from the learning experience and the complementary learning approach of each other. The encouraging results open the path towards implementing the RT framework on real systems. △ Less

Submitted 25 February, 2024; v1 submitted 6 November, 2023; originally announced November 2023.

Comments: submitted Journal of Computational Science

arXiv:2208.11891 [pdf, other]

Continuous and Discrete LTI Systems

Authors: Miguel A. Mendez

Abstract: This chapter reviews the fundamentals of continuous and discrete Linear Time-Invariant (LTI) systems with Single Input-Single Output (SISO). We start from the general notions of signals and systems, the signal representation problem and the related orthogonal bases in discrete and continuous forms. We then move to the key properties of LTI systems and discuss their eigenfunctions, the input-output… ▽ More This chapter reviews the fundamentals of continuous and discrete Linear Time-Invariant (LTI) systems with Single Input-Single Output (SISO). We start from the general notions of signals and systems, the signal representation problem and the related orthogonal bases in discrete and continuous forms. We then move to the key properties of LTI systems and discuss their eigenfunctions, the input-output relations in the time and frequency domains, the conformal map** linking the continuous and the discrete formulations, and the modeling via differential and difference equations. Finally, we close with two important applications: (linear) models for time series analysis and forecasting and (linear) digital filters for multi-resolution analysis. This chapter contains seven exercises, the solution of which is provided in the book's webpage. △ Less

Submitted 25 August, 2022; originally announced August 2022.

Comments: Chapter 4 in the book `Data Driven Fluid Mechanics', originating from the lecture series `Machine Learning in Fluid Mechanics' organized by the von Karman Institute in 2020

arXiv:2103.01146 [pdf, other]

doi 10.1109/EMBC46164.2021.9630211

Assessing deep learning methods for the identification of kidney stones in endoscopic images

Authors: Francisco Lopez, Andres Varela, Oscar Hinojosa, Mauricio Mendez, Dinh-Hoan Trinh, Jonathan ElBeze, Jacques Hubert, Vincent Estrade, Miguel Gonzalez, Gilberto Ochoa, Christian Daul

Abstract: Knowing the type (i.e., the biochemical composition) of kidney stones is crucial to prevent relapses with an appropriate treatment. During ureteroscopies, kidney stones are fragmented, extracted from the urinary tract, and their composition is determined using a morpho-constitutional analysis. This procedure is time consuming (the morpho-constitutional analysis results are only available after som… ▽ More Knowing the type (i.e., the biochemical composition) of kidney stones is crucial to prevent relapses with an appropriate treatment. During ureteroscopies, kidney stones are fragmented, extracted from the urinary tract, and their composition is determined using a morpho-constitutional analysis. This procedure is time consuming (the morpho-constitutional analysis results are only available after some days) and tedious (the fragment extraction lasts up to an hour). Identifying the kidney stone type only with the in-vivo endoscopic images would allow for the dusting of the fragments, while the morpho-constitutional analysis could be avoided. Only few contributions dealing with the in vivo identification of kidney stones were published. This paper discusses and compares five classification methods including deep convolutional neural networks (DCNN)-based approaches and traditional (non DCNN-based) ones. Even if the best method is a DCCN approach with a precision and recall of 98% and 97% over four classes, this contribution shows that a XGBoost classifier exploiting well-chosen feature vectors can closely approach the performances of DCNN classifiers for a medical application with a limited number of annotated data. △ Less

Submitted 1 March, 2021; originally announced March 2021.

Comments: This paper is currently under review for the IEEE Engineering in Medicine and Biology Conference (EMBC 2021)

arXiv:2009.05188 [pdf, other]

SONYC-UST-V2: An Urban Sound Tagging Dataset with Spatiotemporal Context

Authors: Mark Cartwright, Jason Cramer, Ana Elisa Mendez Mendez, Yu Wang, Ho-Hsiang Wu, Vincent Lostanlen, Magdalena Fuentes, Graham Dove, Charlie Mydlarz, Justin Salamon, Oded Nov, Juan Pablo Bello

Abstract: We present SONYC-UST-V2, a dataset for urban sound tagging with spatiotemporal information. This dataset is aimed for the development and evaluation of machine listening systems for real-world urban noise monitoring. While datasets of urban recordings are available, this dataset provides the opportunity to investigate how spatiotemporal metadata can aid in the prediction of urban sound tags. SONYC… ▽ More We present SONYC-UST-V2, a dataset for urban sound tagging with spatiotemporal information. This dataset is aimed for the development and evaluation of machine listening systems for real-world urban noise monitoring. While datasets of urban recordings are available, this dataset provides the opportunity to investigate how spatiotemporal metadata can aid in the prediction of urban sound tags. SONYC-UST-V2 consists of 18510 audio recordings from the "Sounds of New York City" (SONYC) acoustic sensor network, including the timestamp of audio acquisition and location of the sensor. The dataset contains annotations by volunteers from the Zooniverse citizen science platform, as well as a two-stage verification with our team. In this article, we describe our data collection procedure and propose evaluation metrics for multilabel classification of urban sound tags. We report the results of a simple baseline model that exploits spatiotemporal information. △ Less

Submitted 10 September, 2020; originally announced September 2020.

arXiv:2002.09026 [pdf]

Multi-label Sound Event Retrieval Using a Deep Learning-based Siamese Structure with a Pairwise Presence Matrix

Authors: Jianyu Fan, Eric Nichols, Daniel Tompkins, Ana Elisa Mendez Mendez, Benjamin Elizalde, Philippe Pasquier

Abstract: Realistic recordings of soundscapes often have multiple sound events co-occurring, such as car horns, engine and human voices. Sound event retrieval is a type of content-based search aiming at finding audio samples, similar to an audio query based on their acoustic or semantic content. State of the art sound event retrieval models have focused on single-label audio recordings, with only one sound… ▽ More Realistic recordings of soundscapes often have multiple sound events co-occurring, such as car horns, engine and human voices. Sound event retrieval is a type of content-based search aiming at finding audio samples, similar to an audio query based on their acoustic or semantic content. State of the art sound event retrieval models have focused on single-label audio recordings, with only one sound event occurring, rather than on multi-label audio recordings (i.e., multiple sound events occur in one recording). To address this latter problem, we propose different Deep Learning architectures with a Siamese-structure and a Pairwise Presence Matrix. The networks are trained and evaluated using the SONYC-UST dataset containing both single- and multi-label soundscape recordings. The performance results show the effectiveness of our proposed model. △ Less

Submitted 20 February, 2020; originally announced February 2020.

Comments: Paper accepted for 45th International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2020)

arXiv:1907.09296 [pdf, other]

A-Phase classification using convolutional neural networks

Authors: Edgar R. Arce-Santana, Alfonso Alba, Martin O. Mendez, Valdemar Arce-Guevara

Abstract: A series of short events, called A-phases, can be observed in the human electroencephalogram during NREM sleep. These events can be classified in three groups (A1, A2 and A3) according to their spectral contents, and are thought to play a role in the transitions between the different sleep stages. A-phase detection and classification is usually performed manually by a trained expert, but it is a t… ▽ More A series of short events, called A-phases, can be observed in the human electroencephalogram during NREM sleep. These events can be classified in three groups (A1, A2 and A3) according to their spectral contents, and are thought to play a role in the transitions between the different sleep stages. A-phase detection and classification is usually performed manually by a trained expert, but it is a tedious and time-consuming task. In the past two decades, various researchers have designed algorithms to automatically detect and classify the A-phases with varying degrees of success, but the problem remains open. In this paper, a different approach is proposed: instead of attempting to design a general classifier for all subjects, we propose to train ad-hoc classifiers for each subject using as little data as possible, in order to drastically reduce the amount of time required from the expert. The proposed classifiers are based on deep convolutional neural networks using the log-spectrogram of the EEG signal as input data. Results are encouraging, achieving average accuracies of 80.31% when discriminating between A-phases and non A-phases, and 71.87% when classifying among A-phase sub-types, with only 25% of the total A-phases used for training. When additional expert-validated data is considered, the sub-type classification accuracy increases to 78.92%. These results show that a semi-automatic annotation system with assistance from an expert could provide a better alternative to fully automatic classifiers. △ Less

Submitted 22 July, 2019; originally announced July 2019.

Comments: 19 pages, 5 figures, 4 tables

Showing 1–6 of 6 results for author: Mendez, M