Search | arXiv e-print repository

Model order reduction of deep structured state-space models: A system-theoretic approach

Authors: Marco Forgione, Manas Mejari, Dario Piga

Abstract: With a specific emphasis on control design objectives, achieving accurate system modeling with limited complexity is crucial in parametric system identification. The recently introduced deep structured state-space models (SSM), which feature linear dynamical blocks as key constituent components, offer high predictive performance. However, the learned representations often suffer from excessively l… ▽ More With a specific emphasis on control design objectives, achieving accurate system modeling with limited complexity is crucial in parametric system identification. The recently introduced deep structured state-space models (SSM), which feature linear dynamical blocks as key constituent components, offer high predictive performance. However, the learned representations often suffer from excessively large model orders, which render them unsuitable for control design purposes. The current paper addresses this challenge by means of system-theoretic model order reduction techniques that target the linear dynamical blocks of SSMs. We introduce two regularization terms which can be incorporated into the training loss for improved model order reduction. In particular, we consider modal $\ell_1$ and Hankel nuclear norm regularization to promote sparsity, allowing one to retain only the relevant states without sacrificing accuracy. The presented regularizers lead to advantages in terms of parsimonious representations and faster inference resulting from the reduced order models. The effectiveness of the proposed methodology is demonstrated using real-world ground vibration data from an aircraft. △ Less

Submitted 21 March, 2024; originally announced March 2024.

arXiv:2403.05164 [pdf, other]

Synthetic data generation for system identification: leveraging knowledge transfer from similar systems

Authors: Dario Piga, Matteo Rufolo, Gabriele Maroni, Manas Mejari, Marco Forgione

Abstract: This paper addresses the challenge of overfitting in the learning of dynamical systems by introducing a novel approach for the generation of synthetic data, aimed at enhancing model generalization and robustness in scenarios characterized by data scarcity. Central to the proposed methodology is the concept of knowledge transfer from systems within the same class. Specifically, synthetic data is ge… ▽ More This paper addresses the challenge of overfitting in the learning of dynamical systems by introducing a novel approach for the generation of synthetic data, aimed at enhancing model generalization and robustness in scenarios characterized by data scarcity. Central to the proposed methodology is the concept of knowledge transfer from systems within the same class. Specifically, synthetic data is generated through a pre-trained meta-model that describes a broad class of systems to which the system of interest is assumed to belong. Training data serves a dual purpose: firstly, as input to the pre-trained meta model to discern the system's dynamics, enabling the prediction of its behavior and thereby generating synthetic output sequences for new input sequences; secondly, in conjunction with synthetic data, to define the loss function used for model estimation. A validation dataset is used to tune a scalar hyper-parameter balancing the relative importance of training and synthetic data in the definition of the loss function. The same validation set can be also used for other purposes, such as early stop** during the training, fundamental to avoid overfitting in case of small-size training datasets. The efficacy of the approach is shown through a numerical example that highlights the advantages of integrating synthetic data into the system identification process. △ Less

Submitted 8 March, 2024; originally announced March 2024.

arXiv:2312.04083 [pdf, other]

On the adaptation of in-context learners for system identification

Authors: Dario Piga, Filippo Pura, Marco Forgione

Abstract: In-context system identification aims at constructing meta-models to describe classes of systems, differently from traditional approaches that model single systems. This paradigm facilitates the leveraging of knowledge acquired from observing the behaviour of different, yet related dynamics. This paper discusses the role of meta-model adaptation. Through numerical examples, we demonstrate how meta… ▽ More In-context system identification aims at constructing meta-models to describe classes of systems, differently from traditional approaches that model single systems. This paradigm facilitates the leveraging of knowledge acquired from observing the behaviour of different, yet related dynamics. This paper discusses the role of meta-model adaptation. Through numerical examples, we demonstrate how meta-model adaptation can enhance predictive performance in three realistic scenarios: tailoring the meta-model to describe a specific system rather than a class; extending the meta-model to capture the behaviour of systems beyond the initial training class; and recalibrating the model for new prediction tasks. Results highlight the effectiveness of meta-model adaptation to achieve a more robust and versatile meta-learning framework for system identification. △ Less

Submitted 7 December, 2023; originally announced December 2023.

arXiv:2308.13380 [pdf, other]

From system models to class models: An in-context learning paradigm

Authors: Marco Forgione, Filippo Pura, Dario Piga

Abstract: Is it possible to understand the intricacies of a dynamical system not solely from its input/output pattern, but also by observing the behavior of other systems within the same class? This central question drives the study presented in this paper. In response to this query, we introduce a novel paradigm for system identification, addressing two primary tasks: one-step-ahead prediction and multi-… ▽ More Is it possible to understand the intricacies of a dynamical system not solely from its input/output pattern, but also by observing the behavior of other systems within the same class? This central question drives the study presented in this paper. In response to this query, we introduce a novel paradigm for system identification, addressing two primary tasks: one-step-ahead prediction and multi-step simulation. Unlike conventional methods, we do not directly estimate a model for the specific system. Instead, we learn a meta model that represents a class of dynamical systems. This meta model is trained on a potentially infinite stream of synthetic data, generated by simulators whose settings are randomly extracted from a probability distribution. When provided with a context from a new system-specifically, an input/output sequence-the meta model implicitly discerns its dynamics, enabling predictions of its behavior. The proposed approach harnesses the power of Transformers, renowned for their \emph{in-context learning} capabilities. For one-step prediction, a GPT-like decoder-only architecture is utilized, whereas the simulation problem employs an encoder-decoder structure. Initial experimental results affirmatively answer our foundational question, opening doors to fresh research avenues in system identification. △ Less

Submitted 20 December, 2023; v1 submitted 25 August, 2023; originally announced August 2023.

arXiv:2304.06349 [pdf, other]

Neural State-Space Models: Empirical Evaluation of Uncertainty Quantification

Authors: Marco Forgione, Dario Piga

Abstract: Effective quantification of uncertainty is an essential and still missing step towards a greater adoption of deep-learning approaches in different applications, including mission-critical ones. In particular, investigations on the predictive uncertainty of deep-learning models describing non-linear dynamical systems are very limited to date. This paper is aimed at filling this gap and presents pre… ▽ More Effective quantification of uncertainty is an essential and still missing step towards a greater adoption of deep-learning approaches in different applications, including mission-critical ones. In particular, investigations on the predictive uncertainty of deep-learning models describing non-linear dynamical systems are very limited to date. This paper is aimed at filling this gap and presents preliminary results on uncertainty quantification for system identification with neural state-space models. We frame the learning problem in a Bayesian probabilistic setting and obtain posterior distributions for the neural network's weights and outputs through approximate inference techniques. Based on the posterior, we construct credible intervals on the outputs and define a surprise index which can effectively diagnose usage of the model in a potentially dangerous out-of-distribution regime, where predictions cannot be trusted. △ Less

Submitted 13 April, 2023; originally announced April 2023.

arXiv:2206.12928 [pdf, other]

Learning neural state-space models: do we need a state estimator?

Authors: Marco Forgione, Manas Mejari, Dario Piga

Abstract: In recent years, several algorithms for system identification with neural state-space models have been introduced. Most of the proposed approaches are aimed at reducing the computational complexity of the learning problem, by splitting the optimization over short sub-sequences extracted from a longer training dataset. Different sequences are then processed simultaneously within a minibatch, taking… ▽ More In recent years, several algorithms for system identification with neural state-space models have been introduced. Most of the proposed approaches are aimed at reducing the computational complexity of the learning problem, by splitting the optimization over short sub-sequences extracted from a longer training dataset. Different sequences are then processed simultaneously within a minibatch, taking advantage of modern parallel hardware for deep learning. An issue arising in these methods is the need to assign an initial state for each of the sub-sequences, which is required to run simulations and thus to evaluate the fitting loss. In this paper, we provide insights for calibration of neural state-space training algorithms based on extensive experimentation and analyses performed on two recognized system identification benchmarks. Particular focus is given to the choice and the role of the initial state estimation. We demonstrate that advanced initial state estimation techniques are really required to achieve high performance on certain classes of dynamical systems, while for asymptotically stable ones basic procedures such as zero or random initialization already yield competitive performance. △ Less

Submitted 26 June, 2022; originally announced June 2022.

arXiv:2204.05892 [pdf, other]

NARX Identification using Derivative-Based Regularized Neural Networks

Authors: L. H. Peeters, G. I. Beintema, M. Forgione, M. Schoukens

Abstract: This work presents a novel regularization method for the identification of Nonlinear Autoregressive eXogenous (NARX) models. The regularization method promotes the exponential decay of the influence of past input samples on the current model output. This is done by penalizing the sensitivity of the NARX model simulated output with respect to the past inputs. This promotes the stability of the esti… ▽ More This work presents a novel regularization method for the identification of Nonlinear Autoregressive eXogenous (NARX) models. The regularization method promotes the exponential decay of the influence of past input samples on the current model output. This is done by penalizing the sensitivity of the NARX model simulated output with respect to the past inputs. This promotes the stability of the estimated models and improves the obtained model quality. The effectiveness of the approach is demonstrated through a simulation example, where a neural network NARX model is identified with this novel method. Moreover, it is shown that the proposed regularization approach improves the model accuracy in terms of simulation error performance compared to that of other regularization methods and model classes. △ Less

Submitted 19 August, 2022; v1 submitted 12 April, 2022; originally announced April 2022.

Comments: Accepted for presentation at the 61st IEEE Conference on Decision and Control

arXiv:2201.08660 [pdf, other]

On the adaptation of recurrent neural networks for system identification

Authors: Marco Forgione, Aneri Muni, Dario Piga, Marco Gallieri

Abstract: This paper presents a transfer learning approach which enables fast and efficient adaptation of Recurrent Neural Network (RNN) models of dynamical systems. A nominal RNN model is first identified using available measurements. The system dynamics are then assumed to change, leading to an unacceptable degradation of the nominal model performance on the perturbed system. To cope with the mismatch, th… ▽ More This paper presents a transfer learning approach which enables fast and efficient adaptation of Recurrent Neural Network (RNN) models of dynamical systems. A nominal RNN model is first identified using available measurements. The system dynamics are then assumed to change, leading to an unacceptable degradation of the nominal model performance on the perturbed system. To cope with the mismatch, the model is augmented with an additive correction term trained on fresh data from the new dynamic regime. The correction term is learned through a Jacobian Feature Regression (JFR) method defined in terms of the features spanned by the model's Jacobian with respect to its nominal parameters. A non-parametric view of the approach is also proposed, which extends recent work on Gaussian Process (GP) with Neural Tangent Kernel (NTK-GP) to the RNN case (RNTK-GP). This can be more efficient for very large networks or when only few data points are available. Implementation aspects for fast and efficient computation of the correction term, as well as the initial state estimation for the RNN model are described. Numerical examples show the effectiveness of the proposed methodology in presence of significant system variations. △ Less

Submitted 21 January, 2022; originally announced January 2022.

arXiv:2104.09839 [pdf, other]

Deep learning with transfer functions: new applications in system identification

Authors: Dario Piga, Marco Forgione, Manas Mejari

Abstract: This paper presents a linear dynamical operator described in terms of a rational transfer function, endowed with a well-defined and efficient back-propagation behavior for automatic derivatives computation. The operator enables end-to-end training of structured networks containing linear transfer functions and other differentiable units {by} exploiting standard deep learning software. Two releva… ▽ More This paper presents a linear dynamical operator described in terms of a rational transfer function, endowed with a well-defined and efficient back-propagation behavior for automatic derivatives computation. The operator enables end-to-end training of structured networks containing linear transfer functions and other differentiable units {by} exploiting standard deep learning software. Two relevant applications of the operator in system identification are presented. The first one consists in the integration of {prediction error methods} in deep learning. The dynamical operator is included as {the} last layer of a neural network in order to obtain the optimal one-step-ahead prediction error. The second one considers identification of general block-oriented models from quantized data. These block-oriented models are constructed by combining linear dynamical operators with static nonlinearities described as standard feed-forward neural networks. A custom loss function corresponding to the log-likelihood of quantized output observations is defined. For gradient-based optimization, the derivatives of the log-likelihood are computed by applying the back-propagation algorithm through the whole network. Two system identification benchmarks are used to show the effectiveness of the proposed methodologies. △ Less

Submitted 20 April, 2021; originally announced April 2021.

arXiv:2006.02915 [pdf, other]

doi 10.1016/j.ejcon.2021.01.008

Continuous-time system identification with neural networks: Model structures and fitting criteria

Authors: Marco Forgione, Dario Piga

Abstract: This paper presents tailor-made neural model structures and two custom fitting criteria for learning dynamical systems. The proposed framework is based on a representation of the system behavior in terms of continuous-time state-space models. The sequence of hidden states is optimized along with the neural network parameters in order to minimize the difference between measured and estimated output… ▽ More This paper presents tailor-made neural model structures and two custom fitting criteria for learning dynamical systems. The proposed framework is based on a representation of the system behavior in terms of continuous-time state-space models. The sequence of hidden states is optimized along with the neural network parameters in order to minimize the difference between measured and estimated outputs, and at the same time to guarantee that the optimized state sequence is consistent with the estimated system dynamics. The effectiveness of the approach is demonstrated through three case studies, including two public system identification benchmarks based on experimental data. △ Less

Submitted 31 August, 2021; v1 submitted 3 June, 2020; originally announced June 2020.

Comments: arXiv admin note: text overlap with arXiv:1911.13034

arXiv:2006.02250 [pdf, other]

dynoNet: a neural network architecture for learning dynamical systems

Authors: Marco Forgione, Dario Piga

Abstract: This paper introduces a network architecture, called dynoNet, utilizing linear dynamical operators as elementary building blocks. Owing to the dynamical nature of these blocks, dynoNet networks are tailored for sequence modeling and system identification purposes. The back-propagation behavior of the linear dynamical operator with respect to both its parameters and its input sequence is defined. T… ▽ More This paper introduces a network architecture, called dynoNet, utilizing linear dynamical operators as elementary building blocks. Owing to the dynamical nature of these blocks, dynoNet networks are tailored for sequence modeling and system identification purposes. The back-propagation behavior of the linear dynamical operator with respect to both its parameters and its input sequence is defined. This enables end-to-end training of structured networks containing linear dynamical operators and other differentiable units, exploiting existing deep learning software. Examples show the effectiveness of the proposed approach on well-known system identification benchmarks. Examples show the effectiveness of the proposed approach against well-known system identification benchmarks. △ Less

Submitted 20 April, 2021; v1 submitted 3 June, 2020; originally announced June 2020.

arXiv:1911.13034 [pdf, other]

Model structures and fitting criteria for system identification with neural networks

Authors: Marco Forgione, Dario Piga

Abstract: This paper focuses on the identification of dynamical systems with tailor-made model structures, where neural networks are used to approximate uncertain components and domain knowledge is retained, if available. These model structures are fitted to measured data using different criteria including a computationally efficient approach minimizing a regularized multi-step ahead simulation error. In th… ▽ More This paper focuses on the identification of dynamical systems with tailor-made model structures, where neural networks are used to approximate uncertain components and domain knowledge is retained, if available. These model structures are fitted to measured data using different criteria including a computationally efficient approach minimizing a regularized multi-step ahead simulation error. In this approach, the neural network parameters are estimated along with the initial conditions used to simulate the output signal in small-size subsequences. A regularization term is included in the fitting cost in order to enforce these initial conditions to be consistent with the estimated system dynamics. Pitfalls and limitations of naive one-step prediction and simulation error minimization are also discussed. △ Less

Submitted 28 October, 2021; v1 submitted 29 November, 2019; originally announced November 2019.

Comments: Source code generating the results of the paper available at https://github.com/forgi86/sysid-neural-structures-fitting

Showing 1–12 of 12 results for author: Forgione, M