-
A Weighted Least-Squares Method for Non-Asymptotic Identification of Markov Parameters from Multiple Trajectories
Authors:
Jiabao He,
Cristian R. Rojas,
Håkan Hjalmarsson
Abstract:
Markov parameters play a key role in system identification. There exists many algorithms where these parameters are estimated using least-squares in a first, pre-processing, step, including subspace identification and multi-step least-squares algorithms, such as Weighted Null-Space Fitting. Recently, there has been an increasing interest in non-asymptotic analysis of estimation algorithms. In this…
▽ More
Markov parameters play a key role in system identification. There exists many algorithms where these parameters are estimated using least-squares in a first, pre-processing, step, including subspace identification and multi-step least-squares algorithms, such as Weighted Null-Space Fitting. Recently, there has been an increasing interest in non-asymptotic analysis of estimation algorithms. In this contribution we identify the Markov parameters using weighted least-squares and present non-asymptotic analysis for such estimator. To cover both stable and unstable systems, multiple trajectories are collected. We show that with the optimal weighting matrix, weighted least-squares gives a tighter error bound than ordinary least-squares for the case of non-uniformly distributed measurement errors. Moreover, as the optimal weighting matrix depends on the system's true parameters, we introduce two methods to consistently estimate the optimal weighting matrix, where the convergence rate of these estimates is also provided. Numerical experiments demonstrate improvements of weighted least-squares over ordinary least-squares in finite sample settings.
△ Less
Submitted 7 May, 2024;
originally announced May 2024.
-
Weighted Least-Squares PARSIM
Authors:
Jiabao He,
Cristian R. Rojas,
Håkan Hjalmarsson
Abstract:
Subspace identification methods (SIMs) have proven very powerful for estimating linear state-space models. To overcome the deficiencies of classical SIMs, a significant number of algorithms has appeared over the last two decades, where most of them involve a common intermediate step, that is to estimate the range space of the extended observability matrix. In this contribution, an optimized versio…
▽ More
Subspace identification methods (SIMs) have proven very powerful for estimating linear state-space models. To overcome the deficiencies of classical SIMs, a significant number of algorithms has appeared over the last two decades, where most of them involve a common intermediate step, that is to estimate the range space of the extended observability matrix. In this contribution, an optimized version of the parallel and parsimonious SIM (PARSIM), PARSIM\textsubscript{opt}, is proposed by using weighted least-squares. It not only inherits all the benefits of PARSIM but also attains the best linear unbiased estimator for the above intermediate step. Furthermore, inspired by SIMs based on the predictor form, consistent estimates of the optimal weighting matrix for weighted least-squares are derived. Essential similarities, differences and simulated comparisons of some key SIMs related to our method are also presented.
△ Less
Submitted 7 May, 2024;
originally announced May 2024.
-
Finite Sample Analysis for a Class of Subspace Identification Methods
Authors:
Jiabao He,
Ingvar Ziemann,
Cristian R. Rojas,
Håkan Hjalmarsson
Abstract:
While subspace identification methods (SIMs) are appealing due to their simple parameterization for MIMO systems and robust numerical realizations, a comprehensive statistical analysis of SIMs remains an open problem, especially in the non-asymptotic regime. In this work, we provide a finite sample analysis for a class of SIMs, which reveals that the convergence rates for estimating Markov paramet…
▽ More
While subspace identification methods (SIMs) are appealing due to their simple parameterization for MIMO systems and robust numerical realizations, a comprehensive statistical analysis of SIMs remains an open problem, especially in the non-asymptotic regime. In this work, we provide a finite sample analysis for a class of SIMs, which reveals that the convergence rates for estimating Markov parameters and system matrices are $\mathcal{O}(1/\sqrt{N})$, in line with classical asymptotic results. Based on the observation that the model format in classical SIMs becomes non-causal because of a projection step, we choose a parsimonious SIM that bypasses the projection step and strictly enforces a causal model to facilitate the analysis, where a bank of ARX models are estimated in parallel. Leveraging recent results from finite sample analysis of an individual ARX model, we obtain an overall error bound of an array of ARX models and proceed to derive error bounds for system matrices via robustness results for the singular value decomposition.
△ Less
Submitted 26 April, 2024;
originally announced April 2024.
-
Kernel-based learning with guarantees for multi-agent applications
Authors:
Krzysztof Kowalczyk,
Paweł Wachel,
Cristian R. Rojas
Abstract:
This paper addresses a kernel-based learning problem for a network of agents locally observing a latent multidimensional, nonlinear phenomenon in a noisy environment. We propose a learning algorithm that requires only mild a priori knowledge about the phenomenon under investigation and delivers a model with corresponding non-asymptotic high probability error bounds. Both non-asymptotic analysis of…
▽ More
This paper addresses a kernel-based learning problem for a network of agents locally observing a latent multidimensional, nonlinear phenomenon in a noisy environment. We propose a learning algorithm that requires only mild a priori knowledge about the phenomenon under investigation and delivers a model with corresponding non-asymptotic high probability error bounds. Both non-asymptotic analysis of the method and numerical simulation results are presented and discussed in the paper.
△ Less
Submitted 15 April, 2024;
originally announced April 2024.
-
Statistical Analysis of Block Coordinate Descent Algorithms for Linear Continuous-time System Identification
Authors:
Rodrigo A. González,
Koen Classens,
Cristian R. Rojas,
James S. Welsh,
Tom Oomen
Abstract:
Block coordinate descent is an optimization technique that is used for estimating multi-input single-output (MISO) continuous-time models, as well as single-input single output (SISO) models in additive form. Despite its widespread use in various optimization contexts, the statistical properties of block coordinate descent in continuous-time system identification have not been covered in the liter…
▽ More
Block coordinate descent is an optimization technique that is used for estimating multi-input single-output (MISO) continuous-time models, as well as single-input single output (SISO) models in additive form. Despite its widespread use in various optimization contexts, the statistical properties of block coordinate descent in continuous-time system identification have not been covered in the literature. The aim of this paper is to formally analyze the bias properties of the block coordinate descent approach for the identification of MISO and additive SISO systems. We characterize the asymptotic bias at each iteration, and provide sufficient conditions for the consistency of the estimator for each identification setting. The theoretical results are supported by simulation examples.
△ Less
Submitted 13 April, 2024;
originally announced April 2024.
-
Consistency analysis of refined instrumental variable methods for continuous-time system identification in closed-loop
Authors:
Rodrigo A. González,
Siqi Pan,
Cristian R. Rojas,
James S. Welsh
Abstract:
Refined instrumental variable methods have been broadly used for identification of continuous-time systems in both open and closed-loop settings. However, the theoretical properties of these methods are still yet to be fully understood when operating in closed-loop. In this paper, we address the consistency of the simplified refined instrumental variable method for continuous-time systems (SRIVC)…
▽ More
Refined instrumental variable methods have been broadly used for identification of continuous-time systems in both open and closed-loop settings. However, the theoretical properties of these methods are still yet to be fully understood when operating in closed-loop. In this paper, we address the consistency of the simplified refined instrumental variable method for continuous-time systems (SRIVC) and its closed-loop variant CLSRIVC when they are applied on data that is generated from a feedback loop. In particular, we consider feedback loops consisting of continuous-time controllers, as well as the discrete-time control case. This paper proves that the SRIVC and CLSRIVC estimators are not generically consistent when there is a continuous-time controller in the loop, and that generic consistency can be achieved when the controller is implemented in discrete-time. Numerical simulations are presented to support the theoretical results.
△ Less
Submitted 13 April, 2024;
originally announced April 2024.
-
Coherence-based Input Design for Sparse System Identification
Authors:
Javad Parsa,
Cristian R. Rojas,
Håkan Hjalmarsson
Abstract:
The maximum absolute correlation between regressors, which is called mutual coherence, plays an essential role in sparse estimation. A regressor matrix whose columns are highly correlated may result from optimal input design, since there is no constraint on the mutual coherence, so when this regressor is used to estimate sparse parameter vectors of a system, it may yield a large estimation error.…
▽ More
The maximum absolute correlation between regressors, which is called mutual coherence, plays an essential role in sparse estimation. A regressor matrix whose columns are highly correlated may result from optimal input design, since there is no constraint on the mutual coherence, so when this regressor is used to estimate sparse parameter vectors of a system, it may yield a large estimation error. This paper aims to tackle this issue for fixed denominator models, which include Laguerre, Kautz, and generalized orthonormal basis function expansion models, for example. The paper proposes an optimal input design method where the achieved Fisher information matrix is fitted to the desired Fisher matrix, together with a coordinate transformation designed to make the regressors in the transformed coordinates have low mutual coherence. The method can be used together with any sparse estimation method and in a numerical study we show its potential for alleviating the problem of model order selection when used in conjunction with, for example, classical methods such as AIC and BIC.
△ Less
Submitted 8 February, 2024;
originally announced February 2024.
-
Identification of Additive Continuous-time Systems in Open and Closed-loop
Authors:
Rodrigo A. González,
Koen Classens,
Cristian R. Rojas,
James S. Welsh,
Tom Oomen
Abstract:
When identifying electrical, mechanical, or biological systems, parametric continuous-time identification methods can lead to interpretable and parsimonious models when the model structure aligns with the physical properties of the system. Traditional linear system identification may not consider the most parsimonious model when relying solely on unfactored transfer functions, which typically resu…
▽ More
When identifying electrical, mechanical, or biological systems, parametric continuous-time identification methods can lead to interpretable and parsimonious models when the model structure aligns with the physical properties of the system. Traditional linear system identification may not consider the most parsimonious model when relying solely on unfactored transfer functions, which typically result from standard direct approaches. This paper presents a novel identification method that delivers additive models for both open and closed-loop setups. The estimators that are derived are shown to be generically consistent, and can admit the identification of marginally stable additive systems. Numerical simulations show the efficacy of the proposed approach, and its performance in identifying a modal representation of a flexible beam is verified using experimental data.
△ Less
Submitted 2 January, 2024;
originally announced January 2024.
-
Reset-Free Data-Driven Gain Estimation: Power Iteration using Reversed-Circulant Matrices
Authors:
Tom Oomen,
Cristian R. Rojas
Abstract:
A direct data-driven iterative algorithm is developed to accurately estimate the $H_\infty$ norm of a linear time-invariant system from continuous operation, i.e., without resetting the system. The main technical step involves a reversed-circulant matrix that can be evaluated in a model-free setting by performing experiments on the real system.
A direct data-driven iterative algorithm is developed to accurately estimate the $H_\infty$ norm of a linear time-invariant system from continuous operation, i.e., without resetting the system. The main technical step involves a reversed-circulant matrix that can be evaluated in a model-free setting by performing experiments on the real system.
△ Less
Submitted 21 November, 2023;
originally announced November 2023.
-
Minimax Two-Stage Gradient Boosting for Parameter Estimation
Authors:
Braghadeesh Lakshminarayanan,
Cristian R. Rojas
Abstract:
Parameter estimation is an important sub-field in statistics and system identification. Various methods for parameter estimation have been proposed in the literature, among which the Two-Stage (TS) approach is particularly promising, due to its ease of implementation and reliable estimates. Among the different statistical frameworks used to derive TS estimators, the min-max framework is attractive…
▽ More
Parameter estimation is an important sub-field in statistics and system identification. Various methods for parameter estimation have been proposed in the literature, among which the Two-Stage (TS) approach is particularly promising, due to its ease of implementation and reliable estimates. Among the different statistical frameworks used to derive TS estimators, the min-max framework is attractive due to its mild dependence on prior knowledge about the parameters to be estimated. However, the existing implementation of the minimax TS approach has currently limited applicability, due to its heavy computational load. In this paper, we overcome this difficulty by using a gradient boosting machine (GBM) in the second stage of TS approach. We call the resulting algorithm the Two-Stage Gradient Boosting Machine (TSGBM) estimator. Finally, we test our proposed TSGBM estimator on several numerical examples including models of dynamical systems.
△ Less
Submitted 20 November, 2023;
originally announced November 2023.
-
Unraveling the Control Engineer's Craft with Neural Networks
Authors:
Braghadeesh Lakshminarayanan,
Federico Dettù,
Cristian R. Rojas,
Simone Formentin
Abstract:
Many industrial processes require suitable controllers to meet their performance requirements. More often, a sophisticated digital twin is available, which is a highly complex model that is a virtual representation of a given physical process, whose parameters may not be properly tuned to capture the variations in the physical process. In this paper, we present a sim2real, direct data-driven contr…
▽ More
Many industrial processes require suitable controllers to meet their performance requirements. More often, a sophisticated digital twin is available, which is a highly complex model that is a virtual representation of a given physical process, whose parameters may not be properly tuned to capture the variations in the physical process. In this paper, we present a sim2real, direct data-driven controller tuning approach, where the digital twin is used to generate input-output data and suitable controllers for several perturbations in its parameters. State-of-the art neural-network architectures are then used to learn the controller tuning rule that maps input-output data onto the controller parameters, based on artificially generated data from perturbed versions of the digital twin. In this way, as far as we are aware, we tackle for the first time the problem of re-calibrating the controller by meta-learning the tuning rule directly from data, thus practically replacing the control engineer with a machine learning model. The benefits of this methodology are illustrated via numerical simulations for several choices of neural-network architectures.
△ Less
Submitted 20 November, 2023;
originally announced November 2023.
-
DRCFS: Doubly Robust Causal Feature Selection
Authors:
Francesco Quinzan,
Ashkan Soleymani,
Patrick Jaillet,
Cristian R. Rojas,
Stefan Bauer
Abstract:
Knowing the features of a complex system that are highly relevant to a particular target variable is of fundamental interest in many areas of science. Existing approaches are often limited to linear settings, sometimes lack guarantees, and in most cases, do not scale to the problem at hand, in particular to images. We propose DRCFS, a doubly robust feature selection method for identifying the caus…
▽ More
Knowing the features of a complex system that are highly relevant to a particular target variable is of fundamental interest in many areas of science. Existing approaches are often limited to linear settings, sometimes lack guarantees, and in most cases, do not scale to the problem at hand, in particular to images. We propose DRCFS, a doubly robust feature selection method for identifying the causal features even in nonlinear and high dimensional settings. We provide theoretical guarantees, illustrate necessary conditions for our assumptions, and perform extensive experiments across a wide range of simulated and semi-synthetic datasets. DRCFS significantly outperforms existing state-of-the-art methods, selecting robust features even in challenging highly non-linear and high-dimensional problems.
△ Less
Submitted 5 July, 2023; v1 submitted 12 June, 2023;
originally announced June 2023.
-
On the Relation between Discrete and Continuous-time Refined Instrumental Variable Methods
Authors:
Rodrigo A. González,
Cristian R. Rojas,
Siqi Pan,
James S. Welsh
Abstract:
The Refined Instrumental Variable method for discrete-time systems (RIV) and its variant for continuous-time systems (RIVC) are popular methods for the identification of linear systems in open-loop. The continuous-time equivalent of the transfer function estimate given by the RIV method is commonly used as an initialization point for the RIVC estimator. In this paper, we prove that these estimator…
▽ More
The Refined Instrumental Variable method for discrete-time systems (RIV) and its variant for continuous-time systems (RIVC) are popular methods for the identification of linear systems in open-loop. The continuous-time equivalent of the transfer function estimate given by the RIV method is commonly used as an initialization point for the RIVC estimator. In this paper, we prove that these estimators share the same converging points for finite sample size when the continuous-time model has relative degree zero or one. This relation does not hold for higher relative degrees. Then, we propose a modification of the RIV method whose continuous-time equivalent is equal to the RIVC estimator for any non-negative relative degree. The implications of the theoretical results are illustrated via a simulation example.
△ Less
Submitted 31 May, 2023;
originally announced May 2023.
-
Decentralized diffusion-based learning under non-parametric limited prior knowledge
Authors:
Paweł Wachel,
Krzysztof Kowalczyk,
Cristian R. Rojas
Abstract:
We study the problem of diffusion-based network learning of a nonlinear phenomenon, $m$, from local agents' measurements collected in a noisy environment. For a decentralized network and information spreading merely between directly neighboring nodes, we propose a non-parametric learning algorithm, that avoids raw data exchange and requires only mild \textit{a priori} knowledge about $m$. Non-asym…
▽ More
We study the problem of diffusion-based network learning of a nonlinear phenomenon, $m$, from local agents' measurements collected in a noisy environment. For a decentralized network and information spreading merely between directly neighboring nodes, we propose a non-parametric learning algorithm, that avoids raw data exchange and requires only mild \textit{a priori} knowledge about $m$. Non-asymptotic estimation error bounds are derived for the proposed method. Its potential applications are illustrated through simulation experiments.
△ Less
Submitted 5 May, 2023;
originally announced May 2023.
-
Diagnosing and Augmenting Feature Representations in Correctional Inverse Reinforcement Learning
Authors:
Inês Lourenço,
Andreea Bobu,
Cristian R. Rojas,
Bo Wahlberg
Abstract:
Robots have been increasingly better at doing tasks for humans by learning from their feedback, but still often suffer from model misalignment due to missing or incorrectly learned features. When the features the robot needs to learn to perform its task are missing or do not generalize well to new settings, the robot will not be able to learn the task the human wants and, even worse, may learn a c…
▽ More
Robots have been increasingly better at doing tasks for humans by learning from their feedback, but still often suffer from model misalignment due to missing or incorrectly learned features. When the features the robot needs to learn to perform its task are missing or do not generalize well to new settings, the robot will not be able to learn the task the human wants and, even worse, may learn a completely different and undesired behavior. Prior work shows how the robot can detect when its representation is missing some feature and can, thus, ask the human to be taught about the new feature; however, these works do not differentiate between features that are completely missing and those that exist but do not generalize to new environments. In the latter case, the robot would detect misalignment and simply learn a new feature, leading to an arbitrarily growing feature representation that can, in turn, lead to spurious correlations and incorrect learning down the line. In this work, we propose separating the two sources of misalignment: we propose a framework for determining whether a feature the robot needs is incorrectly learned and does not generalize to new environment setups vs. is entirely missing from the robot's representation. Once we detect the source of error, we show how the human can initiate the realignment process for the model: if the feature is missing, we follow prior work for learning new features; however, if the feature exists but does not generalize, we use data augmentation to expand its training and, thus, complete the correction. We demonstrate the proposed approach in experiments with a simulated 7DoF robot manipulator and physical human corrections.
△ Less
Submitted 13 April, 2023; v1 submitted 11 April, 2023;
originally announced April 2023.
-
An EM Algorithm for Lebesgue-sampled State-space Continuous-time System Identification
Authors:
Rodrigo A. González,
Angel L. Cedeño,
María Coronel,
Juan C. Agüero,
Cristian R. Rojas
Abstract:
This paper concerns the identification of continuous-time systems in state-space form that are subject to Lebesgue sampling. Contrary to equidistant (Riemann) sampling, Lebesgue sampling consists of taking measurements of a continuous-time signal whenever it crosses fixed and regularly partitioned thresholds. The knowledge of the intersample behavior of the output data is exploited in this work to…
▽ More
This paper concerns the identification of continuous-time systems in state-space form that are subject to Lebesgue sampling. Contrary to equidistant (Riemann) sampling, Lebesgue sampling consists of taking measurements of a continuous-time signal whenever it crosses fixed and regularly partitioned thresholds. The knowledge of the intersample behavior of the output data is exploited in this work to derive an expectation-maximization (EM) algorithm for parameter estimation of the state-space and noise covariance matrices. For this purpose, we use the incremental discrete-time equivalent of the system, which leads to EM iterations of the continuous-time state-space matrices that can be computed by standard filtering and smoothing procedures. The effectiveness of the identification method is tested via Monte Carlo simulations.
△ Less
Submitted 6 April, 2023;
originally announced April 2023.
-
Parsimonious Identification of Continuous-Time Systems: A Block-Coordinate Descent Approach
Authors:
Rodrigo A. González,
Cristian R. Rojas,
Siqi Pan,
James S. Welsh
Abstract:
The identification of electrical, mechanical, and biological systems using data can benefit greatly from prior knowledge extracted from physical modeling. Parametric continuous-time identification methods can naturally incorporate this knowledge, which leads to interpretable and parsimonious models. However, some applications lead to model structures that lack parsimonious descriptions using unfac…
▽ More
The identification of electrical, mechanical, and biological systems using data can benefit greatly from prior knowledge extracted from physical modeling. Parametric continuous-time identification methods can naturally incorporate this knowledge, which leads to interpretable and parsimonious models. However, some applications lead to model structures that lack parsimonious descriptions using unfactored transfer functions, which are commonly used in standard direct approaches for continuous-time system identification. In this paper we characterize this parsimony problem, and develop a block-coordinate descent algorithm that delivers parsimonious models by sequentially estimating an additive decomposition of the transfer function of interest. Numerical simulations show the efficacy of the proposed approach.
△ Less
Submitted 6 April, 2023;
originally announced April 2023.
-
Optimal Transport for Correctional Learning
Authors:
Rebecka Winqvist,
Inês Lourenco,
Francesco Quinzan,
Cristian R. Rojas,
Bo Wahlberg
Abstract:
The contribution of this paper is a generalized formulation of correctional learning using optimal transport, which is about how to optimally transport one mass distribution to another. Correctional learning is a framework developed to enhance the accuracy of parameter estimation processes by means of a teacher-student approach. In this framework, an expert agent, referred to as the teacher, modif…
▽ More
The contribution of this paper is a generalized formulation of correctional learning using optimal transport, which is about how to optimally transport one mass distribution to another. Correctional learning is a framework developed to enhance the accuracy of parameter estimation processes by means of a teacher-student approach. In this framework, an expert agent, referred to as the teacher, modifies the data used by a learning agent, known as the student, to improve its estimation process. The objective of the teacher is to alter the data such that the student's estimation error is minimized, subject to a fixed intervention budget. Compared to existing formulations of correctional learning, our novel optimal transport approach provides several benefits. It allows for the estimation of more complex characteristics as well as the consideration of multiple intervention policies for the teacher. We evaluate our approach on two theoretical examples, and on a human-robot interaction application in which the teacher's role is to improve the robots performance in an inverse reinforcement learning setting.
△ Less
Submitted 4 April, 2023;
originally announced April 2023.
-
A Unified Approach to Differentially Private Bayes Point Estimation
Authors:
Braghadeesh Lakshminarayanan,
Cristian R. Rojas
Abstract:
Parameter estimation in statistics and system identification relies on data that may contain sensitive information. To protect this sensitive information, the notion of \emph{differential privacy} (DP) has been proposed, which enforces confidentiality by introducing randomization in the estimates. Standard algorithms for differentially private estimation are based on adding an appropriate amount o…
▽ More
Parameter estimation in statistics and system identification relies on data that may contain sensitive information. To protect this sensitive information, the notion of \emph{differential privacy} (DP) has been proposed, which enforces confidentiality by introducing randomization in the estimates. Standard algorithms for differentially private estimation are based on adding an appropriate amount of noise to the output of a traditional point estimation method. This leads to an accuracy-privacy trade off, as adding more noise reduces the accuracy while increasing privacy. In this paper, we propose a new Unified Bayes Private Point (UBaPP) approach to Bayes point estimation of the unknown parameters of a data generating mechanism under a DP constraint, that achieves a better accuracy-privacy trade off than traditional approaches. We verify the performance of our approach on a simple numerical example.
△ Less
Submitted 18 November, 2022;
originally announced November 2022.
-
On state-space representations of general discrete-time dynamical systems
Authors:
Cristian R. Rojas,
Pawel Wachel
Abstract:
In this paper we establish that every (deterministic) non-autonomous, discrete-time, causal, time invariant system has a state-space representation, and discuss its minimality.
In this paper we establish that every (deterministic) non-autonomous, discrete-time, causal, time invariant system has a state-space representation, and discuss its minimality.
△ Less
Submitted 6 May, 2022;
originally announced May 2022.
-
A Statistical Decision-Theoretical Perspective on the Two-Stage Approach to Parameter Estimation
Authors:
Braghadeesh Lakshminarayanan,
Cristian R. Rojas
Abstract:
One of the most important problems in system identification and statistics is how to estimate the unknown parameters of a given model. Optimization methods and specialized procedures, such as Empirical Minimization (EM) can be used in case the likelihood function can be computed. For situations where one can only simulate from a parametric model, but the likelihood is difficult or impossible to ev…
▽ More
One of the most important problems in system identification and statistics is how to estimate the unknown parameters of a given model. Optimization methods and specialized procedures, such as Empirical Minimization (EM) can be used in case the likelihood function can be computed. For situations where one can only simulate from a parametric model, but the likelihood is difficult or impossible to evaluate, a technique known as the Two-Stage (TS) Approach can be applied to obtain reliable parametric estimates. Unfortunately, there is currently a lack of theoretical justification for TS. In this paper, we propose a statistical decision-theoretical derivation of TS, which leads to Bayesian and Minimax estimators. We also show how to apply the TS approach on models for independent and identically distributed samples, by computing quantiles of the data as a first step, and using a linear function as the second stage. The proposed method is illustrated via numerical simulations.
△ Less
Submitted 15 April, 2022; v1 submitted 31 March, 2022;
originally announced April 2022.
-
A Teacher-Student Markov Decision Process-based Framework for Online Correctional Learning
Authors:
Inês Lourenço,
Rebecka Winqvist,
Cristian R. Rojas,
Bo Wahlberg
Abstract:
A classical learning setting typically concerns an agent/student who collects data, or observations, from a system in order to estimate a certain property of interest. Correctional learning is a type of cooperative teacher-student framework where a teacher, who has partial knowledge about the system, has the ability to observe and alter (correct) the observations received by the student in order t…
▽ More
A classical learning setting typically concerns an agent/student who collects data, or observations, from a system in order to estimate a certain property of interest. Correctional learning is a type of cooperative teacher-student framework where a teacher, who has partial knowledge about the system, has the ability to observe and alter (correct) the observations received by the student in order to improve the accuracy of its estimate. In this paper, we show how the variance of the estimate of the student can be reduced with the help of the teacher. We formulate the corresponding online problem - where the teacher has to decide, at each time instant, whether or not to change the observations due to a limited budget - as a Markov decision process, from which the optimal policy is derived using dynamic programming. We validate the framework in numerical experiments, and compare the optimal online policy with the one from the batch setting.
△ Less
Submitted 29 March, 2022; v1 submitted 15 November, 2021;
originally announced November 2021.
-
Asymptotically Optimal Bandits under Weighted Information
Authors:
Matias I. Müller,
Cristian R. Rojas
Abstract:
We study the problem of regret minimization in a multi-armed bandit setup where the agent is allowed to play multiple arms at each round by spreading the resources usually allocated to only one arm. At each iteration the agent selects a normalized power profile and receives a Gaussian vector as outcome, where the unknown variance of each sample is inversely proportional to the power allocated to t…
▽ More
We study the problem of regret minimization in a multi-armed bandit setup where the agent is allowed to play multiple arms at each round by spreading the resources usually allocated to only one arm. At each iteration the agent selects a normalized power profile and receives a Gaussian vector as outcome, where the unknown variance of each sample is inversely proportional to the power allocated to that arm. The reward corresponds to a linear combination of the power profile and the outcomes, resembling a linear bandit. By spreading the power, the agent can choose to collect information much faster than in a traditional multi-armed bandit at the price of reducing the accuracy of the samples. This setup is fundamentally different from that of a linear bandit -- the regret is known to scale as $Θ(\sqrt{T})$ for linear bandits, while in this setup the agent receives a much more detailed feedback, for which we derive a tight $\log(T)$ problem-dependent lower-bound. We propose a Thompson-Sampling-based strategy, called Weighted Thompson Sampling (\WTS), that designs the power profile as its posterior belief of each arm being the best arm, and show that its upper bound matches the derived logarithmic lower bound. Finally, we apply this strategy to a problem of control and system identification, where the goal is to estimate the maximum gain (also called $\mathcal{H}_\infty$-norm) of a linear dynamical system based on batches of input-output samples.
△ Less
Submitted 28 May, 2021;
originally announced May 2021.
-
Consistency Analysis of the Closed-loop SRIVC Estimator
Authors:
Siqi Pan,
James S. Welsh,
Rodrigo A. Gonzalez,
Cristian R. Rojas
Abstract:
The Consistency of the Closed-Loop Simplified Refined Instrumental Variable method for Continuous-time system (CLSRIVC) is analysed based on sampled data. It is proven that the CLSRIVC estimator is not consistent when a continuous-time controller is used in the closed-loop.
The Consistency of the Closed-Loop Simplified Refined Instrumental Variable method for Continuous-time system (CLSRIVC) is analysed based on sampled data. It is proven that the CLSRIVC estimator is not consistent when a continuous-time controller is used in the closed-loop.
△ Less
Submitted 23 March, 2021;
originally announced March 2021.
-
Non-causal regularized least-squares for continuous-time system identification with band-limited input excitations
Authors:
Rodrigo A. González,
Cristian R. Rojas,
Håkan Hjalmarsson
Abstract:
In continuous-time system identification, the intersample behavior of the input signal is known to play a crucial role in the performance of estimation methods. One common input behavior assumption is that the spectrum of the input is band-limited. The sinc interpolation property of these input signals yields equivalent discrete-time representations that are non-causal. This observation, often ove…
▽ More
In continuous-time system identification, the intersample behavior of the input signal is known to play a crucial role in the performance of estimation methods. One common input behavior assumption is that the spectrum of the input is band-limited. The sinc interpolation property of these input signals yields equivalent discrete-time representations that are non-causal. This observation, often overlooked in the literature, is exploited in this work to study non-parametric frequency response estimators of linear continuous-time systems. We study the properties of non-causal least-square estimators for continuous-time system identification, and propose a kernel-based non-causal regularized least-squares approach for estimating the band-limited equivalent impulse response. The proposed methods are tested via extensive numerical simulations.
△ Less
Submitted 19 March, 2021;
originally announced March 2021.
-
Cooperative System Identification via Correctional Learning
Authors:
Inês Lourenço,
Robert Mattila,
Cristian R. Rojas,
Bo Wahlberg
Abstract:
We consider a cooperative system identification scenario in which an expert agent (teacher) knows a correct, or at least a good, model of the system and aims to assist a learner-agent (student), but cannot directly transfer its knowledge to the student. For example, the teacher's knowledge of the system might be abstract or the teacher and student might be employing different model classes, which…
▽ More
We consider a cooperative system identification scenario in which an expert agent (teacher) knows a correct, or at least a good, model of the system and aims to assist a learner-agent (student), but cannot directly transfer its knowledge to the student. For example, the teacher's knowledge of the system might be abstract or the teacher and student might be employing different model classes, which renders the teacher's parameters uninformative to the student. In this paper, we propose correctional learning as an approach to the above problem: Suppose that in order to assist the student, the teacher can intercept the observations collected from the system and modify them to maximize the amount of information the student receives about the system. We formulate a general solution as an optimization problem, which for a multinomial system instantiates itself as an integer program. Furthermore, we obtain finite-sample results on the improvement that the assistance from the teacher results in (as measured by the reduction in the variance of the estimator) for a binomial system.
△ Less
Submitted 9 December, 2020;
originally announced December 2020.
-
Consistent identification of continuous-time systems under multisine input signal excitation
Authors:
Rodrigo A. González,
Cristian R. Rojas,
Siqi Pan,
James S. Welsh
Abstract:
For many years, the Simplified Refined Instrumental Variable method for Continuous-time systems (SRIVC) has been widely used for identification. The intersample behaviour of the input plays an important role in this method, and it has been shown recently that the SRIVC estimator is not consistent if an incorrect assumption on the intersample behaviour is considered. In this paper, we present an ex…
▽ More
For many years, the Simplified Refined Instrumental Variable method for Continuous-time systems (SRIVC) has been widely used for identification. The intersample behaviour of the input plays an important role in this method, and it has been shown recently that the SRIVC estimator is not consistent if an incorrect assumption on the intersample behaviour is considered. In this paper, we present an extension of the SRIVC algorithm that is able to deal with continuous-time multisine signals, which cannot be interpolated exactly through hold reconstructions. The proposed estimator is generically consistent for any input reconstructed through zero or first-order-hold devices, and we show that it is generically consistent for continuous-time multisine inputs as well. The statistical performance of the proposed estimator is compared to the standard SRIVC estimator through extensive simulations.
△ Less
Submitted 12 March, 2021; v1 submitted 6 May, 2020;
originally announced May 2020.
-
How to Protect Your Privacy? A Framework for Counter-Adversarial Decision Making
Authors:
Inês Lourenço,
Robert Mattila,
Cristian R. Rojas,
Bo Wahlberg
Abstract:
We consider a counter-adversarial sequential decision-making problem where an agent computes its private belief (posterior distribution) of the current state of the world, by filtering private information. According to its private belief, the agent performs an action, which is observed by an adversarial agent. We have recently shown how the adversarial agent can reconstruct the private belief of t…
▽ More
We consider a counter-adversarial sequential decision-making problem where an agent computes its private belief (posterior distribution) of the current state of the world, by filtering private information. According to its private belief, the agent performs an action, which is observed by an adversarial agent. We have recently shown how the adversarial agent can reconstruct the private belief of the decision-making agent via inverse optimization. The main contribution of this paper is a method to obfuscate the private belief of the agent from the adversary, by performing a suboptimal action. The proposed method optimizes the trade-off between obfuscating the private belief and limiting the increase in cost accrued due to taking a suboptimal action. We propose a probabilistic relaxation to obtain a linear optimization problem for solving the trade-off. In numerical examples, we show that the proposed methods enable the agent to obfuscate its private belief without compromising its cost budget.
△ Less
Submitted 8 April, 2020;
originally announced April 2020.
-
Efficiency Analysis of the Simplified Refined Instrumental Variable Method for Continuous-time Systems
Authors:
Siqi Pan,
James S. Welsh,
Rodrigo A. González,
Cristian R. Rojas
Abstract:
In this paper, we derive the asymptotic Cramér-Rao lower bound for the continuous-time output error model structure and provide an analysis of the statistical efficiency of the Simplified Refined Instrumental Variable method for Continuous-time systems (SRIVC) based on sampled data.It is shown that the asymptotic Cramér-Rao lower bound is independent of the intersample behaviour of the noise-free…
▽ More
In this paper, we derive the asymptotic Cramér-Rao lower bound for the continuous-time output error model structure and provide an analysis of the statistical efficiency of the Simplified Refined Instrumental Variable method for Continuous-time systems (SRIVC) based on sampled data.It is shown that the asymptotic Cramér-Rao lower bound is independent of the intersample behaviour of the noise-free system output and hence only depends on the intersample behaviour of the system input. We have also shown that, at the converging point of the SRIVC algorithm, the estimates do not depend on the intersample behaviour of the measured output. It is then proven that the SRIVC estimator is asymptotically efficient for the output error model structure under mild conditions. Monte Carlo simulations are performed to verify the asymptotic Cramér-Rao lower bound and the asymptotic covariance of the SRIVC estimates.
△ Less
Submitted 17 July, 2020; v1 submitted 2 February, 2020;
originally announced February 2020.
-
Inverse Filtering for Hidden Markov Models with Applications to Counter-Adversarial Autonomous Systems
Authors:
Robert Mattila,
Cristian R. Rojas,
Vikram Krishnamurthy,
Bo Wahlberg
Abstract:
Bayesian filtering deals with computing the posterior distribution of the state of a stochastic dynamic system given noisy observations. In this paper, motivated by applications in counter-adversarial systems, we consider the following inverse filtering problem: Given a sequence of posterior distributions from a Bayesian filter, what can be inferred about the transition kernel of the state, the ob…
▽ More
Bayesian filtering deals with computing the posterior distribution of the state of a stochastic dynamic system given noisy observations. In this paper, motivated by applications in counter-adversarial systems, we consider the following inverse filtering problem: Given a sequence of posterior distributions from a Bayesian filter, what can be inferred about the transition kernel of the state, the observation likelihoods of the sensor and the measured observations? For finite-state Markov chains observed in noise (hidden Markov models), we show that a least-squares fit for estimating the parameters and observations amounts to a combinatorial optimization problem with non-convex objective. Instead, by exploiting the algebraic structure of the corresponding Bayesian filter, we propose an algorithm based on convex optimization for reconstructing the transition kernel, the observation likelihoods and the observations. We discuss and derive conditions for identifiability. As an application of our results, we illustrate the design of counter-adversarial systems: By observing the actions of an autonomous enemy, we estimate the accuracy of its sensors and the observations it has received. The proposed algorithms are evaluated in numerical examples.
△ Less
Submitted 31 January, 2020;
originally announced January 2020.
-
A Finite-Sample Deviation Bound for Stable Autoregressive Processes
Authors:
Rodrigo A. González,
Cristian R. Rojas
Abstract:
In this paper, we study non-asymptotic deviation bounds of the least squares estimator in Gaussian AR($n$) processes. By relying on martingale concentration inequalities and a tail-bound for $χ^2$ distributed variables, we provide a concentration bound for the sample covariance matrix of the process output. With this, we present a problem-dependent finite-time bound on the deviation probability of…
▽ More
In this paper, we study non-asymptotic deviation bounds of the least squares estimator in Gaussian AR($n$) processes. By relying on martingale concentration inequalities and a tail-bound for $χ^2$ distributed variables, we provide a concentration bound for the sample covariance matrix of the process output. With this, we present a problem-dependent finite-time bound on the deviation probability of any fixed linear combination of the estimated parameters of the AR$(n)$ process. We discuss extensions and limitations of our approach.
△ Less
Submitted 25 May, 2020; v1 submitted 17 December, 2019;
originally announced December 2019.
-
Bayesian Model Selection for Change Point Detection and Clustering
Authors:
Othmane Mazhar,
Cristian R. Rojas,
Carlo Fischione,
Mohammad R. Hesamzadeh
Abstract:
We address the new problem of estimating a piece-wise constant signal with the purpose of detecting its change points and the levels of clusters. Our approach is to model it as a nonparametric penalized least square model selection on a family of models indexed over the collection of partitions of the design points and propose a computationally efficient algorithm to approximately solve it. Statis…
▽ More
We address the new problem of estimating a piece-wise constant signal with the purpose of detecting its change points and the levels of clusters. Our approach is to model it as a nonparametric penalized least square model selection on a family of models indexed over the collection of partitions of the design points and propose a computationally efficient algorithm to approximately solve it. Statistically, minimizing such a penalized criterion yields an approximation to the maximum a posteriori probability (MAP) estimator. The criterion is then analyzed and an oracle inequality is derived using a Gaussian concentration inequality. The oracle inequality is used to derive on one hand conditions for consistency and on the other hand an adaptive upper bound on the expected square risk of the estimator, which statistically motivates our approximation. Finally, we apply our algorithm to simulated data to experimentally validate the statistical guarantees and illustrate its behavior.
△ Less
Submitted 3 December, 2019;
originally announced December 2019.
-
Finite impulse response models: A non-asymptotic analysis of the least squares estimator
Authors:
Boualem Djehiche,
Othmane Mazhar,
Cristian R. Rojas
Abstract:
We consider a finite impulse response system with centered independent sub-Gaussian design covariates and noise components that are not necessarily identically distributed. We derive non-asymptotic near-optimal estimation and prediction bounds for the least-squares estimator of the parameters. Our results are based on two concentration inequalities on the norm of sums of dependent covariate vector…
▽ More
We consider a finite impulse response system with centered independent sub-Gaussian design covariates and noise components that are not necessarily identically distributed. We derive non-asymptotic near-optimal estimation and prediction bounds for the least-squares estimator of the parameters. Our results are based on two concentration inequalities on the norm of sums of dependent covariate vectors and on the singular values of their covariance operator that are of independent value on their own and where the dependence arises from the time shift structure of the time series. These results generalize the known bounds for the independent case.
△ Less
Submitted 28 November, 2019;
originally announced November 2019.
-
Finite sample deviation and variance bounds for first order autoregressive processes
Authors:
Rodrigo A. González,
Cristian R. Rojas
Abstract:
In this paper, we study finite-sample properties of the least squares estimator in first order autoregressive processes. By leveraging a result from decoupling theory, we derive upper bounds on the probability that the estimate deviates by at least a positive $\varepsilon$ from its true value. Our results consider both stable and unstable processes. Afterwards, we obtain problem-dependent non-asym…
▽ More
In this paper, we study finite-sample properties of the least squares estimator in first order autoregressive processes. By leveraging a result from decoupling theory, we derive upper bounds on the probability that the estimate deviates by at least a positive $\varepsilon$ from its true value. Our results consider both stable and unstable processes. Afterwards, we obtain problem-dependent non-asymptotic bounds on the variance of this estimator, valid for sample sizes greater than or equal to seven. Via simulations we analyze the conservatism of our bounds, and show that they reliably capture the true behavior of the quantities of interest.
△ Less
Submitted 25 May, 2020; v1 submitted 17 October, 2019;
originally announced October 2019.
-
What Did Your Adversary Believe? Optimal Filtering and Smoothing in Counter-Adversarial Autonomous Systems
Authors:
Robert Mattila,
Inês Lourenço,
Vikram Krishnamurthy,
Cristian R. Rojas,
Bo Wahlberg
Abstract:
We consider fixed-interval smoothing problems for counter-adversarial autonomous systems. An adversary deploys an autonomous filtering and control system that i) measures our current state via a noisy sensor, ii) computes a posterior estimate (belief) and iii) takes an action that we can observe. Based on such observed actions and our knowledge of our state sequence, we aim to estimate the adversa…
▽ More
We consider fixed-interval smoothing problems for counter-adversarial autonomous systems. An adversary deploys an autonomous filtering and control system that i) measures our current state via a noisy sensor, ii) computes a posterior estimate (belief) and iii) takes an action that we can observe. Based on such observed actions and our knowledge of our state sequence, we aim to estimate the adversary's past and current beliefs -- this forms a foundation for predicting, and counteracting against, future actions. We derive the optimal smoother for the adversary's beliefs (we treat the problem in a Bayesian framework). Moreover, we demonstrate how the smoother can be computed for discrete systems even though the corresponding backward variables do not admit a finite-dimensional characterization. Finally, we illustrate our results in numerical simulations.
△ Less
Submitted 16 October, 2019;
originally announced October 2019.
-
Consistency Analysis of the Simplified Refined Instrumental Variable Method for Continuous-time Systems
Authors:
Siqi Pan,
Rodrigo A. González,
James S. Welsh,
Cristian R. Rojas
Abstract:
In this paper, we analyse the consistency of the Simplified Refined Instrumental Variable method for Continuous-time systems (SRIVC). It is well known that the intersample behaviour of the input signal influences the quality and accuracy of the results when estimating and simulating continuous-time models. Here, we present a comprehensive analysis on the consistency of the SRIVC estimator while ta…
▽ More
In this paper, we analyse the consistency of the Simplified Refined Instrumental Variable method for Continuous-time systems (SRIVC). It is well known that the intersample behaviour of the input signal influences the quality and accuracy of the results when estimating and simulating continuous-time models. Here, we present a comprehensive analysis on the consistency of the SRIVC estimator while taking into account the intersample behaviour of the input signal. The main result of the paper shows that, under some mild conditions, the SRIVC estimator is generically consistent. We also describe some conditions when consistency is not achieved, which is important from a practical standpoint. The theoretical results are supported by simulation examples.
△ Less
Submitted 30 September, 2019;
originally announced October 2019.
-
An asymptotically optimal indirect approach to continuous-time system identification
Authors:
Rodrigo A. González,
Cristian R. Rojas,
James S. Welsh
Abstract:
The indirect approach to continuous-time system identification consists in estimating continuous-time models by first determining an appropriate discrete-time model. For a zero-order hold sampling mechanism, this approach usually leads to a transfer function estimate with relative degree 1, independent of the relative degree of the strictly proper real system. In this paper, a refinement of these…
▽ More
The indirect approach to continuous-time system identification consists in estimating continuous-time models by first determining an appropriate discrete-time model. For a zero-order hold sampling mechanism, this approach usually leads to a transfer function estimate with relative degree 1, independent of the relative degree of the strictly proper real system. In this paper, a refinement of these methods is developed. Inspired by indirect PEM, we propose a method that enforces a fixed relative degree in the continuous-time transfer function estimate, and show that the resulting estimator is consistent and asymptotically efficient. Extensive numerical simulations are put forward to show the performance of this estimator when contrasted with other indirect and direct methods for continuous-time system identification.
△ Less
Submitted 22 March, 2018;
originally announced March 2018.
-
Estimating Models with High-Order Noise Dynamics Using Semi-Parametric Weighted Null-Space Fitting
Authors:
Miguel Galrinho,
Cristian R. Rojas,
Hakan Hjalmarsson
Abstract:
Standard system identification methods often provide inconsistent estimates with closed-loop data. With the prediction error method (PEM), this issue is solved by using a noise model that is flexible enough to capture the noise spectrum. However, a too flexible noise model (i.e., too many parameters) increases the model complexity, which can cause additional numerical problems for PEM. In this pap…
▽ More
Standard system identification methods often provide inconsistent estimates with closed-loop data. With the prediction error method (PEM), this issue is solved by using a noise model that is flexible enough to capture the noise spectrum. However, a too flexible noise model (i.e., too many parameters) increases the model complexity, which can cause additional numerical problems for PEM. In this paper, we consider the weighted null-space fitting (WNSF) method. With this method, the system is first modeled using a non-parametric ARX model, which is then reduced to a parametric model of interest using weighted least squares. In the reduction step, a parametric noise model does not need to be estimated if it is not of interest. Because the flexibility of the noise model is increased with the sample size, this will still provide consistent estimates in closed loop and asymptotically efficient estimates in open loop. In this paper, we prove these results, and we derive the asymptotic covariance for the estimation error obtained in closed loop, which is optimal for an infinite-order noise model. For this purpose, we also derive a new technical result for geometric variance analysis, instrumental to our end. Finally, we perform a simulation study to illustrate the benefits of the method when the noise model cannot be parametrized by a low-order model.
△ Less
Submitted 6 September, 2018; v1 submitted 13 August, 2017;
originally announced August 2017.
-
Parametric Identification Using Weighted Null-Space Fitting
Authors:
Miguel Galrinho,
Cristian R. Rojas,
Hakan Hjalmarsson
Abstract:
In identification of dynamical systems, the prediction error method using a quadratic cost function provides asymptotically efficient estimates under Gaussian noise and additional mild assumptions, but in general it requires solving a non-convex optimization problem. An alternative class of methods uses a non-parametric model as intermediate step to obtain the model of interest. Weighted null-spac…
▽ More
In identification of dynamical systems, the prediction error method using a quadratic cost function provides asymptotically efficient estimates under Gaussian noise and additional mild assumptions, but in general it requires solving a non-convex optimization problem. An alternative class of methods uses a non-parametric model as intermediate step to obtain the model of interest. Weighted null-space fitting (WNSF) belongs to this class. It is a weighted least-squares method consisting of three steps. In the first step, a high-order ARX model is estimated. In a second least-squares step, this high-order estimate is reduced to a parametric estimate. In the third step, weighted least squares is used to reduce the variance of the estimates. The method is flexible in parametrization and suitable for both open- and closed-loop data. In this paper, we show that WNSF provides estimates with the same asymptotic properties as PEM with a quadratic cost function when the model orders are chosen according to the true system. Also, simulation studies indicate that WNSF may be competitive with state-of-the-art methods.
△ Less
Submitted 26 March, 2018; v1 submitted 13 August, 2017;
originally announced August 2017.
-
Sparse Iterative Learning Control with Application to a Wafer Stage: Achieving Performance, Resource Efficiency, and Task Flexibility
Authors:
Tom Oomen,
Cristian R. Rojas
Abstract:
Trial-varying disturbances are a key concern in Iterative Learning Control (ILC) and may lead to inefficient and expensive implementations and severe performance deterioration. The aim of this paper is to develop a general framework for optimization-based ILC that allows for enforcing additional structure, including sparsity. The proposed method enforces sparsity in a generalized setting through c…
▽ More
Trial-varying disturbances are a key concern in Iterative Learning Control (ILC) and may lead to inefficient and expensive implementations and severe performance deterioration. The aim of this paper is to develop a general framework for optimization-based ILC that allows for enforcing additional structure, including sparsity. The proposed method enforces sparsity in a generalized setting through convex relaxations using $\ell_1$ norms. The proposed ILC framework is applied to the optimization of sampling sequences for resource efficient implementation, trial-varying disturbance attenuation, and basis function selection. The framework has a large potential in control applications such as mechatronics, as is confirmed through an application on a wafer stage.
△ Less
Submitted 6 June, 2017;
originally announced June 2017.
-
Computing monotone policies for Markov decision processes: a nearly-isotonic penalty approach
Authors:
Robert Mattila,
Cristian R. Rojas,
Vikram Krishnamurthy,
Bo Wahlberg
Abstract:
This paper discusses algorithms for solving Markov decision processes (MDPs) that have monotone optimal policies. We propose a two-stage alternating convex optimization scheme that can accelerate the search for an optimal policy by exploiting the monotone property. The first stage is a linear program formulated in terms of the joint state-action probabilities. The second stage is a regularized pro…
▽ More
This paper discusses algorithms for solving Markov decision processes (MDPs) that have monotone optimal policies. We propose a two-stage alternating convex optimization scheme that can accelerate the search for an optimal policy by exploiting the monotone property. The first stage is a linear program formulated in terms of the joint state-action probabilities. The second stage is a regularized problem formulated in terms of the conditional probabilities of actions given states. The regularization uses techniques from nearly-isotonic regression. While a variety of iterative method can be used in the first formulation of the problem, we show in numerical simulations that, in particular, the alternating method of multipliers (ADMM) can be significantly accelerated using the regularization step.
△ Less
Submitted 3 April, 2017;
originally announced April 2017.
-
An analysis of the SPARSEVA estimate for the finite sample data case
Authors:
Huong Ha,
James S. Welsh,
Cristian R. Rojas,
Bo Wahlberg
Abstract:
In this paper, we develop an upper bound for the SPARSEVA (SPARSe Estimation based on a VAlidation criterion) estimation error in a general scheme, i.e., when the cost function is strongly convex and the regularized norm is decomposable for a pair of subspaces. We show how this general bound can be applied to a sparse regression problem to obtain an upper bound for the traditional SPARSEVA problem…
▽ More
In this paper, we develop an upper bound for the SPARSEVA (SPARSe Estimation based on a VAlidation criterion) estimation error in a general scheme, i.e., when the cost function is strongly convex and the regularized norm is decomposable for a pair of subspaces. We show how this general bound can be applied to a sparse regression problem to obtain an upper bound for the traditional SPARSEVA problem. Numerical results are used to illustrate the effectiveness of the suggested bound.
△ Less
Submitted 20 July, 2018; v1 submitted 27 March, 2017;
originally announced March 2017.
-
Asymptotically Efficient Identification of Known-Sensor Hidden Markov Models
Authors:
Robert Mattila,
Cristian R. Rojas,
Vikram Krishnamurthy,
Bo Wahlberg
Abstract:
We consider estimating the transition probability matrix of a finite-state finite-observation alphabet hidden Markov model with known observation probabilities. The main contribution is a two-step algorithm; a method of moments estimator (formulated as a convex optimization problem) followed by a single iteration of a Newton-Raphson maximum likelihood estimator. The two-fold contribution of this l…
▽ More
We consider estimating the transition probability matrix of a finite-state finite-observation alphabet hidden Markov model with known observation probabilities. The main contribution is a two-step algorithm; a method of moments estimator (formulated as a convex optimization problem) followed by a single iteration of a Newton-Raphson maximum likelihood estimator. The two-fold contribution of this letter is, firstly, to theoretically show that the proposed estimator is consistent and asymptotically efficient, and secondly, to numerically show that the method is computationally less demanding than conventional methods - in particular for large data sets.
△ Less
Submitted 1 February, 2017;
originally announced February 2017.
-
A Class of Nonconvex Penalties Preserving Overall Convexity in Optimization-Based Mean Filtering
Authors:
Mohammadreza Malek-Mohammadi,
Cristian R. Rojas,
Bo Wahlberg
Abstract:
$\ell_1$ mean filtering is a conventional, optimization-based method to estimate the positions of jumps in a piecewise constant signal perturbed by additive noise. In this method, the $\ell_1…
▽ More
$\ell_1$ mean filtering is a conventional, optimization-based method to estimate the positions of jumps in a piecewise constant signal perturbed by additive noise. In this method, the $\ell_1$ norm penalizes sparsity of the first-order derivative of the signal. Theoretical results, however, show that in some situations, which can occur frequently in practice, even when the jump amplitudes tend to $\infty$, the conventional method identifies false change points. This issue is referred to as stair-casing problem and restricts practical importance of $\ell_1$ mean filtering. In this paper, sparsity is penalized more tightly than the $\ell_1$ norm by exploiting a certain class of nonconvex functions, while the strict convexity of the consequent optimization problem is preserved. This results in a higher performance in detecting change points. To theoretically justify the performance improvements over $\ell_1$ mean filtering, deterministic and stochastic sufficient conditions for exact change point recovery are derived. In particular, theoretical results show that in the stair-casing problem, our approach might be able to exclude the false change points, while $\ell_1$ mean filtering may fail. A number of numerical simulations assist to show superiority of our method over $\ell_1$ mean filtering and another state-of-the-art algorithm that promotes sparsity tighter than the $\ell_1$ norm. Specifically, it is shown that our approach can consistently detect change points when the jump amplitudes become sufficiently large, while the two other competitors cannot.
△ Less
Submitted 22 April, 2016;
originally announced April 2016.
-
Particle-based Gaussian process optimization for input design in nonlinear dynamical models
Authors:
Patricio E. Valenzuela,
Johan Dahlin,
Cristian R. Rojas,
Thomas B. Schön
Abstract:
We propose a novel approach to input design for identification of nonlinear state space models. The optimal input sequence is obtained by maximizing a scalar cost function of the Fisher information matrix. Since the Fisher information matrix is unavailable in closed form, it is estimated using particle methods. In addition, we make use of Gaussian process optimization to find the optimal input and…
▽ More
We propose a novel approach to input design for identification of nonlinear state space models. The optimal input sequence is obtained by maximizing a scalar cost function of the Fisher information matrix. Since the Fisher information matrix is unavailable in closed form, it is estimated using particle methods. In addition, we make use of Gaussian process optimization to find the optimal input and to mitigate the problem of a large computational cost incurred by the particle filter, as the method reduces the number of functional evaluations. Numerical examples are provided to illustrate the performance of the resulting algorithm.
△ Less
Submitted 17 March, 2016;
originally announced March 2016.
-
Estimator Selection: End-Performance Metric Aspects
Authors:
Dimitrios Katselis,
Cristian R. Rojas,
Carolyn L. Beck
Abstract:
Recently, a framework for application-oriented optimal experiment design has been introduced. In this context, the distance of the estimated system from the true one is measured in terms of a particular end-performance metric. This treatment leads to superior unknown system estimates to classical experiment designs based on usual pointwise functional distances of the estimated system from the true…
▽ More
Recently, a framework for application-oriented optimal experiment design has been introduced. In this context, the distance of the estimated system from the true one is measured in terms of a particular end-performance metric. This treatment leads to superior unknown system estimates to classical experiment designs based on usual pointwise functional distances of the estimated system from the true one. The separation of the system estimator from the experiment design is done within this new framework by choosing and fixing the estimation method to either a maximum likelihood (ML) approach or a Bayesian estimator such as the minimum mean square error (MMSE). Since the MMSE estimator delivers a system estimate with lower mean square error (MSE) than the ML estimator for finite-length experiments, it is usually considered the best choice in practice in signal processing and control applications. Within the application-oriented framework a related meaningful question is: Are there end-performance metrics for which the ML estimator outperforms the MMSE when the experiment is finite-length? In this paper, we affirmatively answer this question based on a simple linear Gaussian regression example.
△ Less
Submitted 26 July, 2015;
originally announced July 2015.
-
Evaluation of Spectral Learning for the Identification of Hidden Markov Models
Authors:
Robert Mattila,
Cristian R. Rojas,
Bo Wahlberg
Abstract:
Hidden Markov models have successfully been applied as models of discrete time series in many fields. Often, when applied in practice, the parameters of these models have to be estimated. The currently predominating identification methods, such as maximum-likelihood estimation and especially expectation-maximization, are iterative and prone to have problems with local minima. A non-iterative metho…
▽ More
Hidden Markov models have successfully been applied as models of discrete time series in many fields. Often, when applied in practice, the parameters of these models have to be estimated. The currently predominating identification methods, such as maximum-likelihood estimation and especially expectation-maximization, are iterative and prone to have problems with local minima. A non-iterative method employing a spectral subspace-like approach has recently been proposed in the machine learning literature. This paper evaluates the performance of this algorithm, and compares it to the performance of the expectation-maximization algorithm, on a number of numerical examples. We find that the performance is mixed; it successfully identifies some systems with relatively few available observations, but fails completely for some systems even when a large amount of observations is available. An open question is how this discrepancy can be explained. We provide some indications that it could be related to how well-conditioned some system parameters are.
△ Less
Submitted 22 July, 2015;
originally announced July 2015.
-
Reweighted nuclear norm regularization: A SPARSEVA approach
Authors:
Huong Ha,
James S. Welsh,
Niclas Blomberg,
Cristian R. Rojas,
Bo Wahlberg
Abstract:
The aim of this paper is to develop a method to estimate high order FIR and ARX models using least squares with re-weighted nuclear norm regularization. Typically, the choice of the tuning parameter in the reweighting scheme is computationally expensive, hence we propose the use of the SPARSEVA (SPARSe Estimation based on a VAlidation criterion) framework to overcome this problem. Furthermore, we…
▽ More
The aim of this paper is to develop a method to estimate high order FIR and ARX models using least squares with re-weighted nuclear norm regularization. Typically, the choice of the tuning parameter in the reweighting scheme is computationally expensive, hence we propose the use of the SPARSEVA (SPARSe Estimation based on a VAlidation criterion) framework to overcome this problem. Furthermore, we suggest the use of the prediction error criterion (PEC) to select the tuning parameter in the SPARSEVA algorithm. Numerical examples demonstrate the veracity of this method which has close ties with the traditional technique of cross validation, but using much less computations.
△ Less
Submitted 21 July, 2015;
originally announced July 2015.
-
Successive Concave Sparsity Approximation for Compressed Sensing
Authors:
Mohammadreza Malek-Mohammadi,
Ali Koochakzadeh,
Massoud Babaie-Zadeh,
Magnus Jansson,
Cristian R. Rojas
Abstract:
In this paper, based on a successively accuracy-increasing approximation of the $\ell_0$ norm, we propose a new algorithm for recovery of sparse vectors from underdetermined measurements. The approximations are realized with a certain class of concave functions that aggressively induce sparsity and their closeness to the $\ell_0$ norm can be controlled. We prove that the series of the approximatio…
▽ More
In this paper, based on a successively accuracy-increasing approximation of the $\ell_0$ norm, we propose a new algorithm for recovery of sparse vectors from underdetermined measurements. The approximations are realized with a certain class of concave functions that aggressively induce sparsity and their closeness to the $\ell_0$ norm can be controlled. We prove that the series of the approximations asymptotically coincides with the $\ell_1$ and $\ell_0$ norms when the approximation accuracy changes from the worst fitting to the best fitting. When measurements are noise-free, an optimization scheme is proposed which leads to a number of weighted $\ell_1$ minimization programs, whereas, in the presence of noise, we propose two iterative thresholding methods that are computationally appealing. A convergence guarantee for the iterative thresholding method is provided, and, for a particular function in the class of the approximating functions, we derive the closed-form thresholding operator. We further present some theoretical analyses via the restricted isometry, null space, and spherical section properties. Our extensive numerical simulations indicate that the proposed algorithm closely follows the performance of the oracle estimator for a range of sparsity levels wider than those of the state-of-the-art algorithms.
△ Less
Submitted 26 April, 2016; v1 submitted 26 May, 2015;
originally announced May 2015.
-
Approximate Regularization Paths for Nuclear Norm Minimization Using Singular Value Bounds -- With Implementation and Extended Appendix
Authors:
Niclas Blomberg,
Cristian R. Rojas,
Bo Wahlberg
Abstract:
The widely used nuclear norm heuristic for rank minimization problems introduces a regularization parameter which is difficult to tune. We have recently proposed a method to approximate the regularization path, i.e., the optimal solution as a function of the parameter, which requires solving the problem only for a sparse set of points. In this paper, we extend the algorithm to provide error bounds…
▽ More
The widely used nuclear norm heuristic for rank minimization problems introduces a regularization parameter which is difficult to tune. We have recently proposed a method to approximate the regularization path, i.e., the optimal solution as a function of the parameter, which requires solving the problem only for a sparse set of points. In this paper, we extend the algorithm to provide error bounds for the singular values of the approximation. We exemplify the algorithms on large scale benchmark examples in model order reduction. Here, the order of a dynamical system is reduced by means of constrained minimization of the nuclear norm of a Hankel matrix.
△ Less
Submitted 20 April, 2015;
originally announced April 2015.