Joint State and Parameter Estimation Using the Partial Errors-in-Variables Principle

Peng Liu, Kailai Li, Gustaf Hendeby, and Fredrik Gustafsson The work is funded in part by the Swedish Research Council under the grant Scalable Kalman Filters and in part by ZENITH of Linkö** University under the grant Computational Agile Sensing and Inference for Intelligent Systems.Peng Liu, Gustaf Hendeby, and Fredrik Gustafsson are with the Division of Automatic Control, Department of Electrical Engineering, Linkö** University, Linkö**, Sweden. Email: {peng.liu, gustaf.hendeby, fredrik.gustafsson}@liu.seKailai Li is with the Bernoulli Institute for Mathematics, Computer Science and Artificial Intelligence, University of Groningen, Groningen, the Netherlands. Email: [email protected]

Abstract

This letter proposes a new method for joint state and parameter estimation in uncertain dynamical systems. We exploit the partial errors-in-variables (PEIV) principle and formulate a regression problem in the sense of weighted total least squares, where the uncertainty in the parameter prior is explicitly considered. Based thereon, the PEIV regression can be solved iteratively through the Kalman smoothing and the regularized least squares for estimating the state and the parameter, respectively. The simulations demonstrate improved accuracy of the proposed method compared to existing approaches, including the joint maximum a posterior-maximum likelihood, the expectation maximisation, and the augmented state extended Kalman smoother.

Index Terms:

Joint state and parameter estimation, partial errors-in-variables model, iterative estimation

I Introduction

Estimating states of uncertain dynamical systems plays fundamental roles in statistical signal processing and has various application scenarios, such as localization, tracking, energy, and robotics [1, 2, 3, 4], etc. Conventionally, state estimation problems can be solved recursively, either online using the Kalman filter and its derivatives, such as the extended Kalman filter (EKF) [5], or offline based on the smoothing techniques, such as the extended Kalman smoother (EKS), for enhanced estimation accuracy [6, 7].

However, standard filtering and smoothing algorithms assume the complete knowledge of the models, which is hard to reach in practice. A more realistic but more challenging scenario involves state-space modeling with unknown or uncertain parameters. One strategy to mitigate this issue is to augment the state with the parameter for joint estimation within the framework of EKF or EKS [8]. While the resulting augmented state EKF or EKS have become popular owing to its simplicity, they may suffer from poor estimation accuracy due to observability degradation [9]. Alternatively, iterative estimation methods have been investigated for joint state and parameter estimation, such as the maximum likelihood (ML) method [10], which demonstrates favorable asymptotic properties and has been applied for state-space models together with the expectation maximisation (EM) algorithm [11]. However, the ML method may deliver biased parameter estimates and fail to reach the Cramér–Rao bound given small datasets [12]. This issue can be mitigated by the joint maximum a posterior-maximum likelihood (JMAP-ML) method involving numerical optimisation, such as the coordinate descent algorithm [13]. This method has been widely exploited in many tasks including sensor calibration, epidemic modeling, and robust localization [14, 15, 16]. However, the JMAP-ML disregards the uncertainty in the parameter prior, which may lead to insufficient accuracy [12].

In this paper, we investigate the possibility of explicitly incorporating the uncertainty in the parameter prior for joint state and parameter estimation of linear models, where the partial errors-in-variables (PEIV) model is exploited for regression. The standard errors-in-variables (EIV) model contains a regressor matrix that is subject to noise corruption [17, 18], which can be handled by the total least squares (TLS) for i.i.d. regressor and measurement noises [19] or the weighted total least squares (WTLS) for more general noise patterns [20, 21]. For the PEIV model, where the regressor matrix is partially uncertain, it is possible to reformulate the model w.r.t. the uncertain part and apply WTLS for regression [22]. To the best of the authors’ knowledge, there is no existing literature that investigates joint state and parameter estimation problem from an errors-in-variables perspective.

Contribution

We propose a novel iterative framework for joint state and parameter estimation based on the partial errors-in-variables (PEIV) principle, which explicitly addresses the uncertainty in parameter prior. The joint estimation problem is formulated in the sense of WTLS and solved iteratively through the Kalman smoothing and the regularized least squares for estimating the state and parameter, respectively. The proposed method is evaluated through Monte Carlo simulation. Numerical results show its improved parameter estimation accuracy in comparison with the JMAP-ML, the EM, and the augmented state extended Kalman smoother (ASEKS).

The remainder of the paper is organized as follows. Sec. II provides the signal model, followed by an overview of existing methods in Sec. III and IV. Sec. V introduces the proposed PEIV-based framework, and Sec. VI presents the numerical simulation. Finally, conclusions are drawn in Sec. VII.

II Signal Model

To make the derivations explicit, we will assume a state-space model that is linear in both the state and parameters

	$\displaystyle x_{k+1}$	$\displaystyle=F(\theta^{\text{o}})x_{k}+v_{k}\,,$		(1)
	$\displaystyle y_{k}$	$\displaystyle=H(\theta^{\text{o}})x_{k}+e_{k}\,,$		(1)

where the state-space matrices $F(\theta^{{\text{o}}})$ and $H(\theta^{{\text{o}}})$ are linear functions of the true parameter value $\theta^{{\text{o}}}\in\mathbbm{R}^{d}$ given as

	$\displaystyle F(\theta^{{\text{o}}})$	$\displaystyle=F_{0}+\sum_{i=1}^{d}\theta^{{\text{o}}}_{i}F_{i}\quad\text{and}$		(2)
	$\displaystyle H(\theta^{{\text{o}}})$	$\displaystyle=H_{0}+\sum_{i=1}^{d}\theta^{{\text{o}}}_{i}H_{i}\,,$		(2)

respectively. The matrices $F_{i}$ and $H_{i}$ are assumed to be known. $x_{k}\in\mathbbm{R}^{n}$ denotes the state vector, and $y_{k}\in\mathbbm{R}^{m}$ is the measurement. $v_{k}$ and $e_{k}$ are the white Gaussian-distributed process and measurement noises of covariance matrices $Q$ and $R$ , respectively. $\theta^{{\text{o}}}_{i}$ denotes the $i$ -th element in the parameter vector. Further, $x_{k}$ , $v_{k}$ , and $e_{k}$ are assumed to be mutually independent. The initial state and the parameter priors are assumed to be Gaussian-distributed with

	$\displaystyle x^{{\text{o}}}_{0}$	$\displaystyle\sim\mathcal{N}(m_{0},P_{0})\quad\text{and}$		(3)
	$\displaystyle\hat{\theta}$	$\displaystyle\sim\mathcal{N}(\theta^{\text{o}},\Sigma_{\theta})\,,$		(3)

respectively. Let $X^{\text{o}}=[\,(x^{\text{o}}_{0})^{\top},\dots,(x^{\text{o}}_{N})^{\top}\,]^{\top}$ contain all the state vectors and $Y=[\,y_{1}^{\top},\dots,y_{N}^{\top}\,]^{\top}$ all the measurement. $x^{{\text{o}}}_{k}$ denotes the true state. Using a prior on the state and parameter allows the MAP approach maximising $P(X,\theta|Y)$ , but we compare to the EM approach that maximises $P(Y|\theta)$ and the JMAP-ML method that maximises $P(Y,X|\theta)$ (MAP and ML for estimating $X$ and $\theta$ , respectively).

III Separate state and parameter estimation

The proposed PEIV method as well as the EM and JMAP-ML method all lead to algorithms that iteratively estimate the state and parameter. For that purpose, we derive the fundamental estimation modules in this section. These are rather straightforward to derive, and the main issue is to re-formulate the model given by (1) and (2) to the following linear regression models

	$\displaystyle\bar{Y}$	$\displaystyle=\Psi(\theta^{{\text{o}}})X^{{\text{o}}}+\eta\,\quad\text{or}$		(4)
	$\displaystyle\quad\bar{Y}$	$\displaystyle=\Phi(X^{{\text{o}}})\theta^{{\text{o}}}+c(X^{{\text{o}}})+\eta\,.$		(4)

Here, $c(X^{\text{o}})$ represents the component that is independent of $\theta^{{\text{o}}}$ . The interpretation of it will be provided later. The first model formulation leads to the Kalman smoother for a given parameter, and the second one induces the least squares estimate of the parameters $\theta^{{\text{o}}}$ , given the state sequence $X^{{\text{o}}}$ .

III-A Kalman Smoother

The Kalman smoother (KS) can be formulated as a MAP problem given by

$\displaystyle\hat{X}=$	$\displaystyle\arg\max_{X}\log P(X\|Y)$	(5)
$\displaystyle=$	$\displaystyle\arg\max_{X}\sum_{i=1}^{N}\log P(y_{i}\|x_{i})+\sum_{j=1}^{N}\log P% (x_{j}\|x_{j-1})$
	$\displaystyle+\log P(x_{0})\,.$

By exploiting the model (1), (5) can be expressed as

	$\displaystyle\hat{x}_{0:N}=\arg\min_{X}\Big{\{}$	$\displaystyle\\|Y-C(\theta^{{\text{o}}})X\\|^{2}_{\mathbf{R}^{-1}}$		(6)
	$\displaystyle+$	$\displaystyle\\|A(\theta^{{\text{o}}})X\\|^{2}_{\mathbf{Q}^{-1}}+\\|x_{0}-m_{0}\\|% ^{2}_{P^{-1}_{0}}\Big{\}}\,,$		(6)

where $m_{0}$ and $P_{0}$ denote the mean and covariance of the initial state prior $x_{0}$ , respectively. To achieve a conciser formulation, we introduce $\mathbf{R}=\mathbf{diag}(R,\dots,R)$ , $\mathbf{Q}=\mathbf{diag}(Q,\dots,Q)$ , and $\|(\cdot)\|^{2}_{W}=(\cdot)^{\top}W(\cdot)$ . $\mathbf{diag}$ denotes the diagonal matrix, and $A(\theta^{{\text{o}}})$ and $C(\theta^{{\text{o}}})$ are defined by

	$\displaystyle A(\theta^{{\text{o}}})$	$\displaystyle=\begin{bmatrix}F(\theta^{{\text{o}}})&-\mathbf{I}&0&\dots&0\\ 0&F(\theta^{{\text{o}}})&-\mathbf{I}&\dots&0\\ \dots&\dots&\dots&\dots&\dots\\ 0&0&\dots&F(\theta^{{\text{o}}})&-\mathbf{I}\end{bmatrix}\,,$		(7)
	$\displaystyle C(\theta^{{\text{o}}})$	$\displaystyle=\begin{bmatrix}0&H(\theta^{{\text{o}}})&0&\dots&0\\ 0&0&H(\theta^{{\text{o}}})&\dots&0\\ \dots&\dots&\dots&\dots&\dots\\ 0&0&\dots&0&H(\theta^{{\text{o}}})\end{bmatrix}\,,$		(7)

respectively. With these definitions, (6) can be formulated as the solution to the following linear regression models

	$\displaystyle Y$	$\displaystyle=C(\theta^{{\text{o}}})X^{{\text{o}}}+E\,,$
	$\displaystyle 0$	$\displaystyle=A(\theta^{{\text{o}}})X^{{\text{o}}}+V\,,$
	$\displaystyle m_{0}$	$\displaystyle=x^{{\text{o}}}_{0}+\epsilon\,,$

where $\mathbf{cov}(E)=\mathbf{R}$ , $\mathbf{cov}(V)=\mathbf{Q}$ , and $\mathbf{cov}(\epsilon)=P_{0}$ . $0$ denotes zero vector. These equations can be summarized as follows

	$\displaystyle\bar{Y}$	$\displaystyle=\begin{bmatrix}Y\\ 0\\ m_{0}\end{bmatrix}=\begin{bmatrix}C(\theta^{\text{o}})\\ A(\theta^{\text{o}})\\ [\,\mathbf{I},\mathbf{0}\,]\end{bmatrix}X^{{\text{o}}}+\begin{bmatrix}E\\ V\\ \epsilon\end{bmatrix}$		(8)
		$\displaystyle=\Psi(\theta^{{\text{o}}})X^{{\text{o}}}+\eta\,.$		(8)

Given the assumption of mutual independence for the initial state, and the process and measurement noises, we have $\mathbf{cov}(\eta)=\mathbf{cov}([\,E^{\top},V^{\top},\epsilon^{\top}]^{\top})=% \mathbf{diag}(\mathbf{R},\mathbf{Q},P_{0})\eqqcolon\Sigma_{\eta}$ . $\bar{Y}$ serves as an augmented measurement based on the prior of $x^{{\text{o}}}_{0}$ . The state estimate can be determined by the least squares (LS) assuming that the parameter $\theta^{{\text{o}}}$ is known, namely,

\displaystyle\hat{X}

\displaystyle=(\Psi(\theta^{{\text{o}}})^{\top}\Sigma^{-1}_{\eta}\Psi(\theta^{% {\text{o}}}))^{-1}\Psi(\theta^{{\text{o}}})^{\top}\Sigma^{-1}_{\eta}\bar{Y}\,.

(9)

The covariance matrix of the estimation error is given by

\Sigma_{X}=(\Psi(\theta^{{\text{o}}})^{\top}\Sigma^{-1}_{\eta}\Psi(\theta^{{% \text{o}}}))^{-1}\,.

(10)

For implementing the Kalman smoother in practice, the recursive forward-backward version is perferred and runs much faster than the batch-wise solution [6]. We give the batch-wise formulation here for the sake of clearness, which also assists introducing the JMAP-ML method in Sec. IV-A.

III-B Parameter Estimation

To derive the parameter estimation solution, we first need to rewrite (8) as a linear regression in $\theta^{{\text{o}}}$ , and not in $X^{{\text{o}}}$ . It is straightforward to show that $\Psi(\theta^{{\text{o}}})X^{{\text{o}}}$ can be written as

	$\displaystyle\Psi(\theta^{{\text{o}}})X^{{\text{o}}}$	$\displaystyle=D(X^{{\text{o}}})\mathbf{vec}(\Psi(\theta^{{\text{o}}}))\,,\quad% \text{with}$		(11)
	$\displaystyle D(X^{{\text{o}}})$	$\displaystyle=(X^{{\text{o}}})^{\top}\otimes\mathbf{I}\,.$		(11)

$\otimes$ is the Kronecker product, and $\mathbf{vec}(\cdot)$ denotes the matrix vectorisation. (7) and (8) show that only a portion of the elements in $\Psi(\theta^{{\text{o}}})$ is a function of $\theta^{{\text{o}}}$ , whereas the others are independent of $\theta^{{\text{o}}}$ . Accordingly, $\mathbf{vec}(\Psi(\theta^{{\text{o}}}))$ can be reformulated into

\displaystyle\mathbf{vec}(\Psi(\theta^{{\text{o}}}))

\displaystyle=h+B\theta^{{\text{o}}}\,.

(12)

Combining (11) and (12) leads to

	$\displaystyle\Psi(\theta^{{\text{o}}})X^{{\text{o}}}$	$\displaystyle=D(X^{{\text{o}}})\mathbf{vec}(\Psi(\theta^{{\text{o}}}))$
		$\displaystyle=D(X^{{\text{o}}})h+D(X^{{\text{o}}})B\theta^{{\text{o}}}$
		$\displaystyle=\Phi(X^{{\text{o}}})\theta^{{\text{o}}}+c(X^{{\text{o}}}).$

Here, $c(X^{{\text{o}}})=D(X^{{\text{o}}})h$ . The solution to the parameter estimation problem in the sense of LS

\hat{\theta}=\arg\min_{\theta}\|\bar{Y}-\Psi(\theta)X^{{\text{o}}}\|^{2}_{% \Sigma^{-1}_{\eta}}\,

(13)

can then be derived as

	$\displaystyle\hat{\theta}=$	$\displaystyle\,(B^{\top}D({X^{{\text{o}}}})^{\top}\Sigma^{-1}_{\eta}D({X}^{{% \text{o}}})B)^{-1}$
		$\displaystyle(B^{\top}D({X}^{{\text{o}}})^{\top}\Sigma^{-1}_{\eta}(\bar{Y}-D({% X}^{{\text{o}}})h))\,,$

with covariance estimate

\Sigma_{{\theta}}=(B^{\top}D({X}^{{\text{o}}})^{\top}\Sigma^{-1}_{\eta}D({X}^{% {\text{o}}})B)^{-1}\,.

IV Joint State and Parameter Estimation

State and parameter estimation can be iterated in different ways. This section provides overviews to well-known methods, before we introduce the PEIV method in the next section.

IV-A Joint Maximum A Posterior-Maximum Likelihood

In this subsection, we explore the JMAP-ML method for estimating both the state and the model parameter iteratively. It aims to solve the optimisation problem given by

	$\displaystyle\{\hat{X},\hat{\theta}\}=$	$\displaystyle\arg\min_{X,\theta}\,\,\log P(Y,X\|\theta)$		(14)
	$\displaystyle=$	$\displaystyle\arg\min_{\theta}\{\arg\min_{X}\{\log P(X\|Y,\theta)\}+\log P(Y\|% \theta)\}\,,$		(14)

where the parameter $\theta$ is a deterministic parameter, and the state $X$ is a random vector. This problem can be iteratively computed with an initialisation $\hat{\theta}^{1}$ following [12]

$\displaystyle\hat{X}^{i+1}$	$\displaystyle=(\Psi(\hat{\theta}^{i})^{\top}\Sigma^{-1}_{\eta}\Psi(\hat{\theta% }^{i}))^{-1}\Psi(\hat{\theta}^{i})^{\top}\Sigma^{-1}_{\eta}\bar{Y}\,,$	(15)
$\displaystyle\hat{\theta}^{i+1}$	$\displaystyle=(B^{\top}D(\hat{X}^{i+1})^{\top}\Sigma^{-1}_{\eta}D(\hat{X}^{i+1% })B)^{-1}$
	$\displaystyle(B^{\top}D(\hat{X}^{i+1})^{\top}\Sigma^{-1}_{\eta}(\bar{Y}-D(\hat% {X}^{i+1})h))\,,$

where the KS and the LS are exploited for updating the state and parameter, respectively.

IV-B Expectation Maximisation

The JMAP-ML method discussed in Sec. IV-A only utilizes the mean of the state estimate, and the covariance estimate is ignored. To fully exploit the information from state estimation, the EM method can be deployed. It optimises for the parameter and state iteratively in the following ML problem

\displaystyle\hat{\theta}

\displaystyle=\arg\max_{\theta}\log P(Y|\theta)\,.

(16)

The absence of state $X$ makes (16) difficult to solve directly. The EM algorithm tackles this in two steps, namely, the $E$ step and $M$ step. The $E$ step estimates the state according to

\mathcal{Q}(\theta,\hat{\theta}^{i})=\mathbf{E}_{P(X|Y,\hat{\theta}^{i})}(\log P% (Y,X|\theta))\,,

(17)

where $\hat{\theta}^{i}$ denotes the parameter estimate in the $i$ -th iteration. The posterior distribution $P(X|Y,\hat{\theta}^{i})$ can be solved using KS introduced in Sec. III-A, with $\theta^{{\text{o}}}$ substituted by its estimate $\hat{\theta}^{i}$ . After the $E$ step, the $M$ step updates $\hat{\theta}^{i}$ following

\hat{\theta}^{i+1}=\arg\max_{\theta}\mathcal{Q}(\theta,\hat{\theta}^{i})\,.

(18)

After updating the parameter in (18), we go back to the $E$ step in (17) and repeat until convergence. In summary, the method resembles the one in (15). The only difference lies in iterating the parameter, where the uncertainty in the state estimate is considered as follows

	$\displaystyle\hat{\theta}^{i+1}=$	$\displaystyle(\mathbf{E}(B^{\top}D(\hat{X}^{i+1})^{\top}\Sigma^{-1}_{\eta}D(% \hat{X}^{i+1})B))^{-1}$		(19)
		$\displaystyle\mathbf{E}(B^{\top}D(\hat{X}^{i+1})^{\top}\Sigma^{-1}_{\eta}(\bar% {Y}-D(\hat{X}^{i+1})h))\,.$		(19)

Here, the expectation is computed with respect to $P(X|Y,\hat{\theta}^{i})$ .

V PEIV-based State and Parameter Estimation

We now introduce how to exploit the partial errors-in-variables (PEIV) modeling to facilitate joint state and parameter estimation. The regression on states in (8) contains a partially unknown regressor matrix $\Psi(\theta^{{\text{o}}})$ due to the uncertainty when estimating parameter $\theta^{{\text{o}}}$ in (7). Based on (3) and (8), we formulate the following WTLS problem to jointly estimate the state $X^{{\text{o}}}$ and the parameter $\theta^{{\text{o}}}$

	$\displaystyle\{\hat{\theta},\hat{X}\}$	$\displaystyle=\arg\min_{\theta,\eta}\bigg{\\|}\begin{bmatrix}\theta-\hat{\theta% }^{1}\\ \eta\end{bmatrix}\bigg{\\|}^{2}_{\Sigma^{-1}},\,$
		$\displaystyle s.t.\quad\eta=\bar{Y}-\Psi(\theta)X\,,$

where $\Sigma=\mathbf{diag}(\Sigma_{\theta},\Sigma_{\eta})$ . It is straightforward to reformulate the objective by replacing $\eta$ with the constraint. This leads to

J(\theta,X)=\|\theta-\hat{\theta}^{1}\|^{2}_{\Sigma^{-1}_{\theta}}+\|\bar{Y}-% \Psi(\theta)X\|^{2}_{\Sigma^{-1}_{\eta}}\,,

(20)

where the first term can be seen as a generalized Tikhonov regularizer, with $\hat{\theta}^{1}$ being the initialised parameter estimate [23]. At each iteration, we can update the state following the first equation in (15). Afterward, the parameter estimate can be updated via the regularized least squares given the iterated state estimate $\hat{X}$ . For that, we derive the closed-form derivative of $J(\theta,\hat{X})$ w.r.t. $\theta$ and set it to $0$ , yielding

\displaystyle\hat{\theta}=N^{-1}

\displaystyle(\Sigma^{-1}_{\theta}\hat{\theta}^{1}+B^{\top}D(\hat{X})^{\top}% \Sigma^{-1}_{\eta}(\bar{Y}-D(\hat{X})h))\,.

(21)

The notation $N$ in (21) follows

N=\Sigma^{-1}_{\theta}+B^{\top}D(\hat{X})^{\top}\Sigma^{-1}_{\eta}D(\hat{X})B\,.

(22)

(15) and (21) should be implemented iteratively. Once converged, we can compute the estimation covariance of state $X$ according to (10) with the parameter estimate $\hat{\theta}$ . The estimation covariance of $\hat{\theta}$ can be obtained by reformulating (20) given the state estimate $\hat{X}$ as follows

\displaystyle\begin{bmatrix}\hat{\theta}^{1}\\ \bar{Y}-D(\hat{X})h\end{bmatrix}

\displaystyle=\begin{bmatrix}\mathbf{I}\\ D(\hat{X})B\end{bmatrix}\theta^{{\text{o}}}+\begin{bmatrix}\tilde{\theta}^{1}% \,\\ \eta\,\end{bmatrix}\,.

(23)

Here, $\tilde{\theta}^{1}$ is the initialisation error. This leads to the covariance matrix $\mathbf{cov}(\hat{\theta})=N^{-1}$ .

Refer to caption — Figure 1: RMSE of parameter estimation w.r.t. batch size. The dashed lines in each color bounds the 5% and 95% quantiles given by each method.

VI Numerical Simulation

To demonstrate the merit of the PEIV principle in joint state and parameter estimation, we synthesize a numerical example with Monte Carlo simulation. We consider the following state-space model with scalar-valued state and parameter

	$\displaystyle x_{k+1}$	$\displaystyle=\theta^{{\text{o}}}x_{k}+v_{k}\,,$		(24)
	$\displaystyle y_{k}$	$\displaystyle=x_{k}+e_{k}\,.$		(24)

The process noise $v_{k}$ , the measurement noise $e_{k}$ , and the initial state $x_{0}$ are assumed to be independent of each other, and we assume $v_{k}\sim\mathcal{N}(0,0.2)$ and $e_{k}\sim\mathcal{N}(0,0.09)$ . We assume a stationary process with $x^{{\text{o}}}_{k}\sim\mathcal{N}(0,P)$ , where

P=0.2/(1-(\theta^{{\text{o}}})^{2})\,.

The state estimate is initialised as $\hat{x}_{0}\sim\mathcal{N}(y_{1},2P)$ , which implies that it is not necessary to know the prior. The true value of the parameter in the model is $\theta^{{\text{o}}}=0.9$ . To quantify the estimation accuracy, we employ root mean square error (RMSE) criterion given by

\texttt{RMSE}_{\theta}=\sqrt{\frac{1}{M}\sum_{i=1}^{M}(\hat{\theta}_{i}-\theta% )^{2}}\,,

where $i$ denotes the $i$ -th simulation (This equation shows the case for the parameter). We set the number of simulations $M=1000$ .

We evaluate our PEIV-based method with a focus on joint estimation using small batch size of data ranging within $\{10,15,20,25,30,35,40,45,50,100,150,200\}$ time steps. Three other state-of-the-art methods are involved for comparison, including the expectation maximisation (EM), the joint maximum a posterior-maximum likelihood (JMAP-ML), and the augmented state extended Kalman smoother (ASEKS) methods.

As shown in Fig. 1, the proposed PEIV-based method outperforms all the other methods given small batch size of data ( $\leq 100$ ) in terms of RMSE and $95\%$ quantile. Additionally, we depict the error ellipses given by the simulations with a batch size of $30$ data points in Fig. 2, where $\tilde{x}_{0}$ and $\tilde{\theta}$ denotes the estimation errors of the initial state and the parameter, respectively. The proposed PEIV-based method delivers the best result in the benchmarking with the smallest error ellipse.

VII Conclusion

In this letter, a novel principle for joint state and parameter estimation is proposed through the partial errors-in-variables modeling, where the uncertainty in the parameter prior is explicitly considered. Based thereon, we formulate the regression problem in the sense of WTLS, which is solved iteratively by the Kalman smoothing and the regularized least squares for updating the state and parameter, respectively. Numerical results based on simulations demonstrate that the proposed method outperforms state-of-the-art methods, including the EM, the JMAP-ML, and the ASEKS methods, in terms of estimation accuracy.

For future investigation, we look forward to incorporating the uncertainty of state estimates into parameter estimation. Another possibility for extending the PEIV-based framework can be focused on tackling non-Gaussian noise patterns in state-space modeling.

References

[1] Fredrik Gustafsson, Fredrik Gunnarsson, Niclas Bergman, Urban Forssell, Jonas Jansson, Rickard Karlsson, and P-J Nordlund. Particle filters for positioning, navigation, and tracking. IEEE Transactions on signal processing, 50(2):425–437, 2002.
[2] Michael Roth, Gustaf Hendeby, and Fredrik Gustafsson. EKF/UKF maneuvering target tracking using coordinated turn models with polar/Cartesian velocity. In 17th International Conference on Information Fusion (FUSION), pages 1–8. IEEE, 2014.
[3] Esmaeil Ghahremani and Innocent Kamwa. Dynamic state estimation in power system by applying the extended Kalman filter with unknown inputs to phasor measurements. IEEE Transactions on Power Systems, 26(4):2556–2566, 2011.
[4] Jakub Simanek, Michal Reinstein, and Vladimir Kubelka. Evaluation of the EKF-based estimation architectures for data fusion in mobile robots. IEEE/ASME transactions on mechatronics, 20(2):985–990, 2014.
[5] Brian DO Anderson and John B Moore. Optimal filtering. Courier Corporation, 2012.
[6] Herbert E Rauch, F Tung, and Charlotte T Striebel. Maximum likelihood estimates of linear dynamic systems. AIAA journal, 3(8):1445–1450, 1965.
[7] Simo Särkkä and Lennart Svensson. Bayesian filtering and smoothing, volume 17. Cambridge university press, 2023.
[8] Lennart Ljung. Asymptotic behavior of the extended Kalman filter as a parameter estimator for linear systems. IEEE Transactions on Automatic Control, 24(1):36–50, 1979.
[9] Anxi Yu, Ye Liu, Jubo Zhu, and Zhen Dong. An improved dual unscented Kalman filter for state and parameter estimation. Asian Journal of Control, 18(4):1427–1440, 2016.
[10] Steven M Kay. Fundamentals of statistical signal processing: estimation theory. Prentice-Hall, Inc., 1993.
[11] Arthur P Dempster, Nan M Laird, and Donald B Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of the royal statistical society: series B (methodological), 39(1):1–22, 1977.
[12] Arie Yeredor. The joint MAP-ML criterion and its relation to ML and to extended least-squares. IEEE Transactions on Signal Processing, 48(12):3484–3492, 2000.
[13] Stephen J Wright. Coordinate descent algorithms. Mathematical programming, 151(1):3–34, 2015.
[14] Manon Kok and Thomas B Schön. Maximum likelihood calibration of a magnetometer using inertial sensors. IFAC Proceedings Volumes, 47(3):92–97, 2014.
[15] Peng Liu, Gustaf Hendeby, and Fredrik Gustafsson. Joint estimation of states and parameters in stochastic SIR model. In 2022 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI), pages 1–6. IEEE, 2022.
[16] Feng Yin, Carsten Fritsche, Fredrik Gustafsson, and Abdelhak M Zoubir. EM-and JMAP-ML based joint estimation algorithms for robust wireless geolocation in mixed LOS/NLOS environments. IEEE Transactions on Signal Processing, 62(1):168–182, 2013.
[17] Wayne A Fuller. Measurement error models. John Wiley & Sons, 2009.
[18] Peng Liu, Kailai Li, Gustaf Hendeby, and Fredrik Gustafsson. Weighted total least squares for quadratic errors-in-variables regression. In 2023 31st European Signal Processing Conference (EUSIPCO), pages 1893–1897. IEEE, 2023.
[19] Gene H Golub and Charles F Van Loan. An analysis of the total least squares problem. SIAM journal on numerical analysis, 17(6):883–893, 1980.
[20] A Amiri-Simkooei and S Jazaeri. Weighted total least squares formulated by standard least squares theory. Journal of geodetic science, 2(2):113–124, 2012.
[21] Burkhard Schaffrin and Andreas Wieser. On weighted total least-squares adjustment for linear regression. Journal of geodesy, 82:415–421, 2008.
[22] Peiliang Xu, **gnan Liu, and Chuang Shi. Total least squares adjustment in partial errors-in-variables models: algorithm and statistical analysis. Journal of geodesy, 86:661–675, 2012.
[23] Gene H Golub, Per Christian Hansen, and Dianne P O’Leary. Tikhonov regularization and total least squares. SIAM journal on matrix analysis and applications, 21(1):185–194, 1999.