Disturbance Rejection-Guarded Learning for Vibration Suppression of Two-Inertia Systems

Fan Zhang1, **feng Chen1, Yu Hu1, Zhiqiang Gao1, Ge Lv2, Qin Lin1 1The authors are with the Center for Advanced Control Technologies (CACT), Cleveland State University, 2121 Euclid Avenue, Cleveland, OH 44115, USA. Corresponding author: Qin Lin, [email protected]2Ge Lv is with the Department of Mechanical Engineering, Clemson University, 105 Sikes Hall, Clemson, SC 29634, USA.
Abstract

Model uncertainty presents significant challenges in vibration suppression of multi-inertia systems, as these systems often rely on inaccurate nominal mathematical models due to system identification errors or unmodeled dynamics. An observer, such as an extended state observer (ESO), can estimate the discrepancy between the inaccurate nominal model and the true model, thus improving control performance via disturbance rejection. The conventional observer design is memoryless in the sense that once its estimated disturbance is obtained and sent to the controller, the datum is discarded. In this research, we propose a seamless integration of ESO and machine learning. On one hand, the machine learning model attempts to model the disturbance. With the assistance of prior information about the disturbance, the observer is expected to achieve faster convergence in disturbance estimation. On the other hand, machine learning benefits from an additional assurance layer provided by the ESO, as any imperfections in the machine learning model can be compensated for by the ESO. We validated the effectiveness of this novel learning-for-control paradigm through simulation and physical tests on two-inertial motion control systems used for vibration studies.

Index Terms:
Machine Learning, Disturbance Rejection, Extended State Observer, Model Uncertainty

I Introduction

Vibration suppression of multi-inertia systems is critical in many engineering applications, including automotive suspensions, series elastic actuators (SEA), and various other motion control systems [1]. These systems often involve multiple inertia components with a two-inertia subsystem serving as a fundamental block, connected by flexible couplings, which leads to inherent resonance issues. This resonance can cause dynamic stresses, energy wastes, and performance degradation, therefore posing significant challenges to the systems’ efficiency and stability [2, 3]. Given the fundamental challenge of system identification and the necessity for real-time performance, it is common practice to employ a simplified or inaccurate nominal dynamic model. Consequently, the disturbances become inevitable, necessitating their rejection to achieve robust control. The disturbance includes internal (i.e., unknown or unmodelled parts of the plant dynamics) and external (i.e., perturbations from the outside affecting the dynamics) [4, 5].

The observer-based method has emerged as a promising approach to estimating the disturbance for the subsequent design of a disturbance rejection controller. Among the array of existing disturbance observers, the extended state observer (ESO) [6] is gaining popularity due to its simplicity in implementation. For the formulation of an ESO, the system is modeled as a simple chained integrator with a total disturbance term (also called lumped disturbance, f𝑓fitalic_f) that includes both internal and external disturbances. The total disturbance is treated as an extended state to be estimated together with other states. The estimated disturbance can be mitigated through various means, including a simple state feedback controller or more advanced control strategies such as sliding mode control [7] and model predictive control [8].

It is worth noting that the traditional ESO operates in a memoryless fashion, i.e., once it estimates a disturbance and transmits it to the controller, the datum used for estimation is then discarded. However, as a control system operates, we can improve our understanding of the disturbance through collected operational data. Prior works [9, 10] show that a model-based ESO (MB-ESO), which utilizes prior model information about the disturbance (such as a detailed dynamic model obtained through system identification), tends to exhibit reduced sensitivity to noise when compared to a model-free ESO (MF-ESO) that assumes a simple chained integrator as a nominal model. In order to circumvent the need for extensive system identification and maximize the utilization of disturbance information, we propose to leverage machine learning (ML), which has powerful capacities for nonlinear optimization, to memorize and generalize the past estimations from the ESO as a feedforward estimation of the disturbance. The learning component is expected to capture the internal dynamics as well as patterns of external disturbances.

[11, 12, 13] combine ESO with iterative learning control (ILC) for repetitive control tasks. Our approach focuses on general control tasks rather than just the repetitive ones. In addition, we assume that system dynamics, as well as disturbances, are unknown and not necessarily repetitive. In [14], a neural network is utilized to tune the parameters of ESO rather than explicitly learning the disturbance. Other learning-for-control approaches such as [15] employ neural networks to capture discrepancies between a nominal model F^(xk,uk)^𝐹subscript𝑥𝑘subscript𝑢𝑘\hat{F}(x_{k},u_{k})over^ start_ARG italic_F end_ARG ( italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , italic_u start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) and the true model F(xk,uk)𝐹subscript𝑥𝑘subscript𝑢𝑘F(x_{k},u_{k})italic_F ( italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , italic_u start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ). Since the state of the true model is unknown, the measured next state xk+1subscript𝑥𝑘1x_{k+1}italic_x start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT is used to update the error model represented by the neural network. However, these methods always assume full-state information is available. In addition, when the learning performance falls short of expectations, it may result in suboptimal performance for subsequent model-based controllers. In contrast, our approach represents a novel paradigm that aims at learning the total disturbance with the help of output measurements instead of true values for states. Furthermore, our paradigm includes a correction mechanism for cases where the learning component fails to accurately capture the disturbances. The residual total disturbance, i.e., the remainder excluding the disturbance already estimated by the learning component, will be estimated by a conventional ESO in a feedback correction manner. Through this seamless integration, even when the learning-based estimation struggles to converge effectively, we can leverage the ESO for feedback correction, thereby adding an extra layer of robustness and assurance to the system.

In our new framework, as visualized in Fig. 1, we refer to the learning-enabled extended state observer as L-ESO. The estimation f^^𝑓\hat{f}over^ start_ARG italic_f end_ARG of the true total disturbance f𝑓fitalic_f consists of f^Lsubscript^𝑓𝐿\hat{f}_{L}over^ start_ARG italic_f end_ARG start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT and Δf^Δ^𝑓\Delta\hat{f}roman_Δ over^ start_ARG italic_f end_ARG, which are from the learning component and the ESO, respectively. First, ESO uses the information of control u𝑢uitalic_u and observation y𝑦yitalic_y to estimate the system’s states x^^𝑥\hat{x}over^ start_ARG italic_x end_ARG and the residual disturbance Δf^Δ^𝑓\Delta\hat{f}roman_Δ over^ start_ARG italic_f end_ARG. Second, ESO’s estimation, including x^^𝑥\hat{x}over^ start_ARG italic_x end_ARG and u𝑢uitalic_u are fed as input to the learning component for learning a regression model. The learning component carries out the feedforward estimation f^Lsubscript^𝑓𝐿\hat{f}_{L}over^ start_ARG italic_f end_ARG start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT, after which an online optimization iteratively minimizes the difference between f^Lsubscript^𝑓𝐿\hat{f}_{L}over^ start_ARG italic_f end_ARG start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT and f^^𝑓\hat{f}over^ start_ARG italic_f end_ARG, allowing the learning component to approximate the total disturbance accurately. In situations where imperfect learning introduces errors, the ESO serves as an additional layer to rectify.

Refer to caption
Figure 1: The proposed framework in this paper, where the red and the blue blocks represent the L-ESO and the disturbance rejection tracking controller, respectively. Once the total disturbance is estimated, the tracking controller will be able to reject disturbance.

The contributions of our work are summarized as follows:

  • We propose a novel framework that combines ML and ESO for feedforward estimation and feedback correction for a general disturbance rejection tracking control task. Compared with existing learning-for-control frameworks, we estimate states and disturbances in a unique way. We also have an extra error correction mechanism for the learning component.

  • The learning component serves as an add-on to existing ESO-based control architecture. As shown in Fig. 1, only a learning component and a few connections (in green) are introduced. The advantage of our modular design is two-fold: 1) no need to change the existing framework; 2) users can customize the learning components by choosing any appropriate machine learning model.

  • Our learning and estimation are real-time and online. We showcase the efficacy of our framework through simulations and a real-world two-inertia testbed as a fundamental block for a multi-inertia system.

The remainder of this paper is structured as follows. We first go through the preliminaries in Sec. II. Then, we construct our framework in Sec. III. Simulation results of the two-mass-spring benchmark system are presented in Sec. IV, followed by the hardware experiments of a torsional plant in Sec. V. Finally, we conclude our work and discuss possible future research directions in Sec. VI.

II Preliminary

The multi-inertia system can be represented as the sum of a nominal part and a nonlinear time-varying part:

{x¯˙(t)=A0x¯(t)+B0u(t)+E0f(x(t),d(t),t)y=C0x¯cases˙¯𝑥𝑡subscript𝐴0¯𝑥𝑡subscript𝐵0𝑢𝑡subscript𝐸0𝑓𝑥𝑡𝑑𝑡𝑡otherwise𝑦subscript𝐶0¯𝑥otherwise\begin{cases}\dot{\bar{x}}(t)=A_{0}\bar{x}(t)+B_{0}u(t)+E_{0}f(x(t),d(t),t)\\ y=C_{0}\bar{x}\end{cases}{ start_ROW start_CELL over˙ start_ARG over¯ start_ARG italic_x end_ARG end_ARG ( italic_t ) = italic_A start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT over¯ start_ARG italic_x end_ARG ( italic_t ) + italic_B start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_u ( italic_t ) + italic_E start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_f ( italic_x ( italic_t ) , italic_d ( italic_t ) , italic_t ) end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL italic_y = italic_C start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT over¯ start_ARG italic_x end_ARG end_CELL start_CELL end_CELL end_ROW (1)

where x¯n¯𝑥superscript𝑛\bar{x}\in\mathbb{R}^{n}over¯ start_ARG italic_x end_ARG ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT is the state vector, u𝑢u\in\mathbb{R}italic_u ∈ blackboard_R is a control input, y𝑦y\in\mathbb{R}italic_y ∈ blackboard_R is a measured output, and f:n+1×[0,]:𝑓superscript𝑛10f:\mathbb{R}^{n+1}\times[0,\infty]\rightarrow\mathbb{R}italic_f : blackboard_R start_POSTSUPERSCRIPT italic_n + 1 end_POSTSUPERSCRIPT × [ 0 , ∞ ] → blackboard_R is an unknown function representing the time-varying uncertainty, which contains external disturbance d(t)𝑑𝑡d(t)\in\mathbb{R}italic_d ( italic_t ) ∈ blackboard_R, unmodeled dynamics, and parameter uncertainty. Terms A0subscript𝐴0A_{0}italic_A start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, B0subscript𝐵0B_{0}italic_B start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, E0subscript𝐸0E_{0}italic_E start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and C0subscript𝐶0C_{0}italic_C start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT are real and known matrices with appropriate dimensions. For the particular case of a two-inertial system with n=4𝑛4n=4italic_n = 4, meaning two states for each inertial position/angular and velocity/angular velocity, please refer to the details in the example in Sec. IV. The justification of classifying (1) as a nonlinear time-varying system can be found in [16, 17].

Traditionally, an ESO is established for a system in a chained integrator form [6]. However, in our most recent work [18], we have significantly expanded the applicability scope of ESO and rigorously proved that for a general system (1), given that Assumption 1 and the Assumption 2 are satisfied, an ESO can be established to estimate f𝑓fitalic_f by releasing the chained integrator form requirement.

Assumption 1.

(A0,C0)subscript𝐴0subscript𝐶0(A_{0},C_{0})( italic_A start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_C start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) is observable.

Assumption 2.

(A0,E0,C0)subscript𝐴0subscript𝐸0subscript𝐶0(A_{0},E_{0},C_{0})( italic_A start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_E start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_C start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) has no invariant zeros.

For system (1), under the Assumptions 1, and 2 , there exists a matrix

S=[C0C0A0C0A0n1]𝑆matrixsubscript𝐶0subscript𝐶0subscript𝐴0subscript𝐶0superscriptsubscript𝐴0𝑛1S=\begin{bmatrix}C_{0}\\ C_{0}A_{0}\\ \vdots\\ C_{0}A_{0}^{n-1}\end{bmatrix}italic_S = [ start_ARG start_ROW start_CELL italic_C start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_C start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_A start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL ⋮ end_CELL end_ROW start_ROW start_CELL italic_C start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_A start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ] (2)

such that

A¯0=SA0S1=[010001a0a1an1]B¯0=SB0=[00b]TC¯0=C0S1=[100]E¯0=SE0=[001]Tsubscript¯𝐴0𝑆subscript𝐴0superscript𝑆1matrix010001subscript𝑎0subscript𝑎1subscript𝑎𝑛1subscript¯𝐵0𝑆subscript𝐵0superscriptmatrix00𝑏𝑇subscript¯𝐶0subscript𝐶0superscript𝑆1matrix100subscript¯𝐸0𝑆subscript𝐸0superscriptmatrix001𝑇\begin{split}\bar{A}_{0}=SA_{0}S^{-1}&=\begin{bmatrix}0&1&\dots&0\\ \vdots&\ddots&\ddots&\vdots\\ 0&\dots&0&1\\ -a_{0}&-a_{1}&\dots&-a_{n-1}\\ \end{bmatrix}\\ \bar{B}_{0}=SB_{0}&=\begin{bmatrix}0&0&\dots&b\end{bmatrix}^{T}\\ \bar{C}_{0}=C_{0}S^{-1}&=\begin{bmatrix}1&0&\dots&0\\ \end{bmatrix}\\ \bar{E}_{0}=SE_{0}&=\begin{bmatrix}0&0&\dots&1\end{bmatrix}^{T}\\ \end{split}start_ROW start_CELL over¯ start_ARG italic_A end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = italic_S italic_A start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_S start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_CELL start_CELL = [ start_ARG start_ROW start_CELL 0 end_CELL start_CELL 1 end_CELL start_CELL … end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL ⋮ end_CELL start_CELL ⋱ end_CELL start_CELL ⋱ end_CELL start_CELL ⋮ end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL … end_CELL start_CELL 0 end_CELL start_CELL 1 end_CELL end_ROW start_ROW start_CELL - italic_a start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_CELL start_CELL - italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL start_CELL … end_CELL start_CELL - italic_a start_POSTSUBSCRIPT italic_n - 1 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ] end_CELL end_ROW start_ROW start_CELL over¯ start_ARG italic_B end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = italic_S italic_B start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_CELL start_CELL = [ start_ARG start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL … end_CELL start_CELL italic_b end_CELL end_ROW end_ARG ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL over¯ start_ARG italic_C end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = italic_C start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_S start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_CELL start_CELL = [ start_ARG start_ROW start_CELL 1 end_CELL start_CELL 0 end_CELL start_CELL … end_CELL start_CELL 0 end_CELL end_ROW end_ARG ] end_CELL end_ROW start_ROW start_CELL over¯ start_ARG italic_E end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = italic_S italic_E start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_CELL start_CELL = [ start_ARG start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL … end_CELL start_CELL 1 end_CELL end_ROW end_ARG ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL end_ROW (3)

form the following new system

{x˙=A¯0x+B¯0u+E¯0fy=C¯0xcases˙𝑥subscript¯𝐴0𝑥subscript¯𝐵0𝑢subscript¯𝐸0𝑓otherwise𝑦subscript¯𝐶0𝑥otherwise\begin{cases}\dot{x}=\bar{A}_{0}x+\bar{B}_{0}u+\bar{E}_{0}f\\ y=\bar{C}_{0}x\end{cases}{ start_ROW start_CELL over˙ start_ARG italic_x end_ARG = over¯ start_ARG italic_A end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_x + over¯ start_ARG italic_B end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_u + over¯ start_ARG italic_E end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_f end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL italic_y = over¯ start_ARG italic_C end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_x end_CELL start_CELL end_CELL end_ROW (4)

The readers are referred to [18] for more details on the matrix transformation. The new system (4) has an observable canonical form such that an ESO can be established for estimating f𝑓fitalic_f.

Remark 1.

Assumption 2 is equivalent to the following conditions. The proof can be found in [18].

C0E0=0,C0A0E0=0,,C0A0n1E00formulae-sequencesubscript𝐶0subscript𝐸00formulae-sequencesubscript𝐶0subscript𝐴0subscript𝐸00subscript𝐶0superscriptsubscript𝐴0𝑛1subscript𝐸00C_{0}E_{0}=0,C_{0}A_{0}E_{0}=0,\dots,C_{0}A_{0}^{n-1}E_{0}\neq 0italic_C start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_E start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = 0 , italic_C start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_A start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_E start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = 0 , … , italic_C start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_A start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT italic_E start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ≠ 0

According to whether or not the system dynamics are available, we have the following two variants of ESO:

II-A MB-ESO

If the model information, i.e., a0,a1,,an1,bsubscript𝑎0subscript𝑎1subscript𝑎𝑛1𝑏-a_{0},-a_{1},\cdots,-a_{n-1},b- italic_a start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , - italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , ⋯ , - italic_a start_POSTSUBSCRIPT italic_n - 1 end_POSTSUBSCRIPT , italic_b, in matrix A¯0subscript¯𝐴0\bar{A}_{0}over¯ start_ARG italic_A end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and B¯0subscript¯𝐵0\bar{B}_{0}over¯ start_ARG italic_B end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT is available, we have

x˙=˙𝑥absent\displaystyle\dot{x}=over˙ start_ARG italic_x end_ARG = [010001a0an1]A¯0,MBx+[00b]B¯0,MBu+[001]E¯0dfsubscriptmatrix010001subscript𝑎0subscript𝑎𝑛1subscript¯𝐴0𝑀𝐵𝑥subscriptmatrix00𝑏subscript¯𝐵0𝑀𝐵𝑢subscriptmatrix001subscript¯𝐸0subscript𝑑𝑓\displaystyle\underbrace{\begin{bmatrix}0&1&\dots&0\\ \vdots&\ddots&\ddots&\vdots\\ 0&\dots&0&1\\ -a_{0}&\dots&\dots&-a_{n-1}\\ \end{bmatrix}}_{\bar{A}_{0,MB}}x+\underbrace{\begin{bmatrix}0\\ 0\\ \vdots\\ b\end{bmatrix}}_{\bar{B}_{0,MB}}u+\underbrace{\begin{bmatrix}0\\ 0\\ \vdots\\ 1\end{bmatrix}}_{\bar{E}_{0}}\underbrace{d}_{f}under⏟ start_ARG [ start_ARG start_ROW start_CELL 0 end_CELL start_CELL 1 end_CELL start_CELL … end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL ⋮ end_CELL start_CELL ⋱ end_CELL start_CELL ⋱ end_CELL start_CELL ⋮ end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL … end_CELL start_CELL 0 end_CELL start_CELL 1 end_CELL end_ROW start_ROW start_CELL - italic_a start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_CELL start_CELL … end_CELL start_CELL … end_CELL start_CELL - italic_a start_POSTSUBSCRIPT italic_n - 1 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ] end_ARG start_POSTSUBSCRIPT over¯ start_ARG italic_A end_ARG start_POSTSUBSCRIPT 0 , italic_M italic_B end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_x + under⏟ start_ARG [ start_ARG start_ROW start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL end_ROW start_ROW start_CELL ⋮ end_CELL end_ROW start_ROW start_CELL italic_b end_CELL end_ROW end_ARG ] end_ARG start_POSTSUBSCRIPT over¯ start_ARG italic_B end_ARG start_POSTSUBSCRIPT 0 , italic_M italic_B end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_u + under⏟ start_ARG [ start_ARG start_ROW start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL end_ROW start_ROW start_CELL ⋮ end_CELL end_ROW start_ROW start_CELL 1 end_CELL end_ROW end_ARG ] end_ARG start_POSTSUBSCRIPT over¯ start_ARG italic_E end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT under⏟ start_ARG italic_d end_ARG start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT (5)

The total disturbance can be represented as:

f=d𝑓𝑑f=d\\ italic_f = italic_d (6)

where d𝑑ditalic_d is the external disturbance, b𝑏bitalic_b is the true control gain.

II-B MF-ESO

If the model information, i.e., a0,a1,,an1,bsubscript𝑎0subscript𝑎1subscript𝑎𝑛1𝑏-a_{0},-a_{1},\cdots,-a_{n-1},b- italic_a start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , - italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , ⋯ , - italic_a start_POSTSUBSCRIPT italic_n - 1 end_POSTSUBSCRIPT , italic_b, in matrix A¯0subscript¯𝐴0\bar{A}_{0}over¯ start_ARG italic_A end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and B¯0subscript¯𝐵0\bar{B}_{0}over¯ start_ARG italic_B end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, is not available, we have

x˙=[010001000]A¯0,MFx+[00b0]B¯0,MFu+[01]E¯0(a0x1an1xn+(bb0)u+d)f˙𝑥absentsubscriptmatrix010001000subscript¯𝐴0𝑀𝐹𝑥limit-fromsubscriptmatrix00subscript𝑏0subscript¯𝐵0𝑀𝐹𝑢missing-subexpressionsubscriptmatrix01subscript¯𝐸0subscriptsubscript𝑎0subscript𝑥1subscript𝑎𝑛1subscript𝑥𝑛𝑏subscript𝑏0𝑢𝑑𝑓\begin{array}[]{r@{}l}\dot{x}=&\underbrace{\begin{bmatrix}0&1&\dots&0\\ \vdots&\ddots&\ddots&\vdots\\ 0&\dots&0&1\\ 0&0&\dots&0\\ \end{bmatrix}}_{\bar{A}_{0,MF}}x+\underbrace{\begin{bmatrix}0\\ 0\\ \vdots\\ b_{0}\end{bmatrix}}_{\bar{B}_{0,MF}}u+\\ &\underbrace{\begin{bmatrix}0\\ \vdots\\ 1\end{bmatrix}}_{\bar{E}_{0}}\underbrace{(-a_{0}x_{1}-\dots-a_{n-1}x_{n}+(b-b_% {0})u+d)}_{f}\par\end{array}start_ARRAY start_ROW start_CELL over˙ start_ARG italic_x end_ARG = end_CELL start_CELL under⏟ start_ARG [ start_ARG start_ROW start_CELL 0 end_CELL start_CELL 1 end_CELL start_CELL … end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL ⋮ end_CELL start_CELL ⋱ end_CELL start_CELL ⋱ end_CELL start_CELL ⋮ end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL … end_CELL start_CELL 0 end_CELL start_CELL 1 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL … end_CELL start_CELL 0 end_CELL end_ROW end_ARG ] end_ARG start_POSTSUBSCRIPT over¯ start_ARG italic_A end_ARG start_POSTSUBSCRIPT 0 , italic_M italic_F end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_x + under⏟ start_ARG [ start_ARG start_ROW start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL end_ROW start_ROW start_CELL ⋮ end_CELL end_ROW start_ROW start_CELL italic_b start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ] end_ARG start_POSTSUBSCRIPT over¯ start_ARG italic_B end_ARG start_POSTSUBSCRIPT 0 , italic_M italic_F end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_u + end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL under⏟ start_ARG [ start_ARG start_ROW start_CELL 0 end_CELL end_ROW start_ROW start_CELL ⋮ end_CELL end_ROW start_ROW start_CELL 1 end_CELL end_ROW end_ARG ] end_ARG start_POSTSUBSCRIPT over¯ start_ARG italic_E end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT under⏟ start_ARG ( - italic_a start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - ⋯ - italic_a start_POSTSUBSCRIPT italic_n - 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT + ( italic_b - italic_b start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) italic_u + italic_d ) end_ARG start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT end_CELL end_ROW end_ARRAY (7)

where a0x1an1xn+(bb0)usubscript𝑎0subscript𝑥1subscript𝑎𝑛1subscript𝑥𝑛𝑏subscript𝑏0𝑢-a_{0}x_{1}-\dots-a_{n-1}x_{n}+(b-b_{0})u- italic_a start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - ⋯ - italic_a start_POSTSUBSCRIPT italic_n - 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT + ( italic_b - italic_b start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) italic_u is the internal disturbance (unknown/unmodelled dynamics), b0subscript𝑏0b_{0}italic_b start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT is the nominal control gain, and d𝑑ditalic_d is the external disturbance. In such a case, the total disturbance becomes:

f=a0x1an1xn+(bb0)u+d𝑓subscript𝑎0subscript𝑥1subscript𝑎𝑛1subscript𝑥𝑛𝑏subscript𝑏0𝑢𝑑f=-a_{0}x_{1}-\dots-a_{n-1}x_{n}+(b-b_{0})u+d\\ italic_f = - italic_a start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - ⋯ - italic_a start_POSTSUBSCRIPT italic_n - 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT + ( italic_b - italic_b start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) italic_u + italic_d (8)

ESO treats the total disturbance f𝑓fitalic_f as an extended state, such that a Luenberger observer can be designed to estimate both the original system state x𝑥xitalic_x and the total disturbance f𝑓fitalic_f. The augmented dynamic system is as follows:

{[x˙f˙]=A[xf]+Bu+Ef˙y=Cxcasesmatrix˙𝑥˙𝑓𝐴matrix𝑥𝑓𝐵𝑢𝐸˙𝑓otherwise𝑦𝐶𝑥otherwise\begin{cases}\begin{bmatrix}\dot{x}\\ \dot{f}\end{bmatrix}=A\begin{bmatrix}x\\ f\end{bmatrix}+Bu+E\dot{f}\\ y=Cx\end{cases}{ start_ROW start_CELL [ start_ARG start_ROW start_CELL over˙ start_ARG italic_x end_ARG end_CELL end_ROW start_ROW start_CELL over˙ start_ARG italic_f end_ARG end_CELL end_ROW end_ARG ] = italic_A [ start_ARG start_ROW start_CELL italic_x end_CELL end_ROW start_ROW start_CELL italic_f end_CELL end_ROW end_ARG ] + italic_B italic_u + italic_E over˙ start_ARG italic_f end_ARG end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL italic_y = italic_C italic_x end_CELL start_CELL end_CELL end_ROW (9)

where A=[A0¯E0¯01×n0](n+1)×(n+1)𝐴subscriptmatrix¯subscript𝐴0¯subscript𝐸0subscript01𝑛0𝑛1𝑛1A=\begin{bmatrix}\bar{A_{0}}&\bar{E_{0}}\\ 0_{1\times n}&0\end{bmatrix}_{(n+1)\times(n+1)}italic_A = [ start_ARG start_ROW start_CELL over¯ start_ARG italic_A start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG end_CELL start_CELL over¯ start_ARG italic_E start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG end_CELL end_ROW start_ROW start_CELL 0 start_POSTSUBSCRIPT 1 × italic_n end_POSTSUBSCRIPT end_CELL start_CELL 0 end_CELL end_ROW end_ARG ] start_POSTSUBSCRIPT ( italic_n + 1 ) × ( italic_n + 1 ) end_POSTSUBSCRIPT, B=[B0¯0](n+1)×1𝐵subscriptmatrix¯subscript𝐵00𝑛11B=\begin{bmatrix}\bar{B_{0}}\\ 0\end{bmatrix}_{(n+1)\times 1}italic_B = [ start_ARG start_ROW start_CELL over¯ start_ARG italic_B start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG end_CELL end_ROW start_ROW start_CELL 0 end_CELL end_ROW end_ARG ] start_POSTSUBSCRIPT ( italic_n + 1 ) × 1 end_POSTSUBSCRIPT, C=[C0¯,0]1×(n+1)𝐶subscript¯subscript𝐶001𝑛1C=[\bar{C_{0}},0]_{1\times(n+1)}italic_C = [ over¯ start_ARG italic_C start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG , 0 ] start_POSTSUBSCRIPT 1 × ( italic_n + 1 ) end_POSTSUBSCRIPT, E=[0,,0,1](n+1)×1T𝐸superscriptsubscript001𝑛11𝑇E=[0,\cdots,0,1]_{(n+1)\times 1}^{T}italic_E = [ 0 , ⋯ , 0 , 1 ] start_POSTSUBSCRIPT ( italic_n + 1 ) × 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT.

The Luenberger observer has the following form:

[x^˙f^˙]=A[x^f^]+Bu+L(yC[x^f^])matrix˙^𝑥˙^𝑓𝐴matrix^𝑥^𝑓𝐵𝑢𝐿𝑦𝐶matrix^𝑥^𝑓\begin{bmatrix}\dot{\hat{x}}\\ \dot{\hat{f}}\end{bmatrix}=A\begin{bmatrix}\hat{x}\\ \hat{f}\end{bmatrix}+Bu+L\left(y-C\begin{bmatrix}\hat{x}\\ \hat{f}\end{bmatrix}\right)[ start_ARG start_ROW start_CELL over˙ start_ARG over^ start_ARG italic_x end_ARG end_ARG end_CELL end_ROW start_ROW start_CELL over˙ start_ARG over^ start_ARG italic_f end_ARG end_ARG end_CELL end_ROW end_ARG ] = italic_A [ start_ARG start_ROW start_CELL over^ start_ARG italic_x end_ARG end_CELL end_ROW start_ROW start_CELL over^ start_ARG italic_f end_ARG end_CELL end_ROW end_ARG ] + italic_B italic_u + italic_L ( italic_y - italic_C [ start_ARG start_ROW start_CELL over^ start_ARG italic_x end_ARG end_CELL end_ROW start_ROW start_CELL over^ start_ARG italic_f end_ARG end_CELL end_ROW end_ARG ] ) (10)

where x^^𝑥\hat{x}over^ start_ARG italic_x end_ARG and f^^𝑓\hat{f}over^ start_ARG italic_f end_ARG are estimations of x𝑥xitalic_x and f𝑓fitalic_f, L𝐿Litalic_L is the observer gain. We have the following estimation error dynamics:

e˙=(ALC)e+Ef˙˙𝑒𝐴𝐿𝐶𝑒𝐸˙𝑓\dot{e}=(A-LC)e+E\dot{f}\\ over˙ start_ARG italic_e end_ARG = ( italic_A - italic_L italic_C ) italic_e + italic_E over˙ start_ARG italic_f end_ARG (11)

where e=[xx^ff^]T𝑒superscriptmatrix𝑥^𝑥𝑓^𝑓𝑇e=\begin{bmatrix}x-\hat{x}&f-\hat{f}\\ \end{bmatrix}^{T}italic_e = [ start_ARG start_ROW start_CELL italic_x - over^ start_ARG italic_x end_ARG end_CELL start_CELL italic_f - over^ start_ARG italic_f end_ARG end_CELL end_ROW end_ARG ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT.

Theorem 1.

Under Assumption 1 and Assumption 2, the eigenvalues ALC𝐴𝐿𝐶A-LCitalic_A - italic_L italic_C can be placed at the left side of the plane to make the estimation converge [18, 17].

All eigenvalues can be placed at ωosubscript𝜔𝑜-\omega_{o}- italic_ω start_POSTSUBSCRIPT italic_o end_POSTSUBSCRIPT, which is called the observer bandwidth of ESO [19].

III Learning-Enabled ESO

The model-based ESO in (5) and the model-free ESO in (7) can be further expanded as follows:

[010001a0an1A¯0,MBE0¯01×n0]AMB[x^f^]+[0b0+bb0B¯0,MB0]BMBu=[01000100A¯0,MFE0¯01×n0]AMF[x^f^]+[0b0B¯0,MF0]BMFu+[E¯00](a0x1an1xn+(bb0)u)missing-subexpressionsubscriptmatrixsubscriptmatrix010001subscript𝑎0subscript𝑎𝑛1subscript¯𝐴0𝑀𝐵¯subscript𝐸0subscript01𝑛0subscript𝐴𝑀𝐵matrix^𝑥^𝑓subscriptmatrix0subscriptsubscript𝑏0𝑏subscript𝑏0subscript¯𝐵0𝑀𝐵0subscript𝐵𝑀𝐵𝑢absentmissing-subexpressionsubscriptmatrixsubscriptmatrix01000100subscript¯𝐴0𝑀𝐹¯subscript𝐸0subscript01𝑛0subscript𝐴𝑀𝐹matrix^𝑥^𝑓limit-fromsubscriptmatrix0subscriptsubscript𝑏0subscript¯𝐵0𝑀𝐹0subscript𝐵𝑀𝐹𝑢missing-subexpressionmatrixsubscript¯𝐸00subscript𝑎0subscript𝑥1subscript𝑎𝑛1subscript𝑥𝑛𝑏subscript𝑏0𝑢\begin{array}[]{r@{}l}&\underbrace{\begin{bmatrix}\underbrace{\begin{matrix}0&% 1&\dots&0\\ \vdots&\ddots&\ddots&\vdots\\ 0&\dots&0&1\\ -a_{0}&\dots&\dots&-a_{n-1}\\ \end{matrix}}_{\bar{A}_{0,MB}}&\bar{E_{0}}\\ 0_{1\times n}&0\end{bmatrix}}_{A_{MB}}\begin{bmatrix}\hat{x}\\ \hat{f}\end{bmatrix}+\underbrace{\begin{bmatrix}0\\ \vdots\\ \underbrace{b_{0}+b-b_{0}}_{\bar{B}_{0,MB}}\\ 0\end{bmatrix}}_{B_{MB}}u=\\ &\underbrace{\begin{bmatrix}\underbrace{\begin{matrix}0&1&\dots&0\\ \vdots&\ddots&\ddots&\vdots\\ 0&\dots&0&1\\ 0&\dots&\dots&0\\ \end{matrix}}_{\bar{A}_{0,MF}}&\bar{E_{0}}\\ 0_{1\times n}&0\end{bmatrix}}_{A_{MF}}\begin{bmatrix}\hat{x}\\ \hat{f}\end{bmatrix}+\underbrace{\begin{bmatrix}0\\ \vdots\\ \underbrace{b_{0}}_{\bar{B}_{0,MF}}\\ 0\end{bmatrix}}_{B_{MF}}u\par+\\ &\begin{bmatrix}\bar{E}_{0}\\ 0\end{bmatrix}(-a_{0}x_{1}\dots-a_{n-1}x_{n}+(b-b_{0})u)\end{array}start_ARRAY start_ROW start_CELL end_CELL start_CELL under⏟ start_ARG [ start_ARG start_ROW start_CELL under⏟ start_ARG start_ARG start_ROW start_CELL 0 end_CELL start_CELL 1 end_CELL start_CELL … end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL ⋮ end_CELL start_CELL ⋱ end_CELL start_CELL ⋱ end_CELL start_CELL ⋮ end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL … end_CELL start_CELL 0 end_CELL start_CELL 1 end_CELL end_ROW start_ROW start_CELL - italic_a start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_CELL start_CELL … end_CELL start_CELL … end_CELL start_CELL - italic_a start_POSTSUBSCRIPT italic_n - 1 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG end_ARG start_POSTSUBSCRIPT over¯ start_ARG italic_A end_ARG start_POSTSUBSCRIPT 0 , italic_M italic_B end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL start_CELL over¯ start_ARG italic_E start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG end_CELL end_ROW start_ROW start_CELL 0 start_POSTSUBSCRIPT 1 × italic_n end_POSTSUBSCRIPT end_CELL start_CELL 0 end_CELL end_ROW end_ARG ] end_ARG start_POSTSUBSCRIPT italic_A start_POSTSUBSCRIPT italic_M italic_B end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ start_ARG start_ROW start_CELL over^ start_ARG italic_x end_ARG end_CELL end_ROW start_ROW start_CELL over^ start_ARG italic_f end_ARG end_CELL end_ROW end_ARG ] + under⏟ start_ARG [ start_ARG start_ROW start_CELL 0 end_CELL end_ROW start_ROW start_CELL ⋮ end_CELL end_ROW start_ROW start_CELL under⏟ start_ARG italic_b start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + italic_b - italic_b start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG start_POSTSUBSCRIPT over¯ start_ARG italic_B end_ARG start_POSTSUBSCRIPT 0 , italic_M italic_B end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL 0 end_CELL end_ROW end_ARG ] end_ARG start_POSTSUBSCRIPT italic_B start_POSTSUBSCRIPT italic_M italic_B end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_u = end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL under⏟ start_ARG [ start_ARG start_ROW start_CELL under⏟ start_ARG start_ARG start_ROW start_CELL 0 end_CELL start_CELL 1 end_CELL start_CELL … end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL ⋮ end_CELL start_CELL ⋱ end_CELL start_CELL ⋱ end_CELL start_CELL ⋮ end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL … end_CELL start_CELL 0 end_CELL start_CELL 1 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL … end_CELL start_CELL … end_CELL start_CELL 0 end_CELL end_ROW end_ARG end_ARG start_POSTSUBSCRIPT over¯ start_ARG italic_A end_ARG start_POSTSUBSCRIPT 0 , italic_M italic_F end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL start_CELL over¯ start_ARG italic_E start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG end_CELL end_ROW start_ROW start_CELL 0 start_POSTSUBSCRIPT 1 × italic_n end_POSTSUBSCRIPT end_CELL start_CELL 0 end_CELL end_ROW end_ARG ] end_ARG start_POSTSUBSCRIPT italic_A start_POSTSUBSCRIPT italic_M italic_F end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ start_ARG start_ROW start_CELL over^ start_ARG italic_x end_ARG end_CELL end_ROW start_ROW start_CELL over^ start_ARG italic_f end_ARG end_CELL end_ROW end_ARG ] + under⏟ start_ARG [ start_ARG start_ROW start_CELL 0 end_CELL end_ROW start_ROW start_CELL ⋮ end_CELL end_ROW start_ROW start_CELL under⏟ start_ARG italic_b start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG start_POSTSUBSCRIPT over¯ start_ARG italic_B end_ARG start_POSTSUBSCRIPT 0 , italic_M italic_F end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL 0 end_CELL end_ROW end_ARG ] end_ARG start_POSTSUBSCRIPT italic_B start_POSTSUBSCRIPT italic_M italic_F end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_u + end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL [ start_ARG start_ROW start_CELL over¯ start_ARG italic_E end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL 0 end_CELL end_ROW end_ARG ] ( - italic_a start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⋯ - italic_a start_POSTSUBSCRIPT italic_n - 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT + ( italic_b - italic_b start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) italic_u ) end_CELL end_ROW end_ARRAY (12)
Remark 2.

By incorporating model information, MF-ESO becomes equivalent to MB-ESO.

Remark 3.

The motivation for proposing the learning component can be justified in that the model information is learnable to facilitate the incorporation of model information.

Remark 4.

The learning component is even possible to learn the external disturbance together with the internal disturbance to be incorporated.

Since the learning component has a feedforward estimation f^Lsubscript^𝑓𝐿\hat{f}_{L}over^ start_ARG italic_f end_ARG start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT for the total disturbance, ESO can serve as a feedback correction to estimate the residual total disturbance as Δf^Δ^𝑓\Delta\hat{f}roman_Δ over^ start_ARG italic_f end_ARG. The combination of the feedforward estimation and the feedback correction is realized as follows:

[x^˙Δf^˙]=A[x^Δf^]+Bu+L(yC[x^Δf^])+[E¯00]f^Lmatrix˙^𝑥˙Δ^𝑓𝐴matrix^𝑥Δ^𝑓𝐵𝑢𝐿𝑦𝐶matrix^𝑥Δ^𝑓matrixsubscript¯𝐸00subscript^𝑓𝐿\begin{bmatrix}\dot{\hat{x}}\\ \dot{\Delta\hat{f}}\end{bmatrix}=A\begin{bmatrix}\hat{x}\\ \Delta\hat{f}\end{bmatrix}+Bu+L\left(y-C\begin{bmatrix}\hat{x}\\ \Delta\hat{f}\end{bmatrix}\right)+\begin{bmatrix}\bar{E}_{0}\\ 0\end{bmatrix}\hat{f}_{L}[ start_ARG start_ROW start_CELL over˙ start_ARG over^ start_ARG italic_x end_ARG end_ARG end_CELL end_ROW start_ROW start_CELL over˙ start_ARG roman_Δ over^ start_ARG italic_f end_ARG end_ARG end_CELL end_ROW end_ARG ] = italic_A [ start_ARG start_ROW start_CELL over^ start_ARG italic_x end_ARG end_CELL end_ROW start_ROW start_CELL roman_Δ over^ start_ARG italic_f end_ARG end_CELL end_ROW end_ARG ] + italic_B italic_u + italic_L ( italic_y - italic_C [ start_ARG start_ROW start_CELL over^ start_ARG italic_x end_ARG end_CELL end_ROW start_ROW start_CELL roman_Δ over^ start_ARG italic_f end_ARG end_CELL end_ROW end_ARG ] ) + [ start_ARG start_ROW start_CELL over¯ start_ARG italic_E end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL 0 end_CELL end_ROW end_ARG ] over^ start_ARG italic_f end_ARG start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT (13)

Since the learning component is expected to capture the unknown dynamics, we employ a model-free ESO, see Fig. 1. The learning block in Fig. 1 is a function hθ(x,u)subscript𝜃𝑥𝑢h_{\theta}(x,u)italic_h start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( italic_x , italic_u ) parameterized by θ𝜃\thetaitalic_θ. To learn the total disturbance (see (8)), we establish a map** from the input (x^^𝑥\hat{x}over^ start_ARG italic_x end_ARG estimated by ESO and control input u𝑢uitalic_u) to the output f^^𝑓\hat{f}over^ start_ARG italic_f end_ARG, where f^=f^L+Δf^^𝑓subscript^𝑓𝐿Δ^𝑓\hat{f}=\hat{f}_{L}+\Delta\hat{f}over^ start_ARG italic_f end_ARG = over^ start_ARG italic_f end_ARG start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT + roman_Δ over^ start_ARG italic_f end_ARG. The total disturbance estimation consists of two parts: 1) the feedforward estimation from the learning component f^L=hθ(x^,u)subscript^𝑓𝐿subscript𝜃^𝑥𝑢\hat{f}_{L}=h_{\theta}(\hat{x},u)over^ start_ARG italic_f end_ARG start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT = italic_h start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( over^ start_ARG italic_x end_ARG , italic_u ); 2) feedback correction for the residual disturbance Δf^Δ^𝑓\Delta\hat{f}roman_Δ over^ start_ARG italic_f end_ARG by an MF-ESO. To optimize the parameters of the machine learning model, a general regression problem is formulated using the following cost function:

J(θ)=12i=1n(hθ(x^i,ui)f^i)2𝐽𝜃12superscriptsubscript𝑖1𝑛superscriptsubscript𝜃superscript^𝑥𝑖superscript𝑢𝑖superscript^𝑓𝑖2J(\theta)=\frac{1}{2}\sum_{i=1}^{n}(h_{\theta}(\hat{x}^{i},u^{i})-\hat{f}^{i})% ^{2}italic_J ( italic_θ ) = divide start_ARG 1 end_ARG start_ARG 2 end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ( italic_h start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( over^ start_ARG italic_x end_ARG start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT , italic_u start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ) - over^ start_ARG italic_f end_ARG start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT (14)

where n𝑛nitalic_n is the size of the training data. The details are in Alg. 1. When the batch is not yet filled, we run the MF-ESO (see Line 7-14, the learning component does not return optimized parameters).

Input: Control input u𝑢uitalic_u, system output y𝑦yitalic_y, learning rate α𝛼\alphaitalic_α, batch size n𝑛nitalic_n, maximum running time Nmaxsubscript𝑁𝑚𝑎𝑥N_{max}italic_N start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT
Output: Total disturbance f^^𝑓\hat{f}over^ start_ARG italic_f end_ARG
1 Initialize:
2machine learning input batch 0=superscript0\mathcal{I}^{0}=\emptysetcaligraphic_I start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = ∅
3disturbance estimation by ESO batch Δ0=Δsuperscript0\Delta\mathcal{F}^{0}=\emptysetroman_Δ caligraphic_F start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = ∅
4machine learning output batch 0=superscriptsubscript0\mathcal{F_{L}}^{0}=\emptysetcaligraphic_F start_POSTSUBSCRIPT caligraphic_L end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = ∅
5machine learning model parameter θ𝜃\thetaitalic_θ
6machine learning output f^L0=0superscriptsubscript^𝑓𝐿00\hat{f}_{L}^{0}=0over^ start_ARG italic_f end_ARG start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = 0
7for  i=1𝑖1i=1italic_i = 1 to n𝑛nitalic_n do
       Get x^isuperscript^𝑥𝑖\hat{x}^{i}over^ start_ARG italic_x end_ARG start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT and Δf^iΔsuperscript^𝑓𝑖\Delta\hat{f}^{i}roman_Δ over^ start_ARG italic_f end_ARG start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT by running L-ESO \triangleright see (13)
8      
      Compute uisuperscript𝑢𝑖u^{i}italic_u start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT \triangleright see (22)
9      
10      i:=[i1,[x^1i,x^2i,,x^ni,ui,1]T]assignsuperscript𝑖superscript𝑖1superscriptsuperscriptsubscript^𝑥1𝑖superscriptsubscript^𝑥2𝑖superscriptsubscript^𝑥𝑛𝑖superscript𝑢𝑖1𝑇\mathcal{I}^{i}:=[\mathcal{I}^{i-1},[\hat{x}_{1}^{i},\hat{x}_{2}^{i},\dots,% \hat{x}_{n}^{i},u^{i},1]^{T}]caligraphic_I start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT := [ caligraphic_I start_POSTSUPERSCRIPT italic_i - 1 end_POSTSUPERSCRIPT , [ over^ start_ARG italic_x end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT , over^ start_ARG italic_x end_ARG start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT , … , over^ start_ARG italic_x end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT , italic_u start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT , 1 ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ]
11      Δi:=[Δi1,Δf^i]assignΔsuperscript𝑖Δsuperscript𝑖1Δsuperscript^𝑓𝑖\Delta\mathcal{F}^{i}:=[\Delta\mathcal{F}^{i-1},\Delta\hat{f}^{i}]roman_Δ caligraphic_F start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT := [ roman_Δ caligraphic_F start_POSTSUPERSCRIPT italic_i - 1 end_POSTSUPERSCRIPT , roman_Δ over^ start_ARG italic_f end_ARG start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ]
      i:=[i1,0]assignsuperscriptsubscript𝑖superscriptsubscript𝑖10\mathcal{F_{L}}^{i}:=[\mathcal{F_{L}}^{i-1},0]caligraphic_F start_POSTSUBSCRIPT caligraphic_L end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT := [ caligraphic_F start_POSTSUBSCRIPT caligraphic_L end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i - 1 end_POSTSUPERSCRIPT , 0 ] \triangleright append data into three batches
12      
13      f^Li=0superscriptsubscript^𝑓𝐿𝑖0\hat{f}_{L}^{i}=0over^ start_ARG italic_f end_ARG start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT = 0
14 end for
15
16for i=n𝑖𝑛i=nitalic_i = italic_n to Nmaxsubscript𝑁𝑚𝑎𝑥N_{max}italic_N start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT do
17      
      Get x^isuperscript^𝑥𝑖\hat{x}^{i}over^ start_ARG italic_x end_ARG start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT and Δf^iΔsuperscript^𝑓𝑖\Delta\hat{f}^{i}roman_Δ over^ start_ARG italic_f end_ARG start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT by running L-ESO \triangleright see (13)
18      
      Update isuperscript𝑖\mathcal{I}^{i}caligraphic_I start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT \triangleright pop oldest datum, push new datum
19      
      Update ΔiΔsuperscript𝑖\Delta\mathcal{F}^{i}roman_Δ caligraphic_F start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT \triangleright pop oldest datum, push new datum
20      
      Update θisuperscript𝜃𝑖\theta^{i}italic_θ start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT \triangleright According to (14)
21      
22      i=hθi(i)superscriptsubscript𝑖subscriptsuperscript𝜃𝑖superscript𝑖\mathcal{F_{L}}^{i}=h_{\theta^{i}}(\mathcal{I}^{i})caligraphic_F start_POSTSUBSCRIPT caligraphic_L end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT = italic_h start_POSTSUBSCRIPT italic_θ start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( caligraphic_I start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT )
23      f^Li=hθi(xi)superscriptsubscript^𝑓𝐿𝑖subscriptsuperscript𝜃𝑖superscript𝑥𝑖\hat{f}_{L}^{i}=h_{\theta^{i}}(x^{i})over^ start_ARG italic_f end_ARG start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT = italic_h start_POSTSUBSCRIPT italic_θ start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_x start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT )
      f^i=f^Li+Δf^isuperscript^𝑓𝑖superscriptsubscript^𝑓𝐿𝑖Δsuperscript^𝑓𝑖\hat{f}^{i}=\hat{f}_{L}^{i}+\Delta\hat{f}^{i}over^ start_ARG italic_f end_ARG start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT = over^ start_ARG italic_f end_ARG start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT + roman_Δ over^ start_ARG italic_f end_ARG start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT \triangleright compute total disturbance
24      
      Compute uisuperscript𝑢𝑖u^{i}italic_u start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT \triangleright see (22)
25      
26 end for
27
Algorithm 1 L-ESO

Our framework has superior modularity. The design of the ESO is just a conventional model-free convention. We only need to use the estimation from ESO to drive the training of our learning component. First, the learning component can serve as an add-on to existing ESO-based control architecture by just adding a few connections. Second, the learning component is so flexible that users can customize it by choosing appropriate machine learning models, e.g., linear, non-linear, parametric, non-parametric, etc.

IV Simulation Results

IV-A Two-Mass-Spring Problem Formulation

Fig. 2 depicts a schematic of a two-mass-spring system, which is from a well-known benchmark control problem [20]. The system includes two masses: m1subscript𝑚1m_{1}italic_m start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and m2subscript𝑚2m_{2}italic_m start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, which can slide freely over a horizontal surface without friction. Note that it has been proved that a non-friction setting is more challenging for a controller design [9]. The masses are connected by a light horizontal spring with a spring constant k𝑘kitalic_k. The system is subject to two external disturbance forces w1subscript𝑤1w_{1}italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and w2subscript𝑤2w_{2}italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, which act on masses m1subscript𝑚1m_{1}italic_m start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and m2subscript𝑚2m_{2}italic_m start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, respectively. The control signal u𝑢uitalic_u is the force applied to mass m1subscript𝑚1m_{1}italic_m start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT. Both the positions of mass m1subscript𝑚1m_{1}italic_m start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and mass m2subscript𝑚2m_{2}italic_m start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT are measured, and either one can be used as an output to be controlled.

The states of the two-mass-spring system are defined as the displacements and velocities of the two masses. Specifically, the displacement and velocity of mass m1subscript𝑚1m_{1}italic_m start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT are x1subscript𝑥1x_{1}italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and x3subscript𝑥3x_{3}italic_x start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT, respectively, while the displacement and velocity of mass m2subscript𝑚2m_{2}italic_m start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT are x2subscript𝑥2x_{2}italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT and x4subscript𝑥4x_{4}italic_x start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT, respectively. The dynamics of the system can be represented in the following state-space form:

[x˙1x˙2x˙3x˙4]=[00100001km1km100km2km200][x1x2x3x4]+[001m10](u+w1)+[0001m2]w2y=[c1c200][x1x2x3x4]Tmatrixsubscript˙𝑥1subscript˙𝑥2subscript˙𝑥3subscript˙𝑥4matrix00100001𝑘subscript𝑚1𝑘subscript𝑚100𝑘subscript𝑚2𝑘subscript𝑚200matrixsubscript𝑥1subscript𝑥2subscript𝑥3subscript𝑥4matrix001subscript𝑚10𝑢subscript𝑤1matrix0001subscript𝑚2subscript𝑤2𝑦matrixsubscript𝑐1subscript𝑐200superscriptmatrixsubscript𝑥1subscript𝑥2subscript𝑥3subscript𝑥4𝑇\begin{split}\begin{bmatrix}\dot{x}_{1}\\ \dot{x}_{2}\\ \dot{x}_{3}\\ \dot{x}_{4}\end{bmatrix}&=\begin{bmatrix}0&0&1&0\\ 0&0&0&1\\ -\frac{k}{m_{1}}&\frac{k}{m_{1}}&0&0\\ \frac{k}{m_{2}}&-\frac{k}{m_{2}}&0&0\\ \end{bmatrix}\begin{bmatrix}x_{1}\\ x_{2}\\ x_{3}\\ x_{4}\\ \end{bmatrix}\\ &+\begin{bmatrix}0\\ 0\\ \frac{1}{m_{1}}\\ 0\\ \end{bmatrix}(u+w_{1})+\begin{bmatrix}0\\ 0\\ 0\\ \frac{1}{m_{2}}\end{bmatrix}w_{2}\\ y&=\begin{bmatrix}c_{1}&c_{2}&0&0\end{bmatrix}\begin{bmatrix}x_{1}&x_{2}&x_{3}% &x_{4}\\ \end{bmatrix}^{T}\end{split}start_ROW start_CELL [ start_ARG start_ROW start_CELL over˙ start_ARG italic_x end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL over˙ start_ARG italic_x end_ARG start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL over˙ start_ARG italic_x end_ARG start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL over˙ start_ARG italic_x end_ARG start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ] end_CELL start_CELL = [ start_ARG start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 1 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 1 end_CELL end_ROW start_ROW start_CELL - divide start_ARG italic_k end_ARG start_ARG italic_m start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG end_CELL start_CELL divide start_ARG italic_k end_ARG start_ARG italic_m start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL divide start_ARG italic_k end_ARG start_ARG italic_m start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG end_CELL start_CELL - divide start_ARG italic_k end_ARG start_ARG italic_m start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW end_ARG ] [ start_ARG start_ROW start_CELL italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_x start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_x start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ] end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL + [ start_ARG start_ROW start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL end_ROW start_ROW start_CELL divide start_ARG 1 end_ARG start_ARG italic_m start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG end_CELL end_ROW start_ROW start_CELL 0 end_CELL end_ROW end_ARG ] ( italic_u + italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) + [ start_ARG start_ROW start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL end_ROW start_ROW start_CELL divide start_ARG 1 end_ARG start_ARG italic_m start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG end_CELL end_ROW end_ARG ] italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_y end_CELL start_CELL = [ start_ARG start_ROW start_CELL italic_c start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL start_CELL italic_c start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW end_ARG ] [ start_ARG start_ROW start_CELL italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL start_CELL italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL start_CELL italic_x start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_CELL start_CELL italic_x start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL end_ROW (15)
Refer to caption
Figure 2: Two-mass-spring system with uncertain parameters

A time-varying unknown external disturbance w2subscript𝑤2w_{2}italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT is from the mass m2subscript𝑚2m_{2}italic_m start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, control needs to be conducted on m1subscript𝑚1m_{1}italic_m start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT to allow x2subscript𝑥2x_{2}italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT track any desired trajectory. For the output y𝑦yitalic_y, i.e., x2subscript𝑥2x_{2}italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, a chained integrator system is derived by taking the derivatives of the output four times. The input and disturbance are in the last channel of this fourth-order system with b=km1m2𝑏𝑘subscript𝑚1subscript𝑚2b=\frac{k}{m_{1}m_{2}}italic_b = divide start_ARG italic_k end_ARG start_ARG italic_m start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_m start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG:

y(4)=km1+m2m1m2y¨+km1m2w2+1m2w¨2+busuperscript𝑦4𝑘subscript𝑚1subscript𝑚2subscript𝑚1subscript𝑚2¨𝑦𝑘subscript𝑚1subscript𝑚2subscript𝑤21subscript𝑚2subscript¨𝑤2𝑏𝑢y^{(4)}=-k\frac{m_{1}+m_{2}}{m_{1}m_{2}}\ddot{y}+\frac{k}{m_{1}m_{2}}w_{2}+% \frac{1}{m_{2}}\ddot{w}_{2}+buitalic_y start_POSTSUPERSCRIPT ( 4 ) end_POSTSUPERSCRIPT = - italic_k divide start_ARG italic_m start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_m start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG start_ARG italic_m start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_m start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG over¨ start_ARG italic_y end_ARG + divide start_ARG italic_k end_ARG start_ARG italic_m start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_m start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT + divide start_ARG 1 end_ARG start_ARG italic_m start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG over¨ start_ARG italic_w end_ARG start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT + italic_b italic_u (16)

IV-B ESO design

The states in the system are:

x=[yy˙y¨y˙˙˙]T𝑥superscriptmatrix𝑦˙𝑦¨𝑦˙˙˙𝑦𝑇x=\begin{bmatrix}y&\dot{y}&\ddot{y}&\dddot{y}\end{bmatrix}^{T}italic_x = [ start_ARG start_ROW start_CELL italic_y end_CELL start_CELL over˙ start_ARG italic_y end_ARG end_CELL start_CELL over¨ start_ARG italic_y end_ARG end_CELL start_CELL over˙˙˙ start_ARG italic_y end_ARG end_CELL end_ROW end_ARG ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT (17)

The state-space description of the system is

{[x˙f˙]=A[xf]+Bu+Ef˙y=Cxcasesmatrix˙𝑥˙𝑓𝐴matrix𝑥𝑓𝐵𝑢𝐸˙𝑓otherwise𝑦𝐶𝑥otherwise\begin{cases}\begin{bmatrix}\dot{x}\\ \dot{f}\end{bmatrix}=A\begin{bmatrix}x\\ f\end{bmatrix}+Bu+E\dot{f}\\ y=Cx\end{cases}{ start_ROW start_CELL [ start_ARG start_ROW start_CELL over˙ start_ARG italic_x end_ARG end_CELL end_ROW start_ROW start_CELL over˙ start_ARG italic_f end_ARG end_CELL end_ROW end_ARG ] = italic_A [ start_ARG start_ROW start_CELL italic_x end_CELL end_ROW start_ROW start_CELL italic_f end_CELL end_ROW end_ARG ] + italic_B italic_u + italic_E over˙ start_ARG italic_f end_ARG end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL italic_y = italic_C italic_x end_CELL start_CELL end_CELL end_ROW (18)

IV-B1 Model-free ESO

The state-space model is:

AMF=[0100000100000100000100000]subscript𝐴𝑀𝐹matrix0100000100000100000100000A_{MF}=\begin{bmatrix}0&1&0&0&0\\ 0&0&1&0&0\\ 0&0&0&1&0\\ 0&0&0&0&1\\ 0&0&0&0&0\\ \end{bmatrix}italic_A start_POSTSUBSCRIPT italic_M italic_F end_POSTSUBSCRIPT = [ start_ARG start_ROW start_CELL 0 end_CELL start_CELL 1 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 1 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 1 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 1 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW end_ARG ], B=[000b00]𝐵matrix000subscript𝑏00B=\begin{bmatrix}0\\ 0\\ 0\\ b_{0}\\ 0\end{bmatrix}italic_B = [ start_ARG start_ROW start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL end_ROW start_ROW start_CELL italic_b start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL 0 end_CELL end_ROW end_ARG ], C=[10000]𝐶matrix10000C=\begin{bmatrix}1&0&0&0&0\end{bmatrix}italic_C = [ start_ARG start_ROW start_CELL 1 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW end_ARG ], E=[00001]T𝐸superscriptmatrix00001𝑇E=\begin{bmatrix}0&0&0&0&1\end{bmatrix}^{T}italic_E = [ start_ARG start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 1 end_CELL end_ROW end_ARG ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT. As we can see, the model-free design assumes unknown dynamics, such that the total disturbance f𝑓fitalic_f can be represented as:

f=km1+m2m1m2y¨+km1m2w2+1m2w¨2+(bb0)u𝑓𝑘subscript𝑚1subscript𝑚2subscript𝑚1subscript𝑚2¨𝑦𝑘subscript𝑚1subscript𝑚2subscript𝑤21subscript𝑚2subscript¨𝑤2𝑏subscript𝑏0𝑢f=-k\frac{m_{1}+m_{2}}{m_{1}m_{2}}\ddot{y}+\frac{k}{m_{1}m_{2}}w_{2}+\frac{1}{% m_{2}}\ddot{w}_{2}+(b-b_{0})uitalic_f = - italic_k divide start_ARG italic_m start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_m start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG start_ARG italic_m start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_m start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG over¨ start_ARG italic_y end_ARG + divide start_ARG italic_k end_ARG start_ARG italic_m start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_m start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT + divide start_ARG 1 end_ARG start_ARG italic_m start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG over¨ start_ARG italic_w end_ARG start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT + ( italic_b - italic_b start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) italic_u (19)

where km1+m2m1m2𝑘subscript𝑚1subscript𝑚2subscript𝑚1subscript𝑚2-k\frac{m_{1}+m_{2}}{m_{1}m_{2}}- italic_k divide start_ARG italic_m start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_m start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG start_ARG italic_m start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_m start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG is the model parameter information, b0subscript𝑏0b_{0}italic_b start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT is the nominal control gain. We have

y(4)=f+b0usuperscript𝑦4𝑓subscript𝑏0𝑢y^{(4)}=f+b_{0}uitalic_y start_POSTSUPERSCRIPT ( 4 ) end_POSTSUPERSCRIPT = italic_f + italic_b start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_u (20)

where everything besides b0usubscript𝑏0𝑢b_{0}uitalic_b start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_u is considered as total disturbance (see (16)). It can be validated that such a system satisfies Assumptions 1, 2, and 3. Therefore, an ESO can be designed for the estimation of f𝑓fitalic_f, see (10).

The observer gain is chosen where all the eigenvalues of AMFLCsubscript𝐴𝑀𝐹𝐿𝐶A_{MF}-LCitalic_A start_POSTSUBSCRIPT italic_M italic_F end_POSTSUBSCRIPT - italic_L italic_C are placed at ωosubscript𝜔𝑜-\omega_{o}- italic_ω start_POSTSUBSCRIPT italic_o end_POSTSUBSCRIPT [19], i.e., LMF=[5ωo10ωo210ωo35ωo4ωo5]subscript𝐿𝑀𝐹5subscript𝜔𝑜10superscriptsubscript𝜔𝑜210superscriptsubscript𝜔𝑜35superscriptsubscript𝜔𝑜4superscriptsubscript𝜔𝑜5L_{MF}=[5\omega_{o}\quad 10\omega_{o}^{2}\quad 10\omega_{o}^{3}\quad 5\omega_{% o}^{4}\quad\omega_{o}^{5}]italic_L start_POSTSUBSCRIPT italic_M italic_F end_POSTSUBSCRIPT = [ 5 italic_ω start_POSTSUBSCRIPT italic_o end_POSTSUBSCRIPT 10 italic_ω start_POSTSUBSCRIPT italic_o end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT 10 italic_ω start_POSTSUBSCRIPT italic_o end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT 5 italic_ω start_POSTSUBSCRIPT italic_o end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT italic_ω start_POSTSUBSCRIPT italic_o end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 5 end_POSTSUPERSCRIPT ].

IV-B2 Model-based ESO

The model-based design has the following state-space representation:

AMB=[01000001000001000km1+m2m1m20100000]subscript𝐴𝑀𝐵matrix01000001000001000𝑘subscript𝑚1subscript𝑚2subscript𝑚1subscript𝑚20100000A_{MB}=\begin{bmatrix}0&1&0&0&0\\ 0&0&1&0&0\\ 0&0&0&1&0\\ 0&0&-k\frac{m_{1}+m_{2}}{m_{1}m_{2}}&0&1\\ 0&0&0&0&0\\ \end{bmatrix}italic_A start_POSTSUBSCRIPT italic_M italic_B end_POSTSUBSCRIPT = [ start_ARG start_ROW start_CELL 0 end_CELL start_CELL 1 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 1 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 1 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL - italic_k divide start_ARG italic_m start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_m start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG start_ARG italic_m start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_m start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG end_CELL start_CELL 0 end_CELL start_CELL 1 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW end_ARG ], B=[000b00]𝐵matrix000subscript𝑏00B=\begin{bmatrix}0\\ 0\\ 0\\ b_{0}\\ 0\end{bmatrix}italic_B = [ start_ARG start_ROW start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL end_ROW start_ROW start_CELL italic_b start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL 0 end_CELL end_ROW end_ARG ], C=[10000]𝐶matrix10000C=\begin{bmatrix}1&0&0&0&0\end{bmatrix}italic_C = [ start_ARG start_ROW start_CELL 1 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW end_ARG ], E=[00001]T𝐸superscriptmatrix00001𝑇E=\begin{bmatrix}0&0&0&0&1\end{bmatrix}^{T}italic_E = [ start_ARG start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 1 end_CELL end_ROW end_ARG ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT. In contrast to the above-mentioned model-free design, such a system tries to leverage the prior knowledge of the dynamic model, by assuming km1+m2m1m2𝑘subscript𝑚1subscript𝑚2subscript𝑚1subscript𝑚2-k\frac{m_{1}+m_{2}}{m_{1}m_{2}}- italic_k divide start_ARG italic_m start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_m start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG start_ARG italic_m start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_m start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG is known (see (16)). In this case, the total disturbance becomes:

f=km1m2w2+1m2w¨2+(bb0)u𝑓𝑘subscript𝑚1subscript𝑚2subscript𝑤21subscript𝑚2subscript¨𝑤2𝑏subscript𝑏0𝑢f=\frac{k}{m_{1}m_{2}}w_{2}+\frac{1}{m_{2}}\ddot{w}_{2}+(b-b_{0})uitalic_f = divide start_ARG italic_k end_ARG start_ARG italic_m start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_m start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT + divide start_ARG 1 end_ARG start_ARG italic_m start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG over¨ start_ARG italic_w end_ARG start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT + ( italic_b - italic_b start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) italic_u (21)

such that y(4)=km1+m2m1m2y¨+f+b0usuperscript𝑦4𝑘subscript𝑚1subscript𝑚2subscript𝑚1subscript𝑚2¨𝑦𝑓subscript𝑏0𝑢y^{(4)}=-k\frac{m_{1}+m_{2}}{m_{1}m_{2}}\ddot{y}+f+b_{0}uitalic_y start_POSTSUPERSCRIPT ( 4 ) end_POSTSUPERSCRIPT = - italic_k divide start_ARG italic_m start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_m start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG start_ARG italic_m start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_m start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG over¨ start_ARG italic_y end_ARG + italic_f + italic_b start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_u

The observer gain is chosen where all eigenvalues of AMBLCsubscript𝐴𝑀𝐵𝐿𝐶A_{MB}-LCitalic_A start_POSTSUBSCRIPT italic_M italic_B end_POSTSUBSCRIPT - italic_L italic_C are placed at ωosubscript𝜔𝑜-\omega_{o}- italic_ω start_POSTSUBSCRIPT italic_o end_POSTSUBSCRIPT [19]. Let a=km1+m2m1m2𝑎𝑘subscript𝑚1subscript𝑚2subscript𝑚1subscript𝑚2a=-k\frac{m_{1}+m_{2}}{m_{1}m_{2}}italic_a = - italic_k divide start_ARG italic_m start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_m start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG start_ARG italic_m start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_m start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG, the coefficients of LMBsubscript𝐿𝑀𝐵L_{MB}italic_L start_POSTSUBSCRIPT italic_M italic_B end_POSTSUBSCRIPT are listed in Table I.

Parameters Values
LMB,1subscript𝐿𝑀𝐵1L_{MB,1}italic_L start_POSTSUBSCRIPT italic_M italic_B , 1 end_POSTSUBSCRIPT 5ωo5subscript𝜔𝑜5\omega_{o}5 italic_ω start_POSTSUBSCRIPT italic_o end_POSTSUBSCRIPT
LMB,2subscript𝐿𝑀𝐵2L_{MB,2}italic_L start_POSTSUBSCRIPT italic_M italic_B , 2 end_POSTSUBSCRIPT a+10ωo2𝑎10superscriptsubscript𝜔𝑜2a+10\omega_{o}^{2}italic_a + 10 italic_ω start_POSTSUBSCRIPT italic_o end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT
LMB,3subscript𝐿𝑀𝐵3L_{MB,3}italic_L start_POSTSUBSCRIPT italic_M italic_B , 3 end_POSTSUBSCRIPT 5aωo+10ωo35𝑎subscript𝜔𝑜10superscriptsubscript𝜔𝑜35a\omega_{o}+10\omega_{o}^{3}5 italic_a italic_ω start_POSTSUBSCRIPT italic_o end_POSTSUBSCRIPT + 10 italic_ω start_POSTSUBSCRIPT italic_o end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT
LMB,4subscript𝐿𝑀𝐵4L_{MB,4}italic_L start_POSTSUBSCRIPT italic_M italic_B , 4 end_POSTSUBSCRIPT a2+10aωo2+5ωo4superscript𝑎210𝑎superscriptsubscript𝜔𝑜25superscriptsubscript𝜔𝑜4a^{2}+10a\omega_{o}^{2}+5\omega_{o}^{4}italic_a start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + 10 italic_a italic_ω start_POSTSUBSCRIPT italic_o end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + 5 italic_ω start_POSTSUBSCRIPT italic_o end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT
LMB,5subscript𝐿𝑀𝐵5L_{MB,5}italic_L start_POSTSUBSCRIPT italic_M italic_B , 5 end_POSTSUBSCRIPT 5a2ωo+10aωo3+ωo55superscript𝑎2subscript𝜔𝑜10𝑎superscriptsubscript𝜔𝑜3superscriptsubscript𝜔𝑜55a^{2}\omega_{o}+10a\omega_{o}^{3}+\omega_{o}^{5}5 italic_a start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_ω start_POSTSUBSCRIPT italic_o end_POSTSUBSCRIPT + 10 italic_a italic_ω start_POSTSUBSCRIPT italic_o end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT + italic_ω start_POSTSUBSCRIPT italic_o end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 5 end_POSTSUPERSCRIPT
TABLE I: coefficients of LMBsubscript𝐿𝑀𝐵L_{MB}italic_L start_POSTSUBSCRIPT italic_M italic_B end_POSTSUBSCRIPT

IV-B3 L-ESO

As shown in (19), the internal disturbance has a linearly structured map** between the input (state and control) and the output (disturbance). Therefore, a linear regression model is a reasonable choice for the learning component, with hθ()=θT[x^1x^2x^3x^4u1]Tsubscript𝜃superscript𝜃𝑇superscriptmatrixsubscript^𝑥1subscript^𝑥2subscript^𝑥3subscript^𝑥4𝑢1𝑇h_{\theta}(\cdot)=\theta^{T}\begin{bmatrix}\hat{x}_{1}&\hat{x}_{2}&\hat{x}_{3}% &\hat{x}_{4}&u&1\end{bmatrix}^{T}italic_h start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( ⋅ ) = italic_θ start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT [ start_ARG start_ROW start_CELL over^ start_ARG italic_x end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL start_CELL over^ start_ARG italic_x end_ARG start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL start_CELL over^ start_ARG italic_x end_ARG start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_CELL start_CELL over^ start_ARG italic_x end_ARG start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT end_CELL start_CELL italic_u end_CELL start_CELL 1 end_CELL end_ROW end_ARG ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT. Note that as we mentioned before, the learning model is flexible to be linear, nonlinear, parametric, non-parametric, etc. Our contribution is not about the complexity of the learning model but the novel design to seamlessly combine machine learning models with an ESO. A batch gradient descent method is used for optimizing the cost function. In our experiments, we initialize θ𝜃\thetaitalic_θ with all zeros.

IV-C Controller Design

The control law for the system (20) can be designed as:

u=f^+u0b0𝑢^𝑓subscript𝑢0subscript𝑏0u=\frac{-\hat{f}+u_{0}}{b_{0}}italic_u = divide start_ARG - over^ start_ARG italic_f end_ARG + italic_u start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG start_ARG italic_b start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG (22)

such that

y(4)=u0superscript𝑦4subscript𝑢0y^{(4)}=u_{0}italic_y start_POSTSUPERSCRIPT ( 4 ) end_POSTSUPERSCRIPT = italic_u start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT (23)

It can be controlled by a state feedback controller

u0=Kx^=k1(rx^1)k2x^2k3x^3k4x^4subscript𝑢0𝐾^𝑥subscript𝑘1𝑟subscript^𝑥1subscript𝑘2subscript^𝑥2subscript𝑘3subscript^𝑥3subscript𝑘4subscript^𝑥4u_{0}=-K\hat{x}=k_{1}(r-\hat{x}_{1})-k_{2}\hat{x}_{2}-k_{3}\hat{x}_{3}-k_{4}% \hat{x}_{4}\\ italic_u start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = - italic_K over^ start_ARG italic_x end_ARG = italic_k start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_r - over^ start_ARG italic_x end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) - italic_k start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT over^ start_ARG italic_x end_ARG start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT - italic_k start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT over^ start_ARG italic_x end_ARG start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT - italic_k start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT over^ start_ARG italic_x end_ARG start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT (24)

with a control gain K=[ωc44ωc36ωc24ωc]𝐾matrixsuperscriptsubscript𝜔𝑐44superscriptsubscript𝜔𝑐36superscriptsubscript𝜔𝑐24subscript𝜔𝑐K=\begin{bmatrix}\omega_{c}^{4}&4\omega_{c}^{3}&6\omega_{c}^{2}&4\omega_{c}% \end{bmatrix}italic_K = [ start_ARG start_ROW start_CELL italic_ω start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT end_CELL start_CELL 4 italic_ω start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT end_CELL start_CELL 6 italic_ω start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_CELL start_CELL 4 italic_ω start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ], where ωcsubscript𝜔𝑐\omega_{c}italic_ω start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT is the close-loop natural frequency [19].

IV-D Simulation Results

The system parameters are taken from the benckmark problem [20], i.e., m1=m2=1subscript𝑚1subscript𝑚21m_{1}=m_{2}=1italic_m start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_m start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 1 kg, k=1𝑘1k=1italic_k = 1 N/m, c1=0subscript𝑐10c_{1}=0italic_c start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 0, c2=1subscript𝑐21c_{2}=1italic_c start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 1. Tracking a desired trajectory for the position of mass m2subscript𝑚2m_{2}italic_m start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT is the control objective. A sinusoidal wave with a frequency of 1 rad/s and amplitude 1 is applied in the training phase for L-ESO. After 110 seconds, a step reference is given to all three approaches. A band-limited white noise with noise power 1012superscript101210^{-12}10 start_POSTSUPERSCRIPT - 12 end_POSTSUPERSCRIPT is added at the system output side. A sinusoidal external disturbance with frequency π/10𝜋10\pi/10italic_π / 10 rad/s is applied on m2subscript𝑚2m_{2}italic_m start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT as w2subscript𝑤2w_{2}italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT starting at 150 s. The learning algorithm is running online. The learning phase is designed to emulate the typical operational scenarios of the machine under general conditions, whereas the step response is employed to assess and compare the tracking performance. All the control parameters are set identically for fair comparison.

The controller bandwidth ωcsubscript𝜔𝑐\omega_{c}italic_ω start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT and the observer bandwidth ωosubscript𝜔𝑜\omega_{o}italic_ω start_POSTSUBSCRIPT italic_o end_POSTSUBSCRIPT are set to 1 rad/s and 10 rad/s, respectively. The control gain is set to 1. All three approaches share such same settings for fair comparison.

Refer to caption
Figure 3: Tracking performance for MB-ESO, MF-ESO, L-ESO plotting from 120s.
Refer to caption
Figure 4: Control signal for MB-ESO, MF-ESO, L-ESO plotting from 120s.

The tracking performance and the control input are shown in Fig. 3 and Fig. 4, respectively.

  1. 1.

    MB-ESO and L-ESO have similar performance for the step reference tracking (see the zoom-in plot from 126 s to 134 s, Fig. 3) after the training phase, see the position plot of m2subscript𝑚2m_{2}italic_m start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT in Fig. 3, which are better than MF-ESO in terms of overshoot percentage (0 vs. 5555‰ ) and settling time (12s vs. 16s).

  2. 2.

    For external disturbance rejection (see the zoom-in plot from 170 s to 195 s, Fig. 3), L-ESO’s performance is the best. By re-visiting (8), if the external disturbance has a linear component, a linear regression component can still capture it, e.g., the trends of going up and down in a sinusoidal external disturbance.

  3. 3.

    Adding external disturbance information to the observer can help reduce the required bandwidth. In our experiments, we found that MF-ESO and MB-ESO will need three times more bandwidth to achieve the same performance as the L-ESO.

  4. 4.

    The control input of the L- ESO has more fluctuations compared with MF-ESO and MB-ESO, as shown in Fig. 4. This is caused by the noise signal and the batch gradient descent method we choose to minimize the cost function. It can be smoothened by increasing the batch size in this example.

V Hardware Experiments Results

We conduct physical experiments on our ECP Model 205 torsional testbed [21], see Fig.5. It is a mechanical system that consists of a flexible vertical shaft connecting two disks - a lower disk and an upper disk. Each disk is equipped with an encoder for position measurement. A DC servo motor drives the lower disk through a belt and pulley system, which provides a 3:1 speed reduction ratio. The system can be used to study the vibration of a torsional two-mass-spring system.

Refer to caption
Figure 5: ECP Model 205 torsional testbed

A personal computer with MATLAB®Simulink Desktop Real-Time™ installed is used for computation. The computer is also equipped with a four-channel quadrature encoder input card (NI-PCI6601) and a multi-function analog and digital I/O card (NI-PCI6221). These cards interface with the torsional plant Model 205 for real-time data acquisition and control. The quadrature encoder input card enables the computer to receive position and velocity data from the encoders on the disks of the plant. The multi-function analog and digital I/O card allows the computer to send control signals to the DC servo motor that drives the lower disk.

V-A System Model

Since the MB-ESO, as a baseline approach, needs the dynamics information, we first use MATLAB®System identification toolbox and get the transfer function: G(s)=4.6×104s4+1.901s3+1683s2+1812s+0.1032𝐺𝑠4.6superscript104superscript𝑠41.901superscript𝑠31683superscript𝑠21812𝑠0.1032G(s)=\frac{4.6\times 10^{4}}{s^{4}+1.901s^{3}+1683s^{2}+1812s+0.1032}italic_G ( italic_s ) = divide start_ARG 4.6 × 10 start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT end_ARG start_ARG italic_s start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT + 1.901 italic_s start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT + 1683 italic_s start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + 1812 italic_s + 0.1032 end_ARG.

V-B ESO and Controller Design

As this testbed is again a fourth-order dynamic system, the same ESO design pipeline shown before can be applied.

V-C Experiment Results

Tracking a desired trajectory for the upper disk is the control objective. A sinusoidal wave with a frequency of π/2𝜋2\pi/2italic_π / 2 rad/s and an amplitude 0.5π0.5𝜋0.5\pi0.5 italic_π is applied in the training phase of L-ESO. ωcsubscript𝜔𝑐\omega_{c}italic_ω start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT and ωosubscript𝜔𝑜\omega_{o}italic_ω start_POSTSUBSCRIPT italic_o end_POSTSUBSCRIPT are set to 90 rad/s and 40 rad/s, respectively. The control gain is 5.5×1045.5superscript1045.5\times 10^{4}5.5 × 10 start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT. A trapezoidal profile reference with the final value π𝜋\piitalic_π is used.

Refer to caption
Figure 6: Upper disk position tracking: MB-ESO, MF-ESO, and L-ESO
Refer to caption
Figure 7: Control signal for MB-ESO, MF-ESO, L-ESO

From the results illustrated in Fig. 6 and Fig. 7, we have the following observations: 1) L-ESO has the best performance among all the methods after the training phase in terms of overshoot percentage and settling time. The reasons for L-ESO outperforming MB-ESO could be the imperfection of system identification or that our approach can learn internal as well as external disturbance. 2) The fluctuation of control input of L-ESO is between MF-ESO and MB-ESO, as shown in Fig. 7, which is different from the simulation result. This is because the learning rate is conservatively chosen due to the large noise in the hardware. Also, the trapezoidal profile reference is more smooth than the step reference, which is beneficial for learning.

VI CONCLUSIONS

A novel learning-enabled extended state observer L-ESO with the capacity to memorize and generalize from past estimated disturbances is proposed in this paper. The machine learning model is seamlessly integrated into existing disturbance rejection control architecture as a flexible add-on for boosting robustness performance against unknown and time-varying disturbances. Compared with existing learning for control framework, our new paradigm does not rely on access to full states. In addition, the learning is guarded by disturbance rejection that provides an extra assurance layer to compensate for the imperfections of the machine learning model. The efficacy of the proposed approach has been supported by simulation and hardware experiments. In the future, we will further validate in real robotic testbeds.

References

  • [1] Y. Hori, H. Iseki, and K. Sugiura, “Basic consideration of vibration suppression and disturbance rejection control of multi-inertia system using SFLAC (state feedback and load acceleration control),” IEEE Transactions on Industry Applications, vol. 30, no. 4, pp. 889–896, 1994.
  • [2] S. Zhao and Z. Gao, “An active disturbance rejection based approach to vibration suppression in two-inertia systems,” Asian Journal of control, vol. 15, no. 2, pp. 350–362, 2013.
  • [3] Y. Wang, L. Dong, Z. Chen, M. Sun, and X. Long, “Integrated skyhook vibration reduction control with active disturbance rejection decoupling for automotive semi-active suspension systems,” Nonlinear Dynamics, pp. 1–16, 2024.
  • [4] J. Chen, Y. Hu, and Z. Gao, “On practical solutions of series elastic actuator control in the context of active disturbance rejection,” Advanced Control for Applications: Engineering and Industrial Systems, vol. 3, no. 2, p. e69, 2021.
  • [5] Q. Zheng, Z. **, S. Soares, Y. Hu, and Z. Gao, “An active disturbance rejection control approach to fan control in servers,” in 2018 IEEE Conference on Control Technology and Applications (CCTA).   IEEE, 2018, pp. 294–299.
  • [6] J. Han, “From PID to active disturbance rejection control,” IEEE Transactions on Industrial Electronics, vol. 56, no. 3, pp. 900–906, 2009.
  • [7] R. Cui, L. Chen, C. Yang, and M. Chen, “Extended state observer-based integral sliding mode control for an underwater robot with unknown disturbances and uncertain nonlinearities,” IEEE Transactions on Industrial Electronics, vol. 64, no. 8, pp. 6785–6795, 2017.
  • [8] H. Zhang, Y. Li, Z. Li, C. Zhao, F. Gao, F. Xu, and P. Wang, “Extended-state-observer based model predictive control of a hybrid modular DC transformer,” IEEE Transactions on Industrial Electronics, vol. 69, no. 2, pp. 1561–1572, 2021.
  • [9] H. Zhang, S. Zhao, and Z. Gao, “An active disturbance rejection control solution for the two-mass-spring benchmark problem,” in 2016 American Control Conference (ACC).   IEEE, 2016, pp. 1566–1571.
  • [10] C. Fu and W. Tan, “Tuning of linear ADRC with known plant information,” ISA transactions, vol. 65, pp. 384–393, 2016.
  • [11] Y. Hui, R. Chi, B. Huang, and Z. Hou, “Extended state observer-based data-driven iterative learning control for permanent magnet linear motor with initial shifts and disturbances,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 51, no. 3, pp. 1881–1891, 2021.
  • [12] J. Wang, D. Huang, S. Fang, Y. Wang, and W. Xu, “Model predictive control for ARC motors using extended state observer and iterative learning methods,” IEEE Transactions on Energy Conversion, vol. 37, no. 3, pp. 2217–2226, 2022.
  • [13] J. Zhang and D. Meng, “Improving tracking accuracy for repetitive learning systems by high-order extended state observers,” IEEE Transactions on Neural Networks and Learning Systems, 2022.
  • [14] P. Kicki, K. Łakomy, and K. M. B. Lee, “Tuning of extended state observer with neural network-based control performance assessment,” European Journal of Control, vol. 64, p. 100609, 2022.
  • [15] G. Shi, X. Shi, M. O’Connell, R. Yu, K. Azizzadenesheli, A. Anandkumar, Y. Yue, and S.-J. Chung, “Neural lander: Stable drone landing control using learned dynamics,” in 2019 International Conference on Robotics and Automation (ICRA), 2019, pp. 9784–9790.
  • [16] B. Guo and Z. Zhao, “On the convergence of an extended state observer for nonlinear systems with uncertainty,” Systems & Control Letters, vol. 60, no. 6, pp. 420–430, 2011.
  • [17] W. Bai, S. Chen, Y. Huang, B. Guo, and Z. Wu, “Observers and observability for uncertain nonlinear systems: A necessary and sufficient condition,” International Journal of Robust and Nonlinear Control, vol. 29, no. 10, pp. 2960–2977, 2019.
  • [18] J. Chen, Z. Gao, Y. Hu, and S. Shao, “A general model-based extended state observer with built-in zero dynamics,” arXiv preprint arXiv:2208.12314, 2023.
  • [19] Z. Gao, “Scaling and bandwidth-parameterization based controller tuning,” in Proceedings of the 2003 American Control Conference, 2003.   IEEE, 2003, pp. 4989–4996.
  • [20] B. Wie and D. S. Bernstein, “Benchmark problems for robust control design,” Journal of Guidance, Control, and Dynamics, vol. 15, no. 5, pp. 1057–1059, 1992.
  • [21] Open AI, “Safety gym,” http://www.ecpsystems.com/controls_torplant.htm [Accessed: 3-23-2024].