I Introduction
Controlling an unstable plant over a noisy communication channel is a hurdle for emerging technologies such as autonomous vehicles, Internet of Things devices, and remote surgery systems. This problem setting deviates from Shannon’s communication problem in two ways that make it more challenging [2]. First, in the control setting, the data to be transmitted correspond to physical measurements and arrive in a streaming fashion instead of being made available in its entirety before transmission. Thus, we must design causal encoders and decoders for this task. Second, typical control systems are unstable, and their stabilization requires near-instantaneous and accurate estimates of the plant’s state to produce effective control actions. Consequently, codes must be low-delay yet highly reliable to perform control tasks over communication channels. We employ a class of low-delay joint-source channel codes to address the two objectives.
In a seminal paper, Sahai and Mitter [3] proved that Shannon’s channel capacity is an insufficient characterization of channel quality when the goal is to stabilize a system over a channel. They introduced the notion of anytime capacity, which is in general upper-bounded by channel capacity, as an alternative measure. While the anytime capacity of a channel provides a useful converse on the channel quality required to stabilize a specific system, achievability schemes are generally open. A class of tree codes such as those studied by Schulman [4] achieve error probabilities that decay exponentially with the delay since a source symbol was emitted. While tree codes exist for a large class of discrete channels, they are only known to be efficiently decodable in limited settings [5].
A noiseless feedback channel connecting the decoder back to the encoder does not improve the Shannon capacity of the channel [6], but feedback can significantly simplify code design and improve the reliability-delay trade-offs for communication [7, 8]. Noiseless feedback channels are reasonable assumptions when the receiver has access to more power than the transmitter, which is often the case in control systems since the controller must provide essentially noiseless control inputs. Coding for bit-streaming sources over discrete channels with feedback, which is relevant to control systems where the state has been quantized for digital transmission, has been studied in [9, 5, 10]. This paper considers a setting where measurement and coding are analog operations applied in discrete time.
Consider the problem of estimating a vector-valued plant, modeled as a Gauss-Markov source, over a multiple-input multiple-output (MIMO) additive white Gaussian noise (AWGN) channel with feedback. The causal rate-distortion function [11] provides a lower bound to the channel capacity necessary for causally estimating the source subject to a given distortion over this channel [12]. The causal rate-distortion functions for both scalar and vector Gauss-Markov sources have been studied in [11] and [13] respectively. The lower bound to channel capacity provided by the causal rate-distortion function is known to be tight only when the source is matched to the channel at hand [14]. For example, a scalar Gauss-Markov source is matched to the scalar AWGN channel [11, 3].
When the criterion is finite MSE, a known converse result is that the Shannon capacity should be greater than the sum of logs of unstable eigenvalues of the source [15, Thm. 4.1]. In the case of a scalar source and a scalar channel, this bound is tight and can be achieved by a linear innovations’ encoder [16, 3, 17, 18]. For a vector source and parallel Gaussian channels with independent power constraints, [19] also proposes a periodic linear scheme leading to sufficient conditions for achieving finite MSE. The case of a vector source with scalar channel was studied in [1], which showed that the Shannon capacity remains a necessary and sufficient measure even though the source and channel dimensions are not matched.
This paper considers the general case of vector source and MIMO channels. We first focus on the fundamental limits of achieving finite MSE using linear time-invariant codes. The innovations’ encoder that generates channel inputs as a function of the source estimation error (at the decoder) is optimal for this general problem [1]. The sequential encoder’s structure implies that the optimal decoder is a Kalman filter and its MSE can be analyzed with linear estimation theory.
Our first result is a sufficient condition to achieve finite MSE by partitioning the vector source to different sub-channels. The analysis is carried out by showing an equivalence between achieving finite MSE and the existence of a stabilizing solution to a DARE equation. The sufficient condition (achievability) is then shown to be necessary for two cases including the scenario of a scalar and a MIMO channel. In particular, it is shown that allocating the entire power to the best sub-channel is optimal, while typical water-filling solutions that distribute the power among the sub-channels are sub-optimal. Indeed, this example reveals that the Shannon channel capacity is not the figure of merit if the objective is finite MSE with linear codes. Motivated by this result, we define the linear stabilizing capacity (LSC) as an optimization problem, which is in general a lower bound to the channel capacity. Finite MSE is achievable using linear codes if and only if there exists a feasible solution. The optimization of the LSC is non-convex, but we are able to utilize it to show that linear codes are not optimal by comparing the LSC with rates that can be achieved using non-linear Shannon-Kotel’nikov map**s for a specific source-channel pair. The general case of our problem remains open but, based on numerical observation, we conjecture that the partitioning property is necessary. The equivalence to the DARE feasibility allows us to extract an algebraic condition that if it holds true then partitioning schemes achieve the fundamental limits.
The paper is organized as follows. Section II specifies the source and channel models and defines zero-delay joint source-channel codes with an MSE performance criterion. Section III presents an optimal linear code structure and applies it to the MIMO channel setting. It also defines the linear stabilizing capacity. Section IV presents our main contributions on the sufficient and necessary conditions for finite estimation error of a vector source over a MIMO Gaussian channel using linear codes and demonstrates that, in general, linear coding is not optimal.
Appendix A Proof of Theorem 1 and Conjecture 3
We start with the DARE (2) and power constraint (16) and show that a sequence of statements are equivalent.
First, we apply the transformation
|
|
|
(32) |
to obtain a standard DARE.
Statement 1. A finite MSE is achievable if and only if there exists a satisfying (2) and (16).
|
|
|
(33) |
|
|
|
|
(34) |
We will define as a redundant intermediate variable, resulting in Statement 2.
Statement 2. A finite MSE is achievable if and only if there exists a satisfying
|
|
|
(35) |
|
|
|
|
(36) |
|
|
|
|
(37) |
We equivalently write (36) as . Note also that any hardens the condition of Statement 2 since where is the unique positive solution of
{dmath}
P^+ = AP^+A^T + Q - A P^+ Γ^T H^T
(R+H (ΓP^+ Γ^T + Ω) H^T )^-1HΓP^+ A^T
and is the unique positive solution of
{dmath}
P = APA^T + Q
- A P Γ^T H^T (I+H (ΓP Γ^T) H^T )^-1HΓP A^T
Here, we use to denote the solution to (2) when and to denote the solution when . However, moving forward we will use to denote the solution of (A).
Statement 3. A finite MSE is achievable if and only if there exists a satisfying
|
|
|
(38) |
|
|
|
(39) |
|
|
|
(40) |
We reapply the transformation (32).
Statement 4. A finite MSE is achievable if and only if there exists a satisfying
|
|
|
(41) |
|
|
|
(42) |
|
|
|
(43) |
Finally, we add two redundant constraints. First,
|
|
|
which holds iff (42) holds by the Schur complement lemma. Then, by substituting into (41), we obtain a Lyapunov for ,
{dmath}
AJA^T - J + A Γ^T Π^-1 ΓA^T + Q - A Γ^T H^T (I + H ΠH^T)^-1 H ΓA^T - Γ^T Π^-1 Γ= 0.
With these redundant conditions, we obtain the statement:
Statement 5. A finite MSE is achievable if and only if there exists a satisfying
|
|
|
(44) |
|
|
|
|
(45) |
|
|
|
|
(46) |
|
|
|
|
(47) |
|
|
|
|
(48) |
As in Statement 3, (47) should be satisfied with equality. We will focus our analysis on Statement 5.
Left and right multiplying (44) by and to obtain a stable Lyapunov equation, we have
{dmath}
A^-1 J A^-T - J - ~Γ^T Π^-1 ~Γ- A^-1QA^-T + ~Γ^T H^T (I+H ΠH^T)^-1H ~Γ+ A^-1 ~Γ^T Π^-1 ~ΓA^-T = 0
By linearity, we can separate where
|
|
|
|
(49) |
|
|
|
(50) |
The purpose of this step is to separate the terms involving so that only affects . If there exists a that makes strictly positive, we can arbitrarily scale so that is arbitrarily positive. By making arbitrarily positive, , so we will limit our investigation to the positivity of .
Let denote the th row of .
We now have the following chain of equalities
|
|
|
(51) |
|
|
|
(52) |
|
|
|
(53) |
|
|
|
(54) |
|
|
|
(55) |
|
|
|
(56) |
where (52) follows from the Matrix Inversion lemma, (54) follows from the singular value decomposition of
|
|
|
(57) |
where is unitary and is diagonal, and (55) follows from the notation
|
|
|
(58) |
Let be the unique solution to the Lyapunov equation
|
|
|
(59) |
then it follows that the unique solution to is given by
|
|
|
(60) |
From (57), , where indicates the th singular value in order.
To connect the definition of to the condition of Theorem 1, we have the following definition:
|
|
|
(61) |
A-A Proof of sufficiency
Leveraging the result for vector sources and scalar channels, see Appendix B, if and only if
|
|
|
(62) |
The sufficiency of the theorem follows by letting be diagonal, in which case . Let be the submatrix of formed by selecting the column and row indices that are members of , the support of the th row of . Note that will be strictly positive by the MISO theorem and zero elsewhere. Since the union of supports of the rows of include every possible index, and can be made arbitrarily positive by scaling . Such a , is sufficient for Statement 5.
A-B Proof of necessity
First, we show that a diagonal is necessary.
Lemma 3
If can be made arbitrarily positive by some , it can also be made arbitrarily positive by where is diagonal with .
Proof:
Fix a in (59). We will show that given an arbitrary satisfying the power constraint, the same is achievable by a diagonal and associated with less power.
Let
|
|
|
(63) |
where by assumption, .
From (57),
|
|
|
(64) |
This implies
|
|
|
(65) |
where is unitary.
For a diagonal ,
|
|
|
(66) |
Then by isolating and applying the trace to both sides,
|
|
|
(67) |
and
|
|
|
(68) |
Note that the entries of are in descending order and the entries of are listed in ascending order by assumption that .
We apply Ruhe’s Trace Inequality, which states that if are PSD matrices, with eigenvalues and ,
|
|
|
(69) |
Here, , , and achieves the lower bound with equality since and are both diagonal with opposing ordering as in (69). has a sandwiched unitary in (68). Thus,
|
|
|
as desired.
∎
The above lemma shows that we can take to be diagonal without loss of generality since that is what minimizes the power. Then,
|
|
|
and
|
|
|
(70) |
To show the partitioning property of Theorem 1, we need to show that the sets cover and are disjoint.
We show the necessity of . For to be arbitrarily positive, must excite all directions of . To see this, suppose an index exists such that for all in (58). Then, , where is the th standard basis vector, and cannot be positive definite.
What remains to be shown is the necessity of the disjointness of the sets . By showing Conjecture 1, (50) can be made arbitrarily positive by scaling of the form described in Conjecture 1. The sets are then disjoint.
We can consider the equivalent (redundant) constraints on in Statement 5. In our construction of in (47), we demonstrated that slack in only hardens the conditions, so we can always take . In Lemma 3, we showed that the optimal is diagonal. Consequently, can also be considered to be diagonal. We now ask what this diagonality constraint imposes on the structure of .
Recall our original DARE:
|
|
|
where .
Under the condition is diagonal, the measurement covariance is diagonal as well since is diagonal. We conjecture that the diagonality of imposes a structure on so that is consistent with the condition of Theorem 1. This means that has only a single non-zero entry in each column. Equivalently, each mode is assigned to a single channel. If this matrix-algebraic fact on the structure of holds, Conjecture 3 holds as well.
This concludes the proof.
Appendix B Proof of Lemma 3 - Vector Source over Scalar Channel
This proof was initially presented in [1], but we give it here for completeness.
The channel is scalar, so in 70. Further, we can always set since the optimal encoder uses all the available power. From (70), we have
|
|
|
(71) |
Defining , i.e., the diagonal matrix whose components are the elements of the vector , we may now write
|
|
|
where is the vector of the diagonal elements of and is the all-one vector. (71) becomes
{dmath}
~J = A^-1~J A^-T + D_Γ(aa^T-11T1 + h2p )D_Γ.
Let be the solution to the Lyapunov equation
|
|
|
(72) |
Let be controllable, which holds by Assumption 2. Then by the Lyapunov stability theorem, . We now claim that
|
|
|
(73) |
This can be verified by plugging (73) into (B).
It follows that if and only if . But the latter is equivalent to
|
|
|
(74) |
or
|
|
|
(75) |
Assume satisfies the Lyapunov equation (72). Then
|
|
|
(76) |
This follows since from (72). On the one hand
|
|
|
(77) |
and on the other
{dmath}
det(M-11^T) = det(I-11^TM^-1) det M = (1-1^TM^-11) detM,
which yields the desired result.
Then, (75) and (76) imply that if and only if
|
|
|
(78) |
or equivalently
|
|
|
(79) |
which is the capacity condition we are seeking.
Note that when this capacity condition holds, and therefore in (73). We can arbitrarily scale and therefore to make arbitrarily positive. This demonstrates both sufficiency and necessity.
Appendix C Proof of Lemma 3 - Scalar Source over MIMO Channel
We also consider a scalar source and arbitrary rank channel in Lemma 3. We can solve (46) explicitly as
|
|
|
(80) |
Plugging this into (45) we obtain
{dmath}
J = -qa2-1 + ~Γ^T ( a2a2-1 H^T (I + H ΠH^T)^-1 H - Π^-1 ) ~Γ⪰0
Finite estimation error can be achieved iff there exists a and , such that . The above statement is equivalent to the statement: finite error is not achievable iff for all such that , , we have . This is what we set out to show.
Let
|
|
|
(81) |
Note that for all if and only if .
|
|
|
|
(82) |
The inequality is equivalent to
|
|
|
(83) |
which is also equivalent to
|
|
|
|
(84) |
|
|
|
|
(85) |
Recall that is diagonal. Let be the entry of with the greatest magnitude. Note that has the best chance of overcoming (85) by placing all power on the best channel. Thus, if this alignment of does not violate (85), the inequality holds for any . Then, the final inequality holds for any if and only if
|
|
|
(86) |
or
|
|
|
(87) |
This is a necessary and sufficient condition for to be unachievable. Taking the inverse of the above, we obtain the desired condition.