Feedback Capacity of Parallel ACGN Channels and Kalman Filter: Power Allocation with Feedback
FFeedback Capacity of Parallel ACGN Channels andKalman Filter: Power Allocation with Feedback
Song Fang and Quanyan Zhu
Department of Electrical and Computer Engineering, New York University, New York, USAEmail: {song.fang, quanyan.zhu}@nyu.edu
Abstract —In this paper, we relate the feedback capacity ofparallel additive colored Gaussian noise (ACGN) channels toa variant of the Kalman filter. By doing so, we obtain lowerbounds on the feedback capacity of such channels, as well as thecorresponding feedback (recursive) coding schemes, which areessentially power allocation policies with feedback, to achievethe bounds. The results are seen to reduce to existing lowerbounds in the case of a single ACGN feedback channel, whereaswhen it comes to parallel additive white Gaussian noise (AWGN)channels with feedback, the recursive coding scheme reduces toa feedback “water-filling” power allocation policy.
I. I
NTRODUCTION
The feedback capacity [1] of additive colored Gaussiannoise (ACGN) channels has been a long-standing problemin information theory, generating numerous research papersover the years, due to its significance in understanding andapplying communication/coding with feedback. In general, werefer to the breakthrough paper [2] and the references thereinfor a rather complete literature review; see also [3], [4] forpossibly complementary paper surveys. Meanwhile, papers onthis topic have also been coming out continuously after [2],which include but are certainly not restricted to [5]–[17]. Mostof the aforementioned works, however, focused merely on thefeedback capacity of a single ACGN channel, so to speak,whereas when it comes to parallel ACGN channels, the corre-sponding results have been lacking in general. One exceptionis the recent paper [11] which generalized the computationalapproach in [10] to multi-antenna ACGN channels.In this paper, we establish a connection between a paral-lel of ACGN channels with feedback and a variant of themulti-input multi-output (MIMO) Kalman filter for coloredGaussian noises. In light of this, we obtain lower bounds onfeedback capacity for parallel ACGN channels by examiningthe algebraic Riccati equations associated with the Kalmanfilter. Meanwhile, the Kalman filtering systems, which areessentially feedback (closed-loop) systems, naturally providerecursive coding schemes, in terms of feedback power al-location policies, to achieve the lower bounds. In addition,the lower bounds are shown to be consistent with existingfeedback capacity results when it comes to a single ACGNchannel. It is also seen that in the special case of paralleladditive white Gaussian noise (AWGN) channels, the recursivecoding reduces to a feedback “water-filling” solution.In a broad sense, in this paper we utilize a control-theoreticapproach towards this problem as in, e.g., [5]–[7], [10]–[14],[16], [18]–[20]; see also [21] and the references therein. Note also that although the organization of this paper resemblesthat of [16] to a certain extent, the results are not trivialgeneralizations of those therein, as evidenced by the resultsthemselves as well as their proofs.The rest of the paper is organized as follows. Section IIprovides the preliminary background on feedback capacity andKalman filter. In Section III, we present the main results ofthis paper. Concluding remarks are given in Section IV.II. P
RELIMINARIES
In this paper, we consider real-valued continuous randomvariables and discrete-time stochastic processes they compose.All random variables and stochastic processes are assumed tobe zero-mean for simplicity and without loss of generality.We represent random variables using boldface letters. Thelogarithm is defined with base . A stochastic process { x k } is said to be asymptotically stationary if it is stationary as k → ∞ , and herein stationarity means strict stationarity[22]. Note in particular that, for simplicity and with abuseof notations, we utilize x ∈ R and x ∈ R n to indicate that x is a real-valued random variable and that x is a real-valued n -dimensional random vector, respectively.The following definitions of entropy and entropy rate areadapted from, e.g., [23]. Definition 1:
The differential entropy of a random vector x with density p x ( x ) is defined as h ( x ) = − (cid:90) p x ( x ) log p x ( x ) d x. The entropy rate of a stochastic process { x k } is defined as h ∞ ( x ) = lim sup k →∞ h ( x ,...,k ) k + 1 . A. Feedback Capacity
Consider a parallel of n additive colored Gaussian noisechannels given by y k = x k + z k , where { x k } , x k ∈ R n denotes the channel input, { y k } , y k ∈ R n denotes the channel output, and { z k } , z k ∈ R n denotesthe additive noise which is assumed to be stationary coloredGaussian. The feedback capacity C f of such a channel withpower constraint P is given by [1], [23] C f = sup lim k →∞ k +1 (cid:80) ki =0 tr E [ x i x T i ] ≤ P [ h ∞ ( y ) − h ∞ ( z )] . (1) a r X i v : . [ c s . I T ] F e b k x k z k v k x k y k e k u z − k + x z − k + x k y A A CC k K Fig. 1. The Kalman filtering system.
It is worth mentioning that in the case of n = 1 , it wasshown in [2] that the optimal channel input process { x k } isstationary and in the form of x k = ∞ (cid:88) i =1 b i z k − i , b i ∈ R , while satisfying E (cid:2) x k (cid:3) ≤ P . This has also been generalized tothe case of n parallel channels by [11]. For the purpose of thispaper, however, it suffices to consider the original definitionin (1). B. Kalman Filter
We now give a brief review of (a special case of) the MIMOKalman filter [24], [25]; note that hereinafter the notations arenot to be confused with those in Section II-A. Particularly,consider the Kalman filtering system depicted in Fig. 1, wherethe state-space model of the plant to be estimated is given by (cid:26) x k +1 = A x k , y k = C x k + v k . (2)Herein, x k ∈ R n is the state to be estimated, y k ∈ R n isthe plant output, and v k ∈ R n is the measurement noise,whereas the process noise, normally denoted as { w k } [24],[25], is assumed to be absent. The system matrix is A ∈ R n × n while the output matrix is C ∈ R n × n , and we assume that A is anti-stable (i.e., all the eigenvalues are unstable withmagnitudes greater than or equal to ) while the pair ( A, C ) is observable (and thus detectable [26]). Suppose that { v k } is white Gaussian with covariance V = E (cid:2) v k v T k (cid:3) (cid:31) andthe initial state x is Gaussian with covariance E (cid:2) x x T0 (cid:3) (cid:31) .Furthermore, { v k } and x are assumed to be uncorrelated.Correspondingly, the Kalman filter (in the observer form [26])for (2) is given by x k +1 = A x k + u k , y k = C x k , e k = y k − y k , u k = K k e k , (3)where x k ∈ R n , y k ∈ R n , e k ∈ R n , and u k ∈ R n . Herein, K k denotes the observer gain [26] (note that the observer gain _ k x k y k e k u z − k + x k v A C K Fig. 2. The steady-state Kalman filtering system in integrated form. is different from the Kalman gain by a factor of A ; see, e.g.,[25], [26] for more details) given by K k = AP k C T (cid:0) CP k C T + V (cid:1) − , where P k denotes the state estimation error covariance as P k = E (cid:104) ( x k − x k ) ( x k − x k ) T (cid:105) . In addition, P k can be obtained iteratively by the Riccatiequation P k +1 = AP k A T − AP k C T (cid:0) CP k C T + V (cid:1) − CP k A T with P = E (cid:2) x x T0 (cid:3) (cid:31) . Additionally, it is known [24],[25] that when ( A, C ) is detectable, the Kalman filteringsystem converges, i.e., the state estimation error { x k − x k } is asymptotically stationary. Moreover, in steady state, theoptimal state estimation error variance P = lim k →∞ E (cid:104) ( x k − x k ) ( x k − x k ) T (cid:105) attained by the Kalman filter is given by the (non-zero) positivesemi-definite solution [25] to the algebraic Riccati equation P = AP A T − AP C T (cid:0) CP C T + V (cid:1) − CP A T , (4)whereas the steady-state observer gain is given by K = AP C T (cid:0) CP C T + V (cid:1) − . (5)In fact, by letting (cid:101) x k = x k − x k and (cid:101) y k = y k − z k = y k − C x k , we may integrate the steady-state systems of (2)and (3) into an equivalent form: (cid:101) x k +1 = A (cid:101) x k + u k , (cid:101) y k = C (cid:101) x k , e k = − (cid:101) y k + v k , u k = K e k , (6)as depicted in Fig. 2, since all the sub-systems are linear.III. L OWER B OUNDS ON F EEDBACK C APACITY OF P ARALLEL
ACGN C
HANNELS AND R ECURSIVE C ODING
The approach we take in this paper to obtain lower boundson the feedback capacity of parallel ACGN channels is byestablishing the connection between a parallel of ACGNchannels with feedback and a variant of the Kalman filter forcolored Gaussian noises. Towards this end, we first present thefollowing variant of the Kalman filter. . A Variant of the Kalman Filter
Consider again the Kalman filtering system given in Fig. 1.Suppose that the plant to be estimated is still given by (cid:26) x k +1 = A x k , y k = C x k + v k , (7)only this time with an auto-regressive moving average(ARMA) colored Gaussian measurement noise { v k } , v k ∈ R n represented as v k = p (cid:88) i =1 F i v k − i + (cid:98) v k + q (cid:88) j =1 G j (cid:98) v k − j , (8)where { (cid:98) v k } , (cid:98) v k ∈ R n is white Gaussian with covariance (cid:98) V = E (cid:2)(cid:98) v k (cid:98) v T k (cid:3) (cid:31) . Equivalently, { v k } may be represented [27] asthe output of a linear time-invariant (LTI) filter F ( z ) drivenby input { (cid:98) v k } , where F ( z ) = (cid:32) I − p (cid:88) i =1 F i z − i (cid:33) − I + q (cid:88) j =1 G j z − j . (9)Herein, we assume that F ( z ) is stable and minimum-phase.We may now generalize the method of dealing with colorednoises as employed in [16] (which in turn was developed basedon [25]; see detailed discussions in [16]), to the case of MIMOKalman filtering systems. Proposition 1:
Suppose that A = T Λ T − , where Λ = diag ( λ , . . . , λ n ) , (10)and | λ (cid:96) | ≥ , (cid:96) = 1 , . . . , n . Denote (cid:98) y k = − q (cid:88) j =1 G j (cid:98) y k − j + y k − p (cid:88) i =1 F i y k − i . (11)Then, (7) is equivalent to (cid:26) x k +1 = A x k , (cid:98) y k = (cid:98) C x k + (cid:98) v k , (12)where (cid:98) C = vec − n × n (cid:34) (cid:0) T − T ⊗ I n × n (cid:1) (cid:32) I n × n − p (cid:88) i =1 Λ − i ⊗ F i (cid:33) × I n × n + q (cid:88) j =1 Λ − j ⊗ G j − (cid:0) T T ⊗ I n × n (cid:1) vec ( C ) (cid:35) . (13) Proof:
Note first that since F ( z ) is stable and minimum-phase, the inverse filter F − ( z ) = (cid:32) I − p (cid:88) i =1 F i z − i (cid:33) I + q (cid:88) j =1 G j z − j − is also stable and minimum-phase. As a result, it holds ∀ | z | ≥ that (cid:32) I − p (cid:88) i =1 F i z − i (cid:33) I + q (cid:88) j =1 G j z − j − (cid:54) = 0 , i.e., the region of convergence must include, though notnecessarily restricted to, | z | ≥ . Consequently, for | z | ≥ ,we may expand (cid:32) I − p (cid:88) i =1 F i z − i (cid:33) I + q (cid:88) j =1 G j z − j − = I − ∞ (cid:88) i =1 H i z − i , and thus { (cid:98) v k } can be reconstructed from { v k } as [27] (cid:98) v k = v k − ∞ (cid:88) i =1 H i v k − i = − q (cid:88) j =1 G j (cid:98) v k − j + v k − p (cid:88) i =1 F i v k − i . Accordingly, we may also rewrite (cid:98) y k = − q (cid:88) j =1 G j (cid:98) y k − j + y k − p (cid:88) i =1 F i y k − i = y k − ∞ (cid:88) i =1 H i y k − i = y k − ∞ (cid:88) i =1 H i ( C x k − i + v k − i )= C x k − ∞ (cid:88) i =1 H i C x k − i + v k − ∞ (cid:88) i =1 H i v k − i = C x k − ∞ (cid:88) i =1 H i C x k − i + (cid:98) v k . Meanwhile, since A is anti-stable (and thus invertible), wehave x k − i = A − i x k . As a result, (cid:98) y k = C x k − ∞ (cid:88) i =1 H i C x k − i + (cid:98) v k = (cid:32) C − ∞ (cid:88) i =1 H i CA − i (cid:33) x k + (cid:98) v k . In addition, vec (cid:32) C − ∞ (cid:88) i =1 H i CA − i (cid:33) = vec ( C ) − vec (cid:32) ∞ (cid:88) i =1 H i CA − i (cid:33) = vec ( C ) − ∞ (cid:88) i =1 (cid:104)(cid:0) A − i (cid:1) T ⊗ H i (cid:105) vec ( C )= (cid:34) I n × n − ∞ (cid:88) i =1 (cid:0) A − i (cid:1) T ⊗ H i (cid:35) vec ( C ) , and hence C − ∞ (cid:88) i =1 H i CA − i = vec − n × n (cid:40)(cid:34) I n × n − ∞ (cid:88) i =1 (cid:0) A − i (cid:1) T ⊗ H i (cid:35) vec ( C ) (cid:41) . ote then that I n × n − ∞ (cid:88) i =1 (cid:0) A − i (cid:1) T ⊗ H i = I n × n − ∞ (cid:88) i =1 (cid:104)(cid:0) T Λ T − (cid:1) − i (cid:105) T ⊗ H i = I n × n − ∞ (cid:88) i =1 (cid:0) T Λ − i T − (cid:1) T ⊗ ( I n × n H i I n × n )= I n × n − ∞ (cid:88) i =1 (cid:0) T − T Λ − i T T (cid:1) ⊗ ( I n × n H i I n × n )= I n × n − ∞ (cid:88) i =1 (cid:0) T − T ⊗ I n × n (cid:1) (cid:0) Λ − i ⊗ H i (cid:1) (cid:0) T T ⊗ I n × n (cid:1) = I n × n − (cid:0) T − T ⊗ I n × n (cid:1) (cid:32) ∞ (cid:88) i =1 Λ − i ⊗ H i (cid:33) (cid:0) T T ⊗ I n × n (cid:1) . Moreover, since T T is invertible and the eigenvalues of T T ⊗ I n × n are given by the n copies of the eigenvalues of T T , itthus follows that T T ⊗ I n × n is invertible and I n × n = (cid:0) T T ⊗ I n × n (cid:1) − (cid:0) T T ⊗ I n × n (cid:1) = (cid:0) T − T ⊗ I n × n (cid:1) (cid:0) T T ⊗ I n × n (cid:1) . Accordingly, I n × n − (cid:0) T − T ⊗ I n × n (cid:1) (cid:32) ∞ (cid:88) i =1 Λ − i ⊗ H i (cid:33) (cid:0) T T ⊗ I n × n (cid:1) = (cid:0) T − T ⊗ I n × n (cid:1) (cid:0) T T ⊗ I n × n (cid:1) − (cid:0) T − T ⊗ I n × n (cid:1) (cid:32) ∞ (cid:88) i =1 Λ − i ⊗ H i (cid:33) (cid:0) T T ⊗ I n × n (cid:1) = (cid:0) T − T ⊗ I n × n (cid:1) (cid:32) I n × n − ∞ (cid:88) i =1 Λ − i ⊗ H i (cid:33) (cid:0) T T ⊗ I n × n (cid:1) . In addition, I n × n − ∞ (cid:88) i =1 Λ − i ⊗ H i = I n × n − ∞ (cid:88) i =1 λ − i · · · ... . . . ... · · · λ − in ⊗ H i = I n × n − ∞ (cid:88) i =1 λ − i H i · · · ... . . . ... · · · λ − in H i = I n × n − (cid:80) ∞ i =1 λ − i H i · · · ... . . . ... · · · I n × n − (cid:80) ∞ i =1 λ − in H i . Meanwhile, we have already shown that ∀ | z | ≥ , I n × n − ∞ (cid:88) i =1 H i z − i = (cid:32) I n × n − p (cid:88) i =1 F i z − i (cid:33) I n × n + q (cid:88) j =1 G j z − j − , i.e., I n × n − (cid:80) ∞ i =1 H i z − i converges. As such, since | λ (cid:96) | ≥ , (cid:96) = 1 , . . . , n , we have I n × n − ∞ (cid:88) i =1 λ − i(cid:96) H i = I n × n − ∞ (cid:88) i =1 H i λ − i(cid:96) = (cid:32) I n × n − p (cid:88) i =1 F i λ − i(cid:96) (cid:33) I n × n + q (cid:88) j =1 G j λ − j(cid:96) − = (cid:32) I n × n − p (cid:88) i =1 λ − i(cid:96) F i (cid:33) I n × n + q (cid:88) j =1 λ − j(cid:96) G j − . Therefore, I n × n − (cid:80) ∞ i =1 λ − i H i · · · ... . . . ... · · · I n × n − (cid:80) ∞ i =1 λ − in H i = I n × n − (cid:80) pi =1 λ − i F i · · · ... . . . ... · · · I n × n − (cid:80) pi =1 λ − in F i × I n × n + (cid:80) qj =1 λ − j G j · · · ... . . . ... · · · I n × n + (cid:80) qj =1 λ − jn G j − = I n × n − p (cid:88) i =1 λ − i F i · · · ... . . . ... · · · λ − in F i × I n × n + q (cid:88) j =1 λ − j G j · · · ... . . . ... · · · λ − jn G j − = I n × n − p (cid:88) i =1 λ − i · · · ... . . . ... · · · λ − in ⊗ F i × I n × n + q (cid:88) j =1 λ − j · · · ... . . . ... · · · λ − jn ⊗ G j − = (cid:32) I n × n − p (cid:88) i =1 Λ − i ⊗ F i (cid:33) I n × n + q (cid:88) j =1 Λ − j ⊗ G j − . k x k e k u z − k + x ˆ k v A ˆ K k y ˆ C Fig. 3. The steady-state integrated Kalman filter for colored noises.
As a result, I n × n − ∞ (cid:88) i =1 (cid:0) A − i (cid:1) T ⊗ H i = (cid:0) T − T ⊗ I n × n (cid:1) (cid:32) I n × n − ∞ (cid:88) i =1 Λ − i ⊗ H i (cid:33) (cid:0) T T ⊗ I n × n (cid:1) = (cid:0) T − T ⊗ I n × n (cid:1) (cid:32) I n × n − p (cid:88) i =1 Λ − i ⊗ F i (cid:33) × I n × n + q (cid:88) j =1 Λ − j ⊗ G j − (cid:0) T T ⊗ I n × n (cid:1) . To sum it up, we have C − ∞ (cid:88) i =1 H i CA − i = vec − n × n (cid:40)(cid:34) I n × n − ∞ (cid:88) i =1 (cid:0) A − i (cid:1) T ⊗ H i (cid:35) vec ( C ) (cid:41) = vec − n × n (cid:34) (cid:0) T − T ⊗ I n × n (cid:1) (cid:32) I n × n − p (cid:88) i =1 Λ − i ⊗ F i (cid:33) × I n × n + q (cid:88) j =1 Λ − j ⊗ G j − (cid:0) T T ⊗ I n × n (cid:1) vec ( C ) (cid:35) . This completes the proof.Meanwhile, the Kalman filter for (12) is given by x k +1 = A x k + u k , y k = (cid:98) C x k , e k = (cid:98) y k − y k , u k = (cid:98) K k e k , (14)where x k ∈ R n , y k ∈ R n , e k ∈ R n , and u k ∈ R n .Furthermore, when (cid:16) A, (cid:98) C (cid:17) is detectable, the Kalman filteringsystem converges, i.e., the state estimation error { x k − x k } is asymptotically stationary. Moreover, in steady state, theoptimal state estimation error covariance P = lim k →∞ E (cid:104) ( x k − x k ) ( x k − x k ) T (cid:105) attained by the Kalman filter is given by the (non-zero) positivesemi-definite solution to the algebraic Riccati equation P = AP A T − AP (cid:98) C T (cid:16) (cid:98) CP (cid:98) C T + (cid:98) V (cid:17) − (cid:98) CP A T , (15)whereas the steady-state observer gain is given by (cid:98) K = AP (cid:98) C T (cid:16) (cid:98) CP (cid:98) C T + (cid:98) V (cid:17) − . (16)Again, by letting (cid:101) x k = x k − x k and (cid:101) y k = y k − (cid:98) z k = y k − (cid:98) C x k ,we may integrate the steady-state systems of (12) and (14) into (cid:101) x k +1 = A (cid:101) x k + u k , (cid:101) y k = (cid:98) C (cid:101) x k , e k = − (cid:101) y k + (cid:98) v k , u k = (cid:98) K e k , (17)as depicted in Fig. 3. In addition, it may be verified that theclosed-loop system given in (17) and Fig. 3 is stable [25],[26].In fact, we may design matrix C specifically to rendermatrix (cid:98) C an identity matrix. Proposition 2:
Suppose that A = T Λ T − , where Λ = diag ( λ , . . . , λ n ) , (18)and | λ (cid:96) | ≥ , (cid:96) = 1 , . . . , n . If C = vec − n × n (cid:34) (cid:0) T − T ⊗ I n × n (cid:1) I n × n + q (cid:88) j =1 Λ − j ⊗ G j × (cid:32) I n × n − p (cid:88) i =1 Λ − i ⊗ F i (cid:33) − (cid:0) T T ⊗ I n × n (cid:1) vec ( I n × n ) (cid:35) , (19)then (cid:98) C = I n × n . (20) Proof:
It is known from the proof of Proposition 1 that (cid:0) T − T ⊗ I n × n (cid:1) (cid:32) I n × n − p (cid:88) i =1 Λ − i ⊗ F i (cid:33) × I n × n + q (cid:88) j =1 Λ − j ⊗ G j − (cid:0) T T ⊗ I n × n (cid:1) is invertible, and it can be verified that its inverse is given by (cid:0) T − T ⊗ I n × n (cid:1) I n × n + q (cid:88) j =1 Λ − j ⊗ G j × (cid:32) I n × n − p (cid:88) i =1 Λ − i ⊗ F i (cid:33) − (cid:0) T T ⊗ I n × n (cid:1) . ence, if (19) holds, then (cid:98) C = vec − n × n (cid:34) (cid:0) T − T ⊗ I n × n (cid:1) (cid:32) I n × n − p (cid:88) i =1 Λ − i ⊗ F i (cid:33) × I n × n + q (cid:88) j =1 Λ − j ⊗ G j − (cid:0) T T ⊗ I n × n (cid:1) vec ( C ) (cid:35) = vec − n × n (cid:34) (cid:0) T − T ⊗ I n × n (cid:1) (cid:32) I n × n − p (cid:88) i =1 Λ − i ⊗ F i (cid:33) × I n × n + q (cid:88) j =1 Λ − j ⊗ G j − (cid:0) T T ⊗ I n × n (cid:1) × (cid:0) T − T ⊗ I n × n (cid:1) I n × n + q (cid:88) j =1 Λ − j ⊗ G j × (cid:32) I n × n − p (cid:88) i =1 Λ − i ⊗ F i (cid:33) − (cid:0) T T ⊗ I n × n (cid:1) vec ( I n × n ) (cid:35) = I n × n . This completes the proof.Note that when (cid:98) C = I n × n , the pair (cid:16) A, (cid:98) C (cid:17) is alwaysobservable (and thus always detectable [26]). To see this, notethat if (cid:98) C = I n × n , then the observation matrix for (cid:16) A, (cid:98) C (cid:17) becomes (cid:98) C (cid:98) CA (cid:98) CA ... (cid:98) CA n − = I n × n AA ... A n − , (21)which has a row rank of n , indicating that (cid:16) A, (cid:98) C (cid:17) is ob-servable [26], regardless of what A is. As such, in this casethe Kalman filtering system always converges, whereas (15)reduces to P = AP A T − AP (cid:16) P + (cid:98) V (cid:17) − P A T , (22)and (16) reduces to (cid:98) K = AP (cid:16) P + (cid:98) V (cid:17) − . (23) B. Feedback Capacity of Parallel ACGN Channels
We now proceed to obtain lower bounds on feedback ca-pacity as well as the corresponding recursive coding schemes,based upon the discussions in the previous sub-section. Wefirst examine the solution to the algebraic Riccati equationgiven by (15) when A and C are designed specifically. Theorem 1:
Suppose that in (8), (cid:98) V = U (cid:98) v Λ (cid:98) v U T (cid:98) v , where Λ (cid:98) v = diag (cid:16) (cid:98) V , . . . , (cid:98) V n (cid:17) , (24)and U (cid:98) v is an orthogonal matrix. If A = U (cid:98) v Λ U T (cid:98) v , (25) and C = vec − n × n (cid:34) (cid:0) U − T (cid:98) v ⊗ I n × n (cid:1) I n × n + q (cid:88) j =1 Λ − j ⊗ G j × (cid:32) I n × n − p (cid:88) i =1 Λ − i ⊗ F i (cid:33) − (cid:0) U T (cid:98) v ⊗ I n × n (cid:1) vec ( I n × n ) (cid:35) , (26)where Λ = ± (Λ (cid:101) x + Λ (cid:98) v ) Λ − (cid:98) v , (27)and Λ (cid:101) x = diag ( P , . . . , P n ) (cid:23) , Λ (cid:101) x (cid:54) = 0 , (28)then the (non-zero) positive semi-definite solution to (15) isgiven by P = U (cid:98) v Λ (cid:101) x U T (cid:98) v . (29) Proof:
Note first that the eigenvalues of A = U (cid:98) v Λ U T (cid:98) v = ± U (cid:98) v (Λ (cid:101) x + Λ (cid:98) v ) Λ − (cid:98) v U T (cid:98) v are given by λ (cid:96) = (cid:115) P (cid:96) + (cid:98) V (cid:96) (cid:98) V (cid:96) , (cid:96) = 1 , . . . , n, or λ (cid:96) = − (cid:115) P (cid:96) + (cid:98) V (cid:96) (cid:98) V (cid:96) , (cid:96) = 1 , . . . , n. Then, since diag (cid:16) (cid:98) V , . . . , (cid:98) V n (cid:17) (cid:31) , and diag ( P , . . . , P n ) (cid:23) , we have | λ (cid:96) | = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:115) P (cid:96) + (cid:98) V (cid:96) (cid:98) V (cid:96) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≥ , (cid:96) = 1 , . . . , n. Therefore, according to Proposition 1 and Proposition 2, when C is given by (26), it follows that (cid:98) C = I n × n while (22) holds.n the other hand, it may be verified that P = U (cid:98) v Λ (cid:101) x U T (cid:98) v isthe (non-zero) positive semi-definite solution to (22) as U (cid:98) v Λ (cid:101) x U T (cid:98) v − AU (cid:98) v Λ (cid:101) x U T (cid:98) v A T + AU (cid:98) v Λ (cid:101) x U T (cid:98) v (cid:16) U (cid:98) v Λ (cid:101) x U T (cid:98) v + (cid:98) V (cid:17) − U (cid:98) v Λ (cid:101) x U T (cid:98) v A T = U (cid:98) v Λ (cid:101) x U T (cid:98) v − U (cid:98) v Λ U T (cid:98) v U (cid:98) v Λ (cid:101) x U T (cid:98) v U (cid:98) v Λ U T (cid:98) v + U (cid:98) v Λ U T (cid:98) v U (cid:98) v Λ (cid:101) x U T (cid:98) v (cid:0) U (cid:98) v Λ (cid:101) x U T (cid:98) v + U (cid:98) v Λ (cid:98) v U T (cid:98) v (cid:1) − × U (cid:98) v Λ (cid:101) x U T (cid:98) v U (cid:98) v Λ U T (cid:98) v = U (cid:98) v Λ (cid:101) x U T (cid:98) v − U (cid:98) v ΛΛ (cid:101) x Λ U T (cid:98) v + U (cid:98) v ΛΛ (cid:101) x (Λ (cid:101) x + Λ (cid:98) v ) − Λ (cid:101) x Λ U T (cid:98) v = U (cid:98) v (cid:104) Λ (cid:101) x − ΛΛ (cid:101) x Λ + ΛΛ (cid:101) x (Λ (cid:101) x + Λ (cid:98) v ) − Λ (cid:101) x Λ (cid:105) U T (cid:98) v = U (cid:98) v (cid:104) Λ (cid:101) x − ΛΛ (cid:101) x (Λ (cid:101) x + Λ (cid:98) v ) − (Λ (cid:101) x + Λ (cid:98) v ) Λ+ ΛΛ (cid:101) x (Λ (cid:101) x + Λ (cid:98) v ) − Λ (cid:101) x Λ (cid:105) U T (cid:98) v = U (cid:98) v (cid:104) Λ (cid:101) x − ΛΛ (cid:101) x (Λ (cid:101) x + Λ (cid:98) v ) − Λ (cid:98) v Λ (cid:105) U T (cid:98) v = U (cid:98) v (cid:104) Λ (cid:101) x − Λ (cid:101) x (Λ (cid:101) x + Λ (cid:98) v ) − Λ (cid:98) v Λ (cid:105) U T (cid:98) v = U (cid:98) v (cid:104) Λ (cid:101) x − Λ (cid:101) x (Λ (cid:101) x + Λ (cid:98) v ) − Λ (cid:98) v (Λ (cid:101) x + Λ (cid:98) v ) Λ − (cid:98) v (cid:105) U T (cid:98) v = U (cid:98) v (cid:104) Λ (cid:101) x − Λ (cid:101) x (cid:105) U T (cid:98) v = 0 . (Note also that clearly P = 0 is the other positive semi-definitesolution to (22), which is not relevant herein though.) Thiscompletes the proof.Note that herein Λ and P can respectively be rewritten as Λ = ± (cid:113) P (cid:98) V · · · ... . . . ... · · · (cid:113) P n (cid:98) V n , (30)and P = U (cid:98) v P · · · ... . . . ... · · · P n U T (cid:98) v . (31)Note also that in the special case when { v k } is a white noise,i.e., when F i = G j = 0 in (8), (26) reduces to C = I n × n . (32)On the other hand, we may obtain an equivalent form ofthe system in Fig. 3 as given by (17). Proposition 3:
The system in Fig. 3 is equivalent to that inFig. 4, where K ( z ) is dynamic and is given by K ( z ) = F − ( z ) (cid:98) K = (cid:32) I − p (cid:88) i =1 F i z − i (cid:33) I + q (cid:88) j =1 G j z − j − (cid:98) K. (33) _ k x k y k e k u z − k + x k v A C ( ) K z ( )
F z ˆ k v Fig. 4. The steady-state integrated Kalman filter for colored noises: Equivalentform. _ k x k e k u z − k + x ˆ k v A ( ) K z ( )
F z k y ˆ K ˆ C Fig. 5. The steady-state integrated Kalman filter for colored noises: Equivalentform 2.
Herein, (cid:98) K is given by (16). More specifically, the system inFig. 4 is given by (cid:101) x k +1 = A (cid:101) x k + u k , y (cid:48) k = C (cid:101) x k , e (cid:48) k = − y (cid:48) k + v k , u k = (cid:98) K (cid:0) e (cid:48) k − (cid:80) pi =1 F i e (cid:48) k − i (cid:1) − (cid:80) qj =1 G j u k − j , (34)which is stable as a closed-loop system. Proof:
Note first that the system of Fig. 3 is equivalentto the one of Fig. 5, since (cid:98) K = F ( z ) K ( z ) . In addition, it isknown from the proof of Proposition 1 that (cid:101) y k = (cid:98) C (cid:101) x k = (cid:32) C − ∞ (cid:88) i =1 H i CA − i (cid:33) (cid:101) x k . As such, since (cid:101) x k = x k − x k = A ( x k − − x k − ) = A (cid:101) x k − , we have (cid:32) C − ∞ (cid:88) i =1 H i CA − i (cid:33) (cid:101) x k = (cid:32) C − ∞ (cid:88) i =1 H i Cz − i (cid:33) (cid:101) x k = (cid:32) I − p (cid:88) i =1 F i z − i (cid:33) I + q (cid:88) j =1 G j z − j − C (cid:101) x k = (cid:32) I − ∞ (cid:88) i =1 H i z − i (cid:33) C (cid:101) x k = F − ( z ) C (cid:101) x k . Consequently, the system of Fig. 5 is equivalent to that ofFig. 6. Moreover, since all the sub-systems are linear, thesystem of Fig. 6 is equivalent to that of Fig. 7, which in turn k x k y k e k u z − k + x ˆ k v A C ( ) K z ( )
F z ( ) F z − Fig. 6. The steady-state integrated Kalman filter for colored noises: Equivalentform 3. _ k x k y k e k u z − k + x k v A C ( ) K z ( )
F z ˆ k v ( ) F z − ( ) F z
Fig. 7. The steady-state integrated Kalman filter for colored noises: Equivalentform 4. equals to the one of Fig. 4; note that herein F ( z ) is stable andminimum-phase, and thus there will be no issues caused bycancellations of unstable poles and nonminimum-phase zeros.Meanwhile, the closed-loop stability of the system given in(34) and Fig. 4 is the same as that of the system given by(17) and Fig. 3, since they are essentially the same feedbacksystem.As a matter of fact, in the system of Fig. 4, or equivalently,in the system of Fig. 8, we may view e (cid:48) k = − y (cid:48) k + v k (35)as a feedback channel [2], [5] with additive colored Gaussiannoise { v k } , whereas {− y (cid:48) k } is the channel input while { e (cid:48) k } is the channel output. Note that in Fig. 8, L ( z ) = C ( zI − A ) − K ( z ) (36)may be viewed as a particular class of feedback coding. On theother hand, with the notations in (35), the feedback capacity _ k x k y k e k u z − k + x k v A C ( ) K z ( )
F z ˆ k v ( ) L z
Fig. 8. The steady-state integrated Kalman filter for colored noises: Equivalentform 5. is given by (cf. the definition in (1)) C f = sup lim k →∞ k +1 (cid:80) ki =0 tr E (cid:104) ( − y (cid:48) i )( − y (cid:48) i ) T (cid:105) ≤ P [ h ∞ ( e (cid:48) ) − h ∞ ( v )] . (37)As such, if A and C are designed specifically as in Theorem 1,then (36) naturally provides a class of sub-optimal feedbackcoding scheme, by which the corresponding h ∞ ( e (cid:48) ) − h ∞ ( v ) that can be achieved is thus a lower bound of (37).In this view, the following lower bound of feedback capacitycan be obtained. Theorem 2:
Suppose that in (8), (cid:98) V = U (cid:98) v Λ (cid:98) v U T (cid:98) v , where Λ (cid:98) v = diag (cid:16) (cid:98) V , . . . , (cid:98) V n (cid:17) , (38)and U (cid:98) v is an orthogonal matrix. Then, a lower bound of thefeedback capacity with power constraint P is given by max P ≥ ,...,P n ≥ n (cid:88) (cid:96) =1
12 log (cid:18) P (cid:96) (cid:98) V (cid:96) (cid:19) , (39)where P , . . . , P n satisfy tr CU (cid:98) v P · · · ... . . . ... · · · P n U T (cid:98) v C T = P . (40)Herein, C = vec − n × n (cid:34) (cid:0) U − T (cid:98) v ⊗ I n × n (cid:1) I n × n + q (cid:88) j =1 Λ − j ⊗ G j × (cid:32) I n × n − p (cid:88) i =1 Λ − i ⊗ F i (cid:33) − (cid:0) U T (cid:98) v ⊗ I n × n (cid:1) vec ( I n × n ) (cid:35) , (41)where Λ = ± (cid:113) P (cid:98) V · · · ... . . . ... · · · (cid:113) P n (cid:98) V n . (42) Proof:
To start with, suppose that A and C are specificallydesigned as in Theorem 1. In this case, it is known fromTheorem 3 that the system in (34) is stable. Note then that(34) implies − Y (cid:48) ( z ) = − C ( zI − A ) − K ( z ) × (cid:104) I + C ( zI − A ) − K ( z ) (cid:105) − V ( z ) , and thus − C ( zI − A ) − K ( z ) (cid:104) I + C ( zI − A ) − K ( z ) (cid:105) − (43)is stable. Accordingly, since { v k } is stationary Gaussian, {− y (cid:48) k } is also stationary Gaussian. On the other hand, it holdsthat E (cid:48) ( z ) = (cid:104) I + C ( zI − A ) − K ( z ) (cid:105) − V ( z ) , nd as a consequence (cf. discussions in [5], [18]), h ∞ ( e (cid:48) ) − h ∞ ( v )= 12 π (cid:90) π − π log det (cid:12)(cid:12)(cid:12)(cid:12)(cid:104) I + C (cid:0) e j ω I − A (cid:1) − K (cid:0) e j ω (cid:1)(cid:105) − (cid:12)(cid:12)(cid:12)(cid:12) d ω = 12 π (cid:90) π − π log det (cid:12)(cid:12)(cid:12)(cid:12)(cid:104) I + C (cid:0) e j ω I − A (cid:1) − F − (cid:0) e j ω (cid:1) (cid:98) K (cid:105) − (cid:12)(cid:12)(cid:12)(cid:12) d ω = n (cid:88) (cid:96) =1 log | λ (cid:96) | = n (cid:88) (cid:96) =1 log (cid:115) P (cid:96) (cid:98) V (cid:96) = n (cid:88) (cid:96) =1
12 log (cid:18) P (cid:96) (cid:98) V (cid:96) (cid:19) , where the first equality may be referred to [21], [22] whilethe third equality follows as a result of the Bode integral orJensen’s formula [21], [28]. Note that herein we have used thefact that F − ( z ) is stable and minimum-phase, ( A, C ) is de-tectable (thus the set of unstable poles of C ( zI − A ) − K ( z ) is exactly the same as the set of eigenvalues of A withmagnitude greater than or equal to ; see, e.g., discussionsin [29]), and (cid:104) I + C ( zI − A ) − K ( z ) (cid:105) − is stable. Consequently, according to the definition of feedbackcapacity given in (37), it holds that C f ≥ h ∞ ( e (cid:48) ) − h ∞ ( v ) = n (cid:88) (cid:96) =1
12 log (cid:18) P (cid:96) (cid:98) V (cid:96) (cid:19) , when the corresponding lim k →∞ k + 1 k (cid:88) i =0 tr E (cid:104) ( − y (cid:48) i ) ( − y (cid:48) i ) T (cid:105) = tr E (cid:104) ( − y (cid:48) k ) ( − y (cid:48) k ) T (cid:105) = tr E (cid:104) ( y (cid:48) k ) ( y (cid:48) k ) T (cid:105) = tr E (cid:104) ( C (cid:101) x k ) ( C (cid:101) x k ) T (cid:105) = tr E (cid:2) C (cid:101) x k (cid:101) x T k C T (cid:3) = tr (cid:110) C E (cid:104) ( x k − x k ) ( x k − x k ) T (cid:105) C T (cid:111) = tr (cid:0) CP C T (cid:1) is less than the power constraint P , i.e., when (see Theorem 1) tr (cid:0) CP C T (cid:1) = tr CU (cid:98) v P · · · ... . . . ... · · · P n U T (cid:98) v C T ≤ P .
Note that herein we have used the fact that {− y (cid:48) k } is station-ary. In particular, we may pick the allocation P , . . . , P n thatmaximizes n (cid:88) (cid:96) =1
12 log (cid:18) P (cid:96) (cid:98) V (cid:96) (cid:19) while satisfying tr CU (cid:98) v P · · · ... . . . ... · · · P n U T (cid:98) v C T = P .
This completes the proof. Note that the solution to (39) is essentially a power alloca-tion policy with feedback. Note also that he lower bound inTheorem 2 is equal to max a ≥ ,...,a n ≥ n (cid:88) (cid:96) =1 log a (cid:96) , (44)where a , . . . , a n satisfy tr CU (cid:98) v a − · · · ... . . . ... · · · a n − U T (cid:98) v C T = P . (45)Herein, C is given by (41), where Λ = ± a · · · ... . . . ... · · · a n . (46)We now consider the case of independent parallel channels. Corollary 1:
Suppose that in (8), (cid:98) V = diag (cid:16) (cid:98) V , . . . , (cid:98) V n (cid:17) , (47)and F i = diag ( f i , . . . , f in ) , i = 1 , . . . , p, (48)while G j = diag ( g j , . . . , g jn ) , j = 1 , . . . , q, (49)which essentially model a parallel of independent ARMAnoises. Then, a lower bound of the feedback capacity withpower constraint P is given by max P ≥ ,...,P n ≥ n (cid:88) (cid:96) =1
12 log (cid:18) P (cid:96) (cid:98) V (cid:96) (cid:19) , (50)where P , . . . , P n satisfy n (cid:88) (cid:96) =1 (cid:32) (cid:80) qj =1 g j(cid:96) a − j(cid:96) − (cid:80) pi =1 f i(cid:96) a − i(cid:96) (cid:33) P (cid:96) = P , (51)or n (cid:88) (cid:96) =1 (cid:34) (cid:80) qj =1 g j(cid:96) ( − a (cid:96) ) − j − (cid:80) pi =1 f i(cid:96) ( − a (cid:96) ) − i (cid:35) P (cid:96) = P . (52)Herein, a (cid:96) = (cid:115) P (cid:96) (cid:98) V (cid:96) , (cid:96) = 1 , . . . , n. (53) roof: Note first that in this case, U (cid:98) v = I and hence C = vec − n × n (cid:34) (cid:0) I − T n × n ⊗ I n × n (cid:1) I n × n + q (cid:88) j =1 Λ − j ⊗ G j × (cid:32) I n × n − p (cid:88) i =1 Λ − i ⊗ F i (cid:33) − (cid:0) I T n × n ⊗ I n × n (cid:1) vec ( I n × n ) (cid:35) = vec − n × n (cid:34) I n × n + q (cid:88) j =1 Λ − j ⊗ G j × (cid:32) I n × n − p (cid:88) i =1 Λ − i ⊗ F i (cid:33) − vec ( I n × n ) (cid:35) , while tr CU (cid:98) v P · · · ... . . . ... · · · P n U T (cid:98) v C T = tr C P · · · ... . . . ... · · · P n C T . As such, if
Λ = diag ( a , . . . , a n ) , where a , . . . , a n are given by (53), then similarly to theprocedures in the proof of Proposition 1, it can be obtainedthat I n × n + q (cid:88) j =1 Λ − j ⊗ G j (cid:32) I n × n − p (cid:88) i =1 Λ − i ⊗ F i (cid:33) − = I n × n + (cid:80) qj =1 a − j G j · · · ... . . . ... · · · I n × n + (cid:80) qj =1 a − jn G j × I n × n − (cid:80) pi =1 a − i F i · · · ... . . . ... · · · I n × n − (cid:80) pi =1 a − in F i − . Meanwhile, in this case it holds for (cid:96) = 1 , . . . , n, that I n × n + q (cid:88) j =1 a − j(cid:96) G j = (cid:80) qj =1 a − j(cid:96) g j · · · ... . . . ... · · · (cid:80) qj =1 a − j(cid:96) g jn , and (cid:32) I n × n − p (cid:88) i =1 a − i(cid:96) F i (cid:33) − = − (cid:80) pi =1 a − i(cid:96) f i · · · ... . . . ... · · · − (cid:80) pi =1 a − i(cid:96) f in − . Thus, C = vec − n × n (cid:34) I n × n + q (cid:88) j =1 Λ − j ⊗ G j × (cid:32) I n × n − p (cid:88) i =1 Λ − i ⊗ F i (cid:33) − vec ( I n × n ) (cid:35) = (cid:80) qj =1 g j a − j − (cid:80) pi =1 f i a − i · · · ... . . . ... · · · (cid:80) qj =1 g jn a − jn − (cid:80) pi =1 f in a − in , and the power constraint becomes tr C P · · · ... . . . ... · · · P n C T = n (cid:88) (cid:96) =1 (cid:32) (cid:80) qj =1 g j(cid:96) a − j(cid:96) − (cid:80) pi =1 f i(cid:96) a − i(cid:96) (cid:33) P (cid:96) = P .
Similarly, if
Λ = − diag ( a , . . . , a n ) , then it can be obtained that C = (cid:80) qj =1 g j ( − a ) − j − (cid:80) pi =1 f i ( − a ) − i · · · ... . . . ... · · · (cid:80) qj =1 g jn ( − a n ) − j − (cid:80) pi =1 f in ( − a n ) − i , and the power constraint becomes n (cid:88) (cid:96) =1 (cid:34) (cid:80) qj =1 g j ( − a ) − j − (cid:80) pi =1 f i ( − a ) − i (cid:35) P (cid:96) = P .
This completes the proof.Equivalently, the lower bound in Corollary 1 can be rewrit-ten as max a ≥ ,...,a n ≥ n (cid:88) (cid:96) =1 log a (cid:96) , (54)where a , . . . , a n satisfy n (cid:88) (cid:96) =1 (cid:32) (cid:80) qj =1 g j(cid:96) a − j(cid:96) − (cid:80) pi =1 f i(cid:96) a − i(cid:96) (cid:33) (cid:0) a (cid:96) − (cid:1) (cid:98) V (cid:96) = P , (55)or n (cid:88) (cid:96) =1 (cid:34) (cid:80) qj =1 g j(cid:96) ( − a (cid:96) ) − j − (cid:80) pi =1 f i(cid:96) ( − a (cid:96) ) − i (cid:35) (cid:0) a (cid:96) − (cid:1) (cid:98) V (cid:96) = P . (56)We next consider some special cases in which Theorem 2(or Corollary 1) can be characterized more explicitly, includinga parallel of AWGN channels (Example 1) and a single ACGNchannel (Example 2). xample 1:
In the special case when { v k } is a whiteGaussian noise with covariance (cid:98) V , i.e., when F i = G j = 0 in (8), a lower bound of the feedback capacity with powerconstraint P is given by max P ≥ ,...,P n ≥ n (cid:88) (cid:96) =1
12 log (cid:18) P (cid:96) (cid:98) V (cid:96) (cid:19) , (57)where P , . . . , P n satisfy tr U (cid:98) v P · · · ... . . . ... · · · P n U T (cid:98) v = tr P · · · ... . . . ... · · · P n = n (cid:88) (cid:96) =1 P (cid:96) = P . (58)As a matter of fact, the lower bound is tight in this case[23] and the optimal power allocation solution is given bythe classical “water-filling” policy [23] as P (cid:96) = max (cid:110) , ζ − (cid:98) V (cid:96) (cid:111) , (cid:96) = 1 , . . . , n, (59)where ζ > satisfies n (cid:88) (cid:96) =1 P (cid:96) = n (cid:88) (cid:96) =1 max (cid:110) , ζ − (cid:98) V (cid:96) (cid:111) = P . (60)It is also worth mentioning that the lower bound in (57)can equivalently be rewritten as (cf. also discussions afterTheorem 2 for the general case) max a ≥ ,...,a n ≥ n (cid:88) (cid:96) =1 log a (cid:96) , (61)where a , . . . , a n satisfy n (cid:88) (cid:96) =1 (cid:104)(cid:0) a (cid:96) − (cid:1) (cid:98) V (cid:96) (cid:105) = P . (62)Correspondingly, the optimal “allocation” solution is given by a (cid:96) = (cid:115) max (cid:26) , ζ (cid:98) V (cid:96) (cid:27) , (cid:96) = 1 , . . . , n, (63)where ζ > satisfies n (cid:88) (cid:96) =1 (cid:104)(cid:0) a (cid:96) − (cid:1) (cid:98) V (cid:96) (cid:105) = n (cid:88) (cid:96) =1 max (cid:110) , ζ − (cid:98) V (cid:96) (cid:111) = P . (64)This provides an alternative perspective to look at the water-filling allocation, while also displaying more clearly the con-nections with lower bounds in other cases, e.g., that of thesubsequent Example 2.
Example 2:
For another special case, consider the scalarcase of n = 1 . In this case, (8) reduces to v k = p (cid:88) i =1 f i v k − i + (cid:98) v k + q (cid:88) j =1 g j (cid:98) v k − j , (65) where { (cid:98) v k } , (cid:98) v k ∈ R is white Gaussian with variance σ (cid:98) v > .Accordingly, Theorem 2 reduces to that a lower bound of thefeedback capacity with power constraint P is given by max P
12 log (cid:18) Pσ (cid:98) v (cid:19) , (66)where P satisfies (cid:32) (cid:80) qj =1 g j a − j − (cid:80) pi =1 f i a − i (cid:33) P = P , (67)or (cid:34) (cid:80) qj =1 g j ( − a ) − j − (cid:80) pi =1 f i ( − a ) − i (cid:35) P = P . (68)Herein, a = (cid:115) Pσ (cid:98) v . (69)It may then be verified that this lower bound can equivalentlybe rewritten as (cf. also discussions after Theorem 2 orCorollary 1) max a ≥ log a, (70)where a satisfies (cid:32) (cid:80) qj =1 g j a − j − (cid:80) pi =1 f i a − i (cid:33) (cid:0) a − (cid:1) σ (cid:98) v = P , (71)or (cid:34) (cid:80) qj =1 g j ( − a ) − j − (cid:80) pi =1 f i ( − a ) − i (cid:35) (cid:0) a − (cid:1) σ (cid:98) v = P . (72)In fact, (66) coincides with the lower bound in [16] given as max a ∈ R log | a | , (73)where a satisfies (cid:32) (cid:80) qj =1 g j a − j − (cid:80) pi =1 f i a − i (cid:33) (cid:0) a − (cid:1) σ (cid:98) v = P , (74)whereas this in turn reduces to the results in, e.g., [18] (seedetailed discussions in [16], which also relates to the formulaein, e.g., [2]).Note that (34) and Fig. 8 essentially provide a recursivecoding scheme to achieve the lower bound in Theorem 2. Thisis more clearly seen in Fig. 9, where L ( z ) is given by (36).More specifically, the recursive coding algorithm is given asfollows. Theorem 3:
Suppose the optimal solution to (39) is givenby P , . . . , P n . Then, one class of recursive coding schemeto achieve the lower bound in Theorem 2 is given by (cid:101) x k +1 = A (cid:101) x k + u k , y (cid:48) k = C (cid:101) x k , e (cid:48) k = − y (cid:48) k + v k , u k = (cid:98) K (cid:0) e (cid:48) k − (cid:80) pi =1 F i e (cid:48) k − i (cid:1) − (cid:80) qj =1 G j u k − j . (75) v ( ) L z − ( ) F z ˆ k v Fig. 9. The steady-state integrated Kalman filter as a feedback coding scheme
Herein, A = U (cid:98) v Λ U T (cid:98) v and C is given by (41), where Λ isgiven by Λ = ± (cid:113) P (cid:98) V · · · ... . . . ... · · · (cid:113) P n (cid:98) V n . (76)In the case of parallel AWGN channels, Theorem 3 reducesto a recursive water-filling scheme. Example 3:
In the special case when { v k } is a white noise,i.e., when F i = G j = 0 in (8), the coding scheme of (75)reduces to (cid:101) x k +1 = A (cid:101) x k + u k , y (cid:48) k = C (cid:101) x k , e (cid:48) k = − y (cid:48) k + v k , u k = (cid:98) K e (cid:48) k . (77)Herein, A = U (cid:98) v Λ U T (cid:98) v and C = I n × n , where Λ is given by Λ = ± (cid:114) max (cid:110) , ζ (cid:98) V (cid:111) · · · ... . . . ... · · · (cid:114) max (cid:110) , ζ (cid:98) V n (cid:111) , (78)and ζ > satisfies n (cid:88) (cid:96) =1 max (cid:110) , ζ − (cid:98) V (cid:96) (cid:111) = P . (79)Note that this is essentially a feedback (“closed-loop”) water-filling power allocation scheme, which is potentially more“robust” than the classical “open-loop” water-filling policy; cf.results in [30] for instance. We will, however, leave detaileddiscussions on this topic to future research.IV. C
ONCLUSION
In this paper, from the perspective of a variant of the Kalmanfilter, we have obtained lower bounds on the feedback capacityof parallel ACGN channels and the accompanying recursivecoding schemes in terms of power allocation policies withfeedback. Possible future research directions include investi-gating the tightness of the lower bounds, as well as the specialcases in which more explicit solutions (cf. water-filling) to thefeedback power allocation policies may be derived. R
EFERENCES[1] T. M. Cover and S. Pombra, “Gaussian feedback capacity,”
IEEETransactions on Information Theory , vol. 35, no. 1, pp. 37–43, 1989.[2] Y.-H. Kim, “Feedback capacity of stationary Gaussian channels,”
IEEETransactions on Information Theory , vol. 56, no. 1, pp. 57–85, 2010.[3] ——, “Feedback capacity of the first-order moving average Gaussianchannel,”
IEEE Transactions on Information Theory , vol. 52, no. 7, pp.3063–3079, 2006.[4] ——, “Gaussian feedback capacity,” Ph.D. dissertation, Stanford Uni-versity, 2006.[5] E. Ardestanizadeh and M. Franceschetti, “Control-theoretic approachto communication with feedback,”
IEEE Transactions on AutomaticControl , vol. 57, no. 10, pp. 2576–2587, 2012.[6] J. Liu and N. Elia, “Convergence of fundamental limitations in feedbackcommunication, estimation, and feedback control over Gaussian chan-nels,”
Communications in Information and Systems , vol. 14, no. 3, pp.161–211, 2014.[7] J. Liu, N. Elia, and S. Tatikonda, “Capacity-achieving feedback schemesfor Gaussian finite-state Markov channels with channel state informa-tion,”
IEEE Transactions on Information Theory , vol. 61, no. 7, pp.3632–3650, 2015.[8] P. A. Stavrou, C. D. Charalambous, and C. K. Kourtellaris, “Sequentialnecessary and sufficient conditions for capacity achieving distributionsof channels with memory and feedback,”
IEEE Transactions on Infor-mation Theory , vol. 63, no. 11, pp. 7095–7115, 2017.[9] T. Liu and G. Han, “Feedback capacity of stationary Gaussian channelsfurther examined,”
IEEE Transactions on Information Theory , vol. 65,no. 4, pp. 2492–2506, 2018.[10] C. Li and N. Elia, “Youla coding and computation of Gaussian feedbackcapacity,”
IEEE Transactions on Information Theory , vol. 64, no. 4, pp.3197–3215, 2018.[11] A. Rawat, N. Elia, and C. Li, “Computation of feedback capacity ofsingle user multi-antenna stationary Gaussian channel,” in
Proceedingsof the Annual Allerton Conference on Communication, Control, andComputing , 2018, pp. 1128–1135.[12] A. R. Pedram and T. Tanaka, “Some results on the computation offeedback capacity of Gaussian channels with memory,” in
Proceedingsof the Annual Allerton Conference on Communication, Control, andComputing (Allerton) , 2018, pp. 919–926.[13] C. K. Kourtellaris and C. D. Charalambous, “Information structures ofcapacity achieving distributions for feedback channels with memory andtransmission cost: Stochastic optimal control & variational equalities,”
IEEE Transactions on Information Theory , vol. 64, no. 7, pp. 4962–4992, 2018.[14] A. Gattami, “Feedback capacity of Gaussian channels revisited,”
IEEETransactions on Information Theory , vol. 65, no. 3, pp. 1948–1960,2018.[15] S. Ihara, “On the feedback capacity of the first-order moving averageGaussian channel,”
Japanese Journal of Statistics and Data Science , pp.1–16, 2019.[16] S. Fang and Q. Zhu, “A connection between feedback capacity andKalman filter for colored Gaussian noises,” in
Proceedings of the IEEEInternational Symposium on Information Theory , 2020, pp. 2055–2060.[17] Z. Aharoni, D. Tsur, Z. Goldfeld, and H. H. Permuter, “Capacityof continuous channels with memory via directed information neuralestimator,” in
Proceedings of the IEEE International Symposium onInformation Theory , 2020, pp. 2014–2019.[18] N. Elia, “When Bode meets Shannon: Control-oriented feedback com-munication schemes,”
IEEE Transactions on Automatic Control , vol. 49,no. 9, pp. 1477–1488, 2004.[19] S. Yang, A. Kavcic, and S. Tatikonda, “On the feedback capacityof power-constrained Gaussian noise channels with memory,”
IEEETransactions on Information Theory , vol. 53, no. 3, pp. 929–954, 2007.[20] S. Tatikonda and S. Mitter, “The capacity of channels with feedback,”
IEEE Transactions on Information Theory , vol. 55, no. 1, pp. 323–349,2009.[21] S. Fang, J. Chen, and H. Ishii,
Towards Integrating Control andInformation Theories: From Information-Theoretic Measures to ControlPerformance Limitations . Springer, 2017.[22] A. Papoulis and S. U. Pillai,
Probability, Random Variables and Stochas-tic Processes . New York: McGraw-Hill, 2002.[23] T. M. Cover and J. A. Thomas,
Elements of Information Theory . JohnWiley & Sons, 2006.24] T. Kailath, A. H. Sayed, and B. Hassibi,
Linear Estimation . PrenticeHall, 2000.[25] B. D. O. Anderson and J. B. Moore,
Optimal Filtering . Prentice-Hall,1979.[26] K. J. Åström and R. M. Murray,
Feedback Systems: An Introduction forScientists and Engineers . Princeton University Press, 2010.[27] P. P. Vaidyanathan,
The Theory of Linear Prediction . Morgan &Claypool Publishers, 2007.[28] M. M. Seron, J. H. Braslavsky, and G. C. Goodwin,
Fundamental Limitations in Filtering and Control . Springer, 1997.[29] S. Fang, H. Ishii, and J. Chen, “An integral characterization of optimalerror covariance by Kalman filtering,” in
Proceedings of the AmericanControl Conference , 2018, pp. 5031–5036.[30] S. L. Fong and V. Y. Tan, “A tight upper bound on the second-ordercoding rate of the parallel Gaussian channel with feedback,”