[PDF] Feedback Capacity of Parallel ACGN Channels and Kalman Filter: Power Allocation with Feedback

Abstract

In this paper, we relate the feedback capacity of parallel additive colored Gaussian noise (ACGN) channels to a variant of the Kalman filter. By doing so, we obtain lower bounds on the feedback capacity of such channels, as well as the corresponding feedback (recursive) coding schemes, which are essentially power allocation policies with feedback, to achieve the bounds. The results are seen to reduce to existing lower bounds in the case of a single ACGN feedback channel, whereas when it comes to parallel additive white Gaussian noise (AWGN) channels with feedback, the recursive coding scheme reduces to a feedback "water-filling" power allocation policy.

Full PDF

FFeedback Capacity of Parallel ACGN Channels andKalman Filter: Power Allocation with Feedback

Song Fang and Quanyan Zhu

Department of Electrical and Computer Engineering, New York University, New York, USAEmail: {song.fang, quanyan.zhu}@nyu.edu

Abstract —In this paper, we relate the feedback capacity ofparallel additive colored Gaussian noise (ACGN) channels toa variant of the Kalman ﬁlter. By doing so, we obtain lowerbounds on the feedback capacity of such channels, as well as thecorresponding feedback (recursive) coding schemes, which areessentially power allocation policies with feedback, to achievethe bounds. The results are seen to reduce to existing lowerbounds in the case of a single ACGN feedback channel, whereaswhen it comes to parallel additive white Gaussian noise (AWGN)channels with feedback, the recursive coding scheme reduces toa feedback “water-ﬁlling” power allocation policy.

I. I

NTRODUCTION

The feedback capacity [1] of additive colored Gaussiannoise (ACGN) channels has been a long-standing problemin information theory, generating numerous research papersover the years, due to its signiﬁcance in understanding andapplying communication/coding with feedback. In general, werefer to the breakthrough paper [2] and the references thereinfor a rather complete literature review; see also [3], [4] forpossibly complementary paper surveys. Meanwhile, papers onthis topic have also been coming out continuously after [2],which include but are certainly not restricted to [5]–[17]. Mostof the aforementioned works, however, focused merely on thefeedback capacity of a single ACGN channel, so to speak,whereas when it comes to parallel ACGN channels, the corre-sponding results have been lacking in general. One exceptionis the recent paper [11] which generalized the computationalapproach in [10] to multi-antenna ACGN channels.In this paper, we establish a connection between a paral-lel of ACGN channels with feedback and a variant of themulti-input multi-output (MIMO) Kalman ﬁlter for coloredGaussian noises. In light of this, we obtain lower bounds onfeedback capacity for parallel ACGN channels by examiningthe algebraic Riccati equations associated with the Kalmanﬁlter. Meanwhile, the Kalman ﬁltering systems, which areessentially feedback (closed-loop) systems, naturally providerecursive coding schemes, in terms of feedback power al-location policies, to achieve the lower bounds. In addition,the lower bounds are shown to be consistent with existingfeedback capacity results when it comes to a single ACGNchannel. It is also seen that in the special case of paralleladditive white Gaussian noise (AWGN) channels, the recursivecoding reduces to a feedback “water-ﬁlling” solution.In a broad sense, in this paper we utilize a control-theoreticapproach towards this problem as in, e.g., [5]–[7], [10]–[14],[16], [18]–[20]; see also [21] and the references therein. Note also that although the organization of this paper resemblesthat of [16] to a certain extent, the results are not trivialgeneralizations of those therein, as evidenced by the resultsthemselves as well as their proofs.The rest of the paper is organized as follows. Section IIprovides the preliminary background on feedback capacity andKalman ﬁlter. In Section III, we present the main results ofthis paper. Concluding remarks are given in Section IV.II. P

RELIMINARIES

In this paper, we consider real-valued continuous randomvariables and discrete-time stochastic processes they compose.All random variables and stochastic processes are assumed tobe zero-mean for simplicity and without loss of generality.We represent random variables using boldface letters. Thelogarithm is deﬁned with base . A stochastic process { x k } is said to be asymptotically stationary if it is stationary as k → ∞ , and herein stationarity means strict stationarity[22]. Note in particular that, for simplicity and with abuseof notations, we utilize x ∈ R and x ∈ R n to indicate that x is a real-valued random variable and that x is a real-valued n -dimensional random vector, respectively.The following deﬁnitions of entropy and entropy rate areadapted from, e.g., [23]. Deﬁnition 1:

The differential entropy of a random vector x with density p x ( x ) is deﬁned as h ( x ) = − (cid:90) p x ( x ) log p x ( x ) d x. The entropy rate of a stochastic process { x k } is deﬁned as h ∞ ( x ) = lim sup k →∞ h ( x ,...,k ) k + 1 . A. Feedback Capacity

Consider a parallel of n additive colored Gaussian noisechannels given by y k = x k + z k , where { x k } , x k ∈ R n denotes the channel input, { y k } , y k ∈ R n denotes the channel output, and { z k } , z k ∈ R n denotesthe additive noise which is assumed to be stationary coloredGaussian. The feedback capacity C f of such a channel withpower constraint P is given by [1], [23] C f = sup lim k →∞ k +1 (cid:80) ki =0 tr E [ x i x T i ] ≤ P [ h ∞ ( y ) − h ∞ ( z )] . (1) a r X i v : . [ c s . I T ] F e b k x k z k v k x k y k e k u z − k + x z − k + x k y A A CC k K Fig. 1. The Kalman ﬁltering system.

It is worth mentioning that in the case of n = 1 , it wasshown in [2] that the optimal channel input process { x k } isstationary and in the form of x k = ∞ (cid:88) i =1 b i z k − i , b i ∈ R , while satisfying E (cid:2) x k (cid:3) ≤ P . This has also been generalized tothe case of n parallel channels by [11]. For the purpose of thispaper, however, it sufﬁces to consider the original deﬁnitionin (1). B. Kalman Filter

We now give a brief review of (a special case of) the MIMOKalman ﬁlter [24], [25]; note that hereinafter the notations arenot to be confused with those in Section II-A. Particularly,consider the Kalman ﬁltering system depicted in Fig. 1, wherethe state-space model of the plant to be estimated is given by (cid:26) x k +1 = A x k , y k = C x k + v k . (2)Herein, x k ∈ R n is the state to be estimated, y k ∈ R n isthe plant output, and v k ∈ R n is the measurement noise,whereas the process noise, normally denoted as { w k } [24],[25], is assumed to be absent. The system matrix is A ∈ R n × n while the output matrix is C ∈ R n × n , and we assume that A is anti-stable (i.e., all the eigenvalues are unstable withmagnitudes greater than or equal to ) while the pair ( A, C ) is observable (and thus detectable [26]). Suppose that { v k } is white Gaussian with covariance V = E (cid:2) v k v T k (cid:3) (cid:31) andthe initial state x is Gaussian with covariance E (cid:2) x x T0 (cid:3) (cid:31) .Furthermore, { v k } and x are assumed to be uncorrelated.Correspondingly, the Kalman ﬁlter (in the observer form [26])for (2) is given by  x k +1 = A x k + u k , y k = C x k , e k = y k − y k , u k = K k e k , (3)where x k ∈ R n , y k ∈ R n , e k ∈ R n , and u k ∈ R n . Herein, K k denotes the observer gain [26] (note that the observer gain _ k x k y k e k u z − k + x k v A C K Fig. 2. The steady-state Kalman ﬁltering system in integrated form. is different from the Kalman gain by a factor of A ; see, e.g.,[25], [26] for more details) given by K k = AP k C T (cid:0) CP k C T + V (cid:1) − , where P k denotes the state estimation error covariance as P k = E (cid:104) ( x k − x k ) ( x k − x k ) T (cid:105) . In addition, P k can be obtained iteratively by the Riccatiequation P k +1 = AP k A T − AP k C T (cid:0) CP k C T + V (cid:1) − CP k A T with P = E (cid:2) x x T0 (cid:3) (cid:31) . Additionally, it is known [24],[25] that when ( A, C ) is detectable, the Kalman ﬁlteringsystem converges, i.e., the state estimation error { x k − x k } is asymptotically stationary. Moreover, in steady state, theoptimal state estimation error variance P = lim k →∞ E (cid:104) ( x k − x k ) ( x k − x k ) T (cid:105) attained by the Kalman ﬁlter is given by the (non-zero) positivesemi-deﬁnite solution [25] to the algebraic Riccati equation P = AP A T − AP C T (cid:0) CP C T + V (cid:1) − CP A T , (4)whereas the steady-state observer gain is given by K = AP C T (cid:0) CP C T + V (cid:1) − . (5)In fact, by letting (cid:101) x k = x k − x k and (cid:101) y k = y k − z k = y k − C x k , we may integrate the steady-state systems of (2)and (3) into an equivalent form:  (cid:101) x k +1 = A (cid:101) x k + u k , (cid:101) y k = C (cid:101) x k , e k = − (cid:101) y k + v k , u k = K e k , (6)as depicted in Fig. 2, since all the sub-systems are linear.III. L OWER B OUNDS ON F EEDBACK C APACITY OF P ARALLEL

ACGN C

HANNELS AND R ECURSIVE C ODING

The approach we take in this paper to obtain lower boundson the feedback capacity of parallel ACGN channels is byestablishing the connection between a parallel of ACGNchannels with feedback and a variant of the Kalman ﬁlter forcolored Gaussian noises. Towards this end, we ﬁrst present thefollowing variant of the Kalman ﬁlter. . A Variant of the Kalman Filter

Consider again the Kalman ﬁltering system given in Fig. 1.Suppose that the plant to be estimated is still given by (cid:26) x k +1 = A x k , y k = C x k + v k , (7)only this time with an auto-regressive moving average(ARMA) colored Gaussian measurement noise { v k } , v k ∈ R n represented as v k = p (cid:88) i =1 F i v k − i + (cid:98) v k + q (cid:88) j =1 G j (cid:98) v k − j , (8)where { (cid:98) v k } , (cid:98) v k ∈ R n is white Gaussian with covariance (cid:98) V = E (cid:2)(cid:98) v k (cid:98) v T k (cid:3) (cid:31) . Equivalently, { v k } may be represented [27] asthe output of a linear time-invariant (LTI) ﬁlter F ( z ) drivenby input { (cid:98) v k } , where F ( z ) = (cid:32) I − p (cid:88) i =1 F i z − i (cid:33) −  I + q (cid:88) j =1 G j z − j  . (9)Herein, we assume that F ( z ) is stable and minimum-phase.We may now generalize the method of dealing with colorednoises as employed in [16] (which in turn was developed basedon [25]; see detailed discussions in [16]), to the case of MIMOKalman ﬁltering systems. Proposition 1:

Suppose that A = T Λ T − , where Λ = diag ( λ , . . . , λ n ) , (10)and | λ (cid:96) | ≥ , (cid:96) = 1 , . . . , n . Denote (cid:98) y k = − q (cid:88) j =1 G j (cid:98) y k − j + y k − p (cid:88) i =1 F i y k − i . (11)Then, (7) is equivalent to (cid:26) x k +1 = A x k , (cid:98) y k = (cid:98) C x k + (cid:98) v k , (12)where (cid:98) C = vec − n × n (cid:34) (cid:0) T − T ⊗ I n × n (cid:1) (cid:32) I n × n − p (cid:88) i =1 Λ − i ⊗ F i (cid:33) ×  I n × n + q (cid:88) j =1 Λ − j ⊗ G j  − (cid:0) T T ⊗ I n × n (cid:1) vec ( C ) (cid:35) . (13) Proof:

Note ﬁrst that since F ( z ) is stable and minimum-phase, the inverse ﬁlter F − ( z ) = (cid:32) I − p (cid:88) i =1 F i z − i (cid:33)  I + q (cid:88) j =1 G j z − j  − is also stable and minimum-phase. As a result, it holds ∀ | z | ≥ that (cid:32) I − p (cid:88) i =1 F i z − i (cid:33)  I + q (cid:88) j =1 G j z − j  − (cid:54) = 0 , i.e., the region of convergence must include, though notnecessarily restricted to, | z | ≥ . Consequently, for | z | ≥ ,we may expand (cid:32) I − p (cid:88) i =1 F i z − i (cid:33)  I + q (cid:88) j =1 G j z − j  − = I − ∞ (cid:88) i =1 H i z − i , and thus { (cid:98) v k } can be reconstructed from { v k } as [27] (cid:98) v k = v k − ∞ (cid:88) i =1 H i v k − i = − q (cid:88) j =1 G j (cid:98) v k − j + v k − p (cid:88) i =1 F i v k − i . Accordingly, we may also rewrite (cid:98) y k = − q (cid:88) j =1 G j (cid:98) y k − j + y k − p (cid:88) i =1 F i y k − i = y k − ∞ (cid:88) i =1 H i y k − i = y k − ∞ (cid:88) i =1 H i ( C x k − i + v k − i )= C x k − ∞ (cid:88) i =1 H i C x k − i + v k − ∞ (cid:88) i =1 H i v k − i = C x k − ∞ (cid:88) i =1 H i C x k − i + (cid:98) v k . Meanwhile, since A is anti-stable (and thus invertible), wehave x k − i = A − i x k . As a result, (cid:98) y k = C x k − ∞ (cid:88) i =1 H i C x k − i + (cid:98) v k = (cid:32) C − ∞ (cid:88) i =1 H i CA − i (cid:33) x k + (cid:98) v k . In addition, vec (cid:32) C − ∞ (cid:88) i =1 H i CA − i (cid:33) = vec ( C ) − vec (cid:32) ∞ (cid:88) i =1 H i CA − i (cid:33) = vec ( C ) − ∞ (cid:88) i =1 (cid:104)(cid:0) A − i (cid:1) T ⊗ H i (cid:105) vec ( C )= (cid:34) I n × n − ∞ (cid:88) i =1 (cid:0) A − i (cid:1) T ⊗ H i (cid:35) vec ( C ) , and hence C − ∞ (cid:88) i =1 H i CA − i = vec − n × n (cid:40)(cid:34) I n × n − ∞ (cid:88) i =1 (cid:0) A − i (cid:1) T ⊗ H i (cid:35) vec ( C ) (cid:41) . ote then that I n × n − ∞ (cid:88) i =1 (cid:0) A − i (cid:1) T ⊗ H i = I n × n − ∞ (cid:88) i =1 (cid:104)(cid:0) T Λ T − (cid:1) − i (cid:105) T ⊗ H i = I n × n − ∞ (cid:88) i =1 (cid:0) T Λ − i T − (cid:1) T ⊗ ( I n × n H i I n × n )= I n × n − ∞ (cid:88) i =1 (cid:0) T − T Λ − i T T (cid:1) ⊗ ( I n × n H i I n × n )= I n × n − ∞ (cid:88) i =1 (cid:0) T − T ⊗ I n × n (cid:1) (cid:0) Λ − i ⊗ H i (cid:1) (cid:0) T T ⊗ I n × n (cid:1) = I n × n − (cid:0) T − T ⊗ I n × n (cid:1) (cid:32) ∞ (cid:88) i =1 Λ − i ⊗ H i (cid:33) (cid:0) T T ⊗ I n × n (cid:1) . Moreover, since T T is invertible and the eigenvalues of T T ⊗ I n × n are given by the n copies of the eigenvalues of T T , itthus follows that T T ⊗ I n × n is invertible and I n × n = (cid:0) T T ⊗ I n × n (cid:1) − (cid:0) T T ⊗ I n × n (cid:1) = (cid:0) T − T ⊗ I n × n (cid:1) (cid:0) T T ⊗ I n × n (cid:1) . Accordingly, I n × n − (cid:0) T − T ⊗ I n × n (cid:1) (cid:32) ∞ (cid:88) i =1 Λ − i ⊗ H i (cid:33) (cid:0) T T ⊗ I n × n (cid:1) = (cid:0) T − T ⊗ I n × n (cid:1) (cid:0) T T ⊗ I n × n (cid:1) − (cid:0) T − T ⊗ I n × n (cid:1) (cid:32) ∞ (cid:88) i =1 Λ − i ⊗ H i (cid:33) (cid:0) T T ⊗ I n × n (cid:1) = (cid:0) T − T ⊗ I n × n (cid:1) (cid:32) I n × n − ∞ (cid:88) i =1 Λ − i ⊗ H i (cid:33) (cid:0) T T ⊗ I n × n (cid:1) . In addition, I n × n − ∞ (cid:88) i =1 Λ − i ⊗ H i = I n × n − ∞ (cid:88) i =1  λ − i · · · ... . . . ... · · · λ − in  ⊗ H i = I n × n − ∞ (cid:88) i =1  λ − i H i · · · ... . . . ... · · · λ − in H i  =  I n × n − (cid:80) ∞ i =1 λ − i H i · · · ... . . . ... · · · I n × n − (cid:80) ∞ i =1 λ − in H i  . Meanwhile, we have already shown that ∀ | z | ≥ , I n × n − ∞ (cid:88) i =1 H i z − i = (cid:32) I n × n − p (cid:88) i =1 F i z − i (cid:33)  I n × n + q (cid:88) j =1 G j z − j  − , i.e., I n × n − (cid:80) ∞ i =1 H i z − i converges. As such, since | λ (cid:96) | ≥ , (cid:96) = 1 , . . . , n , we have I n × n − ∞ (cid:88) i =1 λ − i(cid:96) H i = I n × n − ∞ (cid:88) i =1 H i λ − i(cid:96) = (cid:32) I n × n − p (cid:88) i =1 F i λ − i(cid:96) (cid:33)  I n × n + q (cid:88) j =1 G j λ − j(cid:96)  − = (cid:32) I n × n − p (cid:88) i =1 λ − i(cid:96) F i (cid:33)  I n × n + q (cid:88) j =1 λ − j(cid:96) G j  − . Therefore,  I n × n − (cid:80) ∞ i =1 λ − i H i · · · ... . . . ... · · · I n × n − (cid:80) ∞ i =1 λ − in H i  =  I n × n − (cid:80) pi =1 λ − i F i · · · ... . . . ... · · · I n × n − (cid:80) pi =1 λ − in F i  ×  I n × n + (cid:80) qj =1 λ − j G j · · · ... . . . ... · · · I n × n + (cid:80) qj =1 λ − jn G j  − =  I n × n − p (cid:88) i =1  λ − i F i · · · ... . . . ... · · · λ − in F i  ×  I n × n + q (cid:88) j =1  λ − j G j · · · ... . . . ... · · · λ − jn G j  − =  I n × n − p (cid:88) i =1  λ − i · · · ... . . . ... · · · λ − in  ⊗ F i  ×  I n × n + q (cid:88) j =1  λ − j · · · ... . . . ... · · · λ − jn  ⊗ G j  − = (cid:32) I n × n − p (cid:88) i =1 Λ − i ⊗ F i (cid:33)  I n × n + q (cid:88) j =1 Λ − j ⊗ G j  − . k x k e k u z − k + x ˆ k v A ˆ K k y ˆ C Fig. 3. The steady-state integrated Kalman ﬁlter for colored noises.

As a result, I n × n − ∞ (cid:88) i =1 (cid:0) A − i (cid:1) T ⊗ H i = (cid:0) T − T ⊗ I n × n (cid:1) (cid:32) I n × n − ∞ (cid:88) i =1 Λ − i ⊗ H i (cid:33) (cid:0) T T ⊗ I n × n (cid:1) = (cid:0) T − T ⊗ I n × n (cid:1) (cid:32) I n × n − p (cid:88) i =1 Λ − i ⊗ F i (cid:33) ×  I n × n + q (cid:88) j =1 Λ − j ⊗ G j  − (cid:0) T T ⊗ I n × n (cid:1) . To sum it up, we have C − ∞ (cid:88) i =1 H i CA − i = vec − n × n (cid:40)(cid:34) I n × n − ∞ (cid:88) i =1 (cid:0) A − i (cid:1) T ⊗ H i (cid:35) vec ( C ) (cid:41) = vec − n × n (cid:34) (cid:0) T − T ⊗ I n × n (cid:1) (cid:32) I n × n − p (cid:88) i =1 Λ − i ⊗ F i (cid:33) ×  I n × n + q (cid:88) j =1 Λ − j ⊗ G j  − (cid:0) T T ⊗ I n × n (cid:1) vec ( C ) (cid:35) . This completes the proof.Meanwhile, the Kalman ﬁlter for (12) is given by  x k +1 = A x k + u k , y k = (cid:98) C x k , e k = (cid:98) y k − y k , u k = (cid:98) K k e k , (14)where x k ∈ R n , y k ∈ R n , e k ∈ R n , and u k ∈ R n .Furthermore, when (cid:16) A, (cid:98) C (cid:17) is detectable, the Kalman ﬁlteringsystem converges, i.e., the state estimation error { x k − x k } is asymptotically stationary. Moreover, in steady state, theoptimal state estimation error covariance P = lim k →∞ E (cid:104) ( x k − x k ) ( x k − x k ) T (cid:105) attained by the Kalman ﬁlter is given by the (non-zero) positivesemi-deﬁnite solution to the algebraic Riccati equation P = AP A T − AP (cid:98) C T (cid:16) (cid:98) CP (cid:98) C T + (cid:98) V (cid:17) − (cid:98) CP A T , (15)whereas the steady-state observer gain is given by (cid:98) K = AP (cid:98) C T (cid:16) (cid:98) CP (cid:98) C T + (cid:98) V (cid:17) − . (16)Again, by letting (cid:101) x k = x k − x k and (cid:101) y k = y k − (cid:98) z k = y k − (cid:98) C x k ,we may integrate the steady-state systems of (12) and (14) into  (cid:101) x k +1 = A (cid:101) x k + u k , (cid:101) y k = (cid:98) C (cid:101) x k , e k = − (cid:101) y k + (cid:98) v k , u k = (cid:98) K e k , (17)as depicted in Fig. 3. In addition, it may be veriﬁed that theclosed-loop system given in (17) and Fig. 3 is stable [25],[26].In fact, we may design matrix C speciﬁcally to rendermatrix (cid:98) C an identity matrix. Proposition 2:

Suppose that A = T Λ T − , where Λ = diag ( λ , . . . , λ n ) , (18)and | λ (cid:96) | ≥ , (cid:96) = 1 , . . . , n . If C = vec − n × n (cid:34) (cid:0) T − T ⊗ I n × n (cid:1)  I n × n + q (cid:88) j =1 Λ − j ⊗ G j  × (cid:32) I n × n − p (cid:88) i =1 Λ − i ⊗ F i (cid:33) − (cid:0) T T ⊗ I n × n (cid:1) vec ( I n × n ) (cid:35) , (19)then (cid:98) C = I n × n . (20) Proof:

It is known from the proof of Proposition 1 that (cid:0) T − T ⊗ I n × n (cid:1) (cid:32) I n × n − p (cid:88) i =1 Λ − i ⊗ F i (cid:33) ×  I n × n + q (cid:88) j =1 Λ − j ⊗ G j  − (cid:0) T T ⊗ I n × n (cid:1) is invertible, and it can be veriﬁed that its inverse is given by (cid:0) T − T ⊗ I n × n (cid:1)  I n × n + q (cid:88) j =1 Λ − j ⊗ G j  × (cid:32) I n × n − p (cid:88) i =1 Λ − i ⊗ F i (cid:33) − (cid:0) T T ⊗ I n × n (cid:1) . ence, if (19) holds, then (cid:98) C = vec − n × n (cid:34) (cid:0) T − T ⊗ I n × n (cid:1) (cid:32) I n × n − p (cid:88) i =1 Λ − i ⊗ F i (cid:33) ×  I n × n + q (cid:88) j =1 Λ − j ⊗ G j  − (cid:0) T T ⊗ I n × n (cid:1) vec ( C ) (cid:35) = vec − n × n (cid:34) (cid:0) T − T ⊗ I n × n (cid:1) (cid:32) I n × n − p (cid:88) i =1 Λ − i ⊗ F i (cid:33) ×  I n × n + q (cid:88) j =1 Λ − j ⊗ G j  − (cid:0) T T ⊗ I n × n (cid:1) × (cid:0) T − T ⊗ I n × n (cid:1)  I n × n + q (cid:88) j =1 Λ − j ⊗ G j  × (cid:32) I n × n − p (cid:88) i =1 Λ − i ⊗ F i (cid:33) − (cid:0) T T ⊗ I n × n (cid:1) vec ( I n × n ) (cid:35) = I n × n . This completes the proof.Note that when (cid:98) C = I n × n , the pair (cid:16) A, (cid:98) C (cid:17) is alwaysobservable (and thus always detectable [26]). To see this, notethat if (cid:98) C = I n × n , then the observation matrix for (cid:16) A, (cid:98) C (cid:17) becomes  (cid:98) C (cid:98) CA (cid:98) CA ... (cid:98) CA n −  =  I n × n AA ... A n −  , (21)which has a row rank of n , indicating that (cid:16) A, (cid:98) C (cid:17) is ob-servable [26], regardless of what A is. As such, in this casethe Kalman ﬁltering system always converges, whereas (15)reduces to P = AP A T − AP (cid:16) P + (cid:98) V (cid:17) − P A T , (22)and (16) reduces to (cid:98) K = AP (cid:16) P + (cid:98) V (cid:17) − . (23) B. Feedback Capacity of Parallel ACGN Channels

We now proceed to obtain lower bounds on feedback ca-pacity as well as the corresponding recursive coding schemes,based upon the discussions in the previous sub-section. Weﬁrst examine the solution to the algebraic Riccati equationgiven by (15) when A and C are designed speciﬁcally. Theorem 1:

Suppose that in (8), (cid:98) V = U (cid:98) v Λ (cid:98) v U T (cid:98) v , where Λ (cid:98) v = diag (cid:16) (cid:98) V , . . . , (cid:98) V n (cid:17) , (24)and U (cid:98) v is an orthogonal matrix. If A = U (cid:98) v Λ U T (cid:98) v , (25) and C = vec − n × n (cid:34) (cid:0) U − T (cid:98) v ⊗ I n × n (cid:1)  I n × n + q (cid:88) j =1 Λ − j ⊗ G j  × (cid:32) I n × n − p (cid:88) i =1 Λ − i ⊗ F i (cid:33) − (cid:0) U T (cid:98) v ⊗ I n × n (cid:1) vec ( I n × n ) (cid:35) , (26)where Λ = ± (Λ (cid:101) x + Λ (cid:98) v ) Λ − (cid:98) v , (27)and Λ (cid:101) x = diag ( P , . . . , P n ) (cid:23) , Λ (cid:101) x (cid:54) = 0 , (28)then the (non-zero) positive semi-deﬁnite solution to (15) isgiven by P = U (cid:98) v Λ (cid:101) x U T (cid:98) v . (29) Proof:

Note ﬁrst that the eigenvalues of A = U (cid:98) v Λ U T (cid:98) v = ± U (cid:98) v (Λ (cid:101) x + Λ (cid:98) v ) Λ − (cid:98) v U T (cid:98) v are given by λ (cid:96) = (cid:115) P (cid:96) + (cid:98) V (cid:96) (cid:98) V (cid:96) , (cid:96) = 1 , . . . , n, or λ (cid:96) = − (cid:115) P (cid:96) + (cid:98) V (cid:96) (cid:98) V (cid:96) , (cid:96) = 1 , . . . , n. Then, since diag (cid:16) (cid:98) V , . . . , (cid:98) V n (cid:17) (cid:31) , and diag ( P , . . . , P n ) (cid:23) , we have | λ (cid:96) | = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:115) P (cid:96) + (cid:98) V (cid:96) (cid:98) V (cid:96) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≥ , (cid:96) = 1 , . . . , n. Therefore, according to Proposition 1 and Proposition 2, when C is given by (26), it follows that (cid:98) C = I n × n while (22) holds.n the other hand, it may be veriﬁed that P = U (cid:98) v Λ (cid:101) x U T (cid:98) v isthe (non-zero) positive semi-deﬁnite solution to (22) as U (cid:98) v Λ (cid:101) x U T (cid:98) v − AU (cid:98) v Λ (cid:101) x U T (cid:98) v A T + AU (cid:98) v Λ (cid:101) x U T (cid:98) v (cid:16) U (cid:98) v Λ (cid:101) x U T (cid:98) v + (cid:98) V (cid:17) − U (cid:98) v Λ (cid:101) x U T (cid:98) v A T = U (cid:98) v Λ (cid:101) x U T (cid:98) v − U (cid:98) v Λ U T (cid:98) v U (cid:98) v Λ (cid:101) x U T (cid:98) v U (cid:98) v Λ U T (cid:98) v + U (cid:98) v Λ U T (cid:98) v U (cid:98) v Λ (cid:101) x U T (cid:98) v (cid:0) U (cid:98) v Λ (cid:101) x U T (cid:98) v + U (cid:98) v Λ (cid:98) v U T (cid:98) v (cid:1) − × U (cid:98) v Λ (cid:101) x U T (cid:98) v U (cid:98) v Λ U T (cid:98) v = U (cid:98) v Λ (cid:101) x U T (cid:98) v − U (cid:98) v ΛΛ (cid:101) x Λ U T (cid:98) v + U (cid:98) v ΛΛ (cid:101) x (Λ (cid:101) x + Λ (cid:98) v ) − Λ (cid:101) x Λ U T (cid:98) v = U (cid:98) v (cid:104) Λ (cid:101) x − ΛΛ (cid:101) x Λ + ΛΛ (cid:101) x (Λ (cid:101) x + Λ (cid:98) v ) − Λ (cid:101) x Λ (cid:105) U T (cid:98) v = U (cid:98) v (cid:104) Λ (cid:101) x − ΛΛ (cid:101) x (Λ (cid:101) x + Λ (cid:98) v ) − (Λ (cid:101) x + Λ (cid:98) v ) Λ+ ΛΛ (cid:101) x (Λ (cid:101) x + Λ (cid:98) v ) − Λ (cid:101) x Λ (cid:105) U T (cid:98) v = U (cid:98) v (cid:104) Λ (cid:101) x − ΛΛ (cid:101) x (Λ (cid:101) x + Λ (cid:98) v ) − Λ (cid:98) v Λ (cid:105) U T (cid:98) v = U (cid:98) v (cid:104) Λ (cid:101) x − Λ (cid:101) x (Λ (cid:101) x + Λ (cid:98) v ) − Λ (cid:98) v Λ (cid:105) U T (cid:98) v = U (cid:98) v (cid:104) Λ (cid:101) x − Λ (cid:101) x (Λ (cid:101) x + Λ (cid:98) v ) − Λ (cid:98) v (Λ (cid:101) x + Λ (cid:98) v ) Λ − (cid:98) v (cid:105) U T (cid:98) v = U (cid:98) v (cid:104) Λ (cid:101) x − Λ (cid:101) x (cid:105) U T (cid:98) v = 0 . (Note also that clearly P = 0 is the other positive semi-deﬁnitesolution to (22), which is not relevant herein though.) Thiscompletes the proof.Note that herein Λ and P can respectively be rewritten as Λ = ±  (cid:113) P (cid:98) V · · · ... . . . ... · · · (cid:113) P n (cid:98) V n  , (30)and P = U (cid:98) v  P · · · ... . . . ... · · · P n  U T (cid:98) v . (31)Note also that in the special case when { v k } is a white noise,i.e., when F i = G j = 0 in (8), (26) reduces to C = I n × n . (32)On the other hand, we may obtain an equivalent form ofthe system in Fig. 3 as given by (17). Proposition 3:

The system in Fig. 3 is equivalent to that inFig. 4, where K ( z ) is dynamic and is given by K ( z ) = F − ( z ) (cid:98) K = (cid:32) I − p (cid:88) i =1 F i z − i (cid:33)  I + q (cid:88) j =1 G j z − j  − (cid:98) K. (33) _ k x k  y k  e k u z − k + x k v A C ( ) K z ( )

F z ˆ k v Fig. 4. The steady-state integrated Kalman ﬁlter for colored noises: Equivalentform. _ k x k e k u z − k + x ˆ k v A ( ) K z ( )

F z k y ˆ K ˆ C Fig. 5. The steady-state integrated Kalman ﬁlter for colored noises: Equivalentform 2.

Herein, (cid:98) K is given by (16). More speciﬁcally, the system inFig. 4 is given by  (cid:101) x k +1 = A (cid:101) x k + u k , y (cid:48) k = C (cid:101) x k , e (cid:48) k = − y (cid:48) k + v k , u k = (cid:98) K (cid:0) e (cid:48) k − (cid:80) pi =1 F i e (cid:48) k − i (cid:1) − (cid:80) qj =1 G j u k − j , (34)which is stable as a closed-loop system. Proof:

Note ﬁrst that the system of Fig. 3 is equivalentto the one of Fig. 5, since (cid:98) K = F ( z ) K ( z ) . In addition, it isknown from the proof of Proposition 1 that (cid:101) y k = (cid:98) C (cid:101) x k = (cid:32) C − ∞ (cid:88) i =1 H i CA − i (cid:33) (cid:101) x k . As such, since (cid:101) x k = x k − x k = A ( x k − − x k − ) = A (cid:101) x k − , we have (cid:32) C − ∞ (cid:88) i =1 H i CA − i (cid:33) (cid:101) x k = (cid:32) C − ∞ (cid:88) i =1 H i Cz − i (cid:33) (cid:101) x k = (cid:32) I − p (cid:88) i =1 F i z − i (cid:33)  I + q (cid:88) j =1 G j z − j  − C (cid:101) x k = (cid:32) I − ∞ (cid:88) i =1 H i z − i (cid:33) C (cid:101) x k = F − ( z ) C (cid:101) x k . Consequently, the system of Fig. 5 is equivalent to that ofFig. 6. Moreover, since all the sub-systems are linear, thesystem of Fig. 6 is equivalent to that of Fig. 7, which in turn k x k y k e k u z − k + x ˆ k v A C ( ) K z ( )

F z ( ) F z − Fig. 6. The steady-state integrated Kalman ﬁlter for colored noises: Equivalentform 3. _ k x k  y k  e k u z − k + x k v A C ( ) K z ( )

F z ˆ k v ( ) F z − ( ) F z

Fig. 7. The steady-state integrated Kalman ﬁlter for colored noises: Equivalentform 4. equals to the one of Fig. 4; note that herein F ( z ) is stable andminimum-phase, and thus there will be no issues caused bycancellations of unstable poles and nonminimum-phase zeros.Meanwhile, the closed-loop stability of the system given in(34) and Fig. 4 is the same as that of the system given by(17) and Fig. 3, since they are essentially the same feedbacksystem.As a matter of fact, in the system of Fig. 4, or equivalently,in the system of Fig. 8, we may view e (cid:48) k = − y (cid:48) k + v k (35)as a feedback channel [2], [5] with additive colored Gaussiannoise { v k } , whereas {− y (cid:48) k } is the channel input while { e (cid:48) k } is the channel output. Note that in Fig. 8, L ( z ) = C ( zI − A ) − K ( z ) (36)may be viewed as a particular class of feedback coding. On theother hand, with the notations in (35), the feedback capacity _ k x k  y k  e k u z − k + x k v A C ( ) K z ( )

F z ˆ k v ( ) L z

Fig. 8. The steady-state integrated Kalman ﬁlter for colored noises: Equivalentform 5. is given by (cf. the deﬁnition in (1)) C f = sup lim k →∞ k +1 (cid:80) ki =0 tr E (cid:104) ( − y (cid:48) i )( − y (cid:48) i ) T (cid:105) ≤ P [ h ∞ ( e (cid:48) ) − h ∞ ( v )] . (37)As such, if A and C are designed speciﬁcally as in Theorem 1,then (36) naturally provides a class of sub-optimal feedbackcoding scheme, by which the corresponding h ∞ ( e (cid:48) ) − h ∞ ( v ) that can be achieved is thus a lower bound of (37).In this view, the following lower bound of feedback capacitycan be obtained. Theorem 2:

Suppose that in (8), (cid:98) V = U (cid:98) v Λ (cid:98) v U T (cid:98) v , where Λ (cid:98) v = diag (cid:16) (cid:98) V , . . . , (cid:98) V n (cid:17) , (38)and U (cid:98) v is an orthogonal matrix. Then, a lower bound of thefeedback capacity with power constraint P is given by max P ≥ ,...,P n ≥ n (cid:88) (cid:96) =1

12 log (cid:18) P (cid:96) (cid:98) V (cid:96) (cid:19) , (39)where P , . . . , P n satisfy tr  CU (cid:98) v  P · · · ... . . . ... · · · P n  U T (cid:98) v C T  = P . (40)Herein, C = vec − n × n (cid:34) (cid:0) U − T (cid:98) v ⊗ I n × n (cid:1)  I n × n + q (cid:88) j =1 Λ − j ⊗ G j  × (cid:32) I n × n − p (cid:88) i =1 Λ − i ⊗ F i (cid:33) − (cid:0) U T (cid:98) v ⊗ I n × n (cid:1) vec ( I n × n ) (cid:35) , (41)where Λ = ±  (cid:113) P (cid:98) V · · · ... . . . ... · · · (cid:113) P n (cid:98) V n  . (42) Proof:

To start with, suppose that A and C are speciﬁcallydesigned as in Theorem 1. In this case, it is known fromTheorem 3 that the system in (34) is stable. Note then that(34) implies − Y (cid:48) ( z ) = − C ( zI − A ) − K ( z ) × (cid:104) I + C ( zI − A ) − K ( z ) (cid:105) − V ( z ) , and thus − C ( zI − A ) − K ( z ) (cid:104) I + C ( zI − A ) − K ( z ) (cid:105) − (43)is stable. Accordingly, since { v k } is stationary Gaussian, {− y (cid:48) k } is also stationary Gaussian. On the other hand, it holdsthat E (cid:48) ( z ) = (cid:104) I + C ( zI − A ) − K ( z ) (cid:105) − V ( z ) , nd as a consequence (cf. discussions in [5], [18]), h ∞ ( e (cid:48) ) − h ∞ ( v )= 12 π (cid:90) π − π log det (cid:12)(cid:12)(cid:12)(cid:12)(cid:104) I + C (cid:0) e j ω I − A (cid:1) − K (cid:0) e j ω (cid:1)(cid:105) − (cid:12)(cid:12)(cid:12)(cid:12) d ω = 12 π (cid:90) π − π log det (cid:12)(cid:12)(cid:12)(cid:12)(cid:104) I + C (cid:0) e j ω I − A (cid:1) − F − (cid:0) e j ω (cid:1) (cid:98) K (cid:105) − (cid:12)(cid:12)(cid:12)(cid:12) d ω = n (cid:88) (cid:96) =1 log | λ (cid:96) | = n (cid:88) (cid:96) =1 log (cid:115) P (cid:96) (cid:98) V (cid:96) = n (cid:88) (cid:96) =1

12 log (cid:18) P (cid:96) (cid:98) V (cid:96) (cid:19) , where the ﬁrst equality may be referred to [21], [22] whilethe third equality follows as a result of the Bode integral orJensen’s formula [21], [28]. Note that herein we have used thefact that F − ( z ) is stable and minimum-phase, ( A, C ) is de-tectable (thus the set of unstable poles of C ( zI − A ) − K ( z ) is exactly the same as the set of eigenvalues of A withmagnitude greater than or equal to ; see, e.g., discussionsin [29]), and (cid:104) I + C ( zI − A ) − K ( z ) (cid:105) − is stable. Consequently, according to the deﬁnition of feedbackcapacity given in (37), it holds that C f ≥ h ∞ ( e (cid:48) ) − h ∞ ( v ) = n (cid:88) (cid:96) =1

12 log (cid:18) P (cid:96) (cid:98) V (cid:96) (cid:19) , when the corresponding lim k →∞ k + 1 k (cid:88) i =0 tr E (cid:104) ( − y (cid:48) i ) ( − y (cid:48) i ) T (cid:105) = tr E (cid:104) ( − y (cid:48) k ) ( − y (cid:48) k ) T (cid:105) = tr E (cid:104) ( y (cid:48) k ) ( y (cid:48) k ) T (cid:105) = tr E (cid:104) ( C (cid:101) x k ) ( C (cid:101) x k ) T (cid:105) = tr E (cid:2) C (cid:101) x k (cid:101) x T k C T (cid:3) = tr (cid:110) C E (cid:104) ( x k − x k ) ( x k − x k ) T (cid:105) C T (cid:111) = tr (cid:0) CP C T (cid:1) is less than the power constraint P , i.e., when (see Theorem 1) tr (cid:0) CP C T (cid:1) = tr  CU (cid:98) v  P · · · ... . . . ... · · · P n  U T (cid:98) v C T  ≤ P .

Note that herein we have used the fact that {− y (cid:48) k } is station-ary. In particular, we may pick the allocation P , . . . , P n thatmaximizes n (cid:88) (cid:96) =1

12 log (cid:18) P (cid:96) (cid:98) V (cid:96) (cid:19) while satisfying tr  CU (cid:98) v  P · · · ... . . . ... · · · P n  U T (cid:98) v C T  = P .

This completes the proof. Note that the solution to (39) is essentially a power alloca-tion policy with feedback. Note also that he lower bound inTheorem 2 is equal to max a ≥ ,...,a n ≥ n (cid:88) (cid:96) =1 log a (cid:96) , (44)where a , . . . , a n satisfy tr  CU (cid:98) v  a − · · · ... . . . ... · · · a n −  U T (cid:98) v C T  = P . (45)Herein, C is given by (41), where Λ = ±  a · · · ... . . . ... · · · a n  . (46)We now consider the case of independent parallel channels. Corollary 1:

Suppose that in (8), (cid:98) V = diag (cid:16) (cid:98) V , . . . , (cid:98) V n (cid:17) , (47)and F i = diag ( f i , . . . , f in ) , i = 1 , . . . , p, (48)while G j = diag ( g j , . . . , g jn ) , j = 1 , . . . , q, (49)which essentially model a parallel of independent ARMAnoises. Then, a lower bound of the feedback capacity withpower constraint P is given by max P ≥ ,...,P n ≥ n (cid:88) (cid:96) =1

12 log (cid:18) P (cid:96) (cid:98) V (cid:96) (cid:19) , (50)where P , . . . , P n satisfy  n (cid:88) (cid:96) =1 (cid:32) (cid:80) qj =1 g j(cid:96) a − j(cid:96) − (cid:80) pi =1 f i(cid:96) a − i(cid:96) (cid:33) P (cid:96)  = P , (51)or n (cid:88) (cid:96) =1 (cid:34) (cid:80) qj =1 g j(cid:96) ( − a (cid:96) ) − j − (cid:80) pi =1 f i(cid:96) ( − a (cid:96) ) − i (cid:35) P (cid:96)  = P . (52)Herein, a (cid:96) = (cid:115) P (cid:96) (cid:98) V (cid:96) , (cid:96) = 1 , . . . , n. (53) roof: Note ﬁrst that in this case, U (cid:98) v = I and hence C = vec − n × n (cid:34) (cid:0) I − T n × n ⊗ I n × n (cid:1)  I n × n + q (cid:88) j =1 Λ − j ⊗ G j  × (cid:32) I n × n − p (cid:88) i =1 Λ − i ⊗ F i (cid:33) − (cid:0) I T n × n ⊗ I n × n (cid:1) vec ( I n × n ) (cid:35) = vec − n × n (cid:34)  I n × n + q (cid:88) j =1 Λ − j ⊗ G j  × (cid:32) I n × n − p (cid:88) i =1 Λ − i ⊗ F i (cid:33) − vec ( I n × n ) (cid:35) , while tr  CU (cid:98) v  P · · · ... . . . ... · · · P n  U T (cid:98) v C T  = tr  C  P · · · ... . . . ... · · · P n  C T  . As such, if

Λ = diag ( a , . . . , a n ) , where a , . . . , a n are given by (53), then similarly to theprocedures in the proof of Proposition 1, it can be obtainedthat  I n × n + q (cid:88) j =1 Λ − j ⊗ G j  (cid:32) I n × n − p (cid:88) i =1 Λ − i ⊗ F i (cid:33) − =  I n × n + (cid:80) qj =1 a − j G j · · · ... . . . ... · · · I n × n + (cid:80) qj =1 a − jn G j  ×  I n × n − (cid:80) pi =1 a − i F i · · · ... . . . ... · · · I n × n − (cid:80) pi =1 a − in F i  − . Meanwhile, in this case it holds for (cid:96) = 1 , . . . , n, that I n × n + q (cid:88) j =1 a − j(cid:96) G j =  (cid:80) qj =1 a − j(cid:96) g j · · · ... . . . ... · · · (cid:80) qj =1 a − j(cid:96) g jn  , and (cid:32) I n × n − p (cid:88) i =1 a − i(cid:96) F i (cid:33) − =  − (cid:80) pi =1 a − i(cid:96) f i · · · ... . . . ... · · · − (cid:80) pi =1 a − i(cid:96) f in  − . Thus, C = vec − n × n (cid:34)  I n × n + q (cid:88) j =1 Λ − j ⊗ G j  × (cid:32) I n × n − p (cid:88) i =1 Λ − i ⊗ F i (cid:33) − vec ( I n × n ) (cid:35) =  (cid:80) qj =1 g j a − j − (cid:80) pi =1 f i a − i · · · ... . . . ... · · · (cid:80) qj =1 g jn a − jn − (cid:80) pi =1 f in a − in  , and the power constraint becomes tr  C  P · · · ... . . . ... · · · P n  C T  = n (cid:88) (cid:96) =1 (cid:32) (cid:80) qj =1 g j(cid:96) a − j(cid:96) − (cid:80) pi =1 f i(cid:96) a − i(cid:96) (cid:33) P (cid:96)  = P .

Similarly, if

Λ = − diag ( a , . . . , a n ) , then it can be obtained that C =  (cid:80) qj =1 g j ( − a ) − j − (cid:80) pi =1 f i ( − a ) − i · · · ... . . . ... · · · (cid:80) qj =1 g jn ( − a n ) − j − (cid:80) pi =1 f in ( − a n ) − i  , and the power constraint becomes n (cid:88) (cid:96) =1 (cid:34) (cid:80) qj =1 g j ( − a ) − j − (cid:80) pi =1 f i ( − a ) − i (cid:35) P (cid:96)  = P .

This completes the proof.Equivalently, the lower bound in Corollary 1 can be rewrit-ten as max a ≥ ,...,a n ≥ n (cid:88) (cid:96) =1 log a (cid:96) , (54)where a , . . . , a n satisfy n (cid:88) (cid:96) =1 (cid:32) (cid:80) qj =1 g j(cid:96) a − j(cid:96) − (cid:80) pi =1 f i(cid:96) a − i(cid:96) (cid:33) (cid:0) a (cid:96) − (cid:1) (cid:98) V (cid:96)  = P , (55)or n (cid:88) (cid:96) =1 (cid:34) (cid:80) qj =1 g j(cid:96) ( − a (cid:96) ) − j − (cid:80) pi =1 f i(cid:96) ( − a (cid:96) ) − i (cid:35) (cid:0) a (cid:96) − (cid:1) (cid:98) V (cid:96)  = P . (56)We next consider some special cases in which Theorem 2(or Corollary 1) can be characterized more explicitly, includinga parallel of AWGN channels (Example 1) and a single ACGNchannel (Example 2). xample 1:

In the special case when { v k } is a whiteGaussian noise with covariance (cid:98) V , i.e., when F i = G j = 0 in (8), a lower bound of the feedback capacity with powerconstraint P is given by max P ≥ ,...,P n ≥ n (cid:88) (cid:96) =1

12 log (cid:18) P (cid:96) (cid:98) V (cid:96) (cid:19) , (57)where P , . . . , P n satisfy tr  U (cid:98) v  P · · · ... . . . ... · · · P n  U T (cid:98) v  = tr  P · · · ... . . . ... · · · P n  = n (cid:88) (cid:96) =1 P (cid:96) = P . (58)As a matter of fact, the lower bound is tight in this case[23] and the optimal power allocation solution is given bythe classical “water-ﬁlling” policy [23] as P (cid:96) = max (cid:110) , ζ − (cid:98) V (cid:96) (cid:111) , (cid:96) = 1 , . . . , n, (59)where ζ > satisﬁes n (cid:88) (cid:96) =1 P (cid:96) = n (cid:88) (cid:96) =1 max (cid:110) , ζ − (cid:98) V (cid:96) (cid:111) = P . (60)It is also worth mentioning that the lower bound in (57)can equivalently be rewritten as (cf. also discussions afterTheorem 2 for the general case) max a ≥ ,...,a n ≥ n (cid:88) (cid:96) =1 log a (cid:96) , (61)where a , . . . , a n satisfy n (cid:88) (cid:96) =1 (cid:104)(cid:0) a (cid:96) − (cid:1) (cid:98) V (cid:96) (cid:105) = P . (62)Correspondingly, the optimal “allocation” solution is given by a (cid:96) = (cid:115) max (cid:26) , ζ (cid:98) V (cid:96) (cid:27) , (cid:96) = 1 , . . . , n, (63)where ζ > satisﬁes n (cid:88) (cid:96) =1 (cid:104)(cid:0) a (cid:96) − (cid:1) (cid:98) V (cid:96) (cid:105) = n (cid:88) (cid:96) =1 max (cid:110) , ζ − (cid:98) V (cid:96) (cid:111) = P . (64)This provides an alternative perspective to look at the water-ﬁlling allocation, while also displaying more clearly the con-nections with lower bounds in other cases, e.g., that of thesubsequent Example 2.

Example 2:

For another special case, consider the scalarcase of n = 1 . In this case, (8) reduces to v k = p (cid:88) i =1 f i v k − i + (cid:98) v k + q (cid:88) j =1 g j (cid:98) v k − j , (65) where { (cid:98) v k } , (cid:98) v k ∈ R is white Gaussian with variance σ (cid:98) v > .Accordingly, Theorem 2 reduces to that a lower bound of thefeedback capacity with power constraint P is given by max P

12 log (cid:18) Pσ (cid:98) v (cid:19) , (66)where P satisﬁes (cid:32) (cid:80) qj =1 g j a − j − (cid:80) pi =1 f i a − i (cid:33) P = P , (67)or (cid:34) (cid:80) qj =1 g j ( − a ) − j − (cid:80) pi =1 f i ( − a ) − i (cid:35) P = P . (68)Herein, a = (cid:115) Pσ (cid:98) v . (69)It may then be veriﬁed that this lower bound can equivalentlybe rewritten as (cf. also discussions after Theorem 2 orCorollary 1) max a ≥ log a, (70)where a satisﬁes (cid:32) (cid:80) qj =1 g j a − j − (cid:80) pi =1 f i a − i (cid:33) (cid:0) a − (cid:1) σ (cid:98) v = P , (71)or (cid:34) (cid:80) qj =1 g j ( − a ) − j − (cid:80) pi =1 f i ( − a ) − i (cid:35) (cid:0) a − (cid:1) σ (cid:98) v = P . (72)In fact, (66) coincides with the lower bound in [16] given as max a ∈ R log | a | , (73)where a satisﬁes (cid:32) (cid:80) qj =1 g j a − j − (cid:80) pi =1 f i a − i (cid:33) (cid:0) a − (cid:1) σ (cid:98) v = P , (74)whereas this in turn reduces to the results in, e.g., [18] (seedetailed discussions in [16], which also relates to the formulaein, e.g., [2]).Note that (34) and Fig. 8 essentially provide a recursivecoding scheme to achieve the lower bound in Theorem 2. Thisis more clearly seen in Fig. 9, where L ( z ) is given by (36).More speciﬁcally, the recursive coding algorithm is given asfollows. Theorem 3:

Suppose the optimal solution to (39) is givenby P , . . . , P n . Then, one class of recursive coding schemeto achieve the lower bound in Theorem 2 is given by  (cid:101) x k +1 = A (cid:101) x k + u k , y (cid:48) k = C (cid:101) x k , e (cid:48) k = − y (cid:48) k + v k , u k = (cid:98) K (cid:0) e (cid:48) k − (cid:80) pi =1 F i e (cid:48) k − i (cid:1) − (cid:80) qj =1 G j u k − j . (75) v ( ) L z − ( ) F z ˆ k v Fig. 9. The steady-state integrated Kalman ﬁlter as a feedback coding scheme

Herein, A = U (cid:98) v Λ U T (cid:98) v and C is given by (41), where Λ isgiven by Λ = ±  (cid:113) P (cid:98) V · · · ... . . . ... · · · (cid:113) P n (cid:98) V n  . (76)In the case of parallel AWGN channels, Theorem 3 reducesto a recursive water-ﬁlling scheme. Example 3:

In the special case when { v k } is a white noise,i.e., when F i = G j = 0 in (8), the coding scheme of (75)reduces to  (cid:101) x k +1 = A (cid:101) x k + u k , y (cid:48) k = C (cid:101) x k , e (cid:48) k = − y (cid:48) k + v k , u k = (cid:98) K e (cid:48) k . (77)Herein, A = U (cid:98) v Λ U T (cid:98) v and C = I n × n , where Λ is given by Λ = ±  (cid:114) max (cid:110) , ζ (cid:98) V (cid:111) · · · ... . . . ... · · · (cid:114) max (cid:110) , ζ (cid:98) V n (cid:111)  , (78)and ζ > satisﬁes n (cid:88) (cid:96) =1 max (cid:110) , ζ − (cid:98) V (cid:96) (cid:111) = P . (79)Note that this is essentially a feedback (“closed-loop”) water-ﬁlling power allocation scheme, which is potentially more“robust” than the classical “open-loop” water-ﬁlling policy; cf.results in [30] for instance. We will, however, leave detaileddiscussions on this topic to future research.IV. C

ONCLUSION

In this paper, from the perspective of a variant of the Kalmanﬁlter, we have obtained lower bounds on the feedback capacityof parallel ACGN channels and the accompanying recursivecoding schemes in terms of power allocation policies withfeedback. Possible future research directions include investi-gating the tightness of the lower bounds, as well as the specialcases in which more explicit solutions (cf. water-ﬁlling) to thefeedback power allocation policies may be derived. R

EFERENCES[1] T. M. Cover and S. Pombra, “Gaussian feedback capacity,”

IEEETransactions on Information Theory , vol. 35, no. 1, pp. 37–43, 1989.[2] Y.-H. Kim, “Feedback capacity of stationary Gaussian channels,”

IEEETransactions on Information Theory , vol. 56, no. 1, pp. 57–85, 2010.[3] ——, “Feedback capacity of the ﬁrst-order moving average Gaussianchannel,”

IEEE Transactions on Information Theory , vol. 52, no. 7, pp.3063–3079, 2006.[4] ——, “Gaussian feedback capacity,” Ph.D. dissertation, Stanford Uni-versity, 2006.[5] E. Ardestanizadeh and M. Franceschetti, “Control-theoretic approachto communication with feedback,”

IEEE Transactions on AutomaticControl , vol. 57, no. 10, pp. 2576–2587, 2012.[6] J. Liu and N. Elia, “Convergence of fundamental limitations in feedbackcommunication, estimation, and feedback control over Gaussian chan-nels,”

Communications in Information and Systems , vol. 14, no. 3, pp.161–211, 2014.[7] J. Liu, N. Elia, and S. Tatikonda, “Capacity-achieving feedback schemesfor Gaussian ﬁnite-state Markov channels with channel state informa-tion,”

IEEE Transactions on Information Theory , vol. 61, no. 7, pp.3632–3650, 2015.[8] P. A. Stavrou, C. D. Charalambous, and C. K. Kourtellaris, “Sequentialnecessary and sufﬁcient conditions for capacity achieving distributionsof channels with memory and feedback,”

IEEE Transactions on Infor-mation Theory , vol. 63, no. 11, pp. 7095–7115, 2017.[9] T. Liu and G. Han, “Feedback capacity of stationary Gaussian channelsfurther examined,”

IEEE Transactions on Information Theory , vol. 65,no. 4, pp. 2492–2506, 2018.[10] C. Li and N. Elia, “Youla coding and computation of Gaussian feedbackcapacity,”

IEEE Transactions on Information Theory , vol. 64, no. 4, pp.3197–3215, 2018.[11] A. Rawat, N. Elia, and C. Li, “Computation of feedback capacity ofsingle user multi-antenna stationary Gaussian channel,” in

Proceedingsof the Annual Allerton Conference on Communication, Control, andComputing , 2018, pp. 1128–1135.[12] A. R. Pedram and T. Tanaka, “Some results on the computation offeedback capacity of Gaussian channels with memory,” in

Proceedingsof the Annual Allerton Conference on Communication, Control, andComputing (Allerton) , 2018, pp. 919–926.[13] C. K. Kourtellaris and C. D. Charalambous, “Information structures ofcapacity achieving distributions for feedback channels with memory andtransmission cost: Stochastic optimal control & variational equalities,”

IEEE Transactions on Information Theory , vol. 64, no. 7, pp. 4962–4992, 2018.[14] A. Gattami, “Feedback capacity of Gaussian channels revisited,”

IEEETransactions on Information Theory , vol. 65, no. 3, pp. 1948–1960,2018.[15] S. Ihara, “On the feedback capacity of the ﬁrst-order moving averageGaussian channel,”

Japanese Journal of Statistics and Data Science , pp.1–16, 2019.[16] S. Fang and Q. Zhu, “A connection between feedback capacity andKalman ﬁlter for colored Gaussian noises,” in

Proceedings of the IEEEInternational Symposium on Information Theory , 2020, pp. 2055–2060.[17] Z. Aharoni, D. Tsur, Z. Goldfeld, and H. H. Permuter, “Capacityof continuous channels with memory via directed information neuralestimator,” in

Proceedings of the IEEE International Symposium onInformation Theory , 2020, pp. 2014–2019.[18] N. Elia, “When Bode meets Shannon: Control-oriented feedback com-munication schemes,”

IEEE Transactions on Automatic Control , vol. 49,no. 9, pp. 1477–1488, 2004.[19] S. Yang, A. Kavcic, and S. Tatikonda, “On the feedback capacityof power-constrained Gaussian noise channels with memory,”

IEEETransactions on Information Theory , vol. 53, no. 3, pp. 929–954, 2007.[20] S. Tatikonda and S. Mitter, “The capacity of channels with feedback,”

IEEE Transactions on Information Theory , vol. 55, no. 1, pp. 323–349,2009.[21] S. Fang, J. Chen, and H. Ishii,

Towards Integrating Control andInformation Theories: From Information-Theoretic Measures to ControlPerformance Limitations . Springer, 2017.[22] A. Papoulis and S. U. Pillai,

Probability, Random Variables and Stochas-tic Processes . New York: McGraw-Hill, 2002.[23] T. M. Cover and J. A. Thomas,

Elements of Information Theory . JohnWiley & Sons, 2006.24] T. Kailath, A. H. Sayed, and B. Hassibi,

Linear Estimation . PrenticeHall, 2000.[25] B. D. O. Anderson and J. B. Moore,

Optimal Filtering . Prentice-Hall,1979.[26] K. J. Åström and R. M. Murray,

Feedback Systems: An Introduction forScientists and Engineers . Princeton University Press, 2010.[27] P. P. Vaidyanathan,

The Theory of Linear Prediction . Morgan &Claypool Publishers, 2007.[28] M. M. Seron, J. H. Braslavsky, and G. C. Goodwin,

Fundamental Limitations in Filtering and Control . Springer, 1997.[29] S. Fang, H. Ishii, and J. Chen, “An integral characterization of optimalerror covariance by Kalman ﬁltering,” in

Proceedings of the AmericanControl Conference , 2018, pp. 5031–5036.[30] S. L. Fong and V. Y. Tan, “A tight upper bound on the second-ordercoding rate of the parallel Gaussian channel with feedback,”