[PDF] Grassmannian Predictive Coding for Limited Feedback in Multiple Antenna Wireless Systems

Abstract

Limited feedback is a paradigm for the feedback of channel state information in wireless systems. In multiple antenna wireless systems, limited feedback usually entails quantizing a source that lives on the Grassmann manifold. Most work on limited feedback beamforming considered single-shot quantization. In wireless systems, however, the channel is temporally correlated, which can be used to reduce feedback requirements. Unfortunately, conventional predictive quantization does not incorporate the non-Euclidean structure of the Grassmann manifold. In this paper, we propose a Grassmannian predictive coding algorithm where the differential geometric structure of the Grassmann manifold is used to formulate a predictive vector quantization encoder and decoder. We analyze the quantization error and derive bounds on the distortion attained by the proposed algorithm. We apply the algorithm to a multiuser multiple-input multiple-output wireless system and show that it improves the achievable sum rate as the temporal correlation of the channel increases.

Full PDF

11 Grassmannian Predictive Coding for LimitedFeedback in Multiple Antenna Wireless Systems

Takao Inoue,

Member, IEEE and Robert W. Heath, Jr.,

Fellow, IEEE

Abstract —Limited feedback is a paradigm for the feedback ofchannel state information in wireless systems. In multiple antennawireless systems, limited feedback usually entails quantizing asource that lives on the Grassmann manifold. Most work onlimited feedback beamforming considered single-shot quantiza-tion. In wireless systems, however, the channel is temporallycorrelated, which can be used to reduce feedback require-ments. Unfortunately, conventional predictive quantization doesnot incorporate the non-Euclidean structure of the Grassmannmanifold. In this paper, we propose a Grassmannian predictivecoding algorithm where the differential geometric structure ofthe Grassmann manifold is used to formulate a predictive vectorquantization encoder and decoder. We analyze the quantizationerror and derive bounds on the distortion attained by theproposed algorithm. We apply the algorithm to a multiusermultiple-input multiple-output wireless system and show that itimproves the achievable sum rate as the temporal correlation ofthe channel increases.

Index Terms —Prediction methods, correlation, feedback com-munication, MIMO systems, quantization, vector quantization.

I. I

NTRODUCTION

Multiple antenna wireless communication systems can im-prove throughput and reliability when channel state infor-mation (CSI) is known at the transmitter. Limited feedbackis a ﬂexible approach for providing quantized channel stateinformation from the receiver to the transmitter. Most priorwork on limited feedback use one-shot feedback that makesan instantaneous channel measurement and sends back thequantized CSI without memory. In a mobile environment,however, the channel exhibits coherence over time that maybe exploited to improve the resolution of the quantized CSI atthe transmitter.Predictive vector quantization (PVQ) is a class of memorybased coding techniques used in applications such as speech,image, and video processing [1]–[4]. In PVQ, the error signalbetween the current observed vector and the predicted vectorbased on past observations is quantized. When the observeddata to be encoded are correlated, usually in time or space,quantizing the error signal leads to lower distortion comparedwith memoryless vector quantization [3]. The effectivenessof PVQ rests on the correlation exhibited by the data, the

This material is based in part upon work supported by the National ScienceFoundation under grant CCF-830615. This work has appeared in part in the2011 IEEE Int. Conf. on Acoustics, Speech and Signal Process.Takao Inoue is with National Instruments, 11500 N. Mopac Expwy, Austin,TX 78759 USA. Email: [email protected] W. Heath, Jr. is with The University of Texas at Austin, Departmentof Electrical and Computer Engineering, Wireless Networking and Commu-nication Group, 1 University Station C0803, Austin, TX, 78712-0240 USA.Email: [email protected]. prediction function, and the quantization technique employed.Due to temporal correlation in the propagation channel, it isnatural to consider predictive coding approach for encodingCSI in temporally correlated channels. Classical PVQ has beenapplied for signals in linear vector space where the usual dif-ference, addition, and prediction are well understood. Unfor-tunately, in multiple antenna limited feedback beamforming inwireless communication, the CSI to be encoded lives often onthe Grassmann manifold. The Grassmann manifold, denoted G n,p , is the set of p -dimensional subspaces of n -dimensionalEuclidean space. Because it is a nonlinear manifold, extendingclassical PVQ is challenging since the usual linear operations,not to mention important functions like prediction, are not welldeﬁned.Motivated by applications in multiple-input multiple-output(MIMO) wireless communication, there has been researchin analyzing [5], quantizing [6]–[8], and coding [9]–[11] onthe Grassmann manifold driven in part by applications tocommercial wireless systems [12], [13]. Prior work exists fordesigning suitable memoryless quantization codebooks suchas Grassmannian line packing [14], vector quantization [15],Grassmannian frames [16], and Kerdock codebooks [17] (e.g.,also see the references in [18]). Several techniques have beenpreviously proposed to exploit the temporal correlation of thepropagation channel [19]–[27]. In [19], [20], modeling thefeedback state transitions allow the net feedback rate to bereduced. The resolution, however, is ﬁxed by the codebooksize. To improve the quantization error, an adaptive codebookapproach was proposed that can adapt to a given channeldistribution [21]. Additional feedback overhead to retrain orsynchronize the pre-computed codebooks may be needed whenthe channel distribution changes. Alternatively, a hierarchicalcodebook strategy uses two codebooks, coarse and ﬁne, forlayered feedback in temporally correlated channel [22], [23].A codeword describing the coarse encoding region is updatedinfrequently and a ﬁner local codebook is used for frequentfeedback. A more ﬂexible approach is to use a progressivereﬁnement strategy in which rotation and scaling are appliedto structured codebook so as to provide high resolutionfeedback [24], [25]. An approach related to our paper isthe complex Householder transform based PVQ-like tech-nique for correlated normalized channel vectors in multiple-input single-output communication systems [26]. The currentvector channel is decomposed into previous vector channeland weighted sum of orthogonal subspaces to represent thetemporal variation. While the algorithm is presented in theform of PVQ, the actual operation is successive decompositionand projection using the complex Householder transform withunit delay which was shown to be optimal for the speciﬁc a r X i v : . [ c s . I T ] M a y application. A differential feedback approach using a rotationcodebook has been proposed for spatial multiplexing system[27]. They require long term correlation statistics to designsuitable codebook and exploit the structure of the Rieman-nian manifold. Unfortunately, the codebooks are speciﬁc tothe given long term statistics and may become outdated.The Grassmannian predictive coding technique proposed inthis paper was presented in part in [28]. We proposed theGrassmannian predictive coding algorithm applied to multiuserMIMO system. We did not, however, provide the details ofderivation nor consider an efﬁcient codebook representationand a distortion analysis.In this paper, we propose a predictive coding algorithm forcorrelated data on the Grassmann manifold, which we callthe Grassmannian predictive coding (GPC) algorithm. TheGPC algorithm is derived using the intrinsic geometry ofthe manifold and corresponding mathematical operations thatrespect the curved manifold structure. The main contributionsof this paper are as follows. • Grassmannian predictive coding algorithm : We proposea framework for predictive coding on the Grassmannmanifold. The key idea of our approach is to use the tangent vector to establish the notion of a differencebetween points on the manifold. The proposed predictionfunction uses parallel transport as a one step prediction.The prediction step uses the immediate past difference;formulating higher order prediction function remains forfuture work. The concepts of tangent vector and paralleltransport have been used in [29] for optimization prob-lems, but have not been exploited to develop a predictivecoding concept. • Efﬁcient codebook structure : A design of tangent spacecodebook using Lloyd algorithm is proposed. The code-book lives in the tangent space of C N t with magnitudedependent on the correlation exhibited by the channel andprediction function. An efﬁcient codebook storage strat-egy is proposed exploiting the direction and magnitudedecomposition of the tangent space vector. • Distortion bounds:

Based on a geometric interpretation ofour GPC algorithm, a simple model of the quantizationregion is obtained. Using metric volume computationson the Grassmann manifold [7], lower and upper boundson the quantization error are derived. We compare theobtained bounds with distortion obtained in simulations.Furthermore, we show that the distortion for the proposedGPC algorithm is lower than the lower bound of memo-ryless quantizer distortion for a given codebook size. • Application to limited feedback multiuser MIMO systems :We apply the GPC algorithm for limited feedback zero-forcing multiuser MIMO systems with multiple transmitantennas and a single receive antenna at each mobileterminal [30]. We show that the proposed GPC algorithmprovides substantial sum rate improvement over memo-ryless random codebook technique with same feedbackrate [30]. The sum rate improvement, however, dependson the channel correlation. When the channel is highlycorrelated, the proposed GPC algorithm is shown to provide sum rates close to a system with perfect CSIat the transmitter, i.e., inﬁnite feedback.

Notation:

We use lower case bold letters, e.g., v , to denotevectors and upper case bold letters, e.g., H , to denote matrices.A 2-norm is denoted by (cid:107)·(cid:107) and a normalized vector is denotedby (cid:126) v = v / (cid:107) v (cid:107) . The n × n identity matrix is denoted by I n .The space of integers and complex numbers are denoted by N and C , respectively, with an appropriate superscript to denotethe dimension of the respective spaces. We use T , ∗ , and † to denote the transposition, Hermitian transpose, and pseudoinverse, respectively. The n -th column entry of a matrix A isdenoted by [ A ] : ,n . The expectation is denoted E [ · ] .II. S YSTEM M ODEL

In this paper we apply GPC algorithm to limited feedbackmultiuser MIMO communication. It can also be applied tosingle user MIMO and to multi-cell MIMO systems. Multipleuser MIMO is a challenging application of limited feedback asit requires high resolution quantization [30] and is known to besensitive to channel variations [31]. We consider a multiuserlimited feedback system with N t transmit antennas at the basestation and U ≤ N t mobile users each equipped with a singlereceive antenna. To isolate the impact of using predictive cod-ing for limited feedback, we assume that U users are scheduleda priori from possibly large number of user pool; we do notconsider scheduling or the effects of multiuser diversity in thispaper. Let s u [ k ] , v u [ k ] , and h u [ k ] be the complex transmitsymbol, N t × unit norm beamforming vector, and N t × channel vector for u -th user at time index k , respectively. Weassume that the transmit vector s = [ s [ k ] · · · s U [ k ]] T satisﬁesthe total transmit power constraint E [ (cid:107) s (cid:107) ] ≤ P . Then, theinput-output relationship for u -th user may be written as y u [ k ] = h ∗ u [ k ] v u [ k ] s u [ k ] + h ∗ u [ k ] U (cid:88) n =1 ,n (cid:54) = u v n [ k ] s n [ k ] + n u [ k ] (1)where n u is an independent identically distributed (i.i.d.) zeromean complex Gaussian noise with unit variance at user u . Theﬁrst term in (1) is the desired signal for u -th user while thesecond summation term is the interference signal. The signalto interference plus noise ratio (SINR) for the u -th user canbe written as SINR u = PU | h ∗ u v u | (cid:80) n (cid:54) = u PN t | h ∗ u v n | . (2)If the transmit signal s u is assumed to be Gaussian, theachievable rate for user u is given by R u = log (1 + SINR u ) (3)and the sum rate as R = (cid:80) Uu =1 R u .The SINR expression (2) shows that the amount of inter-ference depends on the design of the beamforming vectors.Zero forcing uses beamforming vectors such that they areorthogonal to other user’s channel vectors, i.e., h u [ k ] v u [ k ] = 0 for n (cid:54) = u , to null the inter user interference [32]. Let H = [ h · · · h u ] ∗ be the U × N t composite channel ma-trix. With perfect CSI, the interference can be completely eliminated by choosing the unit norm beamforming vector asthe normalized columns of pseudo inverse composite channelmatrix, i.e., v u = [ H † ] : ,u / (cid:107) [ H † ] : ,u (cid:107) . Zero forcing creates U interference free parallel channels providing nearly linearthroughput increase as a function of number of users but withsome power loss due to normalization [33].In limited feedback multiuser MIMO systems, quantizedCSI is fed back to the transmitter from each user [30], [31].Assuming that a perfect channel estimate h u is obtained,we consider the quantization of the channel direction g u = h u / (cid:107) h u (cid:107) and assume that the scalar channel gain is knownperfectly [31]. We assume that the channel gain is dependenton the longer term statistics that varies much slower thanthe channel direction. Since the channel gain is a real valuedquantity that is easier to feedback, we assume that the channelgain is known perfectly at the transmitter and consider theeffects of the channel shape quantization only [31]. In thisregime, the SINR can be rewritten asSINR u = PN t (cid:107) h u (cid:107) | g ∗ u v u | (cid:80) n (cid:54) = u PN t (cid:107) h u (cid:107) | g ∗ u v n | . (4)We make two observation from (4). First, if the channel vector h u is an i.i.d. vector distributed according to CN (0 , , g u isisotropically distributed on the N t -dimensional hyper-sphere.Second, due to the absolute value around g ∗ u v u , the SINRis independent of arbitrary unitary rotations of the channeldirection. That is, | g ∗ u v u | = | e jθ g ∗ u v u | for θ ∈ (0 , π ] .Therefore, we may identify the space of channel shape asthe Grassmannian manifold. Thus, the problem of transmitbeamformer design is to feedback channel shapes on theGrassmann manifold from each user u , and use the collectedchannel shape information at the transmitter to design thebeamforming vectors by zero forcing.In conventional codebook based limited feedback multiuserMIMO systems, each user has a normalized channel vectorcodebook of size N RC which is shared with the transmit-ter [30], [31]. The transmitter maintains U tables of size N RC codebooks. Each user selects the codeword with minimumchordal distance from the normalized channel vector estimate.The index of the selected codeword using log ( N RC ) bits is fedback to the transmitter. The transmitter collects the decodedchannel vectors ˆ h u for each user u to form the compositechannel matrix ˆ H = [ˆ h u · · · ˆ h u ] ∗ . The beamforming vectorsare computed as ˆ v u = [ ˆ H † ] : ,u / (cid:107) [ ˆ H † ] : ,u (cid:107) . Using a randomcodebook, it was shown in [30] that sum rate performancebecomes interference limited as signal to noise ratio (SNR)increases and that codebook size needs to be increased lin-early as a function of SNR, in dB, to maintain multiplexinggain. Herein, lies the practical limitation of the conventionalcodebook approach: the codebook size that approaches theachievable sum rate becomes impractical even for moderateSNR. The proposed GPC algorithm overcomes this problem.III. G RASSMANN M ANIFOLD : P

RELIMINARIES

The geometric and linear algebraic properties of the Grass-mann manifold will be fundamental in derivation of ourproposed algorithm. In this section we review key deﬁnitions, properties, and mathematical tools pertaining to designingalgorithms for the Grassmann manifold. Then we propose apredictor on the Grassmann manifold built from the tangentvector, mapping from the tangent onto the manifold, andparallel transport.Let U n = { X ∈ C n × n : X ∗ X = I n } be the unitary groupformed by n × n unitary matrices. For p < n , the Grassmannmanifold, G n,p , is the set of subspaces spanned by the columnsof the quotient group U n / U n − p . It may also be identiﬁed as thequotient space of the unitary group, U n / ( U n − p × U p ) . A point X ∈ G n,p may be considered as an equivalence class, i.e., [ X ] := { XU p : U p ∈ U p } . For notational brevity, we denote X ∈ G n,p to mean the equivalence class of matrices whosecolumns span the same p -dimensional subspace. For numericalcomputation, we understand X ∈ G n,p to be one representativeof the equivalence class. The Grassmann manifold is a smoothtopological manifold with a locally Euclidean property andsmooth tangent space structure [34], both of which will beessential in the derivation of the proposed algorithm. In thispaper, we consider the Grassmann manifold G n, ; the generalcase of p > is a topic of future work.Let the inner product of x , y ∈ G n, be denoted by ρ = x ∗ y . Let θ = cos − ( | ρ | ) be the subspace angle between x and y [35]. The chordal distance metric for G n, is given by [29],[36] d ( x , y ) = (cid:112) − | ρ | = | sin θ | . (5)For notational brevity, we use d without the arguments whenthere is no confusion. Unlike the arc length, given by | θ | ,the chordal distance is differentiable everywhere and providesa close approximation of the arc length when the points areclose [37].Using the chordal distance metric, we deﬁne the correlationof two sequences { x [ k ] } k ∈ N , { y [ i ] } i ∈ N ∈ G n, by ζ x , y [ n ] = E k [ d ( x [ k ] , y [ k + n ])] which can be interpreted as the meanchordal distance between two sequences on the Grassmannmanifold.Based on the smooth manifold structure of the Grassmannmanifold, it is possible to relate two points x [ k ] , x [ k + 1] ∈G n, by considering the tangent vector emanating from x [ k ] to x [ k + 1] . Fig. 1 illustrates the concept. The tangent has beenused successfully in the development of Newton and conjugategradient algorithms with orthogonality constraints [29], [38]–[40]. We utilize the tangent relationship for its computationalbeneﬁts and geometric insight to the problem. Lemma 1 (Tangent): If x [ k ] , x [ k + 1] ∈ G n, , then thetangent vector emanating from x [ k ] to x [ k + 1] is e = tan − (cid:18) d | ρ | (cid:19) x [ k + 1] /ρ − x [ k ] (cid:107) x [ k + 1] /ρ − x [ k ] (cid:107) (6)such that (cid:107) e (cid:107) = tan − ( d/ | ρ | ) is the arc length between x [ k ] and x [ k + 1] and (cid:126) e = x [ k + 1] /ρ − x [ k ] d/ | ρ | is the unit tangent direction vector. Proof:

See Appendix A.

Lemma 1 provides a compact formula for the tangent vectorrelating points x [ k ] and x [ k + 1] on G n, . For notationalbrevity, we denote e = L ( x [ k ] , x [ k + 1]) . The tangent vectorcan be interpreted as a length preserving unwrapping of thearc between x [ k ] and x [ k + 1] onto the tangent space at x [ k ] . Furthermore, it is conveniently expressed as the productof a magnitude component and the normalized directionalcomponent. The decomposition will be exploited in codebookdesign for efﬁcient storage.The tangent vector describes the shortest distance path alongthe arc from x [ k ] to x [ k + 1] , called the geodesic [29]. Thegeodesic can be parameterized by a single parameter t ∈ [0 , using the tangent vector as the next lemma shows. Lemma 2 (Geodesic): If x [ k ] , x [ k + 1] ∈ G n, , e , (cid:107) e (cid:107) , and (cid:126) e are the tangent vector emanating from x [ k ] to x [ k + 1] , thenorm of the tangent vector, and the normalized tangent vector,respectively, then the geodesic path between x [ k ] and x [ k + 1] is G ( x [ k ] , e , t ) = x [ k ] cos( (cid:107) e (cid:107) t ) + (cid:126) e sin( (cid:107) e (cid:107) t ) (7)for t ∈ [0 , such that G ( x [ k ] , e ,

0) = x [ k ] and G ( x [ k ] , e ,

1) = x [ k + 1] . Proof:

See Appendix B.Lemma 2 provides a convenient formula to relate pointsbetween x [ k ] and x [ k + 1] in terms of the tangent vector andthe step size t . To introduce the notion of prediction, we usethe tangent vector with respect to x [ k + 1] such that it extendsthe geodesic path from x [ k ] and x [ k +1] . The translation of thetangent vector along the Grassmann manifold is accomplishedby the parallel transport . Lemma 3 (Parallel Transport):

Let x [ k ] , x [ k + 1] ∈ G n, and e be the tangent vector emanating from x [ k ] to x [ k + 1] .Then, the parallel transported tangent vector emanating from x [ k + 1] along the geodesic direction e is ˆ e = tan − (cid:18) d | ρ | (cid:19) x [ k + 1] ρ ∗ − x [ k ] d . (8) Proof:

See Appendix C.Note that the general expression in [29] involves singularvalue decomposition (SVD) which is typically expensive forimplementation. A compact form without an SVD on G n, has not appeared in the literature before to the best of ourknowledge. Thus Lemma 3 provides a convenient expressionfor transporting the base of the tangent vector from x [ k ] to x [ k + 1] . It can be interpreted as transforming the tangentvector onto another tangent space connected by the geodesic.Using the concepts of the tangent vector, geodesic, andparallel transport, we propose a one step prediction for G n, . Deﬁnition 4 (One Step Grassmannian Prediction):

Let x [ k ] , x [ k − ∈ G n, . The one step predicted vector ˜ x ∈ G n, along the geodesic direction from x [ k − to x [ k ] is ˜ x [ k + 1] = | ρ | x [ k ] + ρ ∗ x [ k ] − x [ k − (9)such that d ( x [ k ] , ˜ x [ k + 1]) = d ( x [ k − , x [ k ]) .See Appendix D for a detailed derivation. It is surprisingthat the predicted vector ˜ x [ k + 1] can be computed by theknowledge of x [ k − and x [ k ] using linear operations and the result remains on the Grassmann manifold. This simpliﬁcationonly happens for the case of taking a full step using t = 1 .It is also possible to consider smaller steps t < as well asadaptive step sizes, but we defer this to future work.IV. G RASSMANNIAN P REDICTIVE C ODING

In this section, we describe the proposed GPC algorithm.First, a general overview of the algorithm is provided. Second,the codebook design for encoding the error tangent vector isdescribed. Finally, strategies for initialization are considered.

A. GPC Algorithm

Let { x [ k ] } k ∈ N ∈ G n, be a correlated input sequence withtime index k . The general operation of the proposed GPCalgorithm closely follows that of the conventional predictivevector quantization technique [3]. Linear operations such asdifference, quantization, addition, and prediction are replacedby equivalent operators on Grassmann manifold using theconcepts derived in Section III. The main idea of predictivecoding is to quantize the error e [ k ] between the predictedvector ˜ x [ k ] and the current observed vector x [ k ] . The ﬁgureon the left hand side of Fig. 2 illustrates this graphically.Then, the quantized error is applied to predicted vector toconstruct the state ˆ x [ k ] of the current observed vector. Theﬁgure on the right of Fig. 2 illustrates this graphically. Thecurrent and previous estimated vectors, ˆ x [ k ] and ˆ x [ k − ,are used to compute the predict vector ˜ x [ k + 1] as it wasshown in Section III and Fig. 1. Since both the encoder anddecoder uses estimated vectors for prediction, they both obtainthe same predicted vectors. This is in contrast to quantizing x [ k ] directly in the conventional one-shot approach [18]. Byexploiting memory, predictive vector quantization offer higherresolution for a given number of bits.Fig. 3 illustrates the proposed GPC encoder; the pseudocode is provided in Algorithm 1. At time k , an error tangentvector is computed from the predicted vector ˜ x [ k ] to thecurrent observed vector x [ k ] . Using (6), the error tangentvector emanating from ˜ x [ k ] to x [ k ] is computed as e [ k ] = tan − (cid:18) d | ρ | (cid:19) x [ k ] /ρ − ˜ x [ k ] (cid:107) x [ k ] /ρ − ˜ x [ k ] (cid:107) (10)where ρ = ˜ x ∗ [ k ] x [ k ] and d = (cid:112) − | ρ | .If C = { c i } N C i =1 is the size N C = 2 b codebook of errortangent vectors, the index of the quantized error tangent vectoris obtained by i [ k ] = arg min i ∈{ , ,...,N C } d ( G (˜ x [ k ] , c i , , x [ k ]) . (11)The corresponding codeword is c i [ k ] . The codeword that yieldsthe geodesic map with shortest distance to the observedvector x [ k ] is selected. For notational brevity, we denote thequantization step by Q : C n → N that takes the error tangentvector and outputs the codeword index, i.e., i [ k ] = Q ( e [ k ]) .The design of the codebook and efﬁcient representation of thecodebook for implementation will be described in IV-B.Continuing at the encoder, the estimated vector becomes ˆ x [ k ] = G (˜ x [ k ] , c i [ k ] , . (12) Finally, the prediction using Deﬁnition 4 is performed usingtwo previous estimates ˜ x [ k + 1] = | ρ | ˆ x [ k ] + ρ ∗ ˆ x [ k ] − ˆ x [ k − (13)where ρ = ˆ x [ k ] ∗ ˆ x [ k − . For notational brevity, we denotethe prediction operation by a map P : G n, × G n, → G n, which takes current and previous state vectors and outputsthe predicted vector, i.e., ˜ x [ k + 1] = P (ˆ x [ k − , ˆ x [ k ]) . Thepredicted vector is used in the next step to compute the errortangent vector. The encoding procedure is repeated for eachtime k + 1 , k + 2 , . . . .Fig. 4 illustrates the proposed GPC decoder; the pseudocode is shown in Algorithm 2. The same error tangent code-book as the encoder is assumed to be available. The receivedindices are decoded in Q − to recover c i [ k ] . The predictedvector ˜ x [ k ] is mapped to the estimated vector ˆ x [ k ] using thecodeword as in (12). Similarly to the encoder, the predictionis performed using (13) to obtain ˜ x [ k + 1] for the next timeperiod. Note that for the ﬁrst iteration of the decoder, theknowledge of ˜ x [ k ] , or equivalently ˆ x [ k − and ˆ x [ k − ,is needed. Synchronizing the initial vectors with the encoderis important because if ˜ x [ k ] is different from the encoder, thereceived codeword no longer represents the correct error tan-gent vector. In Section IV-C, we provide an efﬁcient strategyfor initialization over ﬁnite rate communication channel. Withappropriate initialization, symmetric operation at the encoderand decoder yields the same predicted vector ˜ x [ k ] for eachtime k . B. Codebook Design

One of the strategy for PVQ codebook design is to employan open loop approach followed by a closed loop approachto reﬁne the codebook [3]. The open loop approach usesthe prior vectors from a training data set to perform theprediction ˜ x [ k ] = P ( x [ k − , x [ k ]) instead of predicting usingthe estimates, i.e. P (ˆ x [ k − , ˆ x [ k ]) . The error tangent vectoris computed using (10). Then the Lloyd iterative algorithmis used to obtain the open-loop codebook. Using the code-book obtained using the open-loop codebook design, GPC isperformed on the training data set to obtain a sequence oferror tangent vectors. The Lloyd iteration is performed on theclosed-loop error tangent vectors to obtain the ﬁnal codebook.It is difﬁcult to show the Lloyd iteration optimality of the open-loop and closed-loop approaches due to the feedback structureof the GPC but these approaches have been known to providegood results in the PVQ literature [3]. Thus, in this paper, weemploy the open-loop and closed-loop approach to obtain theerror tangent vector codebook.For storage of the codebook, we propose an efﬁcient code-book representation by exploiting the product structure ofthe tangent space. We quantize separately the tangent vectormagnitude and direction [3]. Shape-gain vector quantizationis widely used, for example, in speech and video coding [41].We use the shape-gain decomposition to provide efﬁcientcodebook storage and exploit it to analyze the rate-distortion ofthe proposed GPC that is otherwise very difﬁcult. The tangentmagnitude (cid:107) e [ k ] (cid:107) is dependent on the distance between the predicted vector and the observed vector, which in turn isdependent on the rate of change of the input vectors. The unitnorm error tangent vector depends on the location at which thetangent is computed and the directional statistics of the error.If C is the obtained error tangent codebook of size N C , theshape-gain decomposed codebooks are C d = { c d,i } N d i =1 for theerror tangent direction codebook and C m = { c m,i } Nmi =1 for theerror tangent magnitude codebook. The desired codeword isreconstructed as c i [ k ] = c m,i [ k ] c d,i [ k ] at time k . With someheuristic design, it is possible to express, for example, asize -bit codebook of vectors by a size -bit codebook ofscalars representing the magnitude and a size -bit codebookof vectors representing the normalized tangent directions. Thuscodebook storage reduction is possible at an expense of extracomputation to reconstruct the codeword. C. Initialization

Similar to the PVQ, the initial states of the GPC at both theencoder and the decoder needs to match to obtain the correctresults. For example, in next generation wireless standardssuch as IEEE 802.16m, various feedback initialization inter-vals are deﬁned [42, Sec.16.3.6]. Thus, an efﬁcient mechanismfor initialization is also important. Two approaches may beconsidered for initialization. One approach is to perform aninitialization process so that the two estimated vectors ˆ x [ k − and ˆ x [ k − are communicated from the encoder to thedecoder. Since the complete description of ˆ x [ k − and ˆ x [ k − must be communicated to the decoder, there is system depen-dent communication overhead. Another approach is to use theone-shot memoryless quantization technique to initialize thetwo vectors. This approach is attractive because it does notadd any implementation overhead to systems already usingone-shot feedback approach, e.g. 3GPP LTE. In particular, ifthe same codebook is used for the error tangent directioncodebook and one-shot memoryless quantization codebook,there are no codebook memory overhead resulting in efﬁcientimplementation. A consequence of using memoryless quanti-zation approach for initialization is that there may be an initialtransient period in which the quantization error is larger thanthe steady state condition. As we show in Section V, this isbecause the memoryless quantization generally results in alarger quantization error.V. P ERFORMANCE A NALYSIS OF

GPCIn this section, we provide a quantization error analysisunder a small angle approximation. We derive upper and lowerdistortion bounds, and then derive closed loop gain metric forthe GPC algorithm.

A. Small Angle Approximation

In this section we use the locally Euclidean property of theGrassmann manifold to derive an expression for the predictionerror as a function of the tangent vector. If y ∈ G n, is obtainedby changes to x ∈ G n, , we can approximate the chordal distance between x and y as d ( x , y ) = (cid:112) − | x ∗ y | = | sin( θ ) | (14) ≈ (cid:107) x − y (cid:107) (15)where (14) follows from the subspace angle of vectors [35] and(15) follows from the small angle approximation. Thus, for asufﬁciently small perturbation around x , the subspace distancebetween x and y is approximated by the usual Euclideandistance.We may express the current observed vector at time k , x [ k ] ,in terms of the predicted vector and the error tangent vectoras x [ k ] = G (˜ x [ k ] , e [ k ] , ≈ ˜ x [ k ] + (cid:126) e [ k ] (cid:107) e [ k ] (cid:107) = ˜ x [ k ] + e [ k ] (16)using the small angle approximation. Furthermore, x ∗ [ k ] x [ k ] ≈ (˜ x [ k ] + e [ k ]) ∗ (˜ x [ k ] + e [ k ])= 1 + 2 (cid:107) e [ k ] (cid:107)(cid:60) ( (cid:126) e [ k ] ∗ ˜ x [ k ]) + (cid:107) e [ k ] (cid:107) (17) ≈ . The second term, (cid:126) e ∗ [ k ]˜ x [ k ] , in (17) is zero because the unitnorm tangent vector (cid:126) e [ k ] is orthogonal to ˜ x [ k ] . Similarly, if c i [ k ] is the selected error tangent codeword, the estimatedsignal can be expanded as ˆ x [ k ] = G (˜ x [ k ] , c i [ k ] , ≈ ˜ x [ k ] + c i [ k ] . (18)Both (16) and (18) reveal that for a small enough change, bothvectors are expressed as an additive correction to the predictedvector. Thanks to the locally Euclidean property and using theusual -norm for the local difference, the prediction error is (cid:107) x [ k ] − ˆ x [ k ] (cid:107) ≈ (cid:107) e [ k ] − c i [ k ] (cid:107) . (19)Therefore, the estimation error can be approximated as thenormed difference between the actual tangent vector andthe quantized tangent vector. Thus for small changes in theobserved vector, the accuracy of tangent direction and tangentmagnitude determines the accuracy of the estimate. B. Distortion Bounds

The average distortion induced by a quantizer is a typicalmeasure of performance. In what follows, we derive an upperand lower bound on the distortion for the proposed GPCalgorithm. Recall that a metric ball B δ ( z ) with radius δ centered at z ∈ G n, on the Grassmann manifold is deﬁned as B δ ( z ) = { y ∈ G n, : d ( y , z ) ≤ δ } (20)such that B δ ( z ) ⊂ G n, . A closed form volume formula for B δ ( z ) is given as [43]Vol ( B δ ( z )) = δ n − . (21)Consider B γ ( z ) ⊂ G n, with δ ≤ γ and volume of B δ ( z ) given by (21). Let ( d y ) denote the differential form of the Haar measure on G n, . The distortion in the ball normalizedby the volume of the ball was shown to be [7, Lemma 1] (cid:90) B γ ( z ) d ( y , z )( d y ) Vol ( B γ ( z )) = (cid:18) n − n (cid:19) γ . (22)For memoryless quantization, the volume together with a pointdensity and covering assumption over the entire G n, are usedto characterize distortion. For the proposed GPC algorithm,the Voronoi region is determined by the tangent directionand tangent magnitude codebooks which makes the coveringargument difﬁcult. To overcome this difﬁculty, we assume thatthe tangent magnitude codebook provides concentric annularpartitions of the sphere cap centered around the predictedvector and the tangent direction codebook partitioning eachannulus into equiangle sectors. We obtain the bounds byconsidering the ball that is enclosed in the smallest annularsector and the ball that encloses the largest annular sector.Similarly, the distortion upper bound is given by the volumeof the ball that covers the Voronoi cell.Let γ d = min c d,i , c d,k ∈C d ,i (cid:54) = k d ( c d,i , c d,k ) denote theminimum chordal distance between the tangent direc-tion codewords and γ m = min c m,i ,c m,k ∈C m ,i (cid:54) = k | c m,i − c m,k | denote the minimum Euclidean distance betweenthe tangent magnitude codewords. Similarly, let λ d =max c d,i , c d,k ∈C d ,i (cid:54) = k d ( c d,i , c d,k ) denote the maximum chordaldistance between the tangent direction codewords and λ m =max c m,i ,c m,k ∈C m ,i (cid:54) = k | c m,i − c m,k | denote the maximum Eu-clidean distance between the tangent magnitude codewords.Suppose that the tangent direction and magnitude codebooksmaps uniformly to an equiangle sectors of concentric annuluscentered at the predicted vector. Then the following lemmaprovides the bounds on the distortion for GPC algorithm. Lemma 5 (Distortion bounds): If γ lower = min { γ d , γ m } and λ upper = max { λ d , λ m } , lower and upper quantizationdistortion bounds are given by D lower = (cid:18) n − n (cid:19) (cid:16) γ lower (cid:17) D upper = (cid:18) n − n (cid:19) (cid:18) λ upper (cid:19) . (23) Proof:

The lower bound is given by the volume of ametric ball that has ball radius which is smaller of the halfminimum chordal distance of tangent direction codebook andhalf minimum distance of tangent magnitude codebook. Theupper bound is similarly obtained by considering the volumeof a metric ball which covers a Voronoi region. The boundsare exact since the metric ball volume formula is accurate [7,Lemma 1].No claim is made on the tightness of the bound since anaccurate description of the Voronoi region obtained by theproposed tangent codebook remains an open problem. InSection VI-A, we provide numerical examples comparing thebounds obtained with actual distortion using ﬁxed codebooks.Using the obtained lower bound, we may further quantifythe reduction in distortion lower bound compared to memo-ryless quantization on the Grassmann manifold. For G n, , the lower bound on the ﬁxed rate quantizer on the Grassmannmanifold was shown to be D G n, ( N ) = (cid:18) n − n (cid:19) N − n − (24)where N is the size of the codebook with rate log ( N ) bits [6],[7]. Suppose that γ lower is dominated by the tangent directioncodebook such that γ lower = γ d and that Grassmanniancodebook is used for the tangent direction codebook. Then,the lower bound for the GPC algorithm can be expressed as D lower = (cid:18) n − n (cid:19) (cid:18) γ lower (cid:19) = 14 (cid:18) n − n (cid:19) N − n − d = 14 (cid:18) n − n (cid:19) D G n, ( N d ) (25)showing that the lower bound is smaller than D G n, ( N d ) when γ d < γ m . C. Performance Measures

The closed loop prediction gain ratio is often used invector quantization literature [3] as a measure of how wellthe predictor performs with respect to the changes in theinput. The closed loop prediction gain is usually written asthe ratio of mean squared norm of the observed signal overmean squared norm of the prediction error. We deﬁne the meansquared error to be E [ d (˜ x [ k ] , x [ k ])] . For our GPC algorithm,we measure the closed loop prediction performance by G clp = E [ (cid:107) x [ k ] (cid:107) ] E [ d (˜ x [ k ] , x [ k ])]= 1 E [ d (˜ x [ k ] , x [ k ])] (26)where d (˜ x [ k ] , x [ k ]) denotes the squared chordal predictionerror. In fact, (26) can be further expressed as a function ofthe tangent vector assuming that the small angle approximationholds. Using (15) and (16), the distance function in thedenominator can be approximated as d (˜ x [ k ] , x [ k ]) ≈ (cid:107) e [ k ] (cid:107) .Therefore, the closed loop prediction gain for GPC algorithmbecomes G clp ≈ E [ (cid:107) e [ k ] (cid:107) ] (27)which shows the dependence of the closed loop prediction gainperformance on the tangent magnitude. The tangent magnitudeis in turn dependent on the changes in the observed process. Aclosed form relationship between the observed process and thetangent magnitude is in general difﬁcult to obtain. In SectionVI-B, we show some empirical results of the closed loopprediction gain performance for the proposed GPC algorithm.VI. S IMULATION R ESULTS

In this section, we provide numerical results to illustrate theperformance of the proposed GPC algorithm.

A. Distortion Bounds

We present a numerical example illustrating the opera-tional distortion and compare it with the upper and lowerbounds given in Lemma 5. Correlated × vectors weregenerated according to a second order autoregressive modelwith memory coefﬁcients α = 0 . and α = 0 . withadditive noise distributed according to zero mean complexGaussian with variance (0 . , i.e., h [ k ] = α h [ k −

1] + α h [ k −

2] + (cid:112) − α − α z [ k ] . The normalized vectorswere considered to be the samples on G , to which theproposed GPC algorithm was applied. For this experiment, an N d = 2 tangent direction codebook was used and the tangentmagnitude codebook size was varied from N m = 2 to .Fig. 6 shows the operational distortion with upper and lowerdistortion bounds obtained in Lemma 5 as a function of thetangent magnitude codebook size. The lower bound capturesthe distortion trend over the range of codebook sizes whilethe upper bound seems too loose. We also illustrate the lowerbound of a memoryless quantization using a Grassmanniancodebook with codebook sizes of , , , and bits so thatthe total number of bits used for the codebook matches thatof the proposed GPC algorithm. We see that the proposedGPC algorithm provides signiﬁcant improvement in distortionover the memoryless quantization technique. Unfortunately,the upper bound from Lemma 5 is dominated by the resolutionof the -bit tangent direction codebook which has higherdistortion than the memoryless quantization with adjustednumber of codebook size. Nevertheless, the result shows that asigniﬁcant reduction in distortion is achieved by the proposedGPC algorithm and the achievable distortion can be controlledby the tangent magnitude codebook which is a simple scalarcodebook. B. Closed Loop Prediction Gain and Prediction Error

To illustrate the dependence on the tangent direction andtangent magnitude codebooks, Fig. 7 shows the closed loopprediction gains for various error tangent magnitude codebooksizes and ﬁxed tangent direction codebook of size N d = 64 .For these numerical examples, a correlated × vectorsequence was generated according to a ﬁrst order autoregres-sive model (or Gauss-Markov model [44]) with correlationcoefﬁcient α = J (2 πβ ) where J is Bessel function ofzeroth order and β is the normalized Doppler frequency. Thesequence of channel coefﬁcients are generated according to h [ k ] = α h [ k −

1] + (cid:112) − α z [ k ] (28)where k is the time index and z [ k ] is a N t × vector witheach entry drawn from an i.i.d. zero mean complex whiteGaussian process. The normalized vectors x [ k ] = h [ k ] / (cid:107) h [ k ] (cid:107) are the correlated sequence on the Grassmann manifold. Forthe tangent direction codebook, an N d = 2 Grassmanniancodebook [45] was used and the tangent magnitude codebookswere based on a uniform quantization between and using , , , and bits. For an upper bound, the closed loopprediction gain without quantizing the tangent magnitude isalso shown. The result illustrates the dependence of closedloop prediction gain on tangent magnitude codebook size as a function of correlation parameter β . For highly correlated data,the tangent magnitude codebook resolution has higher impacton the closed loop prediction gain. This is because the smallesttangent magnitude quantization level may be larger than theprediction error leading to an over estimation. If the tangentmagnitude codebook is adjusted based on the correlation, e.g.,quantize in the range of [0 , . instead of [0 , , this gap maybe closed.Another useful performance measure is the chordal distanceerror between the estimated vector ˆ x [ k ] and the observedvector x [ k ] . The chordal distance error d (ˆ x [ k ] , x [ k ]) showshow close the estimated vector is to the observed vectorusing the proposed GPC algorithm. In MIMO communicationapplication considered in VI-C, the chordal distance error hasa direct impact on the respective communication theoreticperformance measures. In Fig. 8, we show the chordal distancebetween ˆ x [ k ] and x [ k ] and the chordal distance between thequantized vector and the observed vector for memorylessquantization using Grassmannian codebook with N = 2 .Fig. 8 illustrates the substantial improvement in the quanti-zation accuracy compared with memoryless technique.To further illustrate the quantizer accuracy, we show theoperational mean squared chordal distance error (MSE) asa function of β for the proposed GPC algorithm and mem-oryless quantizer using Grassmannian codebook in Fig. 9.The memoryless quantizer provides approximately − dB ofMSE whereas the proposed GPC algorithm provides as littleas − dB of MSE which shows that signiﬁcant accuracycan be obtained over memoryless quantization techniques. Asthe correlation decreases, the MSE approaches that of thememoryless quantization MSE. C. Application to Zero Forcing Multiuser MIMO System

In this section, we illustrate the application of proposedGPC algorithm to limited feedback multiuser MIMO systemusing zero forcing precoding [30]. We assume that the trans-mitter has N t = 4 transmit antennas and each user is equippedwith single receive antenna. We assume that the encoder anddecoder are initialized and that each user has a perfect channelestimate. Then, each user performs the prediction as describedin Section IV and feedback the indices of quantized tangentdirection and tangent magnitude codewords. The transmitteruses the received indices and performs the prediction as de-picted in Fig. 4. Then, the predicted channel vectors are used toform the composite channel matrix to compute the zero forcingprecoder. The channel to each user is assumed to be temporallycorrelated with correlation according to J (2 πf D T s ) [46].Each user’s channel is independently generated assuming sametemporal correlation.To compare the random codebook approach and the pro-posed GPC algorithm, we compare the achievable sum ratefor three scenarios. First, the achievable sum rate assumingperfect CSI at the transmitter is obtained. For the perfect CSIcase, i.i.d. channel is assumed. The perfect CSI case providesthe baseline for what can be achieved. The second scenariois the random vector codebook approach also assuming i.i.d.channel [30]. Finally, the proposed GPC algorithm using 9-bit codebook for f D T s = 0 . , . , . , and . that corresponds to Doppler frequencies of 0.2Hz, 2Hz, 4Hz, and8 Hz at 5ms update intervals that is found in LTE-Advancedand IEEE 802.16m.Fig. 10 illustrates the achievable sum rate for cases beingconsidered. Contrary to the random codebook strategy, theproposed GPC algorithm provides signiﬁcant sum rate gain. Infact, for f D T s = 0 . , the system starts to become interfer-ence limited above SNR of 20dB illustrating the superior CSIaccuracy when the channel is highly correlated. Furthermore,each user is equipped with the same codebooks which elimi-nates the need to store multiple codebooks at the transmitter,thus reducing the overhead for practical applications.Fig. 11 illustrates the sum rate improvement of the proposedtechnique over the Householder technique in [26] over a rangeof SNR for channels with various normalized Doppler fre-quencies. Both methods used -bit feedback per channel use.The plot shows that the proposed GPC algorithm outperformsthe Householder technique especially at high SNR illustratinghigher CSI resolution obtained by the GPC algorithm.VII. C ONCLUSION

In this paper, we proposed a new predictive coding algo-rithm on the Grassmann manifold for limited feedback inmultiple antenna wireless systems. Building on the classicalpredictive vector quantization on linear vector space and thegeometric properties of the Grassmann manifold, we deriveda predictive coding framework for G n, . Distortion boundswere obtained showing possible distortion improvement overmemoryless quantization technique. In simulations we showedthat the proposed GPC algorithm provides signiﬁcant sumrate improvement for multiuser MIMO system using practicalcodebook size. Future work should consider the optimizationof the tangent magnitude codebook and extensions to a higherdimensional Grassmann manifold, i.e., G n,p for p > .A PPENDIX AP ROOF OF L EMMA Proof:

It was shown in [47] that the tangent vectorbetween x and x in G n, can be written as e = tan − (cid:18)(cid:13)(cid:13)(cid:13)(cid:13) x ρ − x (cid:13)(cid:13)(cid:13)(cid:13)(cid:19) x /ρ − x (cid:107) x /ρ − x (cid:107) . (29)The normed term can be simpliﬁed as (cid:13)(cid:13)(cid:13)(cid:13) x ρ − x (cid:13)(cid:13)(cid:13)(cid:13) = (cid:18) x ρ − x (cid:19) ∗ (cid:18) x ρ − x (cid:19) = 1 | ρ | − . (30)Therefore, (cid:13)(cid:13)(cid:13)(cid:13) x ρ − x (cid:13)(cid:13)(cid:13)(cid:13) = (cid:115) | ρ | − d | ρ | where d = (cid:112) − | ρ | is the chordal distance between x and x . Clearly, (cid:107) e (cid:107) = tan − ( d/ (cid:107) ρ | ) ≥ and (cid:126) e = ( x /ρ − x ) / ( d/ | ρ | ) such that e = (cid:107) e (cid:107) (cid:126) e . Using the exponential form of trigonometric identities tan − ( x ) = ( j/

2) ln { (1 − jx ) / (1 + jx ) } and cos − ( x ) = − j ln( x + √ x − , we have tan − (cid:18) d | ρ | (cid:19) = j  − j (cid:16) d | ρ | (cid:17) j (cid:16) d | ρ | (cid:17)  = − j ln( | ρ | + (cid:112) | ρ | − − | ρ | . (31)Since | ρ | is the cosine of the subspace angle between x and x , this shows that the norm of the tangent vector is equal tothe arc length, i.e., | θ | with subspace angle θ [35, p. 603].A PPENDIX BP ROOF OF L EMMA Proof:

For the general case where X , X ∈ G n,p with n > p > , the geodesic between X and X was shown tobe [29] X ( t ) = X V cos(Σ t ) V ∗ + U sin(Σ t ) V ∗ where U Σ V ∗ is the compact singular value decomposition ofthe tangent emanating from X to X . For the case x , x ∈G n, , let e be the tangent vector emanating from x to x .Then, we may assume V = 1 without loss of generality andidentify U with (cid:126) e and Σ with (cid:107) e (cid:107) to obtain G ( x , e , t ) = x cos( (cid:107) e (cid:107) t ) + (cid:126) e sin( (cid:107) e (cid:107) t ) . (32)It is clear that G ( x , e ,

0) = x . At t = 1 , we have G ( x , e ,

1) = x (cid:112) d / | ρ | + x /ρ − x d/ | ρ | d/ | ρ | (cid:112) d / | ρ | (33) = x ρ (cid:112) d / | ρ | (34) = x where we have used the identities sin( x ) = x √ x cos( x ) = 1 √ x (35)in (33) and the fact that ρ (cid:112) d / | ρ | = 1 in (34).To verify that G ( x , e , t ) for t ∈ [0 , is a valid point on theGrassmann manifold, taking the inner product of G ( x , e , t ) with itself yields for t ∈ [0 , by using the fact that x ⊥ e .A PPENDIX CP ROOF OF L EMMA Proof:

For the general case where X , X ∈ G n,p , n >p > , the parallel transport of tangent E emanating from X along the geodesic direction ∆ with compact singular valuedecomposition, U Σ V ∗ , was shown to be [29] ˆ E = [ − X V sin(Σ t ) U ∗ + U cos(Σ t ) U ∗ + ( I − UU ∗ )] E . (36) We need to show the parallel transport of the tangent vector e emanating from x to x in the geodesic direction e for thecase x , x ∈ G n, . Without loss of generality, we assumethat the singular value decomposition of e is given with (cid:126) e asthe left singular vector, (cid:107) e (cid:107) as the singular value, and forthe right singular vector. Then (cid:126) e ( t ) = [ − x (cid:126) e ∗ sin( (cid:107) e (cid:107) t ) + (cid:126) e (cid:126) e ∗ cos( (cid:107) e (cid:107) t ) + ( I − (cid:126) e (cid:126) e ∗ )] e = − x (cid:107) e (cid:107) sin( (cid:107) e (cid:107) t ) + e cos( (cid:107) e (cid:107) t ) . (37)Since G ( x , e ,

1) = x , the parallel transported tangent vectoremanating from x is found by evaluating (37) for t = 1 .Using (6) and (35), we have ˆ e = − x (cid:107) e (cid:107) sin( (cid:107) e (cid:107) ) + e cos( (cid:107) e (cid:107) )= − x tan − ( d/ | ρ | )( d/ | ρ | ) (cid:112) d / | ρ | + tan − ( d/ | ρ | )( x /ρ − x ) d/ | ρ | (cid:112) d / | ρ | = tan − ( d/ | ρ | )( d/ | ρ | ) (cid:112) d / | ρ | (cid:18) x ρ − x (cid:18) d | ρ | (cid:19)(cid:19) = tan − (cid:18) d | ρ | (cid:19) x ρ ∗ − x d (38)which is the desired result.A PPENDIX DD ERIVATION OF P REDICTION F UNCTION IN D EFINITION ˆ e ema-nating from x is given in (8). Computing the geodesic withfrom x along ˆ e at t = 1 gives ˆ x = G ( x , ˆ e , x (cid:112) d / | ρ | + x ρ ∗ − x (cid:112) d / | ρ | = | ρ | x + ρ ∗ x − x . (39)To see that ˆ x ∈ G n, , we have ˆ x ∗ ˆ x = ( | ρ | x + ρ ∗ x − x ) ∗ ( | ρ | x + ρ ∗ x − x )= 1 (40)where we have used the fact that ρ = x ∗ x . To see that theprediction is distance preserving, the inner product of x and ˆ x gives x ∗ ˆ x = x ∗ x | ρ | + x ∗ x ρ ∗ − x ∗ x = | ρ | . (41)Therefore, d ( x , ˆ x ) = (cid:112) − | ρ | = d ( x , x ) . (42)R EFERENCES[1] A. Haoui and D. Messerschmitt, “Predictive vector quantization,” in

Proc. of IEEE Int. Conf. on Acoustics, Speech and Signal Process. ,vol. 9, 1984, pp. 420–423.[2] H.-M. Hang and J. Woods, “Predictive vector quantization of images,”

IEEE Trans. Commun. , vol. 33, no. 11, pp. 1208–1219, 1985.[3] A. Gersho and R. M. Gray,

Vector Quantization and Signal Compression .Kluwer Academic, 1991. [4] H. Khalil, K. Rose, and S. L. Regunathan, “The asymptotic closed-loopapproach to predictive vector quantizer design with application in videocoding,” IEEE Trans. Image Process. , vol. 10, no. 1, pp. 15–23, 2001.[5] A. Barg and D. Nogin, “Bounds on packings of spheres in the grassmannmanifold,”

IEEE Trans. Inf. Theory , vol. 48, no. 9, pp. 2450–2454, 2002.[6] W. Dai, Y. Liu, and B. Rider, “Quantization bounds on grassmannmanifolds of arbitrary dimensions and MIMO communications withfeedback,” in

Proc. of IEEE Global Telecom. Conf. , vol. 3, 2005, pp.1456–1460.[7] B. Mondal, S. Dutta, and R. W. Heath Jr., “Quantization on theGrassmann manifold,”

IEEE Trans. Signal Process. , vol. 55, no. 8, pp.4208–4216, 2007.[8] A. Ashikhmin and R. Gopalan, “Grassmannian packings for efﬁcientquantization in MIMO broadcast systems,” in

Proc. of IEEE Int. Symp.on Info. Theory , 2007, pp. 1811–1815.[9] L. Zheng and D. N. C. Tse, “Communication on the grassmannmanifold: a geometric approach to the noncoherent multiple-antennachannel,”

IEEE Trans. Inf. Theory , vol. 48, no. 2, pp. 359–383, 2002.[10] I. Kammoun and J.-C. Belﬁore, “A new family of grassmann space-timecodes for non-coherent MIMO systems,”

IEEE Commun. Lett. , vol. 7,no. 11, pp. 528–530, 2003.[11] A. M. Cipriano, I. Kammoun, and J.-C. Belﬁore, “Simpliﬁed decodingfor some non-coherent codes over the grassmannian,” in

Proc. of IEEEInt. Conf. on Commun. , vol. 2, 2005, pp. 757–761.[12] IEEE, “IEEE 802.16e-2005 IEEE Standard for Local and metropolitanarea networks, Part 16: Air interface for ﬁxed broadcast wireless accesssystems, Amendment 2: Physical and Medium Access Control Layersfor Combined Fixed and Mobile Operation in License Bands andCorrigendum 1,” Dec. 2005.[13] 3GPP, “Physical layer aspects of UTRA high speed downlinkpacket access,”

Technical Report TR25.814

IEEE J. Sel. Areas Commun. , vol. 16, no. 8, pp. 1423–1436,1998.[15] J. C. Roh and B. D. Rao, “Transmit beamforming in multiple-antennasystems with ﬁnite rate feedback: a VQ-based approach,”

IEEE Trans.Inf. Theory , vol. 52, no. 3, pp. 1101–1112, 2006.[16] J. A. Tropp, I. S. Dhillon, R. W. Heath Jr., and T. Strohmer, “Designingstructured tight frames via an alternating projection method,”

IEEETrans. Inf. Theory , vol. 51, no. 1, pp. 188–209, 2005.[17] T. Inoue and R. W. Heath Jr., “Kerdock codes for limited feedbackprecoded MIMO systems,”

IEEE Trans. Signal Process. , vol. 57, no. 9,pp. 3711–3716, 2009.[18] D. J. Love, R. W. Heath Jr., V. K. N. Lau, D. Gesbert, B. D.Rao, and M. Andrews, “An overview of limited feedback in wirelesscommunication systems,”

IEEE J. Sel. Areas Commun. , vol. 26, no. 8,pp. 1341–1365, 2008.[19] K. Huang, B. Mondal, R. W. Heath Jr., and J. G. Andrews, “Markovmodels for limited feedback MIMO systems,” in

Proc. of IEEE Int. Conf.on Acoustics, Speech and Signal Process. , vol. 4, 2006, pp. 9–12.[20] K. Huang, R. W. Heath, Jr., and J. G. Andrews, “Limited feedbackbeamforming over temporally-correlated channels,”

IEEE Trans. SignalProcess. , vol. 57, no. 5, pp. 1959–1975, 2009.[21] B. Mondal and R. W. Heath Jr., “Adaptive feedback for MIMO beam-forming systems,” in

Proc. of IEEE Workshop on Signal Process. Adv.in Wireless Commun. , 2004, pp. 213–217.[22] R. Samanta and R. W. Heath Jr., “Codebook adaptation for quantizedMIMO beamforming systems,” in

Proc. of Asilomar Conf. on Signals,Systems and Computers , 2005, pp. 376–380.[23] J. H. Kim, W. Zirwas, and M. Haardt, “Efﬁcient feedback viasubspace-based channel quantization for distributed cooperative antennasystems with temporally correlated channels,”

EURASIP Journal onAdvances in Signal Processing

IEEEJ. Sel. Areas Commun. , vol. 25, no. 7, pp. 1298–1310, 2007.[25] R. W. Heath Jr., T. Wu, and A. C. K. Soong, “Progressive reﬁnementfor high resolution limited feedback multiuser MIMO beamforming,” in

Proc. of Asilomar Conf. on Signals, Systems and Computers , 2008, pp.743–747.[26] L. Liu and H. Jafarkhani, “Novel transmit beamforming schemes fortime-selective fading multiantenna systems,”

IEEE Trans. Signal Pro-cessing , vol. 54, no. 12, pp. 4767–4781, 2006. [27] T. Kim, D. J. Love, and B. Clerckx, “MIMO systems with limitedrate differential feedback in slowly varying channels,”

IEEE Trans.Commun. , vol. 59, no. 4, pp. 1175–1189, 2011.[28] T. Inoue and R. W. Heath Jr., “Grassmannian predictive coding forlimited feedback multiuser MIMO systems,” in

Proc. of IEEE Int. Conf.on Acoustics, Speech and Signal Process. , 2011.[29] A. Edelman, T. A. Arias, and S. T. Smith, “The geometry of algorithmswith orthogonality constraints,”

SIAM J. Matrix Analysis and Applica-tions , vol. 20, no. 2, pp. 303–353, 1998.[30] N. Jindal, “MIMO broadcast channels with ﬁnite-rate feedback,”

IEEETrans. Inf. Theory , vol. 52, no. 11, pp. 5045–5060, 2006.[31] K. Huang, J. G. Andrews, and R. W. Heath Jr., “Orthogonal beam-forming for SDMA downlink with limited feedback,” in

Proc. of IEEEInt. Conf. on Acoustics, Speech and Signal Process. , vol. 3, 2007, pp.97–100.[32] G. Caire and S. Shamai, “On the achievable throughput of a multiantennaGaussian broadcast channel,”

IEEE Trans. Inf. Theory , vol. 49, no. 7,pp. 1691–1706, 2003.[33] C. B. Peel, B. M. Hochwald, and A. L. Swindlehurst, “Avector-perturbation technique for near-capacity multiantenna multiusercommunication-part I: channel inversion and regularization,”

IEEETrans. Commun. , vol. 53, no. 1, pp. 195–202, 2005.[34] J. M. Lee,

Introduction to Smooth Manifolds , ser. Graduate texts inmathematics; 218. Springer, 2003.[35] G. H. Golub and C. H. Van Loan,

Matrix Computations , 3rd ed. TheJohns Hopkins University Press, 1996.[36] D. J. Love, R. W. Heath Jr., and T. Strohmer, “Grassmannian beamform-ing for multiple-input multiple-output wireless systems,”

IEEE Trans.Inf. Theory , vol. 49, no. 10, pp. 2735–2747, 2003.[37] J. H. Conway, R. H. Hardin, and N. J. A. Sloane, “Packing lines, planes,etc.: packings in grassmannian spaces,”

Experimental Mathematics ,vol. 5, pp. 139–159, 1996.[38] R. E. Mahony, “Optimization algorithms on homogeneous spaces: withapplications in linear system theory,” Ph.D. dissertation, AustralianNational University, 1994.[39] J. H. Manton, “Modiﬁed steepest descent and newton algorithms fororthogonally constrained optimisation. part i. the complex Stiefel man-ifold,” in

Proc. of Int. Symp. on Signal Process. and its Appl. , vol. 1,2001, pp. 80–83.[40] ——, “Modiﬁed steepest descent and newton algorithms for orthogo-nally constrained optimisation. part ii. the complex Grassmann mani-fold,” in

Proc. of Int. Symp. on Signal Process. and its Appl. , vol. 1,2001, pp. 84–87.[41] M. Sabin and R. Gray, “Product code vector quantizers for waveformand voice coding,”

IEEE Trans. Acoust., Speech, Signal Process. , vol. 32,no. 3, pp. 474–488, 1984.[42] IEEE, “Part 16: Air Interface for Broadband Wireless AccessSystems,”

DRAFT Amendment to IEEE Standard for Local andmetropolitan area networks, D12 , 2011. [Online]. Available: http://ieee802.org/16/pubs/80216m.html[43] W. Dai, Y. Liu, and B. Rider, “Quantization bounds on grassmannmanifolds and applications to MIMO communications,”

IEEE Trans.Inf. Theory , vol. 54, no. 3, pp. 1108–1123, 2008.[44] R. H. Etkin and D. N. C. Tse, “Degrees of freedom in some underspreadMIMO fading channels,”

IEEE Trans. Inf. Theory , vol. 52, no. 4, pp.1576–1608, 2006.[45] D. J. Love, “Grassmannian subspace packing webpage.” [Online].Available: http://cobweb.ecn.purdue.edu/ ∼ djlove/grass.html[46] C. Simon, R. de Francisco, D. T. M. Slock, and G. Leus, “Feedbackcompression for correlated broadcast channels,” in Proc. of IEEE Symp.on Commun. and Veh. Technol. in the Benelux , 2007, pp. 1–4.[47] T. Inoue and R. W. Heath Jr., “Geodesic prediction for limited feedbackmultiuser MIMO systems in temporally correlated channels,” in

Proc.of IEEE Radio and Wireless Symposium , 2009, pp. 167–170. Algorithm 1

GPC encoder algorithm

Input: x [ k ] Initialize ˜ x [1] and ˆ x [0] for all k=1,2,. . . do e [ k ] = L (˜ x [ k ] , x [ k ]) i [ k ] = Q ( e [ k ]) ˆ x [ k ] = G (˜ x [ k ] , c i [ k ] , ˜ x [ k + 1] = P (ˆ x [ k − , ˆ x [ k ]) end forOutput: i [ k ] Algorithm 2

GPC decoder algorithm

Input: i [ k ] Initialize ˜ x [1] and ˆ x [0] for all k=1,2,. . . do c i [ k ] = Q − ( i [ k ]) ˆ x [ k ] = G (˜ x [ k ] , c i [ k ] , ˜ x [ k + 1] = P (ˆ x [ k − , ˆ x [ k ]) end forOutput: ˆ x [ k ] Tangent plane atPredicted value based onparallel transported tangent. P a r a ll e l T r a n s p o r t ^ x [ k ] ~ x [ k + 1]^ x [ k ¡ e [ k ] ^ e [ k + ] ^ x [ k ] Fig. 1: Conceptual illustration of the tangent vector and theparallel transport on the Grassmann manifold. e [ k ] ~ x [ k ] x [ k ] Predicted vector Observed vectorPredictionerror ~ x [ k ] x [ k ] Predicted vector Observed vectorQuantizedprediction error ^ x [ k ] Quantized vectorQuantization error

Fig. 2: Conceptual illustration of obtaining prediction error(left) and quantizing the prediction error to obtain the esti-mated vector (right).

L GQP x [ k ] ^ x [ k ]~ x [ k ] e [ k ] c i [ k ] i [ k ] Fig. 3: Block diagram of predictive encoder on the Grassmannmanifold. Q -1 G P ^ x [ k ]~ x [ k ] c i [ k ] i [ k ] Fig. 4: Block diagram of predictive decoder on the Grassmannmanifold.

Lower bound ballwith radius Upper bound ballwith radius ° upper ° lower ° m ° d ~ x [ k ] Fig. 5: Illustration of quantization region around the predictedvector and lower and upper distortion bound balls. D i s t o r t i on Log ( D ) Tangent Magnitude Codebook Size (log_2(N))Comparison of Distortion

Memoryless QuantizationUpper boundMeasured distortionLower bound

Fig. 6: Comparison of operational distortion against the lowerand upper bound for various tangent magnitude codebook size.The lower distortion bound for the memoryless quantizer usingGrassmannian codebook is also shown. G a i n ( d B ) Closed-Loop Prediction Gain

Gclp for Nd=64, Nm= Gclp for Nd=64, Nm=32Gclp for Nd=64, Nm=16Gclp for Nd=64, Nm=8Gclp for Nd=64, Nm=4 β ∞ Fig. 7: Closed loop prediction gain, G clp , for G , data withﬁxed tangent codebook ( N d = 64 ) and different tangentmagnitude codebooks ( N m = 2 , , , ) over variouscorrelation parameter β . C ho r da l D i s t an c e Time index (k)Comparison of Chordal Distance Error Over Time

Memoryless Quantization, 9-bit CodebookProposed GPC algorithm 9-bit Codebook

Fig. 8: Chordal distance comparison over time between mem-oryless quantizer using -bit Grassmannian codebook and theproposed GPC algorithm using -bit codebook. -30-25-20-15-10-500.001 0.01 M SE ( d B ) Mean Squared Error

MSE memoryless 6-bit CodebookMSE memoryless 9-bit CodebookMSE GPC 9-bit Codebook β Fig. 9: Mean squared error comparison between memorylessquantization using -bit and -bit Grassmannian codebook andthe proposed GPC algorithm using -bit codebook. S u m R a t e ( bp s / H z ) SNR (dB)Sum Rate for Nt=4 and U=4

Perfect CSI IID ChannelProposed, 9-bit, fdTs=0.001Proposed, 9-bit, fdTs=0.01Proposed, 9-bit, fdTs=0.02Proposed, 9-bit, fdTs=0.04Grassmannian 9-bit IID Channel

Fig. 10: Sum rate for N t = U = 4 i.i.d. channel with perfectCSI, i.i.d. channel with -bit Grassmannian codebook, andthe proposed GPC algorithm with -bit codebook for variousnormalized Doppler frequencies. S u m R a t e I m p r o v e m en t ( bp s / H z ) SNR (dB)Sum Rate Improvement for GPC over Householder methodfdTs=0.001fdTs=0.01fdTs=0.02fdTs=0.04

Fig. 11: Sum rate improvement obtained by the proposed GPCalgorithm over Householder technique for N t = U = 4 using9