Blind Estimation of Effective Downlink Channel Gains in Massive MIMO
aa r X i v : . [ c s . I T ] M a r BLIND ESTIMATION OF EFFECTIVE DOWNLINK CHANNEL GAINS IN MASSIVE MIMO
Hien Quoc Ngo Erik G. Larsson
Department of Electrical Engineering (ISY), Link¨oping University, 581 83 Link¨oping, Sweden
ABSTRACT
We consider the massive MIMO downlink with time-division du-plex (TDD) operation and conjugate beamforming transmission. Toreliably decode the desired signals, the users need to know the ef-fective channel gain. In this paper, we propose a blind channel es-timation method which can be applied at the users and which doesnot require any downlink pilots. We show that our proposed schemecan substantially outperform the case where each user has only sta-tistical channel knowledge, and that the difference in performanceis particularly large in certain types of channel, most notably key-hole channels. Compared to schemes that rely on downlink pilots(e.g., [1]), our proposed scheme yields more accurate channel esti-mates for a wide range of signal-to-noise ratios and avoid spendingtime-frequency resources on pilots.
Index Terms — Blind channel estimation, downlink, massiveMIMO, time-division duplex.
1. INTRODUCTION
Massive multiple-input multiple-output (MIMO) is one of the mostpromising technologies to meet the demands for high throughputand communication reliability of next generation cellular networks[2–5]. In massive MIMO, time-division duplex (TDD) operation ispreferable since then the pilot overhead does not depend on the num-ber of base station antennas. With TDD, the channels are estimatedat the base station through the uplink training. For the downlink, un-der the assumption of channel reciprocity, the channels estimated atthe base station are used to precode the data, and the precoded dataare sent to the users. To coherently decode the transmitted signals,each user should have channel state information (CSI), that is, knowits effective channel from the base station.In most previous works, the users are assumed to have statisti-cal knowledge of the effective downlink channels, that is, they knowthe mean of the effective channel gain and use this for the signaldetection [6, 7]. In these papers, Rayleigh fading channels were as-sumed. Under the Rayleigh fading, the effective channel gains be-come nearly deterministic (the channel “hardens”) when the numberof base station antennas grows large, and hence, using the mean ofthe effective channel gain for signal detection works very well. How-ever, in practice, propagation scenarios may be encountered wherethe channel does not harden. In that case, using the mean effectivechannel gain may not be accurate enough, and a better estimate ofthe effective channel should be used. In [1], we proposed a schemewhere the base station (in addition to the beamformed data) alsosent a beamformed downlink pilot sequence to the users. With thisscheme, a performance improvement (compared to the case whenthe mean of the effective channel gain is used) was obtained. How-ever, this scheme requires time-frequency resources in order to send
This work was supported in part by the Swedish Research Council (VR)and ELLIIT. the downlink pilots. The associated overhead is proportional to thenumber of users which can be in the order of several tens, and hence,in a high-mobility environment (where the channel coherence inter-val is short) the spectral efficiency is significantly reduced.
Contribution:
In this paper, we consider the massive MIMOdownlink with conjugate beamforming. We propose a scheme withwhich the users blindly estimate the effective channel gain from thereceived data. The scheme exploits the asymptotic properties of themean of the received signal power when the number of base stationantennas is large. The accuracy of our proposed scheme is investi-gated for two specific, very different, types of channels: (i) indepen-dent Rayleigh fading and (ii) keyhole channels. We show that whenthe number of base station antennas goes to infinity, the channel es-timate provided by our scheme becomes exact. Also, numerical re-sults quantitatively show the benefits of our proposed scheme, espe-cially in keyhole channels, compared to the case where the mean ofthe effective channel gain is used as if it were the true channel gain,and compared to the case where the beamforming training schemeof [1] is used.
Notation:
We use boldface upper- and lower-case letters to de-note matrices and column vectors, respectively. The superscripts () T and () H stand for the transpose and conjugate transpose, respec-tively. The Euclidean norm, the trace, and the expectation opera-tors are denoted by k · k , Tr ( · ) , and E {·} , respectively. The no-tation P → means convergence in probability, and a.s. → means almostsure convergence. Finally, we use z ∼ CN (cid:0) , σ (cid:1) to denote a cir-cularly symmetric complex Gaussian random variable (RV) z withzero mean and variance σ .
2. SYSTEM MODEL
Consider the downlink of a massive MIMO system. An M -antennabase station serves K single-antenna users, where M ≫ K ≫ . The base station uses conjugate beamforming to simultaneouslytransmit data to all K users in the same time-frequency resource.Since we focus on the downlink channel estimation here, we assumethat the base station perfectly estimates the channels in the uplinktraining phase. (In future work, this assumption may be relaxed.)Denote by g k the M × channel vector between the base stationand the k th user. The channel g k results from a combination ofsmall-scale fading and large-scale fading, and is modeled as: g k = p β k h k , (1)where β k represents large-scale fading which is constant over manycoherence intervals, and h k is an M × small-scale channel vector.We assume that the elements of h k are i.i.d. with zero mean and unitvariance. We consider conjugate beamforming since it is simple and nearly opti-mal in many massive MIMO scenarios. More importantly, conjugate beam-forming can be implemented in a distributed manner. et s k , E (cid:8) | s k | (cid:9) = 1 , k = 1 , . . . , K , be the symbol intendedfor the k th user. With conjugate beamforming, the M × precodedsignal vector is given by x = √ α Gs , (2)where s , [ s , s , . . . , s K ] T , G , [ g . . . g K ] is an M × K channelmatrix between the K users and the base station, and α is a normal-ization constant chosen to satisfy the average power constraint at thebase station: E (cid:8) k x k (cid:9) = ρ. Hence, α = ρ E { Tr ( GG H ) } . (3)The signal received at the k th user is y k = g Hk x + n k = √ α g Hk Gs + n k = √ α k g k k s k + √ α K X k ′ = k g Hk g k ′ s k ′ + n k , (4)where n k ∼ CN (0 , is the additive Gaussian noise at the k th user.Then, the desired signal s k is decoded.
3. PROPOSED DOWNLINK BLIND CHANNELESTIMATION TECHNIQUE
The k th user wants to detect s k from y k in (4). For this purpose, itneeds to know the effective channel gain k g k k . If the channel isRayleigh fading, then by the law of large numbers, we have M k g k k P → β k , as M → ∞ . This implies that when M is large, k g k k ≈ Mβ k (wesay that the channel hardens ). So we can use the statistical propertiesof the channel, i.e., use E (cid:8) k g k k (cid:9) = Mβ k as a good estimate of k g k k when detecting s k . This assumption is widely made in themassive MIMO literature. However, in practice, the channel is notalways Rayleigh fading, and does not always harden when M →∞ . For example, consider a keyhole channel, where the small-scalefading component h k is modeled as follows [8, 9]: h k = ν k ¯ h k , (5)where ν k and the M elements of ¯ h k are i.i.d. CN (0 , RVs. Forthe keyhole channel (5), by the law of large numbers, we have M k g k k − β k | ν k | P → , which is not deterministic, and hence the channel does not harden.In this case, using E (cid:8) k g k k (cid:9) = Mβ k as an estimate of the trueeffective channel k g k k to detect s k may result in poor performance.For the reasons explained, it is desirable that the users estimatetheir effective channels. One way to do this is to have the base sta-tion transmit beamformed downlink pilots as proposed in [1]. Withthis scheme, at least K downlink pilot symbols are required. Thiscan significantly reduce the spectral efficiency. For example, sup-pose M = 300 antennas serve K = 50 terminals, in a coherenceinterval of length symbols. If half of the coherence interval isused for the downlink, then with the downlink beamforming training of [1], we need to spend at least symbols for sending pilots. Asa result, less than of the downlink symbols are used for pay-load in each coherence interval, and the insertion of the downlinkpilots reduces the overall (uplink+downlink) spectral efficiency by afactor of / .In what follows, we propose a blind channel estimation methodwhich does not require any downlink pilots. Consider the average power of the received signal at the k th user(averaged over s and n k ). From (4), we have E (cid:8) | y k | (cid:9) = α k g k k + α K X k ′ = k (cid:12)(cid:12)(cid:12) g Hk g k ′ (cid:12)(cid:12)(cid:12) + 1 . (6)The second term of (6) can be rewritten as α K X k ′ = k (cid:12)(cid:12)(cid:12) g Hk g k ′ (cid:12)(cid:12)(cid:12) = α K X k ′ = k g Hk ′ g k g Hk g k ′ = α ˜ g Hk A ˜ g k , (7)where ˜ g k , [ g T . . . g Tk − g Tk +1 . . . g TK ] T , and A is an M ( K − × M ( K − block-diagonal matrix whose ( i, i ) -block is the M × M matrix g k g Hk . Since A and ˜ g k are independent, as M ( K − → ∞ , the Trace lemma gives [10] M ( K − K X k ′ = k (cid:12)(cid:12)(cid:12) g Hk g k ′ (cid:12)(cid:12)(cid:12) − M ( K − K X k ′ = k β k ′ k g k k a.s. → . (8)Substituting (8) into (6), as M ( K − → ∞ , we have E (cid:8) | y k | (cid:9) M ( K − − M ( K − α k g k k + α K X k ′ = k β k ′ k g k k + 1 a.s. → . (9)The above result implies that when M and K are large, E (cid:8) | y k | (cid:9) ≈ α k g k k + α K X k ′ = k β k ′ k g k k + 1 . (10)Therefore, the effective channel gain k g k k can be estimated from E (cid:8) | y k | (cid:9) by solving the quadratic equation (10). As discussed in Section 3.1, we can estimate the effective channelgain k g k k by solving the quadratic equation (10). It is then requiredthat the k th user knows α , P Kk ′ = k β k ′ , and E (cid:8) | y k | (cid:9) . We assumethat the k th user knows α and P Kk ′ = k β k ′ . This assumption is rea-sonable since the terms α and P Kk ′ = k β k ′ depend on the large-scalefading coefficients, which stay constant over many coherence inter-vals. The k th user can estimate these terms, or the base station mayinform the k th user about them. Regarding E (cid:8) | y k | (cid:9) , in practice, itis unavailable. However, we can use the received samples during awhole coherence interval to form a sample estimate of E (cid:8) | y k | (cid:9) asfollows: E (cid:8) | y k | (cid:9) ≈ ξ k , | y k (1) | + | y k (2) | + . . . + | y k ( T ) | T , (11) -4 -3 -2 -1 T =T = without channel estimation, use E {|| g k || } DL pilots [1] proposed scheme (Algorithm 1) N o r m a li ze d M ea n - S qu a r e E rr o r SNR (dB)
T = ¥ Fig. 1 . Normalized MSE versus
SNR for different channel estima-tion schemes, for Rayleigh fading channels.where y k ( n ) is the n th receive sample, and T is the length (in sym-bols) of the coherence interval used for the downlink transmission.The algorithm for estimating k g k k is summarized as follows: Algorithm 1 (Proposed blind downlink channel estimation method) Using a data block of T samples, compute ξ k as (11) . The channel estimate of k g k k , denoted by a k , is determined as a k = − α P Kk ′ = k β k ′ + r α (cid:16)P Kk ′ = k β k ′ (cid:17) +4 α ( ξ k − α . (12)Note that a k in (12) is the positive root of the quadratic equation: ξ k = αa k + α P Kk ′ = k β k ′ a k + 1 which comes from (10) and (11). In this section, we analyze the accuracy of our proposed scheme fortwo specific propagation environments: Rayleigh fading and keyholechannels. For keyhole channels, we use the model (5). We assumethat the k th user perfectly estimates E (cid:8) | y k | (cid:9) . This is true whenthe number of symbols of the coherence interval allocated to thedownlink, T , is large. In the numerical results, we shall show thatthe estimate of E (cid:8) | y k | (cid:9) in (11) is very close to E (cid:8) | y k | (cid:9) even formodest values of T (e.g. T ≈ symbols). With the assumption ξ k = E (cid:8) | y k | (cid:9) , from (6) and (12), the estimate of k g k k can bewritten as: a k = − P Kk ′ = k β k ′ vuut P Kk ′ = k β k ′ k g k k ! + ǫ k , (13)where ǫ k , K X k ′ = k (cid:12)(cid:12)(cid:12) g Hk g k ′ (cid:12)(cid:12)(cid:12) − K X k ′ = k β k ′ k g k k . (14) -5 0 5 10 15 2010 -4 -3 -2 -1 T =T = without channel estimation, use E {|| g k || } DL pilots [1] proposed scheme (Algorithm 1) N o r m a li ze d M ea n - S qu a r e E rr o r SNR (dB)
T = ¥ Fig. 2 . Normalized MSE versus
SNR for different channel estima-tion schemes, for keyhole channels.We can see from (13) that if | ǫ k | ≪ (cid:18) P Kk ′6 = k β k ′ + k g k k (cid:19) ,then a k ≈ k g k k . In order to see under what conditions | ǫ k | ≪ (cid:18) P Kk ′6 = k β k ′ + k g k k (cid:19) , we consider ̺ k which is defined as: ̺ k , E (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ǫ k / E K X k ′ = k β k ′ + k g k k (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) . (15)Hence, ̺ k = M ( M +1) β k K P k ′6 = k β k ′ ¯ β k + Mβ k K P k ′ =1 β k ′ + β k M ! , for Rayleigh fading channels , M ( M +1) β k K P k ′6 = k β k ′ ¯ β k + Mβ k K P k ′ =1 β k ′ + β k M (2 M +1) ! , for keyhole channels , (16)where ¯ β k , P Kk ′ = k β k ′ . The detailed derivations of (16) are pre-sented in the Appendix. We can see that ̺ k = O (1 /M ) . Thus,when M ≫ , | ǫ k | is much smaller than (cid:18) P Kk ′6 = k β k ′ + k g k k (cid:19) .As a result, our proposed channel estimation scheme is expected towork well.
4. NUMERICAL RESULTS
In this section, we provide numerical results to evaluate our proposedchannel estimation scheme for finite M . As performance metric weconsider the normalized mean-square error (MSE) at the k th user: MSE k , E ((cid:12)(cid:12)(cid:12)(cid:12) a k − k g k k E {k g k k } (cid:12)(cid:12)(cid:12)(cid:12) ) . (17)or the simulation, we choose M = 100 , K = 20 , and β k =1 , ∀ k = 1 , . . . , K . We define SNR , ρ . Figures 1 and 2 showthe normalized MSE versus SNR for Rayleigh fading and keyholechannels, respectively. The curves labeled “without channel esti-mation, use E (cid:8) k g k k (cid:9) ” represent the case when the k th user usesthe statistical properties of the channels, i.e., it uses E (cid:8) k g k k (cid:9) asestimate of k g k k . The curves “DL pilots [1]” represent the casewhen the beamforming training scheme of [1] with MMSE channelestimation is applied. The curves “proposed scheme (Algorithm 1)”represent our proposed scheme for different T ( T = ∞ implies thatthe k th user perfectly knows E (cid:8) | y k | (cid:9) ). For the beamforming train-ing scheme, the duration of the downlink training is K . For ourproposed blind channel estimation scheme, s k , k = 1 , . . . , K , arerandom 4-QAM symbols.We can see that in Rayleigh fading channels, the MSEs of thethree schemes are comparable. Using E (cid:8) k g k k (cid:9) in lieu of the true k g k k for signal detection works rather well. However, in keyholechannels, since the channels do not harden, the MSE when using E (cid:8) k g k k (cid:9) as estimate of k g k k is very large. In both propagationenvironments, our proposed scheme works very well. For a widerange of SNRs, our scheme outperforms the beamforming trainingscheme, even for short coherence intervals (e.g., T = 100 symbols).Note again that, with the beamforming training scheme of [1], we ad-ditionally have to spend at least K symbols on training pilots (this isnot accounted for here, since we only evaluated MSE). By contrast,our proposed scheme does not requires any resources for downlinktraining.
5. CONCLUDING REMARKS
Massive MIMO systems may encounter propagation conditionswhen the channels do not harden. Then, to facilitate detection of thedata in the downlink, the users need to estimate their effective chan-nel gain rather than relying on knowledge of the average effectivechannel gain. We proposed a channel estimation approach by whichthe users can blindly estimate the effective channel gain from thedata received during a coherence interval. The approach is compu-tationally easy, it does not requires any resource for downlink pilots,it can be applied regardless of the type of propagation channel, andit performs very well.Future work may include studying rate expressions rather thanchannel estimation MSE, and taking into account the channel esti-mation errors in the uplink. (We hypothesize, that the latter will notqualitatively affect our results or conclusions.) Blind estimation of β k by the users may also be addressed.
6. APPENDIX
Here, we provide the proof of (16). From (15), we have ̺ k = E (cid:8) | ǫ k | (cid:9) / E K X k ′ = k β k ′ + k g k k . (18) • Rayleigh Fading Channels:
For Rayleigh fading channels, we have E K X k ′ = k β k ′ + k g k k = 14 K X k ′ = k β k ′ + K X k ′ = k β k ′ E (cid:8) k g k k (cid:9) + E (cid:8) k g k k (cid:9) = 14 K X k ′ = k β k ′ + Mβ k K X k ′ =1 β k ′ + β k M , (19)where the last equality follows [11, Lemma 2.9]. We next compute E (cid:8) | ǫ k | (cid:9) . From (14), we have E (cid:8) | ǫ k | (cid:9) = E K X k ′ = k (cid:12)(cid:12)(cid:12) g Hk g k ′ (cid:12)(cid:12)(cid:12) + K X k ′ = k β k ′ E (cid:8) k g k k (cid:9) − K X k ′ = k β k ′ E K X k ′ = k (cid:12)(cid:12)(cid:12) g Hk g k ′ (cid:12)(cid:12)(cid:12) k g k k . (20)We have, E K X k ′ = k (cid:12)(cid:12)(cid:12) g Hk g k ′ (cid:12)(cid:12)(cid:12) = E k g k k K X k ′ = k | z k ′ | , (21)where z k ′ , g Hk g k ′ k g k k . Conditioned on g k , z k ′ is complex Gaussiandistributed with zero mean and variance β k ′ which is independent of g k . Thus, z k ′ ∼ CN (0 , β k ′ ) and is independent of g k . This yields E K X k ′ = k (cid:12)(cid:12)(cid:12) g Hk g k ′ (cid:12)(cid:12)(cid:12) = E (cid:8) k g k k (cid:9) E K X k ′ = k | z k ′ | = β k M ( M + 1) K X i = k β i + K X i = k K X j = k β i β j . (22)Similarly, E K X k ′ = k (cid:12)(cid:12)(cid:12) g Hk g k ′ (cid:12)(cid:12)(cid:12) k g k k = E (cid:8) k g k k (cid:9) E K X k ′ = k | z k ′ | = β k M ( M + 1) K X k ′ = k β k ′ . (23)Substituting (22), (23), and E (cid:8) k g k k (cid:9) = β k M ( M + 1) into (20),we obtain E (cid:8) | ǫ k | (cid:9) = M ( M + 1) β k K X k ′ = k β k ′ . (24)Inserting (19) and (24) into (18), we obtain (16) for the Rayleighfading case. • Keyhole Channels:
By using the fact that z k ′ = g Hk g k ′ k g k k = p β k ′ ν k ′ g Hk ¯ h k ′ k g k k , (25)is the product of two independent Gaussian RVs, and following asimilar methodology used in the Rayleigh fading case, we obtain(16) for keyhole channels. . REFERENCES [1] H. Q. Ngo, E. G. Larsson, and T. L. Marzetta, “MassiveMU-MIMO downlink TDD systems with linear precoding anddownlink pilots,” in Proc. Allerton Conference on Communi-cation, Control, and Computing , Illinois, Oct. 2013.[2] E. G. Larsson, F. Tufvesson, O. Edfors, and T. L. Marzetta,“Massive MIMO for next generation wireless systems,”
IEEECommun. Mag. , vol. 52, no. 2, pp. 186–195 , Feb. 2014.[3] Q. Zhang, S. Jin, K.-K. Wong, H. Zhu, and M. Matthaiou,“Power scaling of uplink massive MIMO systems witharbitrary-rank channel means,”
IEEE J. Sel. Topics Signal Pro-cess. , vol. 8, no. 5, pp. 966–981, Oct. 2014.[4] A. Liu and V. K.N. Lau, “Phase only RF precoding for massiveMIMO systems with limited RF chains,”
IEEE Trans. SignalProcess., vol. 62, no. 17, pp. 4505–4515, Sept. 2014.[5] S. Noh, M. D. Zoltowski, Y. Sung, and D. J. Love, “Pilot beampattern design for channel estimation in massive MIMO sys-tems,”
IEEE J. Sel. Topics Signal Process. , vol. 8, no. 5, pp.787–801, Oct. 2014.[6] H. Yang and T. L. Marzetta, “Performance of conjugate andzero-forcing beamforming in large-scale antenna systems,”
IEEE J. Sel. Areas Commun. , vol. 31, no. 2, pp. 172–179, Feb.2013.[7] J. Jose, A. Ashikhmin, T. L. Marzetta, and S. Vishwanath, “Pi-lot contamination and precoding in multi-cell TDD systems,”
IEEE Trans. Wireless Commun. , vol. 10, no. 8, pp. 2640–2651,Aug. 2011.[8] H. Shin and J. H. Lee, “Capacity of multiple-antenna fadingchannels: Spatial fading correlation, double scattering, andkeyhole,”
IEEE Trans. Inform. Theory , vol. 49, no. 10, pp.2636–2647, Oct. 2003.[9] C. Zhong, S. Jin, K.-K. Wong, and M. R. McKay, “Ergodic mu-tual information analysis for multi-keyhole MIMO channels,”
IEEE Trans. Wireless Commun. , vol. 10, no. 6, p. 1754–1763,Jun. 2011.[10] S. Wagner, R. Couillet, M. Debbah, and D. T. M. Slock, “Largesystem analysis of linear precoding in correlated MISO broad-cast channels under limited feedback,”
IEEE Trans. Info. The-ory , vol. 58, no. 7, pp. 4509–4537, Jul. 2012[11] A. M. Tulino and S. Verd´u, “Random matrix theory and wire-less communications,”