On the Uplink Transmission of Multi-user Extra-large Scale Massive MIMO Systems
11 On the Uplink Transmission of Multi-userExtra-large Scale Massive MIMO Systems
Xi Yang,
Student Member, IEEE , Fan Cao,
Student Member, IEEE ,Michail Matthaiou,
Senior Member, IEEE , and Shi Jin,
Senior Member, IEEE
Abstract
With the inherent benefits, such as, better cell coverage and higher area throughput, extra-largescale massive MIMO has great potential to be one of the key technologies for the next generationwireless communication systems. However, in practice, when the antenna dimensions grow large,spatial non-stationarities occur and users would only see a portion of the base station antennaarray, which we call visibility region (VR). To exploit the impact of spatial non-stationarities, inthis paper, we investigate the uplink transmission of multi-user extra-large scale massive MIMOsystems by considering VRs. In particular, we first propose a subarray-based system architecturefor extra-large scale massive MIMO systems. Then, tight closed-form uplink spectral efficiency(SE) approximations with linear receivers are derived. With the objective of maximizing the systemsum achievable SE, we also propose schemes for the subarray phase coefficient design. In addition,two statistical CSI-based greedy user scheduling algorithms are developed. Our results indicate thatstatistical CSI-based scheduling algorithms achieve great performance for extra-large scale massiveMIMO systems, and it is not necessary to simultaneously turn on all the subarrays and radiofrequency chains to serve the users.
Index Terms
Ergodic spectral efficiency, extra-large scale massive MIMO, scheduling, spatial non-stationarity,subarray design.
I. I
NTRODUCTION
Massive multiple-input multiple-output (MIMO), whose idea is to employ a large-scaleantenna array at a base station (BS) to serve multiple users simultaneously, thus, achieving
Xi Yang, Fan Cao, and Shi Jin are with the National Mobile Communications Research Laboratory, Southeast University,Nanjing, 210096, P. R. China (e-mail: [email protected]; [email protected]; [email protected]).Michail Matthaiou is with the Institute of Electronics, Communications and Information Technology (ECIT), Queen’sUniversity Belfast, Belfast, U.K. (e-mail: [email protected]). a r X i v : . [ c s . I T ] S e p great spatial multiplexing gains and better spectral efficiency, has been identified as one ofthe key technologies in fifth-generation wireless communication systems [1–3]. When theantenna dimension continues to increase, the arrays turn out to be physically very largeand the benefits such as, channel hardening, asymptotic inter-user channel orthogonality, cellcoverage, area throughput, etc., promised by massive MIMO can be fully harnessed from atheoretical point of view. These extra-large scale antenna arrays could be developed and beintegrated into large infrastructures, such as, the roof of airports, the walls of stadiums, orlarge shopping malls [4]. With these enhanced benefits, extra-large scale massive MIMO isa very promising technology for the sixth-generation of wireless communications [5].However, the reality is not so idealistic: based on some recent measurement results in[6], when the antenna dimension becomes large, spatial non-stationarities start to kick in.This arises from the fact that when the dimension of an antenna array is large, the far-fieldpropagation assumption breaks down since the distances between the BS and scatterers orusers are smaller than the Rayleigh distance [7], and users can only see a portion of the BSantenna array due to the energy-limited scattering propagation paths and the extra-large arraysize. The portion of the antenna array at BS seen by users is called visibility region (VR) [8].Each user has its specific VR and the locations of VRs for different users can be separate,partially overlapped, or completely overlapped, depending on the surrounding environmentand the users’ relative positions along the antenna array.Due to the existence of VRs, the performance characterization of extra-large scale massiveMIMO systems is different from that of stationary massive MIMO systems by simply let-ting their number of BS antennas go to infinity. There are several works exploiting theperformance of extra-large scale massive MIMO systems. In [9], with the objective ofimproving the computational efficiency, a disjoint subarray-based receiver architecture anddistributed linear data fusion receiver with bipartite graph-based user selection method wereproposed. Moreover, [10] presented a capacity analysis of extra-large scale massive MIMOby introducing a tractable non-stationary channel model which divides the scattering clustersinto two categories, i.e., wholly visible clusters and partially visible clusters, and regardsthese clusters as an array with virtual antennas. A simple non-stationary channel model,which has connections with the stationary massive MIMO channel, was proposed in [11]. Adownlink performance analysis of multi-user massive MIMO systems with linear precoderswas also provided. Finally, the results in [11] indicated that the VR significantly impactsthe performance of linear precoders. Despite that, there is less of work that investigatesthe uplink performance of extra-large scale massive MIMO systems when taking the spatialnon-stationarities into account. Motivated by these existing works, we mainly focus on the uplink transmission of multi-user extra-large scale massive MIMO systems by considering VRs. In particular, we firstlypropose a practical system architecture suitable for extra-large scale massive MIMO, wherea subarray-based hybrid architecture is adopted to alleviate the overall hardware cost andcomplexity. Then, the uplink achievable spectral efficiencies (SEs) of the extra-large scalemassive MIMO system with linear receivers are examined. Afterwards, we investigate thedesign of the subarray in order to maximize the achievable SE. Two statistical channel stateinformation (CSI)-based greedy user scheduling algorithms are also proposed and numericalsimulations are performed to validate their performance. The main contributions of this papercan now be summarized as follows: • We derive tight closed-form ergodic uplink achievable SE approximations for the multi-user extra-large scale massive MIMO system with linear receivers, i.e., maximum ratiocombining (MRC) receiver and linear minimum mean squared error (LMMSE) receiver.The ergodic achievable SE approximation for the MRC receiver shows that in orderto maximize the system sum achievable SE, users with their VRs covering differentsubarrays or VRs with less overlap should be simultaneously scheduled. On the otherhand, the ergodic achievable SE approximation for the LMMSE receiver indicates thatwe should simultaneously schedule as many users as possible. • By considering two subarray architectures, i.e., the subarray with phase shifters and theon-off switch-based subarray, we investigate the design of the subarray for the extra-large scale massive MIMO system. For the subarray with phase shifters, the optimalphase coefficients are the phases of the eigenvectors corresponding to the maximumeigenvalues of the main block matrices of the channel correlation matrix. For the on-offswitch-based subarray, the user who has larger sum energy radiated to the subarraysshould be selected for communication to maximize the sum achievable SE. • With the aim of maximizing the sum achievable SE, we propose two statistical CSI-based greedy scheduling algorithms, i.e., the statistical CSI-based greedy user schedulingalgorithm and the statistical CSI-based greedy joint user and subarray scheduling algo-rithm. Numerical results manifest that in the extra-large scale massive MIMO regime,it is not necessary to simultaneously turn on all subarrays and radio frequency (RF)chains to serve the users. The introduction of dynamic subarray scheduling is beneficialto achieve better system performance with lower energy consumption.The rest of this paper is organized as follows: In Section II, we present the systemarchitecture and the signal model for the extra-large scale massive MIMO system. Section IIIinvestigates the uplink ergodic achievable SEs under the linear receivers as well as the phase coefficient design of subarrays. The proposed statistical CSI-based user scheduling algorithmsare provided in Section IV. Section V presents the numerical results and we conclude thepaper in Section VI.Throughout the paper, we use bold lowercase a and bold uppercase A to denote vectors andmatrices, respectively. The superscripts ( · ) ∗ , ( · ) T , and ( · ) H represent the conjugate, transpose,and conjugate-transpose operations of matrix, respectively; I N is an identity matrix withdimension N × N ; (cid:12) and ⊗ denote the element-wise product and the Kronecker product,respectively. Also, E {·} is the expectation operation, (cid:107) a (cid:107) stands for the norm of the vector a . tr( A ) , det( A ) , and A − stand for the trace, determinant, and inverse of the matrix A ,respectively, diag( x , x , ..., x N ) represents a diagonal matrix with diagonal elements x i , i =1 , . . . , N , while blkdiag( X , X , . . . , X N ) represents a diagonal matrix with block diagonalmatrices X i , i = 1 , . . . , N . II. S YSTEM M ODEL
In this section, we firstly describe the system architecture of the extra-large scale massiveMIMO system, and then the signal model is provided.
A. System Architecture
Consider a multi-user extra-large scale massive MIMO system illustrated in Fig. 1, wherea BS equipped with an M -element large uniform linear array (ULA) serves K single-antennausers simultaneously.Since the massive number of antennas at the BS render independent RF chain per antennaelement impractical in terms of hardware cost and system complexity, we propose a subarray-based hybrid architecture as presented in Fig. 1. The BS consists of subarrays, an RF chainpool, and a baseband processing unit. Each subarray includes M/N antenna elements andthus N subarrays are configured. The RF chain pool contains RF chains and each RF chaincan be statically or dynamically assigned to a dedicated subarray. Digital processing, suchas, channel estimation, data detection, or user scheduling is performed in the basebandprocessing unit. Note that each subarray is connected with an RF chain and therefore cansupport one data stream. Due to existence of VR in extra-large scale massive MIMO systems,multiple consecutive subarrays could be covered by one user. Moreover, when overlappedVRs occur, the multiple consecutive subarrays may also support other users simultaneously,which inevitably creates inter-user interference.To harvest the array gain provided by the large number of antennas, two different subarrayarchitectures i.e., the subarray with phase shifters and the on-off switch-based subarray, are M/N antenna elements … ……
M/N antenna elements
RF chain pool R F C h a i n R F C h a i n R F C h a i n … Baseband processingPhase shifter or Switch
M/N antenna elementsUE 1 UE 2UE K … Cluster
Cluster Cluster
Cluster Σ … Σ Σ
Fig. 1. The system architecture of a multi-user extra-large scale massive MIMO system, in which a BS equipped with an M -element large uniform linear array (ULA) serves K single-antenna user simultaneously. Subarray-based hybrid architectureis adopted at BS. Each subarray includes M/N antenna elements and there are N subarrays in total. considered. In former architecture, each antenna element in the subarray is connected witha phase shifter, while in the latter type, the phase shifter is replaced by a switch which isalways turned on in the uplink. Hence the signals acquired by antennas in a on-off switch-based subarray are directly combined without programmed phase shifts before conveyed tothe RF chain. It is worth noting that compared with the subarray with phase shifters anddespite the anticipated performance loss, the on-off switch-based subarray is much cheaperand more hardware-implementation friendly, especially in extra-large scale massive MIMOregime. In addition, as will be presented in the numerical results, the on-off switch-basedsubarray also yields great performance when combined with the LMMSE receiver. B. Signal Model
We focus on the uplink transmission of the multi-user extra-large scale massive MIMOsystem in this paper. Taking the spatial non-stationarity of the extra-large scale MIMO systemchannels into consideration, we model the channel h k ∈ C M × between the k -th user andthe BS as h k = Θ / k g k , (1) where g k ∼ CN ( , I M ) and Θ k represents the correlation matrix at the BS for the k -th user,given as Θ k = D / k R k D / k [11]. Note that R k denotes the classical spatial correlation matrixof user k corresponding to the case of a stationary massive MIMO channel and the spatial non-stationarity, i.e, the user’s VR, in the extra-large scale massive MIMO channel is characterizedby the real diagonal matrix D k = diag( d ( k )1 , d ( k )2 , . . . , d ( k ) M ) , where d ( k ) m , m = 1 , . . . , M, denotesthe spatial non-stationarity for user k at the antenna element m . We point out only a fewdiagonal elements of D k are non-zero [11]. When a ULA is employed at BS, the R k in Θ k can be expressed as [13] R k = (cid:2) a ( θ k ) a H ( θ k ) (cid:3) (cid:12) P ( θ k , σ k ) , (2)where a ( θ k ) is the steering vector of the ULA, defined as a ( θ k ) = [1 , e j πd sin θ k , . . . , e j π ( M − d sin θ k ] T , (3)where θ k represents the mean angle of arrival (AoA) of the k -th user and d denotes theantenna element spacing normalized by the carrier wavelength. In our simulations, we set d = 1 / . Most importantly, P ( θ k , σ k ) captures the angular spectrum of AoA and its entriescome from a Gaussian angular spread distribution with variance σ k [13]. The { m, n } th entryof P ( θ k , σ k ) can be given by { P ( θ k , σ k ) } m,n = e − πd ( m − n ) ] σ k cos θ k , m, n = 1 , . . . , M. (4)Therefore, in the uplink transmission of the multi-user extra-large scale MIMO system,the received signal at the BS can be written as y = √ p u Hx + n , (5)where p u is the transmit power of each user, H = [ h , h , . . . , h K ] represents the multi-user uplink channel matrix, and x = [ x , x , . . . , x K ] T is the transmitted signal from K users with x k ∼ CN (0 , for k = 1 , , ..., K ; n is the complex Gaussian noise satisfying n ∼ CN ( , σ I ) . Without loss of generality, we set σ = 1 . Additionally, for ease ofexposition, we assume that E { g i g Hk } = , ∀ i (cid:54) = k , that is to say, there is no correlationbetween any pair of channels across different users.Since we employ a subarray-based hybrid architecture in the multi-user extra-large scaleMIMO system as presented in Fig. 1, the received signals at BS will be firstly combined in the Although the subsequent analysis is applicable for general cases, we set the non-zero diagonal elements of D k to be in the simulation section for simplicity. analog domain and then be linearly demodulated in the digital domain. Thus, the processingprocedure can be formulated as r = √ p u A H W H Hx + A H W H n , (6)where W = blkdiag( w , w , . . . , w N ) ∈ C M × N represents the combining matrix in theanalog domain and can be expressed as W = w . . . ... . . . ... · · · w N , (7)where w i ∈ C ( M/N ) × is a constant modulus vector, that is to say, all the elements of w i have constant amplitude of (cid:112) N/M and w Hi w i = 1 for i = 1 , , ..., N . Also, A ∈ C N × K isthe linear detection matrix in the digital domain, which has different expressions for differentlinear receivers such as, MRC receiver and LMMSE receiver. Here, we define F = W H H and assume that perfect CSI is available at the BS , then A can be given by A = F , for MRC , F (cid:16) F H F + p u I K (cid:17) − , for MMSE . (8)Hence, the signal of the k -th user at the BS can be expressed as r k = √ p u a Hk W H h k x k + √ p u K (cid:88) i =1 ,i (cid:54) = k a Hk W H h i x i + a Hk W H n , (9)where a k = A (: , k ) is the k -th column of the matrix A .By modeling the noise-plus-interference term as additive Gaussian noise independent of x k with zero mean and variance of p u K (cid:80) i =1 ,i (cid:54) = k | a Hk W H h i | + (cid:13)(cid:13) a Hk W H (cid:13)(cid:13) [12], we obtain theergodic achievable SE of the k -th user in the uplink data transmission as R k = E h log p u | a Hk W H h k | p u K (cid:80) i =1 ,i (cid:54) = k | a Hk W H h i | + (cid:107) a Hk W H (cid:107) , (10)and the sum uplink achievable SE of the multi-user extra-large scale massive MIMO system Channel state information can be obtained by various methods, such as, by sending orthogonal pilots from users. Notethat only the low-dimension effective channel F is needed to be estimated in the uplink. In this paper, we assume perfectCSI in order to assess in detail the impact of non-stationarities without introducing complicated notation. Note also thatour results can be regarded as upper bounds of what will be achieved in practice. becomes R = K (cid:88) i =1 R i . (11)In the next section, we aim to examine the ergodic achievable SEs in (10) and (11) undertwo different types of receivers, i.e., MRC receiver and LMMSE receiver, and then identifythe influence of specific VR distributions on the system ergodic achievable SE.III. U PLINK A CHIEVABLE S PECTRAL E FFICIENCY A NALYSIS
In this section, we investigate the uplink achievable SEs of linear receivers, i.e., MRCreceiver and LMMSE receiver, for extra-large scale massive MIMO systems under perfectCSI. The design of the subarray is also discussed corresponding to the hardware architectureillustrated in Section II.
A. MRC Receiver
When the MRC receiver is employed at the BS, the approximation of the ergodic achievableSE is provided in the following theorem.
Theorem 1:
When the MRC receiver is adopted in the uplink of the multi-user extra-largescale massive MIMO system, the ergodic achievable SE of the k th user can be approximatedby R MRC,app k = log BΘ k BΘ k ) + tr ( BΘ k ) K (cid:80) i =1 ,i (cid:54) = k tr( BΘ i BΘ k ) + p u tr( BB H Θ k ) . (12) Proof:
When the MRC receiver is employed at the BS, we have the linear detectionmatrix A = F , and a k = W H h k . (13)Substituting (13) into (10), the ergodic uplink achievable SE for the k th user can be writtenas R MRC k = E h log p u | h Hk WW H h k | p u K (cid:80) i =1 ,i (cid:54) = k | h Hk WW H h i | + (cid:107) h Hk WW H (cid:107) ( a ) ≈ log p u E h {| h Hk Bh k | } p u K (cid:80) i =1 ,i (cid:54) = k E h {| h Hk Bh i | } + E h (cid:110) (cid:107) h Hk B (cid:107) (cid:111) , (14) where (a) applies the approximation E { log (1 + X/Y ) } ≈ log (1 + E { X } / E { Y } ) from[14] and we define B = WW H . Note that W is a block diagonal matrix owing to thesubarray-based hardware architecture, thus B is also a block diagonal matrix. Assume nowthat B = blkdiag( ¯B , ¯B , . . . , ¯B N ) , then ¯B i = w i w Hi , i = 1 , . . . , N . The results can bederived directly by calculating the terms in the numerator and the denominator of (14) asfollowing: E h { h Hk Bh k } = tr( B E h { h k h Hk } )= tr( BΘ k ) , (15) E h {| h Hk Bh k | } = tr( BΘ k BΘ k ) + tr ( BΘ k ) , (16) E h {| h Hk Bh i | } = E h { tr( Bh i h Hi B H h k h Hk ) } = tr( BΘ i BΘ k ) , (17) E h (cid:110)(cid:13)(cid:13) h Hk B (cid:13)(cid:13) (cid:111) = tr( BB H Θ k ) . (18)Substituting (15)-(18) into (14) and (11), we obtain (12).Observe from Theorem 1 that, to maximize the ergodic achievable SE per user and conse-quently maximize the system sum achievable SE, K (cid:80) i =1 ,i (cid:54) = k tr( BΘ i BΘ k ) should be minimized in(12). Since tr( BΘ i BΘ k ) (cid:62) , Θ i = D / i R i D / i and Θ k = D / k R k D / k are block diagonalmatrices and B is a block diagonal matrix corresponding to the subarray architecture as well,we can obtain tr( BΘ i BΘ k ) = 0 when ( D i ) (cid:12) ( D k ) = , in which ( D i ) denotes the N -dimension indicator function with its n th element calculated by [ ( D i )] n = , D i (cid:12) blkdiag( , . . . , ¯ B n , . . . , ) (cid:54) = , , D i (cid:12) blkdiag( , . . . , ¯ B n , . . . , ) = . (19)Therefore, for the MRC receiver, in order to maximize the system sum achievable SE, userswith their VRs covering different subarrays or VRs with less overlap should be scheduledsimultaneously in the extra-large scale massive MIMO system. By recalling that the MRCreceiver has no capability of cancelling inter-user interference which becomes more prob-lematic in the high signal-to-noise ratio (SNR) regime, we exploit the LMMSE receiver inthe next subsection. B. LMMSE Receiver
Similarly, when the LMMSE receiver is employed at BS, we have the linear detectionmatrix A = F (cid:16) F H F + p u I K (cid:17) − . Substituting A into (10), the ergodic achievable SE of the k th user under LMMSE receiver can be expressed as R LMMSE k = E h (cid:40) log (cid:32) (cid:2) ( I K + p u F H F ) − (cid:3) kk (cid:33)(cid:41) . (20)Since [ M − ] kk = det( M kk ) (cid:14) det( M ) where M kk is the ( k, k ) th minor of the matrix M [15],combining ( F H F ) kk = F H ( k ) F ( k ) where F ( k ) corresponds to F with the k th column removed[16], we can rewrite (20) as R LMMSE k = E h (cid:8) log det (cid:0) I K + p u F H F (cid:1)(cid:9) − E h (cid:8) log det (cid:0) I K − + p u F H ( k ) F ( k ) (cid:1)(cid:9) . (21)Note that it is greatly challenging, if not impossible, to directly evaluate (21) under generalcases, hence in what follows we analyze the ergodic achievable SE by giving a separatetreatment for two cases, i.e., (i) completely overlapped VR case, (ii) partially overlapped VRcase. Completely Overlapped VR Case : In this case, users are closely distributed in a relativelysmall region in front of the extra-large scale massive MIMO system, therefore, the VRs ofdifferent users completely overlap. To further simplify the problem, we assume that Θ = · · · = Θ K = Θ , then H = Θ / [ g , g , . . . , g K ]= Θ / G , (22)where G (cid:44) [ g , g , . . . , g K ] and G ∼ CN ( , I M ⊗ I K ) . Therefore, F H F = G H ˜ΘG , (23)where we define ˜Θ (cid:44) Θ / WW H Θ / . Substituting (23) into (21), we have the ergodicachievable SE of the k th user under the completely overlapped VR case as R LMMSE,Com k = E h (cid:110) log det (cid:16) I K + p u G H ˜ΘG (cid:17)(cid:111) − E h (cid:110) log det (cid:16) I K − + p u G H ( k ) ˜ΘG ( k ) (cid:17)(cid:111) ( a ) = K log e Π Mm When the LMMSE receiver is adopted in the uplink of the multi-user extra-large scale massive MIMO system, the ergodic achievable SE of the k th user can be approx-imated by R LMMSE,app k = log [1 + p u tr( BΘ k )] . (26) Proof: See Appendix A.Note that in the partially overlapped VR case, there exists a special scenario that only afew users are sparsely distributed in front of the extra-large scale massive MIMO system andthus no overlapped VRs appear. The analysis for this special scenario is also presented inAppendix A. Furthermore, since there is less possibility that VRs of different users completelyoverlap, especially when cooperated with user scheduling algorithms, we mainly focus onthe partially overlapped VR case in the subsequent analysis.Theorem 2 indicates that, for the purpose of maximizing the system sum achievable SEwith the LMMSE receiver, we should simultaneously schedule as many users as possiblewho have larger tr( BΘ k ) . In addition, the subarray architecture including, the number ofantennas per subarray, the phase coefficients of the phase shifter network, and so on, shouldalso be well designed to match the correlation matrix Θ such that tr( BΘ ) is maximized. Inthe next subsection, we precisely elaborate on the phase coefficient design of the subarraywith a phase shifter network. Another architecture, i.e., the on-off switch-based subarray, isalso investigated to provide insights into the corresponding user scheduling. In fact, when we approximate ( I K + p u F H F ) with diag( I K + p u F H F ) , the ergodic achievable SE for the completelyoverlapped VR case can also be approximated by (26) in Theorem 2 with a looser tightness. C. Design of Subarray Referring back to the hardware architecture illustrated in Section II, the design of twosubarray architectures i.e., the subarray with phase shifters and the on-off switch-basedsubarray, are considered in this subsection. 1) Subarray with Phase Shifters: When subarrays with phase shifters are deployed, everyantenna at the BS is connected with an independent phase shifter. In what follows, weconsider high precision phase shifters, nevertheless, low resolution phase shifters can also beemployed to further reduce the hardware cost which is, however, beyond the scope of thispaper. On the basis of (12) in Theorem 1 and (26) in Theorem 2, in order to maximize theergodic achievable SE for each user and thus for the whole system, in the next analysis wepropose a phase coefficient design from the aspect of maximizing tr( BΘ ) . Since B = blkdiag ( ¯B , ¯B , . . . , ¯B N ) and ¯B i = w i w Hi , i = 1 , . . . , N , we define Θ k = ¯Θ k, . . . ¯Θ k, N ... . . . ... ¯Θ k,N · · · ¯Θ k,NN , (27)where ¯Θ k,ij ∈ C ( M/N ) × ( M/N ) , ∀ i, j = 1 , . . . , N denotes the i th row j th column block matrixof Θ k , then BΘ k = ¯B ¯Θ k, . . . ¯B ¯Θ k, N ... . . . ... ¯B N ¯Θ k,N · · · ¯B N ¯Θ k,NN . (28)Hence, tr( BΘ k ) = tr (cid:32) N (cid:88) i =1 ¯B i ¯Θ k,ii (cid:33) ( a ) = (cid:88) i ∈ S k tr( w Hi ¯Θ k,ii w i )= (cid:88) i ∈ S k w Hi ¯Θ k,ii w i , (29)where S k represents the ensemble of the non-zero block matrices ¯Θ k,ii for Θ k and (a) utilizesthe trace property tr( AB ) = tr( BA ) . Consequently, based on (29), to maximize the ergodicachievable SE for user k , w i should be chosen as the eigenvector of ¯Θ k,ii corresponding to For MRC receiver in (12), because tr ( BΘ k ) > tr( BΘ k BΘ k ) , we also concentrate on maximizing tr( BΘ k ) as withthe LMMSE receiver in (26). the maximum eigenvalue. Considering that w i is realized by phase shifters with constant-modulus constraints, we design w i = NM e j ∠ v k,i , (30)where N/M is introduced for normalization and v k,i is the eigenvector of ¯Θ k,ii correspondingto the maximum eigenvalue. It is important to note that the phase coefficient design of eachsubarray in (30) is designed in terms of the low-dimension matrix ¯Θ k,ii ∈ C ( M/N ) × ( M/N ) ,instead of the correlation matrix Θ k ∈ C M × M across the whole extra-large scale antennaarray. Therefore, the calculation of w i for different subarrays can be executed in parallel andthe computation complexity can also be greatly reduced. 2) On-Off Switch-Based Subarray: At the expense of performance degradation, a on-offswitch-based subarray requires much lower hardware complexity and hardware cost whencompared with a subarray with phase shifters. Without phase shifters, B for the on-off switch-based subarray becomes B = NM diag( M/N , M/N , . . . , M/N ) , (31)where M/N denotes the all-ones matrix, i.e., M/N = . . . ... . . . ... · · · M/N . (32)Suppose Λ = blkdiag( M/N , M/N , . . . , M/N ) , then B = N Λ /M , and (12) and (26) can besimplified to R MRC,app k = log ΛΘ k ΛΘ k ) + tr ( ΛΘ k ) K (cid:80) i =1 ,i (cid:54) = k tr( ΛΘ i ΛΘ k ) + Mp u N tr( ΛΘ k ) (33)and R LMMSE,app k = log (cid:20) N p u tr( ΛΘ k ) M (cid:21) (34)respectively. As with the case of subarray with phase shifters, to maximize the ergodicachievable SE, we pay attention to the analysis of tr( ΛΘ k ) as well. In the on-off switch- based subarray, we havetr ( ΛΘ k ) = tr (cid:32) N (cid:88) i =1 M/N ¯Θ k,ii (cid:33) = (cid:88) i ∈ S k (cid:88) m Scheduling is of great significance in multi-user communication systems, especially formulti-user extra-large scale massive MIMO because of the existence of VRs, which couldbe used for further improving the spectral and energy efficiency. However, instantaneousCSI-based scheduling is impractical for extra-large scale massive MIMO due to the uncon-ventionally large number of antenna elements and relatively large number of users to beserved. As such, the acquisition of instantaneous CSI for all users will result in unaffordablecomputation complexity and training overhead.To tackle this problem, we propose two statistical CSI-based greedy scheduling schemeswith the aim of maximizing the system sum achievable SE with linear receivers. Given thatthe MRC receiver is adopted, as we can see from (12), users whose VRs cover differentsubarrays or those with fewer overlapped VRs should be scheduled so as to maximize thesum achievable SE. In the other case where a LMMSE receiver is employed, (26) showcasesthat, to obtain the maximum of the achievable SE, as many users as possible with larger tr( BΘ i ) should be scheduled. In the following, we firstly schedule users utilizing statisticalCSI in a greedy manner. Then, the algorithm investigating the feasibility of jointly schedulingusers and subarrays after taking energy consumption into consideration is provided. A. Statistical CSI-based Greedy User Scheduling Generally, utilizing exhaustive search could reach the optimal solution in scheduling prob-lems, however, it may be not appropriate for multi-user extra-large scale massive MIMO due to the extremely large computational complexity and long runtime. Therefore, a sub-optimal user scheduling algorithm, i.e., the greedy user scheduling algorithm, whose mainidea is to achieve an optimal result during each scheduling step and, thus, greatly reduce thealgorithm complexity, is proposed. We summarize our proposed statistical CSI-based greedyuser scheduling algorithm in Algorithm 1 . Algorithm 1 Statistical CSI-based Greedy User Scheduling Algorithm Input: U s = ∅ , U n = { , , . . . , K } , N s = 0 , N u , R = 0 , R temp = 0 . while N s < N u do for each u i ∈ U n do calculate the system sum achievable SE R U s ∪ u i ; end for select u i with the largest R among R U s ∪ u i as a newly scheduled user candidate u sel ; if R temp ≤ R then U s = U s ∪ u sel , U n = U n \{ u sel } , R temp = R, N s = N s + 1 ; else break; end if R = R temp ; end whileOutput: U s , R .In Algorithm 1 , firstly, we initialize all the system parameters, including the scheduleduser set U s = ∅ , the number of scheduled users N s = 0 , and the unscheduled user set U n = { , , . . . , K } . The total number of users to be scheduled and served is N u , and thesystem sum achievable SE is initialized as R = 0 .Next, we select the users in the unscheduled user set U n one by one and calculate theircorresponding updated system sum achievable SEs based on the scheduling results of theprevious iteration. A user will be added to the scheduled user set only if it reaches themaximum of the updated system sum achievable SEs among all unscheduled users from U n ,as well as produces a positive gain compared with the last iteration results. To update thesystem sum achievable SE in this step, (12) and (26) are leveraged when MRC and LMMSEreceivers are adopted respectively. Note that in the phase shifter-based subarray, if a subarrayis not covered by any user’s VR, the phase coefficients of the subarray would be set to thedefault value zero, i.e., ∠ v k,i = ; if a subarray is covered by multiple users simultaneously,then the phase coefficients of the subarray would be set to the sum of the phase correspondingto the multiple users.Then, the algorithm keeps running until N s = N u or there is no SE gain when adding annew user. Finally, the algorithm outputs the final scheduling results i.e., the scheduled userset U s and the system sum achievable SE R . Note that the proposed greedy user scheduling algorithm exploits only statistical CSI, i.e.,the knowledge of the channel correlation information instead of the instantaneous channelgains. This is beneficial and more practical especially for extra-large scale massive MIMOsystems. Moreover, considering the existence of VR and the cases that some subarrays may becovered by no user, we further examine the possibility to jointly schedule users and subarraysand propose a statistical CSI-based greedy joint user and subarray scheduling scheme in thenext subsection. B. Statistical CSI-based Greedy Joint User and Subarray Scheduling Algorithm 2 Statistical CSI-based Greedy Joint User and Subarray Scheduling Algorithm Input: U s = ∅ , S = ∅ , U n = { , , . . . , K } , S n = { , , . . . , N } , N s = 0 , N u , R = 0 , Sub max , Sub min , R temp = 0 . while N s < N u do for each u i ∈ U n do for each S u i ⊂ S n do if Sub min ≤ | S u i | ≤ Sub max then calculate the system sum achievable SE R U s ∪ u i ,S ui ; else continue; end if end for select S sel,u i with the largest R among R U s ∪ u i ,S ui as u i ’s subarray candidate; end for select u sel with the largest R among R U s ∪ u i ,S sel,ui as a newly scheduled user candidate; if R temp ≤ R U s ∪ u sel ,S sel,usel then U s = U s ∪ u sel , S = S ∪ S sel,u sel , U n = U n \{ u sel } , S n = S n \ S sel,u sel , R temp = R , N s = N s + 1 ; else break; end if R = R temp ; end whileOutput: U s , S, R .Similar to Algorithm 1 , we initialize all the system parameters at the first step in Algo-rithm 2 , including the scheduled user set U s = ∅ , the scheduled subarray set S = ∅ , theunscheduled user set U n = { , , . . . , K } , the unscheduled subarray set S n = { , , . . . , N } ,the number of scheduled users N s = 0 , the total number of users to be scheduled and served N u , and the system sum achievable SE R = 0 . The maximum and minimum number ofscheduled subarrays per user are set as Sub max and Sub min , respectively.Next, for each user in the unscheduled user set U n , we select its best subarray set fromthe unscheduled subarray set S n (the number of selected subarrays must not be greater than TABLE IV ALUES OF M AIN P ARAMETERS U SED IN N UMERICAL S IMULATIONS . Parameter Fig. 2 Fig. 3 Fig. 4 Fig. 5 Fig. 6 Fig. 10 M N 128 128 128 128 128 128 K 20 10 5 5 22 22 E 160 128 160 160 128 128 Sub max and less than Sub min ) for transmission and receive combining, so that the systemsum achievable SE is maximized after the current user is added. Note that (12) and (26) areutilized to calculate the system sum achievable SE when MRC and LMMSE receivers areadopted respectively. If the updated system sum achievable SE is larger than its counterpart,then we record the corresponding user index with its selected subarray set and the updatedsystem sum achievable SE. As a result, this user becomes a candidate.Based on the obtained candidates, the user who contributes with the strongest gain tothe sum achievable SE is finally selected and added to U s , with its corresponding selectedsubarray set S sel,u sel added to S . At the same time, its user index u sel and selected subarrayset S sel,u sel are removed from U n and S n respectively. After that, the number of scheduledusers, i.e., N s , increases by one. When N s = N u or there is no SE gain when adding a newuser, the scheduling algorithm terminates and outputs U s , S and R .Taking into consideration that each user only covers a limited portion of antenna arrays ofBS and that some BS subarrays are possibly covered by no user, the proposed statistical CSI-based greedy joint user and subarray scheduling algorithm significantly enhances the energyefficiency by turning off uncovered subarrays, thereby facilitating the practical implementationof extra-large scale massive MIMO systems.V. N UMERICAL R ESULTS In this section, the tightness of the approximated uplink ergodic achievable SEs under bothMRC and LMMSE receivers is firstly investigated. Then, we verify the effectiveness of theproposed phase coefficient design in Section III.C. The performance of these proposed twostatistical CSI-based user scheduling algorithms, i.e., the statistical CSI-based greedy userscheduling algorithm and the statistical CSI-based greedy joint user and subarray schedulingalgorithm, are also evaluated. A. Tightness of the Approximated Uplink Ergodic Achievable SE We first verify the diagonal-dominant property of the matrix Z with the LMMSE receiver.Fig. 2 provides the amplitudes of all the elements in Z when K = 20 and all the usersare randomly located along the extra-large antenna array with each user’s VR covering antenna elements, i.e., E = 160 , where E denotes the number of antenna elements each user’sVR covers. Table I summarizes the values of the main parameters used in the numericalsimulations for each figure. As can be seen from Fig. 2, the diagonal elements of Z areapparently larger than the off-diagonal ones, which verifies the conclusion we drew in SectionIII.B. Fig. 2. The amplitudes of all the elements in Z when K = 20 and all the users are randomly located along the extra-largeantenna array with each user’s VR covering antenna elements, i.e., E = 160 . Next, we investigate the tightness of the approximated uplink achievable SEs. Fig. 3 presentsthe uplink sum achievable SEs under the architectures of the phase shifter-based subarray andthe on-off switch-based subarray. Users are randomly located along the extra-large antennaarray without user scheduling. Note that in the phase shifter-based subarray, if a subarrayis not covered by any user’s VR, the phase coefficients of the subarray would be set to For simplicity, we assume that each user’s VR covers the same number of antenna elements. However, the simulationmethodology also supports the general case, i.e., different E for different users. zero by default; while if a subarray is covered by multiple users simultaneously, then thephase coefficients of the subarray would be set to the sum of the phase corresponding to themultiple users. transmit SNR [dB] p u S u m A c h i e v ab l e SE ( b i t s / s / H z ) Monte-Carlo MRCMRC-ApproxMonte-Carlo LMMSELMMSE-Approx On-off switch-based subarraySubarray with phase shifter s Fig. 3. The uplink sum achievable SEs under the architectures of the phase shifter-based subarray and the on-off switch-based subarray. M = 1024 , N = 128 , K = 10 , and E = 128 and users are randomly located along the extra-large antennaarray without user scheduling. From Fig. 3, the proposed achievable SE approximations match well with the Monte-Carlo results, which indicates that the proposed achievable SE approximations in (12) and(26) are inherently useful for the subsequent user scheduling to maximize the system sumachievable SE. In addition, as the power of transmitted signal increases, the system sumachievable SEs continuously increase with the LMMSE receiver since it can effectivelyeliminate the interference between different users. However, for the MRC receiver, the systemsum achievable SEs rapidly tend to saturation due to the persistent inter-user interference.Furthermore, comparing these results in Fig. 3, we find that, although the on-off switch-based subarray has an apparent performance loss in comparison to the structure of subarraywith phase shifters, it can still achieve nearly spectral efficiency performance withthe MRC receiver and with the LMMSE receiver. For example, with the MRC re-ceiver, the saturated sum achievable SE under the phase shifter-based subarray architectureis bits/s/Hz, while the saturated sum achievable SE with the on-off switch-based subarrayis about bits/s/Hz; for LMMSE receiver at the transmit SNR of dB, the system sum achievable SEs are . bits/s/Hz and . bits/s/Hz under the on-off switch-based subarrayand phase-shifter-based subarray, respectively. Moreover, the on-off switch-based subarrayarchitecture can effectively reduce the hardware cost of the system by replacing the expensivephase shifters with low-cost switches. Hence, it would be a more practical hardware solutionto apply the on-off switch-based subarray architecture in extra-large scale massive MIMO.Fig. 4 presents the sum achievable SE under the special scenario of no overlapped VR,namely users are far apart from each other and the signal radiated by a different usercovers different portions of the antenna array. The on-off switch-based subarray architectureis considered. As can be observed in Fig. 4, the proposed achievable SE approximations(12) and (26) yield again great tightness. Moreover, since users’ VRs do not overlap andthus no interference exists between users, the system sum achievable SE under the MRCreceiver continuously increases with an increasing transmit power. Consequently, the MRCreceiver achieves the same spectral efficiency as the LMMSE receiver. Hence, when thereare fewer users to be serviced or when the scheduled users have no overlapped VR, thehardware-friendly MRC receiver should be considered, thereby achieving lower computationalcomplexity with satisfactory performance. transmit SNR [dB] p u S u m A c h i e v ab l e SE ( b i t s / s / H z ) Monte-Carlo MRC, on-off switchMRC-Approx, on-off switchMonte-Carlo LMMSE, on-off switchLMMSE-Approx, on-off switch Fig. 4. The uplink sum achievable SEs under the scenario of no VR overlapping: M = 1024 , N = 128 , K = 5 , and E = 160 . The architecture of the on-off switch-based subarray is considered. transmit SNR [dB] p u S u m A c h i e v ab l e SE ( b i t s / s / H z ) Monte-Carlo MRCMRC-Approx Monte-Carlo L MMSE L MMSE-A pprox Phases of eigenvectors.Random phases. On-off switch-based subarray. Fig. 5. Comparison of the different subarray phase coefficient design under the scenario of no VR overlapping: M = 1024 , N = 128 , K = 5 , and E = 160 . B. Comparison of Different Subarray Phase Coefficient Design The sum achievable SEs under the special scenario of no overlapped VR for both thephase shifter-based subarray architecture and the on-off switch-based subarray architectureare provided in Fig. 5. LMMSE receiver is employed and two phase coefficient designs i.e.,the proposed eigenvector-based phase coefficient design in Section III.C and the randomphase coefficient design, are considered for the phase shifter-based subarray. The results inFig. 5 indicate that the proposed eigenvector-based phase coefficient design achieves the bestperformance and reaps about bits/s/Hz and bits/s/Hz sum achievable SE gains over therandom phase coefficient design and the on-off switch-based subarray design, respectively.What is more, due to the lack of phase alignment, the on-off switch-based subarray has thelowest system sum achievable SE. However, the on-off switch-based subarray induces thelowest hardware cost and computation complexity, therefore offering a low-cost alternative.Additionally, even if we adopt the on-off switch-based subarray architecture, bits/s/Hzsystem sum achievable SEs can still be achieved at the SNR of dB, which means that theaveraged achievable SEs per user are . bits/s/Hz. transmit SNR [dB] p u S u m A c h i e v ab l e SE ( b i t s / s / H z ) Greedy user scheduling, LMMSE, phase shifterGreedy user scheduling, LMMSE, on-off switchGreedy user scheduling, MRC, phase shifterGreedy user scheduling, MRC, on-off switch Fig. 6. The uplink sum achievable SEs under the architecture of the phase shifter-based subarray and the on-off switch-based subarray respectively. The parameters M = 1024 , N = 128 , K = 22 , and E = 128 are set and users are randomlylocated along the extra-large antenna array. The statistical CSI-based greedy user scheduling algorithm is utilized and thenumber of users to be scheduled and served is N u = 12 . C. User Scheduling Two user scheduling algorithms, i.e., the statistical CSI-based greedy user scheduling algo-rithm and the statistical CSI-based greedy joint user and subarray algorithm, were proposedin Section IV. We firstly investigate the performance of the statistical CSI-based greedy userscheduling algorithm. Fig. 6 presents the system sum achievable SEs for the MRC and theLMMSE receivers under the statistical CSI-based greedy user scheduling algorithm. Users arerandomly distributed along the extra-large antenna array as shown in Fig. 7 and the numberof users to be served is N u = 12 . The proposed eigenvector-based phase coefficient designis leveraged in the phase shifter-based subarray architecture.As can be observed from Fig. 6, regardless of the type of linear receivers (i.e., MRC receiveror LMMSE receiver), the phase shifter-based subarray provides an apparent performance im-provement compared to the on-off switch-based subarray; especially with the MRC receiver,nearly bits/s/Hz sum achievable SE gains are achieved. At low SNR, the LMMSE receiverdoes not apparently outperform the MRC receiver. However, as the transmit SNR increases,inter-user interference becomes stronger owing to the large number of scheduled users andthe overlapped VRs. Therefore, the LMMSE receiver begins to exhibit its superiority. Note UE ID A n t enna I nde x a t BS Fig. 7. The VR coverage distribution of the total users in the extra-large scale massive MIMO system. that nearly bits/s/Hz sum achievable SE gains can be acquired by the LMMSE receiver athigh SNR.It is also important to mention that, with the increasing transmit SNR and, thus, strongerinter-user interference, the number of users finally scheduled to be served under the MRCreceiver does not always reach the target number of scheduled users, i.e., N u . For example,only users are scheduled when using the MRC receiver at the transmit SNR of dB underthe architecture of the phase shifter-based subarray. The index vector of the finally scheduledusers is [1 , , , , , , and Fig. 8 plots their positions and corresponding VRs’ cover-ings. Nevertheless, the LMMSE receiver shows its superiority in supporting more users to beserved. The scheduled users at the transmit SNR of dB when using the LMMSE receiveris presented in Fig. 9, with their index vector being [1 , , , , , , , , , , , .Additionally, based on Figs. 8 and 9, it has also been verified that, to maximize the systemsum achievable SE, for the MRC receiver, users whose VRs cover different subarrays orthose with fewer VR overlaps should be scheduled, while for the LMMSE receiver, userswith larger tr( BΘ i ) should be scheduled as many as possible.Next, we exploit the performance of the statistical CSI-based greedy joint user and subarrayscheduling algorithm in Fig. 10. The maximum and minimum number of subarrays that eachuser can be allocated to in the joint user and subarray scheduling algorithm are Sub max = UE ID A n t enna I nde x a t BS Fig. 8. The scheduled users when using the MRC receiver at the transmit SNR of dB under the architecture of thephase shifter-based subarray. The index vector of the scheduled users is [1 , , , , , , . and Sub min = 6 respectively and we set N u = 12 . Hence, the number of subarrays(namely RF chains) for each user in the joint user and subarray scheduling algorithm ismuch less than that in the greedy user scheduling algorithm. Nevertheless, compared withFig. 6, Fig. 10 indicates that the performance of these two linear receivers in the on-offswitch-based subarray is only slightly deteriorated, and the performance loss with the phaseshifter-based subarray is not obvious and even can be neglected. Based on these results, wefind that, in extra-large scale massive MIMO systems, it is not necessary to simultaneouslyturn on all subarrays and RF chains to serve the users. The introduction of dynamic subarrayscheduling is beneficial to achieve better system performance with lower system energyconsumption. Besides, the statistical CSI-based greedy joint user and subarray schedulingalgorithm collaborating with the on-off switch-based subarray architecture and the LMMSEreceiver is a promising practical solution for extra-large scale massive MIMO.VI. C ONCLUSION This paper has investigated the uplink transmission of multi-user extra-large scale massiveMIMO systems. In order to perform this task, a subarray-based system architecture wasfirstly proposed. Then, we derived tight closed-form uplink achievable SE approximationsfor the extra-large scale massive MIMO system under linear receivers. Based on these UE ID A n t enna I nde x a t BS Fig. 9. The scheduled users when using the LMMSE receiver at the transmit SNR of dB under the architecture ofthe phase shifter-based subarray. The index vector of the scheduled users is [1 , , , , , , , , , , , . approximations, users with their VRs covering different subarrays or VRs with less overlapshould be scheduled simultaneously under MRC receiver, while as many users as possiblewith larger tr( BΘ ) should be selected under LMMSE receiver. The design of the subarraywith the objective of maximizing the system sum achievable SE has also been investigated.Our results indicate that for the subarray with phase shifters, the optimum phase coefficientdesign is the phases of the eigenvectors corresponding to the maximum eigenvalues of themain block matrices of Θ . Afterwards, we proposed two statistical CSI-based greedy userscheduling algorithms. Numerical results manifest that in the extra-large scale massive MIMOsystem, it is not necessary to simultaneously turn on all subarrays and RF chains to serve theusers. There is a tradeoff between the hardware cost and the system performance. Specifically,the statistical CSI-based greedy joint user and subarray scheduling algorithm collaboratingwith the on-off switch-based subarray architecture and the LMMSE receiver is a promisingpractical solution for extra-large scale massive MIMO systems. transmit SNR [dB] p u S u m A c h i e v ab l e SE ( b i t s / s / H z ) Greedy user scheduling, LMMSE, phase shifterGreedy user scheduling, LMMSE, on-off switchGreedy user scheduling, MRC, phase shifterGreedy user scheduling, MRC, on-off switch Fig. 10. The uplink sum achievable SEs under the architecture of the phase shifter-based subarray and the on-off switch-based subarray respectively. The parameters M = 1024 , N = 128 , K = 22 , and E = 128 are set and users are randomlylocated along the extra-large antenna array. The statistical CSI-based greedy joint user and subarray scheduling algorithm isutilized. The maximum and minimum number of subarrays that each user can be allocated are Sub max = 8 and Sub min = 6 respectively and the number of users to be served is N u = 12 . A PPENDIX AP ROOF OF T HEOREM F H F = H H BH = g H Θ / BΘ / g . . . g H Θ / BΘ / K g K ... . . . ... g HK Θ / K BΘ / g · · · g HK Θ / K BΘ / K g K . (36)Since E { g i g Hk } = , ∀ i (cid:54) = k , we obtain E { g Hi Θ / i BΘ / k g k } = , ∀ i (cid:54) = k , hence, E { I K + p u F H F } = diag(1 + p u g H Θ / BΘ / g , . . . , p u g HK Θ / K BΘ / K g K ) . (37)Furthermore, since VRs of different users only partially overlap, thus we can safely drawa conclusion that ( I K + p u F H F ) is a diagonal-dominant matrix. This diagonal-dominantproperty has been verified in the numerical results in Section V. Additionally, from (11) and (20), we have R LMMSE = K (cid:88) i =1 R LMMSE i = K (cid:88) i =1 E h (cid:40) log (cid:32) (cid:2) ( I K + p u F H F ) − (cid:3) ii (cid:33)(cid:41) ( a ) (cid:62) − K E h (cid:40) log (cid:32) K K (cid:88) i =1 (cid:104)(cid:0) I K + p u F H F (cid:1) − (cid:105) ii (cid:33)(cid:41) = − K E h (cid:26) log (cid:18) K tr (cid:0) I K + p u F H F (cid:1) − (cid:19)(cid:27) ( b ) (cid:62) − K log (cid:18) K E h (cid:110) tr (cid:0) I K + p u F H F (cid:1) − (cid:111)(cid:19) , (38)where we leverage the inequality of arithmetic and geometric means in (a) and the Jensen’sequality in (b). Define Z (cid:44) I K + p u F H F and Λ = diag(1 / z , / z , . . . , / z KK ) , then Z is adiagonal-dominant matrix. Therefore, according to the Neumann Series [17], for a diagonal-dominant matrix Z , its inverse can be expressed as Z − ≈ L (cid:88) n =0 ( I K − ΛZ ) n Λ , (39)where L represents the number of terms used in the Neumann Series. For simplicity, we set L = 1 and thus Z − ≈ Λ − ΛZΛ . (40)Applying (40) into (38), we obtain R LMMSE (cid:62) − K log (cid:18) K E h (cid:8) tr( Z − ) (cid:9)(cid:19) ≈ − K log (cid:18) K E h { tr (2 Λ − ΛZΛ ) } (cid:19) = − K log (cid:18) K tr ( E h { Λ } ) (cid:19) . (41)Moreover, z ii = 1 + p u g Hi Θ / i BΘ / i g i and tr ( E h { Λ } ) (cid:62) K (cid:88) i =1 E g { z ii } = K (cid:88) i =1 11 + p u tr( BΘ i ) . (42) Substituting (42) into (41), we have R LMMSE ≈ − K log (cid:32) K K (cid:88) i =1 11 + p u tr( BΘ i ) (cid:33) ( a ) (cid:54) K (cid:88) i =1 log [1 + p u tr( BΘ i )] , (43)where (a) utilizes the inequality of arithmetic and geometric means. Hence, the approximatedergodic system sum achievable SE under partially overlapped VR scenario can be expressedas R LMMSE,PartialApp = K (cid:88) i =1 log [1 + p u tr( BΘ i )] , (44)with each user contributing R LMMSE,PartialApp k = log [1 + p u tr( BΘ k )] . (45)The proof is concluded. Special Scenario: When VRs of different users do not overlap, we have Θ (cid:12) Θ · · ·(cid:12) Θ K = , thus F H F = H H BH = diag( g H Θ / BΘ / g , . . . , g HK Θ / K BΘ / K g K ) , (46)and E h (cid:110)(cid:104)(cid:0) I K + p u F H F (cid:1) − (cid:105) kk (cid:111) = E h { (1 + p u g Hk Θ / k BΘ / k g k ) − } ( a ) (cid:62) ( E h { p u g Hk Θ / k BΘ / k g k } ) − b ) = [1 + p u tr( BΘ k )] − , (47)where (a) applies Jensen’s equality E { /x } (cid:62) / E { x } for x > and (b) comes from E h { p u g Hk Θ / k BΘ / k g k } = p u tr ( BΘ k ) . From (20), we have R LMMSE k = E h (cid:40) log (cid:32) (cid:2) ( I K + p u F H F ) − (cid:3) kk (cid:33)(cid:41) ( a ) (cid:62) log (cid:32) E h (cid:8)(cid:2) ( I K + p u F H F ) − (cid:3) kk (cid:9) (cid:33) , (48)where Jensen’s equality E { log (1 /x ) } (cid:62) log (1 / E { x } ) for x > is applied in (a). Combining(48) with (47), the approximated ergodic achievable SE of the k th user under no overlapped VR scenario can be given by R LMMSE,NoApp k = log [1 + p u tr( BΘ k )] , (49)which is consistent with (45) as expected.R EFERENCES [1] F. Rusek et al. , “Scaling up MIMO: Opportunities and challenges with very large arrays,” IEEE Signal Process. Mag. ,vol. 30, no. 1, pp. 40-60, Jan. 2013.[2] E. G. Larsson, O. Edfors, F. Tufvesson, and T. L. Marzetta, “Massive MIMO for next generation wireless systems,” IEEE Commun. Mag. , vol. 52, no. 2, pp. 186-195, Feb. 2014.[3] T. L. Marzetta, “Noncooperative cellular wireless with unlimited numbers of base station antennas,” IEEE Trans.Wireless Commun. , vol. 9, no. 11, pp. 3590-3600, Nov. 2010.[4] A. O. Martinez, E. D. Carvalho, and J. Nielsen, “Towards very large aperture massive MIMO: A measurement basedstudy,” in Proc. IEEE GLOBECOM Workshops , Dec. 2014, pp. 281–286.[5] E. Bj¨ornson, L. Sanguinetti, H. Wymeersch, J. Hoydis, and T. L. Marzetta, “Massive MIMO is a reality–What is next?Five promising research directions for antenna arrays,” arXiv preprint arXiv:1902.07678v1 , 2019.[6] X. Gao, O. Edfors, F. Rusek, and F. Tufvesson, “Massive MIMO performance evaluation based on measuredpropagation data,” IEEE Trans. Wireless Commun. , vol. 14, no. 7, pp. 3899-3911, Jul. 2015.[7] A. D. Yaghjian, “An overview of near-field antenna measurements,” IEEE Trans. Antennas Propag. , vol. 34, no. 1,pp. 30-45, Jan. 1986.[8] E. D. Carvalho, A. Ali, A. Amiri, M. Angjelichinoski, and R. W. Heath Jr., “Non-stationarities in extra-large scalemassive MIMO,” arXiv preprint arXiv:1903.03085v1 , 2019.[9] A. Amiri and R. W. Heath Jr., “Extremely large aperture massive MIMO: Low complexity receiver architectures,” arXiv preprint arXiv:1810.02092v1 , 2018.[10] X. Li, S. Zhou, E. Bj¨ornson, and J. Wang, “Capacity analysis for spatially non-wide sense stationary uplink massiveMIMO systems,” IEEE Trans. Wireless Commun. , vol. 14, no. 12, pp. 7044-7056, Dec. 2015.[11] A. Ali, E. D. Carvalho, and R. W. Heath Jr., “Linear receivers in non-stationary massive MIMO channels with visibilityregions,” IEEE Wireless Commun. Lett. , pp. 1-1, Feb. 2019.[12] H. Q. Ngo, E. G. Larsson, and T. L. Marzetta, “Energy and spectral efficiency of very large multiuser MIMO systems,” IEEE Trans. Commun. , vol. 61, no. 4, pp. 1436-1449, Apr. 2013.[13] M. R. McKay, Random matrix theory analysis of multiple antenna communication systems. PhD thesis, 2006.[14] Q. Zhang, S. Jin, K.-K. Wong, H. Zhu, and M. Matthaiou, “Power scaling of uplink massive MIMO systems witharbitrary-rank channel means,” IEEE J. Sel. Topics Signal Process. , vol. 8, no. 5, pp. 966-981, Oct. 2014.[15] R. A. Horn and C. R. Johnson, Matrix Analysis , 4th ed. New York: Cambridge Univ. Press, 1990.[16] M. R. McKay, I. B. Collings, and A. M. Tulino, “Achievable sum rate of MIMO MMSE receiver: A general analyticframework,” IEEE Trans. Inf. Theory , vol. 56, no. 1, pp. 396–410, Jan. 2010.[17] D. Zhu, B. Li, and P. Liang, “On the matrix inversion approximation based on Neumann series in massive MIMOsystems,” in Proc. IEEE ICC , June 2015, pp. 1763–1769.[18] T. S. Rappaport, et al . “Millimeter wave mobile communications for 5G cellular: It will work!”