MSE-based Precoding for MIMO Downlinks in Heterogeneous Networks
aa r X i v : . [ c s . I T ] A ug MSE-based Precoding for MIMO Downlinks inHeterogeneous Networks
Yongyu Dai,
Student Member, IEEE,
Xiaodai Dong,
Senior Member, IEEE, and Wu-Sheng Lu,
Fellow, IEEE
Abstract —Considering a heterogeneous network (HetNet) sys-tem consisting of a macro tier overlaid with a second tier ofsmall cells (SCs), this paper studies the mean square error(MSE) based precoding design to be employed by the macrobase station and the SC nodes for multiple-input multiple-output (MIMO) downlinks. First, a new sum-MSE of all usersbased minimization problem is proposed aiming to design aset of macro cell (MC) and SC transmit precoding matricesor vectors. To solve it, two different algorithms are presented.One is via a relaxed-constraints based alternating optimization(RAO) realized by efficient alternating optimization and relaxingnon-convex constraints to convex ones. The other is via an un-constrained alternating optimization with normalization (UAON)implemented by introducing the constraints into the iterationswith the normalization operation. Second, a separate MSEminimization based precoding is proposed by considering thesignal and interference terms corresponding to the macro tier andthe SCs separately. Simulation results show that the sum-MSEbased RAO algorithm provides the best MSE performance amongthe proposed schemes under a number of system configurations.When the number of antennas at the macro-BS is sufficientlylarge, the MSE of the separate MSE-based precoding is foundto approach that of RAO and surpass that of UAON. Together,this paper provides a suite of three new precoding techniquesthat is expected to meet the need in a broad range of HetNetenvironments with adequate balance between performance andcomplexity.
I. I
NTRODUCTION
It is widely acknowledged that further improvements innetwork capacity are only possible by increasing the nodedeployment density [1, 2]. On the other hand, deploying moremacro tiers in already dense networks may be prohibitivelyexpensive and result in significantly reduced cell splittinggains due to severe inter-cell interference [3]. Heterogeneousnetworks (HetNets) that embed a large number of low-powernodes into an existing macro network with the aim of offload-ing traffic from the macro cell to small cells has emerged as aviable and cost-effective way to increase network capacity [1–4].In a typical HetNet consisting of a macro cell (MC) andseveral small cells (SCs), the MC serves its user equipments(UEs) in a large region by a high-power base station (BS),while each SC serves its UEs in its own coverage region by alow-power SC node if there is no cooperative transmission be-tween the BSs and SCs . Due to the large number of potentialinterfering nodes in the network, properly mitigating both the Y. Dai, X. Dong, and W.-S. Lu are with the Department of Electrical andComputer Engineering, University of Victoria, Victoria, BC, Canada (email:[email protected], [email protected], [email protected]). In this paper, SC is also utilized to denote the SC node for simplicity. inter-cell and intra-cell multiuser interference is a crucial issuefacing HetNet. Interference control (IC) for the interferencenetworks recently has been intensively studied and appliedin HetNet [5–9], and the coordinated multi-point (CoMP)transmission is demonstrated to be an effective approach in [5],including joint processing (JP) and coordinated beamforming(CB). When the backhaul among the coordinated tiers is ableto share both user data and channel state information (CSI),the CoMP-JP transmission is shown to provide high spectralefficiency [8, 9]. However, JP also introduces limitations forpractical implementation due to its needs for high signalingoverhead. On the other hand, with the BSs and SCs cooperatedin the beamformer or precoder level, CB strategies onlyrequire the share of CSI in order to mitigate the cross-tierinterference between the macro cell and co-channel deployedSCs. Reference [6] has implemented the cross-tier IC withCB based on a prioritized user selection scheme. Later, ajoint selection based IC is presented to achieve more balancedperformances between the macro cell UEs and the SC UEs [7].Nevertheless, these schemes with closed-form expressions areonly available in certain cases, such as a two-user multiple-input multiple-output (MIMO) interference channel.In practical systems, the design of specific interferencecontrol schemes is subject to various criteria and constraints.Typically, interference control is formulated as problems thatoptimize certain system utility functions, which are directlyassociated with the UE rates or mean square error (MSE).Since the signal-to-noise ratio (SNR) is not so high in thepractical wireless systems, especially at the cell edge, imper-ative performance improvement in the low and intermediateSNR region becomes a motivation in the IC scheme design.In [10], new MSE-based transceiver schemes are designedthrough efficient iterative algorithms for the peer to peerMIMO interference channel. In addition, source and relayprecoding designs based on the MSE criterion in MIMO two-way relay systems are investigated in [11, 12]. Unfortunately,due to their differences in network architecture, they may notbe employed directly into the HetNet systems, where there arehierarchical nodes including BS and SCs and each of them cantransmit to multiple users.To the best knowledge of the authors, there are no MSE-based precoding schemes for HetNet in the literature. In thispaper, we develop three new MSE-based precoding schemesfor MIMO downlinks in HetNet systems consisting of amacro tier overlaid with a second tier of SCs. Collectively,the proposed precoding schemes form a design toolbox thatis expected to cover a wide spectrum of system needsranging from superior precoding performance for systems with sufficient computing power to non-iterative precoderfor systems without the need to exchange CSI among cells.First, the design of transmit precoding matrices and vectorsis tackled by jointly minimizing a sum-MSE of all userssubject to individual transmit power constraints at each cell.Based on this formulation, two alternating optimization al-gorithms named relaxed-constraints based alternating opti-mization (RAO) and unconstrained alternating optimizationwith normalization (UAON) are presented, where the RAOrelaxes the non-convex constraints involved to convex onesfirst and then employs an alternating optimization techniqueto produce the solution, while the UAON is performed byembedding the constraints into the optimization process viaa normalization step. Motivated by the techniques aimed atmulti-cell time division duplex (TDD) systems [13, 15], nextwe develop a low complexity precoding scheme for HetNetwhere the precoder in each cell is designed separately withoutthe need to exchange user data or CSI over the backhaul.By employing block diagonalization (BD) techniques at thenode side [16], we derive a two-level precoder by a non-iterative algorithm where different interference thresholds areutilized to control the relative weights associated with theinterferences for performance enhancement. Moreover, robustprecoding schemes are presented correspondingly with imper-fect CSI known at each node. Finally, we present results fromnumerical experiments for the proposed precoding strategiesunder different system configurations as well as a comparisonstudy on performance in terms of MSE and bit error rate(BER).The rest of the paper is organized as follows. The sys-tem model for the MIMO downlinks in HetNet systems isdescribed in Section II. In Section III, a new sum-MSEbased precoding scheme for HetNet is proposed and twoimplementation algorithms are elaborated. In Section IV, aseparate MSE based precoding algorithm is developed for theBS and SCs, respectively, and two-level precoders are derived.Then, robust precoders are designed based on the estimatedchannel knowledge in Section V. Simulation results for severaldifferent system configurations are presented in Section VIto demonstrate the performance of the proposed precodingtechniques. Finally, we draw our conclusions in Section VII.
Notations:
We use tr { X } , X T , X H and k X k F to denote thetrace, transpose, Hermitian transpose, and Frobenius norm ofmatrix X , respectively. The symbol k x k denotes the 2-normof vector x , diag { x } denotes a diagonal matrix with x beingits diagonal, and bd { X , . . . , X K } denotes a block diagonalmatrix with the main diagonal blocks as matrices X , . . . , X K .The N × N identity matrix is denoted by I N . Furthermore,the expectation of a random variable is denoted by E {·} , and vec {·} denotes a vector composed of all columns of a matrixin sequence. II. S YSTEM M ODEL
We consider a two-tier network architecture with one cellconsisting of one macro BS, which is overlaid with a densetier of S uniformly distributed SCs as shown in Fig. 1. Assumethat the BS and SCs are respectively equipped with N BS and UsersSmall Cell Node
Macro-BSBackhaul S-1 S-2
Fig. 1: System model for HetNet with SCs deployment. N SC antennas, while each user is dropped uniformly in the cellarea and processes N UE antennas. Here, each user is uniformlydropped in the cell area. Based on the maximum referencesignal received power (RSRP) [1], the users served by themacro BS are assigned to a macro UE (MUE) set, and thoseserved by the SCs are assigned to a small cell UE (SUE)set. Suppose the macro BS serves K MUEs with K ≤ N BS while s -th SC ( s ∈ Ω = { , , . . . , S } ) serves L s ≤ N SC SUEs, thus the MUE and s -th SUE sets can be denoted by I = { , , . . . , K } and J s = { , , . . . , L s } , respectively.If the BS and SCs apply linear precoding to serve their UEsduring the downlink transmissions, then the received signalsat the i -th ( i ∈ I ) MUE and j -th ( j ∈ J s ) SUE in the s -th SCare given by y ( i )BS = K X k =1 p P BS (cid:16) H ( i )B − M (cid:17) H W ( k )BS x ( k )BS + S X s =1 L s X l =1 p P SC (cid:16) H ( s,i )S − M (cid:17) H W ( s,l )SC x ( s,l )SC + n ( i )BS = (cid:16) G ( i )B − M (cid:17) H W BS x BS + S X s =1 (cid:16) G ( s,i )S − M (cid:17) H W ( s )SC x ( s )SC + n ( i )BS (1) y ( s,j )SC = K X k =1 p P BS (cid:16) H ( s,j )B − S (cid:17) H W ( k )BS x ( k )BS + S X t =1 L t X l =1 p P SC (cid:16) H ( t,s,j )S − S (cid:17) H W ( t,l )SC x ( t,l )SC + n ( s,j )SC = (cid:16) G ( s,j )B − S (cid:17) H W BS x BS + S X t =1 (cid:16) G ( t,s,j )S − S (cid:17) H W ( t )SC x ( t )SC + n ( s,j )SC (2)respectively, where P BS and P SC represent the average powerat the macro BS and SCs; H ( i )B − M and H ( j )B − S denote the N BS × N UE channel vectors from the BS to the i -th MUEand j -th SUE, respectively; H ( s,i )S − M and H ( s,j )S − S denote the N SC × N UE channel vectors from s -th SC to the i -th MUE and j -th SUE, respectively; x ( k )BS ∈ C N S × and x ( s,j )SC ∈ C N S × are the complex-valued Gaussian N S transmitted symbolstreams from BS to its k -th MUE and from s -th SC to its ownSUE; W ( k )BS and W ( s,j )SC are the N BS × N S and N SC × N S precoding matrices, respectively; and n ( i )BS and n ( s,j )SC are theadditive white Gaussian noise vectors with each element ofvariance N . Besides, W BS = h W (1)BS , W (2)BS , . . . , W ( K )BS i , W ( t )SC = h W ( t, , W ( t, , . . . , W ( t,L t )SC i , x BS = h x (1)BS ; x (2)BS ; . . . ; x ( K )BS i , x ( t )SC = h x ( t, ; x ( t, ; . . . ; x ( t,L t )SC i , G ( i )B − M = √ P BS H ( i )B − M , G ( j )B − S = √ P BS H ( j )B − S , G ( s,i )S − M = √ P SC H ( s,i )S − M and G ( t,s,j )S − S = √ P SC H ( t,s,j )S − S are defined for analysis simplicity. Moreover, the propagationfactor here is defined as the product of a fast fading factor andan amplitude factor that accounts for geometric attenuationand shadow fading. For example, h ( m ,n ,i )B − M (the ( m , n ) -thelement of H ( i )B − M ) and h ( m ,n ,s,i )S − M (the ( m , n ) -th elementof H ( s,i )S − M ) in (1) assume the form h ( m ,n ,i )B − M = q β ( i )B − M υ ( m ,n ,i )B − M h ( m ,n ,s,i )S − M = q β ( s,i )S − M υ ( m ,n ,s,i )S − M (3)where m ∈ { , , . . . , N BS } , m ∈ { , , . . . , N SC } , n , n ∈ { , , . . . , N UE } ; υ ( m ,n ,i )B − M ∼ CN (0 , and υ ( m ,n ,s,i )S − M ∼ CN (0 , denote the fast fading coefficients;and β ( i )B − M and β ( s,i )S − M are the amplitude factors. Because thegeometric and shadow fading change slowly over space, β ( i )B − M and β ( s,i )S − M are treated as constants with respect to the indexof the base station antenna, and we can write β ( i )B − M = ζ BS θ BS (cid:16) d ( i )B − M (cid:17) , β ( s,i )S − M = ζ SC θ SC (cid:16) d ( s,i )S − M (cid:17) (4)where ζ BS and ζ SC denote the corresponding penetration lossthat are independent over all the indices [17], and functions θ BS (cid:16) d ( i )B − M (cid:17) and θ SC (cid:16) d ( s,i )S − M (cid:17) represent the pathloss model atthe BS and the SCs, respectively, where the arguments d ( i )B − M and d ( s,i )S − M are the distance between the BS and the i -th MUEand the distance between the s -th SC and the i -th MUE,respectively. Similar expressions for the propagation factors H ( s,j )B − S and H ( t,s,j )S − S in (2) can be obtained. We assume that timedivision duplex is adopted with channel reciprocity satisfied,i.e., the propagation factor is the same for both forward andreverse links and block fading remains constant for a durationsymbols. Hence, exact CSI for the downlinks can be obtainedfor both BS and SCs.From (1), the signal received at MUEs can be expressed as y BS = G H B − M W BS x BS + S X s =1 (cid:16) G ( s )S − M (cid:17) H W ( s )SC x ( s )SC + n BS (5)where G B − M = h G (1)B − M , G (2)B − M , . . . , G ( K )B − M i , G ( s )S − M = h G ( s, − M , G ( s, − M , . . . , G ( s,K )S − M i , n BS = h n (1)BS ; n (2)BS ; . . . ; n ( K )BS i and y BS = h y (1)BS ; y (2)BS ; . . . ; y ( K )BS i . Similarly, from (2) the signal received at SUEs of the s -th SC is y ( s )SC = (cid:16) G ( s )B − S (cid:17) H W BS x BS + S X t =1 (cid:16) G ( t,s )S − S (cid:17) H W ( t )SC x ( t )SC + n ( s )SC (6)where G ( s )B − S = h G ( s, − S , G ( s, − S , . . . , G ( s,L s )B − S i , G ( t,s )S − S = h G ( t,s, − S , G ( t,s, − S , . . . , G ( t,s,L s )S − S i , n ( s )SC = h n ( s, ; n ( s, ; . . . ; n ( s,L s )SC i and y ( s )SC = h y ( s, ; y ( s, ; . . . ; y ( s,L s )SC i .Assume that the linear receiver is applied at each user, then ˆ x ( i )BS = R ( i )BS y ( i )BS , ˆ x ( s,j )SC = R ( s,j )SC y ( s,j )SC , s ∈ Ω (7)where R ( i )BS ∈ C N S × N UE and R ( s,j )SC ∈ C N S × N UE are thereceiving filter matrices of MUE i and SUE j in the s -thSC, respectively. For simplicity, in the rest of the paper (7) isrewritten as ˆ x BS = R BS y BS , ˆ x ( s )SC = R ( s )SC y ( s )SC , s ∈ Ω (8)where R BS = bd n R (1)BS , . . . , R ( K )BS o and R ( s )SC =bd n R ( s, , . . . , R ( s,L s )SC o .III. S UM -MSE M INIMIZATION B ASED P RECODING IN H ET N ET In this section, the design of precoding matrices W BS and W ( s )SC ( s ∈ Ω ) is addressed by minimizing the total MSE(we call it sum-MSE) where each squared error term involvesits corresponding receiver matrix R BS or R ( s )SC that can beperformed by the user. This minimization is carried out subjectto average power constraints on W BS and W ( s )SC for s ∈ Ω .Under these circumstances, the precoding design problem canbe cast as a constrained optimization problem min W BS , W ( t )SC R BS , R ( t )SC , t ∈ Ω E ( k ˆ x BS − x BS k + S X s =1 (cid:13)(cid:13)(cid:13) ˆ x ( s )SC − x ( s )SC (cid:13)(cid:13)(cid:13) ) (9a) subject to tr (cid:8) W H BS W BS (cid:9) = 1 (9b) tr (cid:26)(cid:16) W ( t )SC (cid:17) H W ( t )SC (cid:27) = 1 , for t ∈ Ω (9c) R BS = bd n R (1)BS , . . . , R ( K )BS o (9d) R ( t )SC = bd n R ( t, , . . . , R ( t,L t )SC o , for t ∈ Ω . (9e)Let W SC = h W (1)SC ; W (2)SC ; . . . ; W ( S )SC i and R SC = h R (1)SC , R (2)SC , . . . , R ( S )SC i , and note that the transmission sym-bols satisfy E { x } = 0 , E (cid:8) xx H (cid:9) = I , and k x k = tr { xx H } ,the objective function in (9a) can be rewritten to make itsdependence on W BS and W ( s )SC explicit as f ( W BS , W SC , R BS , R SC ) = MSE BS + MSE SC (10) where MSE
BS ∆ = E n k ˆ x BS − x BS k o = tr (cid:8) R BS (cid:2) G H B − M W BS W H BS G B − M + S X s =1 (cid:16) G ( s )S − M (cid:17) H W ( s )SC (cid:16) W ( s )SC (cid:17) H G ( s )S − M R H BS − R BS G H B − M W BS + I KN S + σ R BS R H BS (cid:9) (11)and MSE
SC ∆ = E ( S X s =1 (cid:13)(cid:13)(cid:13) ˆ x ( s )SC − x ( s )SC (cid:13)(cid:13)(cid:13) ) = S X s =1 tr (cid:26) R ( s )SC (cid:20)(cid:16) G ( s )B − S (cid:17) H W BS W H BS G ( s )B − S + S X t =1 (cid:16) G ( t,s )S − S (cid:17) H W ( t )SC (cid:16) W ( t )SC (cid:17) H G ( t,s )S − S R ( s )SC (cid:17) H − R ( s )SC (cid:16) G ( s,s )S − S (cid:17) H W ( s )SC + I L s N S + σ R ( s )SC (cid:16) R ( s )SC (cid:17) H (cid:27) . (12)From (11) and (12) it follows that the sum-MSE is convexw.r.t W BS and W ( t )SC ; and that it is also convex w.r.t. R BS and the matrices in R SC . An essential technical difficulty indealing with problem (9) is that both its objective functionand the constraints on average power are nonconvex. In whatfollows, we propose an alternating convex optimization (ACO)technique which turns out to be well suited for the precodingdesign problem at hand. Specifically, a significant advantage ofusing ACO-based techniques is that all sub-problems involvedare convex, and fast algorithms for their solutions and reliablesoftware code for implementations are available [18, 19]. Inwhat follows we present two alternating-optimization basedtechniques. The first technique partitions the design variablesinto two subsets such that the objective becomes convex withrespect to each subset of variables, and this variable partition-ing is done while the constraints on average power are relaxedto their convex counterparts. The second technique carriesout unconstrained alternating optimization with respect to theabove-mentioned two subsets of design variables alternatively,followed by a simple norm normalization step to satisfy therequirement on average power. A. Relaxed-constraints based Alternating Optimization (RAO)
Here we consider a variant of problem (9) by a naturalconvex relaxation of the nonconvex constraints in (9b) and(9c), namely, min W BS , W ( t )SC R BS , R ( t )SC , t ∈ Ω f ( W BS , W SC , R BS , R SC ) (13a) subject to tr (cid:8) W H BS W BS (cid:9) ≤ (13b) tr (cid:26)(cid:16) W ( t )SC (cid:17) H W ( t )SC (cid:27) ≤ , for t ∈ Ω (13c) (9 d ) , (9 e ) (13d)As (9b) and (9c) impose conditions on the average power atthe BS and SCs, its convex relaxation as seen in (13b) and(13c) are well justified as it limits the average power at theBS and SCs to be within the given values. As will become transparent shortly, this convex relaxation removes the onlyobstacle that would otherwise prevent us from applying anACO-based technique to the precoding problem.To solve problem (13), we begin by partitioning the designvariables into two sets, namely X = { W BS , W SC } and X = { R BS , R SC } . Note that f ( W BS , W SC , R BS , R SC ) in (13a) is convex w.r.t. variable set X while variable set X is fixed, and that it is also convex w.r.t. X while X is fixed. Therefore, it is natural to apply an ACO approachfor the solution of (13), which is outlined as follows. Withvariables in X fixed, one minimizes convex objective func-tion f ( W BS , W SC , R BS , R SC ) w.r.t. variables { R BS , R SC } .Clearly this is an unconstrained convex problem becausevariables { R BS , R SC } are not involved in (13b) and (13c) andconstraints in (13d) can be removed by substituting it into theobjective function. The solution of the above problem, denotedby X ∗ = { R ∗ BS , R ∗ SC } , are then fixed and one minimizes theconvex objective function f ( W BS , W SC , R ∗ BS , R ∗ SC ) w.r.t. { W BS , W SC } subject to constraints (13b) and (13c). Obvi-ously this is a constrained convex problem that can be solvedefficiently. Having obtained its solution { W ∗ BS , W ∗ SC } , thenext round of ACO starts, and the procedure continues untila norm of the variations in both variable sets obtained fromthe two current consecutive rounds is less than a prescribedtolerance and the most current { W ∗ BS , W ∗ SC , R ∗ BS , R ∗ SC } istaken as the solution of the problem. The technical details ofsolving the two convex sub-problems now follow.
1) With X fixed: In this case, W BS and W SC are givenand the optimization problem in (13) assumes the form min R BS , R SC f ( R BS , R SC ) (14a) subject to (9 d ) , (9 e ) (14b)Substituting constraints (9d) and (9e) into eq. (10), it follows(15), where (16). Hence, the global minimizer R BS ∗ and R ∗ SC can be found by solving ∂f ( R BS , R SC ) ∂ R ( i )BS = 0 , ∂f ( R BS , R SC ) ∂ R ( s,j )SC = 0 (17)which gives R ( i ) ∗ BS = (cid:16) W ( i )BS (cid:17) H G ( i )B − M (cid:16) Ψ ( i )BS + σ I N UE (cid:17) − (18a) R ( s,j ) ∗ SC = (cid:16) W ( s,j )SC (cid:17) H G ( s,s,j )S − S (cid:16) Ψ ( s,j )SC + σ I N UE (cid:17) − (18b)with ∀ i ∈ I , j ∈ J s and s ∈ Ω . As we can see, the optimallinear receivers R ∗ BS and R ( s,j ) ∗ SC ( s ∈ Ω ) depend on theoptimal transmit precoding matrices W BS and W ( s,j )SC . In thisway, the optimal solution R ∗ BS and R ( s ) ∗ SC ( s ∈ Ω ) for problem(14) can be easily obtained by (18) based on the assumptionof X being fixed.
2) With X fixed: In this case, R BS and R SC are fixed,and the optimization problem in (9) assumes the form min W BS , W SC f ( W BS , W SC ) (19a) subject to (13 b ) , (13 c ) . (19b) f ( R BS , R SC ) = K X i =1 tr (cid:26) R ( i )BS Ψ ( i )BS (cid:16) R ( i )BS (cid:17) H − R ( i )BS (cid:16) G ( i )B − M (cid:17) H W ( i )BS + I N S + σ R ( i )BS (cid:16) R ( i )BS (cid:17) H (cid:27) + S X s =1 l s X j =1 tr (cid:26) R ( s,j )SC Ψ ( s,j )SC (cid:16) R ( s,j )SC (cid:17) H − R ( s,j )SC (cid:16) G ( s,s,j )S − S (cid:17) H W ( s,j )SC + I N S + σ R ( s,j )SC (cid:16) R ( s,j )SC (cid:17) H (cid:27) (15) Ψ ( i )BS = (cid:16) G ( i )B − M (cid:17) H W BS W H BS G ( i )B − M + S X s =1 (cid:16) G ( s,i )S − M (cid:17) H W ( s )SC (cid:16) W ( s )SC (cid:17) H G ( s,i )S − M (16a) Ψ ( s,j )SC = (cid:16) G ( s,j )B − S (cid:17) H W BS W H BS G ( s,j )B − S + S X t =1 (cid:16) G ( t,s,j )S − S (cid:17) H W ( t )SC (cid:16) W ( t )SC (cid:17) H G ( t,s,j )S − S (16b)With λ and λ s ( s ∈ Ω ) as the Lagrange multipliers associatedwith the power constraints, the Lagrangian of problem (19) isgiven by [18] L ( W BS , W SC , λ ) = f ( W BS , W SC ) + λ (cid:2) tr (cid:8) W H BS W BS (cid:9) −
1] + S X s =1 λ s (cid:20) tr (cid:26)(cid:16) W ( s )SC (cid:17) H W ( s )SC (cid:27) − (cid:21) (20)where for notation simplicity we have defined λ =[ λ , λ , . . . , λ S ] T . Given R BS , R SC , λ and λ s ( s ∈ Ω ), theLagrangian in (20) is minimized if and only if ∂L ( W BS , W SC , λ ) ∂ W BS = 0 , ∂L ( W BS , W SC , λ ) ∂ W ( s )SC = 0 , s ∈ Ω (21)i.e., W ∗ BS = ( Φ BS + λ I N BS ) − G B − M R H BS (22a) W ( s ) ∗ SC = (cid:16) Φ ( s )SC + λ s I N SC (cid:17) − G ( s,s )S − S (cid:16) R ( s )SC (cid:17) H . (22b)where (23). To obtain non-negative multipliers λ and λ s ( s ∈ Ω ) in the above equations, we substitute (22) into (20) andwrite L ( λ ) = L ( W ∗ BS , W ∗ SC , λ ) . From the complementarityequalities in the Karush-Kuhn-Tucker (KKT) conditions for(19), namely λ (cid:0) tr (cid:8) W H BS W BS (cid:9) − (cid:1) = 0 (24a) λ s (cid:20) tr (cid:26)(cid:16) W ( s )SC (cid:17) H W ( s )SC (cid:27) − (cid:21) = 0 , s ∈ Ω (24b)we see that the optimal Lagrange multipliers are either positivesuch that the equality constraints in (9b) hold or zeros such thatthe constraints in (13b) hold strictly. Recalling that the equalityconstraints in (9b) are relaxed to the convex inequalities, wefirst assume that all the multipliers are greater than zero so thatthe equalities in constraints (13b) hold. This is the same asstating that taking partial derivative of L w.r.t. λ and λ s ( s ∈ Ω ) yields zero values.Given that Φ BS can be factorized in theform S H BS D BS S BS where S H BS S BS = I N BS and D BS = diag n d (1)BS , d (2)BS , . . . , d ( N BS )BS o , and that each Φ ( s )SC canbe expressed as (cid:16) S ( s )SC (cid:17) H D ( s )SC S ( s )SC , with (cid:16) S ( s )SC (cid:17) H S ( s )SC = I N SC and D ( s )SC = diag n d ( s, , d ( s, , . . . , d ( s,N SC )SC o , theLagrangian L ( λ ) can be simplified to an explicit expressionin terms of λ , λ , . . . , λ S , see (51) in Appendix VIII.Differentiating L ( λ ) in (51) w.r.t. λ and λ s ( s ∈ Ω ) andsetting the results to zero yield ∂L ( λ ) ∂λ = N BS X n =1 a ( n )BS (cid:16) d ( n )BS + λ (cid:17) − ∆ = χ ( λ ) = 0 (25a) ∂L ( λ ) ∂λ s = N SC X n =1 a ( s,n )SC (cid:16) d ( s,n )SC + λ s (cid:17) − ∆ = χ s ( λ s ) = 0 , s ∈ Ω (25b)where A BS = S H BS G B − M R H BS R BS G H B − M S BS is definedwith its ( n, n ) -th entry denoted as a ( n )BS , and A ( s )SC = (cid:16) S ( s )SC (cid:17) H G ( s,s )S − S (cid:16) R ( s )SC (cid:17) H R ( s )SC (cid:16) G ( s,s )S − S (cid:17) H S ( s )SC is defined withits ( n, n ) -th entry denoted as a ( s,n )SC . Based on the equationsin (25), we propose a bisection search algorithm to computethe numerical values of the optimal Lagrange multipliers λ s ( s ∈ { , Ω } ). The reader is referred to Algorithm 1 for a step-by-step description of the search method.By substituting the optimal λ ∗ s ( s ∈ { , Ω } ) obtained into(22), the optimal W ∗ BS , W ( s ) ∗ SC ( s ∈ Ω) can be calculatedaccording to (22), where X is assumed to be fixed. Asthe alternating convex minimization continues, the objectivefunction in (13a) monotonically decreases that ensures thealgorithms convergence because the objective function is non-negative hence it is bounded from below. In practice, thealternating minimization is run sufficient number of times soas to reach a steady-state hence practically optimal design. Thereader is referred to Algorithm 2 for a step-by-step summaryof the proposed method. B. Unconstrained Alternating Optimization with Normaliza-tion (UAON)
As will be demonstrated later in Section VI, the RAOalgorithm described above offers superior performance, butat the cost of considerable complexity. Below we presentan alternative solution for the sum-MSE problem based onunconstrained alternating convex optimization combined witha simple normalization step. More precisely, by relaxing the Φ BS = G B − M R H BS R BS G H B − M + S X s =1 G ( s )B − S (cid:16) R ( s )SC (cid:17) H R ( s )SC (cid:16) G ( s )B − S (cid:17) H (23a) Φ ( s )SC = G ( s )S − M R H BS R BS (cid:16) G ( s )S − M (cid:17) H + S X t =1 G ( s,t )S − S (cid:16) R ( t )SC (cid:17) H R ( t )SC (cid:16) G ( s,t )S − S (cid:17) H , s ∈ Ω . (23b)equality constraints in (9b) to constraints on average powerwhich are in turn satisfied by normalizing the ¯W BS and ¯W ( s )SC ( s ∈ Ω) obtained by minimizing the objective functionwithout constraints, optimal precoding can be achieved quicklywith reduced complexity relative to that of the RAO algorithm.The technical details that materialize this approach are givenas follows.
1) With X fixed: The optimal R ∗ BS and R ( s ) ∗ SC ( s ∈ Ω )can be acquired in the same way as the constrained alternatingoptimization, which results in (18).
2) With X fixed: Given R BS and R ( s )SC ( s ∈ Ω ), theoptimization problem becomes min W BS , W SC f ( W BS , W SC ) (26)where no constraints are imposed. Consequently, the globalminimizer W ∗ BS , W ( s ) ∗ SC ( s ∈ Ω) are obtained by solving [18] ∂f ( W BS , W SC ) ∂ W BS = 0 , ∂f ( W BS , W SC ) ∂ W ( s )SC = 0 , s ∈ Ω (27)which yield ¯W BS = Φ − G B − M R H BS (28a) ¯W ( s )SC = (cid:16) Φ ( s )SC (cid:17) − G ( s,s )S − S (cid:16) R ( s )SC (cid:17) H , s ∈ Ω . (28b)Then, the normalized optimal solutions are expressed as W ∗ BS = ¯W BS q tr (cid:8) ¯W H BS ¯W BS (cid:9) (29a) W ( s ) ∗ SC = ¯W ( s )SC r(cid:16) ¯W ( s )SC (cid:17) H ¯W ( s )SC , s ∈ Ω . (29b)The reader is referred to Algorithm 3 for a step-by-stepsummary of the proposed method. We remark that althoughUAON is much simpler than RAO, the optimal precoder basedon UAON still requires the knowledge about the channels fromthe nodes to both MUEs and SUEs.IV. S EPARATE
MSE M
INIMIZATION B ASED T WO - LEVEL P RECODING IN H ET N ET In Section III, the precoders at the BS and all the SCs arejointly designed by minimizing the sum-MSE. However, dueto the non-convexity of the objective functions and constraints,no non-iterative algorithms are available for the precoder
Algorithm 1:
Bisection search algorithm
Decide search region:
Calculate χ s (0) and decide the searchregion. If χ s (0) > , find a ˜ λ s satisfying χ s (cid:16) ˜ λ s (cid:17) ≥ andthen go to the initialization step. Otherwise, output λ ∗ s = 0 asthe solution. Initialize:
Set λ s, min = 0 , λ s, max = ˜ λ s and a tolerance ε . Repeat:
1) Set λ s = ( λ s, min + λ s, max )/2 ;2) Calculate χ s ( λ s ) ;3) Update the search region: If χ s ( λ s ) ≥ , set lowerbound to λ s, min = λ s . If χ s ( λ s ) < , set upper boundto λ s, max = λ s . Until: χ s ( λ s, min ) − χ s ( λ s, max ) < ε (search error is less thantolerance). Output:
Output λ ∗ s = ( λ s, min + λ s, max )/2 as the solution. Algorithm 2:
RAO
Initialize:
Input initial R (0)BS , R ( s )(0)SC ( s ∈ Ω ) and a maximumnumber of iterations N iter . Set k = 1 . Repeat:
1) Calculate optimal λ ∗ s ( s ∈ { , Ω } ) in (25) by Algorithm1;2) Calculate W ( k )BS and W ( s )( k )SC ( s ∈ Ω) using (22);3) Calculate optimal R ( k )BS and R ( s )( k )SC ( s ∈ Ω ) by substi-tuting the W ( k )BS and W ( s )( k )SC ( s ∈ Ω) obtained in step1) into (18);4) Set k = k + 1 . Until: k = N iter . Output:
Output R ( N iter )BS , R ( s )( N iter )SC ( s ∈ Ω ), W ( N iter )BS and W ( s )( N iter )SC ( s ∈ Ω) as the solution. Algorithm 3:
UAON
Initialize:
Set initial R (0)BS , R ( s )(0)SC ( s ∈ Ω ) and a maximumnumber of iterations N iter . Set k = 1 . Repeat:
1) Calculate ¯W ( k )BS and ¯W ( s )( k )SC ( s ∈ Ω) using(28), and normalize them using (29) to obtain W ( k )BS , W ( s )( k )SC ( s ∈ Ω) ;2) Calculate optimal R ( k )BS and R SC( k ) s ( s ∈ Ω ) by substi-tuting the W ( k )BS and W ( s )( k )SC ( s ∈ Ω) obtained in step1) into (18);3) Set k = k + 1 . Until: k = N iter . Output:
Output R ( N iter )BS , R ( s )( N iter )SC ( s ∈ Ω ), W ( N iter )BS and W ( s )( N iter )SC ( s ∈ Ω) as the solution. designs and intensive computation is required. In this section,a simplified solution procedure is derived based on separateMSE minimization where block diagonalization techniquesact in the first-level and the second-level precoders at eachnode are designed separately. As shown in what follows, theseparate treatment of individual precoders leads to a non-iterative algorithm. A. MSE Minimization at the BS
In order to determine the precoding matrix W BS at the BS,the signal and interference associated with the BS are takeninto account in a way similar to [13]. This leads to min W BS , R BS E n k ˆ x BS − x BS k o (30a) subject to (cid:13)(cid:13) G H B − S W BS x BS (cid:13)(cid:13) ≤ γ BS (30b) tr (cid:8) W H BS W BS (cid:9) ≤ (30c)where γ BS > is a threshold parameter set to control the rel-ative interference involved. The item in the objective in (30a)is the sum of squares of errors seen by the MUEs assuming nointerferences included, i.e., y BS = G H B − M W BS x BS + n BS ; andthe item in (30b) is the sum of squares of the interference seenby the SUEs. By tuning γ BS , the BS trades off the beamform-ing gains for its target MUEs against interference reduction tothe neighboring SUEs. We stress that the objective functionis not jointly convex with respect to all design variables,but that the induced interference constraint in (30b) and theaverage power constraint in (30c) are convex. Certainly, similarto the RAO proposed in Section III, an iterative algorithmcould provide an optimal solution for (30). To obtain a non-iterative algorithm, we further simplify (30) by employing BDtechnique at the BS side as a first-level precoder. Thereby,all the inter-MUE interferences are eliminated and each MUEperceives an interference-free MIMO channel, which meansthat problem in (30) can be divided into K independent sub-problems of the form min W ( i )BS , R ( i )BS E (cid:26)(cid:13)(cid:13)(cid:13) ˆ x ( i )BS − x ( i )BS (cid:13)(cid:13)(cid:13) (cid:27) (31a) subject to (cid:13)(cid:13)(cid:13) G H B − S W ( i )BS x ( i )BS (cid:13)(cid:13)(cid:13) ≤ γ ( i )BS (31b) tr (cid:26)(cid:16) W ( i )BS (cid:17) H W ( i )BS (cid:27) ≤ α ( i )BS (31c) (cid:16) ¯G ( i )B − M (cid:17) H W ( i )BS = (31d)where ¯G ( i )B − M = h G (1)B − M , . . . , G ( i − − M , G ( i +1)B − M , . . . G ( K )B − M i , i ∈ I , K P i =1 γ ( i )BS = γ BS and K P i =1 α ( i )BS = 1 with γ ( i )BS > and α ( i )BS > . Here, the BD constraint of (31d) is imposed toeliminate all inter-MUE interferences. By applying the SVD,we have ¯ G ( i )B − M = U ( i )BS Z ( i )BS h V ( i, V ( i, i H , where Z ( i )BS isthe diagonal matrix with non-negative singular values as itsdiagonal elements, V ( i, contains the singular vectors corre-sponding to the nonzero singular values and V ( i, consistsof vectors corresponding to the zero singular values. Hence, V ( i, is an orthogonal basis for the null space of ¯ G ( i )B − M . Forsimplicity, we suppose that W ( i )BS = W ( i )BS , W ( i )BS , for ∀ i ∈ I with W ( i )BS , = V ( i, to satisfy the BD constraint of (31d). Inthis way, we transform our focus from the design of W ( i )BS tothat of W ( i )BS , .Similar to the RAO algorithm, suppose that W ( i )BS arefixed, then the optimal R ( i )BS ( i ∈ I ) can be expressedas (32). Substituting (32) into the objective function of(31a), we obtain (33), which means that minimizing MSE isequivalent to maximizing the term of (cid:13)(cid:13)(cid:13)(cid:13)(cid:16) G ( i )B − M (cid:17) H W ( i )BS (cid:13)(cid:13)(cid:13)(cid:13) .Since transmission symbols satisfy E { x } = 0 and E (cid:8) xx H (cid:9) = I , the left-hand in (31c) equals to tr n Q ( i )BS o ,where Q ( i )BS = B ( i )BS W ( i )BS , (cid:16) W ( i )BS , (cid:17) H B ( i )BS with B ( i )BS = (cid:20) ˜ G ( i )B − S (cid:16) ˜ G ( i )B − S (cid:17) H (cid:21) , where ˜ G ( i )B − S ∆ = (cid:16) W ( i )BS , (cid:17) H G B − S denotes the equivalent channel matrix. Similarly, supposethat ˜ G ( i )B − M ∆ = (cid:16) W ( i )BS , (cid:17) H G ( i )B − M , then the objectivefunction becomes (34), where (cid:16) ˜ G ( i )B − M (cid:17) H (cid:16) B ( i )BS (cid:17) − =P ( i )BS Σ ( i )BS (cid:16) T ( i )BS (cid:17) H with Σ ( i )BS = diag n σ ( i, , . . . , σ ( i,N UE )BS o is obtained by SVD in order to further simplify the prob-lem. Using the Hadamards inequality (see, e.g., [20]), theoptimal solution for maximizing (34) is obtained as Q ( i ) ∗ BS = T ( i )BS Λ ( i )BS (cid:16) T ( i )BS (cid:17) H , where Λ ( i )BS = diag n λ ( i, , . . . , λ ( i,N UE )BS o with λ ( i,n )BS ( n = 1 , . . . , N UE ) being the only parameters tobe determined. Thus, the objective function in (31) can betransformed into N UE P n =1 (cid:16) σ ( i,n )BS (cid:17) λ ( i,n )BS , the constraint in (31b)is equivalent to N UE P n =1 λ ( i,n )BS ≤ γ ( i )BS , and (31c) is equivalentto tr n Λ ( i )BS X ( i )BS o = N UE P n =1 x ( i,n )BS λ ( i,n )BS ≤ α ( i )BS with X ( i )BS ∆ = (cid:16) T ( i )BS (cid:17) H (cid:16) B ( i )BS (cid:17) − T ( i )BS , where x ( i,n )BS denotes the ( n, n ) -thelement of X ( i )BS . In this way, the optimization problem (31)can be formulated as min λ ( i )BS (cid:16) c ( i )BS (cid:17) T λ ( i )BS (35a) subject to e T λ ( i )BS ≤ γ ( i )BS (35b) (cid:16) x ( i )BS (cid:17) T λ ( i )BS ≤ α ( i )BS (35c) − λ ( i )BS ≤ (35d)where λ ( i )BS = h λ ( i, , . . . , λ ( i,N UE )BS i T , c ( i )BS = (cid:20) − (cid:16) σ ( i,n )BS (cid:17) , . . . , − (cid:16) σ ( i,N UE )BS (cid:17) (cid:21) T and x ( i )BS = h x ( i, ,. . . , x ( i,N UE )BS i T . Notably, (35) is a standard linearprogramming (LP) problem and can easily be solvedby CVX. Upon obtaining the optimal λ ( i ) ∗ BS , the optimal R ( i ) ∗ BS = (cid:16) W ( i )BS (cid:17) H G ( i )B − M (cid:20)(cid:16) G ( i )B − M (cid:17) H W ( i )BS (cid:16) W ( i )BS (cid:17) H G ( i )B − M + σ I N UE (cid:21) − . (32) MSE ( i )BS = N S − tr " σ (cid:18)(cid:16) G ( i )B − M (cid:17) H W ( i )BS (cid:16) W ( i )BS (cid:17) H G ( i )B − M (cid:19) − + I N UE − (33) (cid:13)(cid:13)(cid:13)(cid:13)(cid:16) G ( i )B − M (cid:17) H W ( i )BS (cid:13)(cid:13)(cid:13)(cid:13) = tr (cid:26)(cid:16) ˜ G ( i )B − M (cid:17) H (cid:16) B ( i )BS (cid:17) − Q ( i )BS (cid:16) B ( i )BS (cid:17) − ˜ G ( i )B − M (cid:27) = tr (cid:26) P ( i )BS Σ ( i )BS (cid:16) T ( i )BS (cid:17) H Q ( i )BS T ( i )BS (cid:16) Σ ( i )BS (cid:17) H (cid:16) P ( i )BS (cid:17) H (cid:27) (34)precoder at the BS is found to be W BS W H BS = V ( i, (cid:16) B ( i )BS (cid:17) − T ( i )BS Λ ( i ) ∗ BS × (cid:16) T ( i )BS (cid:17) H (cid:16) B ( i )BS (cid:17) − (cid:16) V ( i, (cid:17) H (36)from which the optimal W BS can be obtained by SVD.From the above solution procedure, it is clear that construct-ing an optimal precoder at the BS only requires the knowledgeabout the channels from BS to both MUEs and SUEs, a lessstringent requirement relative to the sum-MUE minimizationbased precoding scheme. B. MSE Minimization at each SC
Similarly, the design of precoding vector W ( s )SC ( s ∈ Ω ) canbe handled by solving the LP problem for each SUE, givenby min λ ( s,j )SC (cid:16) c ( s,j )SC (cid:17) T λ ( s,j )SC (37a) subject to e T λ ( s,j )SC ≤ γ ( s,j )SC (37b) (cid:16) x ( s,j )SC (cid:17) T λ ( s,j )SC ≤ α ( s,j )SC (37c) − λ ( s,j )SC ≤ (37d)where j ∈ J s , λ ( s,j )SC = h λ ( s,j, , . . . , λ ( s,j,N UE )SC i T , c ( s,j )SC = (cid:20) − (cid:16) σ ( s,j,n )SC (cid:17) , . . . , − (cid:16) σ ( s,j,N UE )SC (cid:17) (cid:21) T and x ( s,j )SC = h x ( s,j, , . . . , x ( s,j,N UE )SC i T . Here, the vector elements are cal-culated accordingly based on the definitions and derivationsin Subsection IV-A.Like the precoder at the BS, only the knowledge aboutchannels from the s -th SC to both MUEs and SUEs arerequired to construct the optimal precoder at the s -th SC.V. R OBUST P RECODING D ESIGN W ITH I MPERFECT
CSI IN H ET N ET Since perfect CSI is required in the above precoding design,it is often not practical due to channel estimation error, feed-back error and quantization error. In this section, we proposemore practical precoders for the HetNet with imperfect CSIknown at each node. Assume that the CSI errors of all links are stochas-tic and modeled as ˆ G ∗ = G ∗ + Ξ ∗ , where ∗ ∈{ B − M , B − S , S − M , S − S } , ˆ G ∗ is the estimated channelmatrix, and Ξ ∗ denotes the channel estimation error matrixwhich is assumed to be Gaussian distributed with E { Ξ } = and E n vec ( Ξ ∗ ) vec( Ξ ∗ ) H o = σ h I . A. Robust RAO With Imperfect CSI
With imperfect CSI known at each node, the RAO problembecomes min W BS , W ( t )SC R BS , R ( t )SC , t ∈ Ω f ( W BS , W SC , R BS , R SC ) (cid:12)(cid:12)(cid:12) ˆ G ∗ (38a) subject to (13 b ) , (13 c ) , (13 d ) (38b)where f ( W BS , W SC , R BS , R SC ) (cid:12)(cid:12)(cid:12) ˆ G ∗ = MˆSE BS + MˆSE SC (39)with (40), (41), and ¯ ω = S P t =1 tr (cid:26)(cid:16) W ( t )SC (cid:17) H W ( t )SC (cid:27) +tr (cid:8) W H BS W BS (cid:9) . Thus, following the step of RAO, key equa-tions can be derived from the KKT conditions for problem(38) as R ( i ) ∗ BS = (cid:16) W ( i )BS (cid:17) H ˆ G ( i )B − M h ˆ Ψ ( i )BS + (cid:0) σ + σ h ¯ ω (cid:1) I N UE i − (42a) R ( s,j ) ∗ SC = (cid:16) W ( s,j )SC (cid:17) H ˆ G ( s,s,j )S − S h ˆ Ψ ( s,j )SC + (cid:0) σ + σ h ¯ ω (cid:1) I N UE i − (42b) W ∗ BS = h ˆ Φ BS + (cid:16) ˆ λ + σ h ¯ r (cid:17) I N BS i − ˆ G B − M R H BS (42c) W ( s ) ∗ SC = h ˆ Φ ( s )SC + (cid:16) ˆ λ s + σ h ¯ r (cid:17) I N SC i − ˆ G ( s,s )S − S (cid:16) R ( s )SC (cid:17) H (42d)with i ∈ I , j ∈ J s and s ∈ Ω , where ˆ Ψ ( i )BS = (cid:16) ˆ G ( i )B − M (cid:17) H W BS W H BS ˆ G ( i )B − M + S P s =1 (cid:16) ˆ G ( s,i )S − M (cid:17) H W ( s )SC (cid:16) W ( s )SC (cid:17) H ˆ G ( s,i )S − M , ˆ Ψ ( s,j )SC = (cid:16) ˆ G ( s,j )B − S (cid:17) H W BS W H BS ˆ G ( s,j )B − S + S P t =1 (cid:16) ˆ G ( t,s,j )S − S (cid:17) H W ( t )SC (cid:16) W ( t )SC (cid:17) H ˆ G ( t,s,j )S − S , ˆ Φ BS = ˆ G B − M R H BS MˆSE
BS ∆ =E n k ˆ x BS − x BS k (cid:12)(cid:12)(cid:12) ˆ G ∗ o = tr ( R BS " ˆ G H B − M W BS W H BS ˆ G B − M + S X s =1 (cid:16) ˆ G ( s )S − M (cid:17) H W ( s )SC (cid:16) W ( s )SC (cid:17) H ˆ G ( s )S − M × R H BS − R BS ˆ G H B − M W BS + I KN S + σ R BS R H BS o + σ h ¯ ω tr (cid:8) R H BS R BS (cid:9) (40) MˆSE
SC ∆ =E ( S X s =1 (cid:13)(cid:13)(cid:13) ˆ x ( s )SC − x ( s )SC (cid:13)(cid:13)(cid:13) (cid:12)(cid:12)(cid:12) ˆ G ∗ ) = S X s =1 tr ( R ( s )SC "(cid:16) ˆ G ( s )B − S (cid:17) H W BS W H BS ˆ G ( s )B − S + S X t =1 (cid:16) ˆ G ( t,s )S − S (cid:17) H W ( t )SC (cid:16) W ( t )SC (cid:17) H ˆ G ( t,s )S − S × (cid:16) R ( s )SC (cid:17) H − R ( s )SC (cid:16) ˆ G ( s,s )S − S (cid:17) H W ( s )SC + I L s N S + σ R ( s )SC (cid:16) R ( s )SC (cid:17) H (cid:27) + σ h ¯ ω tr (cid:26)(cid:16) R ( s )SC (cid:17) H R ( s )SC (cid:27) (41) R BS ˆ G H B − M + S P s =1 ˆ G ( s )B − S (cid:16) R ( s )SC (cid:17) H R ( s )SC (cid:16) ˆ G ( s )B − S (cid:17) H , ˆ Φ ( s )SC = ˆ G ( s )S − M R H BS R BS (cid:16) ˆ G ( s )S − M (cid:17) H + S P t =1 ˆ G ( s,t )S − S (cid:16) R ( t )SC (cid:17) H R ( t )SC (cid:16) ˆ G ( s,t )S − S (cid:17) H , and ¯ r = tr (cid:8) R H BS R BS (cid:9) + S P t =1 tr (cid:26)(cid:16) R ( t )SC (cid:17) H R ( t )SC (cid:27) .Based on (42), a robust RAO algorithm can be constructedin a way similar to that in Subsection III-A where the RAOalgorithm was developed. The optimal Lagrange in the presentcase satisfy ∂L (cid:16) ˆ λ (cid:17) ∂ ˆ λ = N BS X n =1 ˆ a ( n )BS (cid:16) ˆ d ( n )BS + ˆ λ + σ h ¯ r (cid:17) − ∆ = ˆ χ (cid:16) ˆ λ (cid:17) = 0 (43a) ∂L (cid:16) ˆ λ (cid:17) ∂ ˆ λ s = N SC X n =1 ˆ a ( s,n )SC (cid:16) ˆ d ( s,n )SC + ˆ λ s + σ h ¯ r (cid:17) − ∆ = ˆ χ s (cid:16) ˆ λ s (cid:17) = 0 , s ∈ Ω (43b)where ˆ a ( n )BS , ˆ d ( n )BS , ˆ a ( s,n )SC and ˆ d ( s,n )SC are defined in an entirelysimilar way to their counterparts in Subsection III-A. Evi-dently, a bisection search is applicable to (43) to identify theoptimal Lagrange multipliers. B. Robust UAON With Imperfect CSI
As expected, the design of robust UAON with imperfect CSIcan be carried out by steps in parallel to those of Algorithm3. Specifically, the optimal R ∗ BS and R ( s ) ∗ SC ( s ∈ Ω ) have thesame expressions as (42a) and (42b). Consequently, the globalminimizers W ∗ BS and W ( s ) ∗ SC ( s ∈ Ω) for robust UAON canbe obtained by first computing ¯W BS = (cid:16) ˆ Φ BS + σ h ¯ r I N BS (cid:17) − ˆ G B − M R H BS (44a) ¯W ( s )SC = (cid:16) ˆ Φ ( s )SC + σ h ¯ r I N SC (cid:17) − ˆ G ( s,s )S − S (cid:16) R ( s )SC (cid:17) H , s ∈ Ω . (44b)followed by a norm normalization step as in (29). C. Robust Non-iterative Algorithm With Imperfect CSI
With imperfect CSI, there is a robust counterpart of thenon-iterative precoding developed in Subsection IV based onseparate MSE. To see this, note that the optimal R ( i )BS ( i ∈ I )with fixed W ( i )BS can be expressed as (45), where ¯ ω ( i )BS =tr (cid:26) W ( i )BS (cid:16) W ( i )BS (cid:17) H (cid:27) . By substituting (45) into the imperfectCSI based objective function, it is evident that minimiz-ing MSE can be transformed into maximizing the term of (cid:13)(cid:13)(cid:13)(cid:13)(cid:16) ˆ G ( i )B − M (cid:17) H W ( i )BS (cid:13)(cid:13)(cid:13)(cid:13) . In this way, the optimization problemafter certain transformations can be rewritten as min ˆ λ ( i )BS (cid:16) ˆ c ( i )BS (cid:17) T ˆ λ ( i )BS (46a) subject to e T ˆ λ ( i )BS ≤ γ ( i )BS (46b) (cid:16) ˆ x ( i )BS (cid:17) T ˆ λ ( i )BS ≤ α ( i )BS (46c)where ˆ λ ( i )BS = h ˆ λ ( i, , . . . , ˆ λ ( i,N UE )BS i T , ˆ c ( i )BS = (cid:20) − (cid:16) ˆ σ ( i,n )BS (cid:17) , . . . , − (cid:16) ˆ σ ( i,N UE )BS (cid:17) (cid:21) T and ˆ x ( i )BS = h ˆ x ( i, ,. . . , ˆ x ( i,N UE )BS i T . Notably, (46) is a standard linearprogramming (LP) problem and can easily be solvedby CVX. Upon obtaining the optimal ˆ λ ( i ) ∗ BS , the optimalprecoder at the BS is found to be (47), where ˆ B ( i )BS , ˆ T ( i )BS and ˆ V ( i, are obtained by replacing all involved G ∗ with ˆ G ∗ .Thus, the optimal imperfect CSI based W BS can be obtainedby applying SVD to eq. (47). Clearly, constructing an optimalprecoder at the BS only requires estimated knowledge aboutthe channels from BS to both MUEs and SUEs, a lessstringent requirement relative to the sum-MUE minimizationbased precoding scheme.Similarly, the design of precoding vector W ( s )SC ( s ∈ Ω ) canbe handled by solving the LP problem for each SUE, givenby min ˆ λ ( s,j )SC (cid:16) ˆ c ( s,j )SC (cid:17) T ˆ λ ( s,j )SC (48a) subject to e T ˆ λ ( s,j )SC ≤ γ ( s,j )SC (48b) (cid:16) ˆ x ( s,j )SC (cid:17) T ˆ λ ( s,j )SC ≤ α ( s,j )SC (48c) R ( i ) ∗ BS = (cid:16) W ( i )BS (cid:17) H ˆ G ( i )B − M (cid:20)(cid:16) ˆ G ( i )B − M (cid:17) H W ( i )BS (cid:16) W ( i )BS (cid:17) H ˆ G ( i )B − M + (cid:16) σ + σ h ¯ ω ( i )BS (cid:17) I N UE (cid:21) − (45) W ( i )BS (cid:16) W ( i )BS (cid:17) H = ˆ V ( i, (cid:16) ˆ B ( i )BS (cid:17) − ˆ T ( i )BS ˆ Λ ( i ) ∗ BS (cid:16) ˆ T ( i )BS (cid:17) H (cid:16) ˆ B ( i )BS (cid:17) − (cid:16) ˆ V ( i, (cid:17) H (47)where j ∈ J s , ˆ λ ( s,j )SC = h ˆ λ ( s,j, , . . . , ˆ λ ( s,j,N UE )SC i T , ˆ c ( s,j )SC = (cid:20) − (cid:16) ˆ σ ( s,j,n )SC (cid:17) , . . . , − (cid:16) ˆ σ ( s,j,N UE )SC (cid:17) (cid:21) T and ˆ x ( s,j )SC = h ˆ x ( s,j, , . . . , ˆ x ( s,j,N UE )SC i T . The components of the above vec-tors are calculated in a way entirely similar to that performedin Sec. IV.A. Here we omit the details due to limited space.Like the precoder at the BS, only estimated knowledge aboutchannels from s -th SC to both MUEs and SUEs are requiredto construct the optimal precoder at s -th SC.VI. S IMULATION R ESULTS
Simulations were performed for the three MSE-based pre-coding strategies in the MIMO HetNet systems to demonstratethe efficiency and performance of the proposed precoder de-sign schemes. In the simulations, the bandwidth was 20 MHz,the cell radiuses for macro-cell and small cell were set to800 m and 100 m, respectively, and the inter site distancebetween MC and SC was set to 700 m, see Table I forsimulation parameters and assumption details. Throughout thesimulations, a total of 1000 sets of channel realizations wereutilized with each set consisting of ( K + L × S ) BS-to-UEchannels of size N BS × N UE and S × ( K + L × S ) SC-to-UEchannels of size N SC × N UE , and , quadrature-phase-shift keying (QPSK) symbols were transmitted from the BSand each SC node under each channel realization to obtainthe BER performance. In all comparisons, unless specifiedotherwise, the normalized channel estimation error defined by ¯ σ h ∆ = σ h σ was set to be .TABLE I: S IMULATION P ARAMETERS
Parameters SettingBandwidth 20 MHzCell radius MC: 800 m, SC: 100 mInter site distance 700 mTransmit power BS: ∼ dBm, SC: 24 dBmNoise power density -174 dBm/HzNumber of Antennas N BS = 36 , N SC = 8 Number of UEs K = 8 ∼ , L = 4 Pathloss model (BS) θ BS ( d ) = 128 . . log ( d ) , d (km) [17]Pathloss model (SC) θ SC ( d ) = 140 . . log ( d ) , d (km) [17]Penetration loss ζ BS = ζ SC = 20 dB [17]Using the proposed sum-MSE based precoding schemeswith perfect and imperfect CSI respectively, Fig. 2 plots the A v e r age M SE pe r D a t a S t r ea m Sum−MSE based Precoding (RAO, Perfect CSI)Sum−MSE based Precoding (UAON, Perfect CSI)Separate MSE based Precoding (Perfect CSI)Sum−MSE based Precoding (RAO, Imperfect CSI)Sum−MSE based Precoding (UAON, Imperfect CSI)Separate MSE based Precoding (Imperfect CSI)
Fig. 2: The average MSE per data stream learning curve over100 runs ( K = 8 , P BS = 46 dBm, ¯ σ h = 1 ).average MSE learning curves over runs via alternatingoptimization. For comparison purpose, the red lines in Fig. 2depict the average MSE per data stream obtained by the non-iterative algorithm based on separate MSE. From the curvesin the figure, it is observed that between the sum-MSE basedprecoding schemes RAO offers a better performance with alower average MSE than UAON, but its convergence rate isalways slower than the simpler UAON. Also note that the MSEperformance of the separate MSE based precoding schemeobtained from the non-iterative algorithm is inferior to that ofRAO. Moreover, the performance curves in Fig. 2 reveal thatwhen imperfect CSI is utilized, the MSE differences betweenthe separate MSE based and Sum-MSE based precoding aremore pronounced relative to those in the perfect CSI case, forall the three proposed schemes, and the convergence of therobust UAON and robust RAO appears to be slower than theirperfect CSI counterpart.Efforts were made to investigate how the average MSE isrelated to the transmission power. With a fixed P SC = 24 (dBm), Fig. 3 shows that the average MSE for MUEs ofiterative algorithms decreases gradually as the transmit powerat the BS increases with perfect CSI, while the average MSEfor SUEs increases slightly due to the increased inter-cellinterferences. Furthermore, the sum-MSE based RAO offersthe smallest MSE gap between MUE and SUE, indicatingbetter user fairness, while the separate MSE based precodinghas the largest one under the low transmit power. As forthe separate MSE based non-iterative precoding scheme, the
46 48 50 52 54 5600.10.20.30.40.50.6 Transmit Power at BS (dBm) A v e r age M SE pe r D a t a S t r ea m f o r M U E / S U E Sum−MSE based Precoding (RAO, MUE, Perfect CSI)Sum−MSE based Precoding (UAON, MUE, Perfect CSI)Separate MSE based Precoding (MUE, Perfect CSI)Sum−MSE based Precoding (RAO, SUE, Perfect CSI)Sum−MSE based Precoding (UAON, SUE, Perfect CSI)Separate MSE based Precoding (SUE, Perfect CSI)
Fig. 3: The average MSE per data stream for MUE/SUEversus transmit power at BS ( K = 8 , Perfect CSI).
46 48 50 52 54 5610 −8 −7 −6 −5 −4 −3 −2 −1 Transmit Power at BS (dBm) BE R f o r M U E / S U E Sum−MSE based Precoding (RAO, MUE, Perfect CSI)Sum−MSE based Precoding (UAON, MUE, Perfect CSI)Separate MSE based Precoding (MUE, Perfect CSI)Sum−MSE based Precoding (RAO, SUE, Perfect CSI)Sum−MSE based Precoding (UAON, SUE, Perfect CSI)Separate MSE based Precoding (SUE, Perfect CSI)
Fig. 4: The BER per data stream for MUE/SUE versustransmit power at BS ( K = 8 , Perfect CSI). A v e r age M SE pe r D a t a S t r ea m Sum−MSE based Precoding (RAO, Perfect CSI)Sum−MSE based Precoding (UAON, Perfect CSI)Separate MSE based Precoding(Perfect CSI)Sum−MSE based Precoding (RAO, Imperfect CSI)Sum−MSE based Precoding (UAON, Imperfect CSI)Separate MSE based Precoding(Imperfect CSI)
Fig. 5: The average MSE per data stream versus the numberof MUEs K ( P BS = 46 dBm, ¯ σ h = 1 ). −6 −5 −4 −3 −2 −1 The Number of MUEs BE R Sum−MSE based Precoding (RAO, Perfect CSI)Sum−MSE based Precoding(UAON, Perfect CSI)Separate MSE based Precoding(Perfect CSI)Sum−MSE based Precoding (RAO, Imperfect CSI)Sum−MSE based Precoding(UAON, Imperfect CSI)Separate MSE based Precoding(Imperfect CSI)
Fig. 6: The BER per data stream for MUE/SUE versus thenumber of MUEs K ( P BS = 46 dBm, ¯ σ h = 1 ). σ ) A v e r age M SE pe r D a t a S t r ea m Sum−MSE based Precoding (RAO, Imperfect CSI)Sum−MSE based Precoding (UAON, Imperfect CSI)Separate MSE based Precoding (Imperfect CSI)
Fig. 7: The average MSE per data stream versus normalizedchannel estimation error ¯ σ h ( K = 8 , P BS = 46 dBm,Imperfect CSI).average MSE for MUEs approaches to that of RAO as thetransmit power at BS increases, while the average MSE curvefor SUEs goes up gradually. Subsequently, Fig. 4 illustratesthe corresponding BER performance of the proposed schemes,indicating the same relationships as those of the average MSEperformance revealed in Fig. 3.To further illustrate the factors that affect the MSE and BERperformance, Fig. 5 provides the average MSE curves for thethree different precoding schemes when the number of MUEs K increases from to under both perfect and imperfectCSI cases. It can be seen that the average MSE increasesas the number of MUEs gets larger, indicating the higherinterferences from other MUEs, and that the sum-MSE basedRAO always outperforms both the UAON algorithm and theseparate MSE based precoding on the MSE performance underdifferent configurations. Also note that the MSE performancegaps between the separate MSE and sum-MSE based schemes −3 −2 −1 Normalized Channel Estimation Error (/ σ ) BE R Sum−MSE based Precoding (RAO, Imperfect CSI)Sum−MSE based Precoding (UAON, Imperfect CSI)Separate MSE based Precoding (Imperfect CSI)
Fig. 8: The BER per data stream for MUE/SUE versusnormalized channel estimation error ¯ σ h ( K = 8 , P BS = 46 dBm, Imperfect CSI).become larger as the number of MUEs increases, i.e., themacro-BS has more antennas relative to the number of theMUEs. Furthermore, the BER performances of the proposedthree schemes are given in Fig. 6, showing a consistent trendwith those of Fig. 5. We remark that the BER reported herewas averaged over all users in the MC and SCs.In Fig. 7, the MSE performance of the three proposedschemes with imperfect CSI are depicted versus the normal-ized channel estimation error ¯ σ h , where K = 8 and thetransmit power at BS was fixed to P BS = 46 (dBm). From thefigure, it is intuitively clear that the average MSE deterioratesas channel estimation error increases. Similarly, the obtainedBER curves Fig. 8 are consistent to those in Fig. 7.In summary, the sum-MSE based precoding scheme RAOproposed in Section III outperforms the separate MSE basedprecoding scheme described in Section IV in terms of theaverage MSE per user. On the other hand, RAO requires theinformation of all channels in the HetNet and its superiorperformance is achieved at the cost of increased computationalcomplexity relative to that of non-iterative separate MSEbased precoding. Furthermore, when the macro-BS has a largenumber of antennas relative to the number of the MUEs, theperformance gap between these two schemes shrinks. As atradeoff algorithm, the sum-MSE based UAON is much sim-pler and faster than RAO, with a performance slightly betterthan the separate MSE based scheme in most configurationswith reasonable number of BS antennas.VII. C ONCLUSION
This paper has developed three new MSE-based precodingschemes for MIMO downlinks in a HetNet architecture con-sisting of a macro tier overlaid with a second tier of SCs. Thefirst two are both based on the same sum-MSE minimizationproblem focusing on the joint design of a set of BS and SCtransmit precoding matrices or vectors by minimizing the totaluser MSE under individual transmit power constraints at each cell. On the other hand, we have also proposed a separateMSE minimization based two-level precoder by a non-iterativealgorithm in which BD technique is employed as its first-level precoder and each cell designs its own second-levelprecoder separately without the need to exchange user data orchannel state information over the backhaul. On the basis ofthe estimated imperfect CSI, corresponding robust precodingschemes have been proposed. Simulation results have shownthat the sum-MSE based RAO algorithm always outperformsUAON and the separate MSE-based precoding on the MSEperformance. When the number of antennas at the macro-BSis large enough relative to the number of MUEs, the averageMSE of the low complexity separate MSE-based precodingcan come close to those of RAO and UAON. Furthermore,the UAON algorithm has higher convergence rate and lowercomputation complexity compared to RAO, thus is a worthytrade-off between efficiency and performance.VIII. P
ROOF OF E Q . (25)To obtain the non-negative multipliers λ and λ s ( s ∈ Ω ) in the above equations, we substitute (22) into(20) and write (49), where κ = σ tr (cid:8) R BS R H BS (cid:9) + S P s =1 σ tr (cid:26) R ( s )SC (cid:16) R ( s )SC (cid:17) H (cid:27) + KN S + S P s =1 L s N S is inde-pendent of λ . Then, we start from the expressionsof Φ BS = S H BS D BS S BS where S H BS S BS = I N BS and D BS = diag n d (1)BS , d (2)BS , . . . , d ( N BS )BS o , and Φ ( s )SC = (cid:16) S ( s )SC (cid:17) H D ( s )SC S ( s )SC ( s ∈ Ω ) where (cid:16) S ( s )SC (cid:17) H S ( s )SC = I N SC and D ( s )SC = diag n d ( s, , d ( s, , . . . , d ( s,N SC )SC o . By sub-stituting the above two expressions into (49), we ob-tain (50). Defining A BS = S H BS G B − M R H BS R BS G H B − M S BS with the ( n, n ) -th entry denoted as a ( n )BS , and A ( s )SC = (cid:16) S ( s )SC (cid:17) H G ( s,s )S − S (cid:16) R ( s )SC (cid:17) H R ( s )SC (cid:16) G ( s,s )S − S (cid:17) H S ( s )SC with the ( n, n ) -th entry denoted as a ( s,n )SC , we have L ( λ ) = − N BS X n =1 a ( n )BS d ( n )BS + λ − λ − S X s =1 N SC X n =1 a ( s,n )SC d ( s,n )SC + λ s + λ s ! + κ. (51)Using (51), computing the partial derivative of L w.r.t. λ and λ s ( s ∈ Ω ) becomes straightforward, hence the proof of (25).R EFERENCES[1] 3GPP, TR 36.814, v9.0.0,“Evolved Universal Terrestrial Radio Access(E-UTRA), Further Advancements for E-UTRA Physical Layer As-pects,” Mar. 2010.[2] S. Brueck, “Heterogeneous networks in LTE-Advanced,” In
Interna-tional Symposium on Wireless Communication Systems ( ISWCS ), pp.171–175, Nov. 2011.[3] A. Damnjanovic, J. Montojo, Y. Wei, T. Ji, T. Luo, M. Vajapeyam,T. Yoo, O. Song, and D. Malladi, “A survey on 3GPP heterogeneousnetworks,”
IEEE Wireless Communications Magazine , vol. 18, no. 3, pp.10-21, June 2011.[4] J. G. Andrews, S. Buzzi, W. Choi, S. Hanly, A. Lozano, A. C. Soong,and J. C. Zhang, “What will 5G be?” arXiv preprint arXiv:1405.2957 ,2014. L ( λ ) = − tr n ( Φ BS + λ I N BS ) − G B − M R H BS R BS G H B − M o − λ + κ − S X s =1 (cid:20) tr (cid:26)(cid:16) Φ ( s )SC + λ t I N SC (cid:17) − G ( s,s )S − S (cid:16) R ( s )SC (cid:17) H R ( s )SC (cid:16) G ( s,s )S − S (cid:17) H (cid:27) + λ s (cid:21) (49) L ( λ ) = − tr n ( D BS + λ I N BS ) − S H BS G B − M R H BS R BS G H B − M S BS o − λ + κ − S X s =1 (cid:20) tr (cid:26)(cid:16) D ( s )SC + λ t I N SC (cid:17) − (cid:16) S ( s )SC (cid:17) H G ( s,s )S − S (cid:16) R ( s )SC (cid:17) H R ( s )SC (cid:16) G ( s,s )S − S (cid:17) H S ( s )SC (cid:27) + λ s (cid:21) . (50) [5] D. Gesbert, S. Hanly, H. Huang, S. Shamai, O. Simeone, and W. Yu,“Multi-cell MIMO cooperative networks: A new look at interference,” IEEE Journal on Selected Areas in Communications , vol. 28, no. 9,pp. 1380–1408, 2010.[6] J. Zhu and H. Yang, “Interference control with beamforming coordina-tion for two-tier femtocell networks and its performance analysis,” In ( ICC ), June2011.[7] Y. Dai, S. Jin, L. Pan, X. Gao, L. Jiang, and M. Lei, “Interferencecontrol based on beamforming coordination for heterogeneous networkwith RRH deployment,”
IEEE Systems Journal , vol. 99, no. 99, pp. 1–7,Apr. 2013.[8] M. Hong, Q. Li, and Y. F. Liu, “Decomposition by successive convexapproximation: A unifying approach for linear transceiver design inheterogeneous networks,” arXiv preprint arXiv:1210.1507 , 2013.[9] Z. Xu, C. Yang, G. Y. Li, Y. Liu, and S. Xu, “Energy-efficient CoMPprecoding in heterogeneous networks,”
IEEE Transactions on SignalProcessing , vol. 62, no. 4, pp. 1005-1017, Feb. 2014.[10] H. Shen, B. Li, M. Tao, and X. Wang, “MSE-based transceiver designsfor the MIMO interference channel,”
IEEE Transactions on WirelessCommunications , vol. 9, no. 11, pp. 3480–3489, Nov. 2010.[11] R. Wang, M. Tao, and Y. Huang, “Linear precoding designs for amplify-and-forward multiuser two-way relay systems,”
IEEE Transactions onWireless Communications , vol. 11, no. 12, pp. 4457–4469, Dec. 2012.[12] R. Wang and M. Tao, “Joint source and relay precoding designs forMIMO two-way relaying based on MSE criterion,”
IEEE Transactionson Signal Processing , vol. 60, no. 3, pp. 1352–1365, Mar. 2012.[13] J. Jose, A. Ashikhmin, T. L. Marzetta, and S. Vishwanath, “Pilot contam-ination and precoding in multi-cell TDD systems,”
IEEE Transactionson Wireless Communications , vol. 10, no. 8, pp. 2640-2651, Aug. 2011.[14] C. B. Peel, B. M. Hochwald, and A. L. Swindlehurst, “A vector perturba-tion technique for near capacity multi-antenna multi-user communication– Part I: Channel inversion and regularization,”
IEEE Transactions onCommunications , vol. 53, no. 1, pp. 195–202, Jan. 2005.[15] J. Hoydis, K. Hosseini, S. T. Brink, and M. Debbah, “Making smart useof excess antennas: massive MIMO, small cells and TDD,”
Bell LabsTechnical Journal , vol. 18, no. 2, pp. 5-21, 2013.[16] R. Zhang, “Cooperative multi-Cell block diagonalization with per-base-station power constraints,”
IEEE Journal on Selected Areas inCommunications , vol. 28, no. 9, pp. 1435–1445, 2010.[17] 3GPP, “Evolved Universal Terrestrial Radio Access (E-UTRA), FurtherAdvancements for E-UTRA Physical Layer Aspects,” Mar. 2010 [On-line]. Available: ftp.3gpp.org, TR 36.814 v9.0.0.[18] S. Boyd and L. Vandenberghe,
Convex Optimization , Cambridge Uni-versity Press, 2004.[19] M. Grant and S. Boyd, “CVX: Matlab software for disciplined convexprogramming (web page and software),” http://stanford.edu/boyd/cvx ,Feb. 2008.[20] T. Cover and J. Thomas,