[PDF] Cloud Radio Access Networks: Uplink Channel Estimation and Downlink Precoding

Abstract

The gains afforded by cloud radio access network (C-RAN) in terms of savings in capital and operating expenses, flexibility, interference management and network densification rely on the presence of high-capacity low-latency fronthaul connectivity between remote radio heads (RRHs) and baseband unit (BBU). In light of the non-uniform and limited availability of fiber optics cables, the bandwidth constraints on the fronthaul network call, on the one hand, for the development of advanced baseband compression strategies and, on the other hand, for a closer investigation of the optimal functional split between RRHs and BBU. In this chapter, after a brief introduction to signal processing challenges in C-RAN, this optimal function split is studied at the physical (PHY) layer as it pertains to two key baseband signal processing steps, namely channel estimation in the uplink and channel encoding/ linear precoding in the downlink. Joint optimization of baseband fronthaul compression and of baseband signal processing is tackled under different PHY functional splits, whereby uplink channel estimation and downlink channel encoding/ linear precoding are carried out either at the RRHs or at the BBU. The analysis, based on information-theoretical arguments, and numerical results yields insight into the configurations of network architecture and fronthaul capacities in which different functional splits are advantageous. The treatment also emphasizes the versatility of deterministic and stochastic successive convex approximation strategies for the optimization of C-RANs.

Full PDF

11 Cloud Radio Access Networks: Uplink ChannelEstimation and Downlink Precoding

Osvaldo Simeone, Jinkyu Kang, Joonhyuk Kang and Shlomo Shamai (Shitz)

I. I

NTRODUCTION

The gains afforded by cloud radio access network (C-RAN) in terms of savings in capital and operatingexpenses, ﬂexibility, interference management and network densiﬁcation rely on the presence of high-capacity low-latency fronthaul connectivity between remote radio heads (RRHs) and baseband unit (BBU).In light of the non-uniform and limited availability of ﬁber optics cables, the bandwidth constraints on thefronthaul network call, on the one hand, for the development of advanced baseband compression strategiesand, on the other hand, for a closer investigation of the optimal functional split between RRHs and BBU. Inthis chapter, after a brief introduction to signal processing challenges in C-RAN, this optimal function splitis studied at the physical (PHY) layer as it pertains to two key baseband signal processing steps, namelychannel estimation in the uplink and channel encoding/ linear precoding in the downlink. Joint optimizationof baseband fronthaul compression and of baseband signal processing is tackled under different PHYfunctional splits, whereby uplink channel estimation and downlink channel encoding/ linear precoding arecarried out either at the RRHs or at the BBU. The analysis, based on information-theoretical arguments, andnumerical results yields insight into the conﬁgurations of network architecture and fronthaul capacities

O. Simeone is with the Center for Wireless Information Processing (CWIP), ECE Department, New Jersey Institute of Technology (NJIT),Newark, NJ 07102, USA (Email: [email protected]).Jinkyu Kang is with the School of Engineering and Applied Sciences (SEAS), Harvard University, Cambridge, MA 02138, USA (Email:[email protected]).Joonhyuk Kang is with the Department of Electrical Engineering, Korea Advanced Institute of Science and Technology (KAIST) Daejeon,South Korea (Email: [email protected]).S. Shamai (Shitz) is with the Department of Electrical Engineering, Technion, Haifa, 32000, Israel (Email: [email protected]). a r X i v : . [ c s . I T ] A ug in which different functional splits are advantageous. The treatment also emphasizes the versatility ofdeterministic and stochastic successive convex approximation strategies for the optimization of C-RANs.II. T ECHNOLOGY B ACKGROUND

In a C-RAN architecture, the base station (BS) functionalities, from the PHY layer to higher layers,are implemented in a virtualized fashion on centralized general-purpose processors rather than on thelocal hardware of the base stations or access points. This results in a novel cellular architecture in whichlow-cost wireless access points − the RRHs − which retain only radio functionalities, are centrally managedby a reconﬁgurable centralized “cloud”, the BBU. At a high level, the C-RAN concept can be seen as aninstance of network function virtualization and hence as the RAN counterpart of the separation of controland data planes proposed for the core network in software-deﬁned networking [1].The C-RAN architecture has the following key advantages, which make it a key contender for inclusionin a 5G standard: • Reduced capital expense due to the possibility to substitute full-ﬂedged base stations with RRHs withreduced space and energy requirements; • Statistical multiplexing gain thanks to the ﬂexible allocation of radio and computing resources acrossall the connected RRHs; • Easier implementation of coordinated and cooperative transmission/ reception strategies, such asEnhanced Inter-Cell Interference Coordination (eICIC) and Coordinated MultiPoint (CoMP) in LongTerm Evolution Advanced (LTE-A), to mitigate multi-cell interference; • Simpliﬁed network upgrades and maintenance owing to the centralization of RAN functionalities.The C-RAN architecture depends on a network of so-called fronthaul links to enable the virtualization ofBS functionalities at a BBU. This is because in the uplink, the RRHs are required to convey their respectivereceived signals, either in analog format or in the form of digitized baseband samples, to the BBU forprocessing. Moreover, in a dual fashion, in a C-RAN downlink, each RRH needs to receive from the BBUeither directly the analog radio signal to be transmitted on the radio interface, or a digitized version of the corresponding baseband samples. The RRH − BBU bidirectional links that carry such information arereferred to as fronthaul links, in contrast to the backhaul links connecting the BBU to the core network.The analog transport solution is typically implemented on fronthaul links by means of radio-over-ﬁber[2]. Instead, the digital transmission of baseband, or IQ, samples is currently carried out by followingthe Common Public Radio Interface (CPRI) standard [3], which most commonly requires ﬁber opticfronthaul links as well. The digital approach appears to be favored due to the traditional advantages ofdigital solutions, including resilience to noise and hardware impairments and ﬂexibility in the transportoptions [4].

A. Signal Processing Challenges in C-RAN

The main roadblock to the realization of the mentioned promises of C-RAN hinges on the inherentrestrictions on bandwidth and latency of the fronthaul links that may limit the advantages of centralizedprocessing at the BBU.

1) Fronthaul capacity limitations:

Implementing the CPRI standard, the bit rate required for base stationthat serve multiple cell sectors with carrier aggregation and with multiple antennas exceeds the 10 Gbit/sprovided by standard ﬁber optics links [4], [5]. This problem is even more pronounced for networks inwhich ﬁber-optic links are not available due to the large expense required for their deployment or lease,as for heterogeneous networks with smaller RRHs [6]. The capacity limitations of the fronthaul link callfor the development of compression strategies that reduce the fronthaul rate with minor or no degradationin the quality of the quantized baseband signal. Typical solutions are based on ﬁltering, per-block scaling,lossless compression, predictive quantization, see [7]–[12]When quantization and compression are not sufﬁcient, as reported in [13], [14], the bottleneck onthe performance of C-RANs due to the capacity limitations of the fronthaul links can be alleviated byimplementing a more ﬂexible separation of functionalities between RRHs and BBU, rather than performingall baseband processing at the BBU. Examples of baseband operations that can be carried out at theRRH include Fast Fourier Transform and Inverse Fast Fourier Transform (FFT and IFFT), demapping, synchronization, channel estimation, precoding and channel encoding. Note that [13] also investigates thepossibility to implement functions at higher layers, such as error detection, at the RRHs. We will elaborateon important aspects of the functional split between RRH and BBU below.

2) Fronthaul latency limitations:

Two of the communication protocols that are most affected byfronthaul delays are uplink hybrid automatic repeat request (HARQ) and random access [13]. For HARQ,the problem is that the outcome of decoding at the BBU may only become available at the RRH afterthe time required for • the transfer of the baseband signals from the RRH to the BBU • the processing at the BBU • the transmission of the decoding outcome from the BBU to the RRH.This delay may seriously affect the throughput achievable by uplink HARQ. For example, in LTE withfrequency division multiplexing, the feedback latency should be less than ms in order not to disrupt theoperation of the system [13]. Similar issues impair the implementation of random access. B. Chapter Overview

In this chapter, we explore the problem of optimal functional split between RRHs and BBU at thePHY layer by focusing on the two key baseband operations of channel encoding and channel encoding/precoding. We recall that alternative functional splits are envisaged to be potentially advantageous in thepresence of signiﬁcant fronthaul capacity constraints.For the uplink, we compare the standard implementation in which all baseband processing, includingchannel estimation, is performed at the BBU, with an alternative architecture in which channel estimation,along with the necessary frame synchronization and resource demapping, is instead implemented at theRRHs. This is discussed in Sec. III.The downlink is discussed in Sec. IV, where we contrast the standard C-RAN implementation with analternative one in which channel encoding and precoding are applied at the RRHs, while the BBU retainsthe function of designing the precoding matrices based on the available channel state information.

Throughout, we take an information-theoretic approach in order to evaluate analytical expressions for theachievable performance that illuminates the impact of different design choices. The analysis is corroboratedby extensive numerical results that provide insight into the performance comparisons highlighted above.The chapter is concluded in Sec. V.III. U

PLINK : W

HERE TO P ERFORM C HANNEL E STIMATION ?In this section, we study the uplink and address the potential advantages that could be accrued byperforming channel estimation at the RRHs rather than at the BBU. The rationale for the exploration ofthis functional split is that communicating the digitized signal received within the training portion of thereceived signal, as done in the conventional implementation, may impose a more signiﬁcant burden on thefronthaul network that communicating directly the estimated channel state information (CSI). This splitis also supported by the known information-theoretic optimality of separate estimation and compression[15]. In particular, we compare two different approaches: • the conventional approach, in which the RRHs quantize the training signals and CSI estimation takesplace at the BBU; • channel Estimation at the RRHs, in which the RRHs perform CSI estimation and forward a quantizedversion of the CSI to the BBU.Note that the conventional approach was the subject of an earlier study [16] and that this section is adaptedfrom our earlier work [17], to which we refer for proofs and additional considerations.We start by discussing the system model in Sec. III-A and then elaborate on the two approaches inSec. III-B and Sec. III-C. Finally, we present numerical results in Sec. III-D. A. System Model

We study the uplink of a cellular system consisting of N U User Equipments (UEs), N R RRHs and aBBU, as shown in Fig. 1. We denote the set of all UEs, or mobile users, as N U = { , . . . , N U } and theset of all RRHs as N R = { , . . . , N R } . Each i -th UE has N t,i transmit antennas, while each j -th RRH is ... N r, 1 H . ..... N r, N R . .. ... ... N t, 1 N t, N U UE N U UE RRH RRH N R BBU C C N R Fig. 1. Uplink of a C-RAN system consisting of N U UEs and N R RRHs. Each j -th RRH is connected to the BBU with a fronthaul linkof capacity ¯ C j . equipped with N r,j receive antennas. We deﬁne the number of total transmit antennas as N t = (cid:80) N U i =1 N t,i .Each j -th RRH is connected to the BBU via a fronthaul link of capacity ¯ C j . All rates, including ¯ C j ,are normalized to the bandwidth available on the uplink channel from the UEs to the RRHs and aremeasured in bits/s/Hz. We assume that coding is performed across a large number of channel coherenceblocks, for example over many resource blocks of an LTE system operating on a channel with signiﬁcanttime-frequency diversity. This implies that the ergodic capacity describes the system performance in termsof achievable rates (see, e.g., [18]).Each channel coherence block, of length T channel uses, is split into a phase for channel training oflength T p channel uses and a phase for data transmission of length T d channel uses, with T p + T d = T. (1)The signal transmitted by the i -th UE is given by a N t,i × T complex matrix X i , where each columncorresponds to the signal transmitted by the N t,i antennas in a channel use. This signal is divided intothe N t,i × T p pilot signal X p,i and the N t,i × T d data signal X d,i . We assume that the transmit signal X i has a total per-block power constraint T − E [ (cid:107) X i (cid:107) ] = ¯ P i , and we deﬁne T p − E [ (cid:107) X p,i (cid:107) ] = P p,i and T d − E [ (cid:107) X d,i (cid:107) ] = P d,i as the powers used for training and data, respectively by the i -th UE. Note that E [ · ] refers throughout to the expectation operator. In terms of pilot and data signal powers, the power constraint is hence expressed as T p T P p,i + T d T P d,i = ¯ P i . (2)For simplicity, we assume equal transmit power allocation for all UEs, and hence we have ¯ P i = ¯ P , P d,i = P d and P p,i = P p for all i ∈ N U . Finally, we collect in matrices X p and X d all the pilot signals and the datasignals transmitted by all UEs, respectively, i.e., X p = [ X Tp, , . . . , X Tp,N U ] T and X d = [ X Td, , . . . , X Td,N U ] T .The training signal is X p = (cid:112) P p /N t S p , where S p is a N t × T p matrix with orthogonal rows and unitarypower entries corresponding to the orthogonal training sequences transmitted from each antenna by allUEs (as in, e.g., [16]). Note that this implies that each training sequence is transmitted with power P p /N t and that the condition T p ≥ N t holds. During the data phase, the UEs transmit independent space-timecodewords without precoding. Using random coding arguments, we write X d = (cid:112) P d /N t S d , where S d isa N t × T d matrix of independent and identically distributed (i.i.d.) CN (0 , variables.The N r,j × T signal Y j received by the j -th RRH in a given coherence block, where each columncorresponds to the signal received by the N r,j antennas in a channel use, can be split into the N r,j × T p received pilot signal Y p,j and the N r,j × T d data signal Y d,j . The signal received at the j -th RRH is thengiven by Y p,j = (cid:114) P p N t H j S p + Z p,j (3a)and Y d,j = (cid:114) P d N t H j S d + Z d,j , (3b)where Z p,j and Z d,j are respectively the N r,j × T p and N r,j × T d matrices of i.i.d. complex Gaussian noisevariables with zero-mean and unit variance, i.e., CN (0 , . The N r,j × N t channel matrix H j collects allthe N r,j × N t,i channel matrices H ji from the i -th UE to the j -th RRH as H j = [ H j , . . . , H jN U ] .The channel matrix H ji is modeled as having i.i.d. CN (0 , α ji ) entries, where α ji is the path losscoefﬁcient between the i -th UE and the j -th RRH being given as α ji = 11 + (cid:16) d ji d (cid:17) η , (4) where d ji is the distance between the i -th UE and the j -th RRH, d is a reference distance, and η isthe path loss exponent. The channel matrices are assumed to be constant during each channel coherenceblock and to change according to an ergodic process from block to block. B. Conventional Approach

With the conventional approach, the RRH quantizes and compresses both its received pilot signal inEq. (3a) and its received data signal in Eq. (3b), and forwards the compressed signals to the BBU on thefronthaul link. The BBU then estimates the CSI on the basis of the received quantized pilot signals andperforms coherent decoding of the data signal. In the rest of Sec. III, we limit the analytical treatmentto the case of a single UE and a single RRH, i.e., N U = 1 and N R = 1 , for simplicity of presentation.We henceforth remove the subscripts indicating UE and RRH indices. A more general discussion can befound elsewhere [17].

1) Training Phase:

During the training phase, the vector of received training signals Y p in Eq. (3a)across all coherence times is quantized. In order to account for quantization and compression, throughoutthis chapter, we use the standard additive quantization noise model that follows conventional information-theoretical arguments based on random coding [19]. Accordingly, the quantized pilot signal can be writtenas (cid:98) Y p = Y p + Q p , (5)where the compression noise matrix Q p is assumed to have i.i.d. CN (0 , σ p ) entries. Note that theassumption of Gaussian i.i.d. quantization noises is made here for simplicity of analysis without claim ofoptimality. On a practical note, Gaussian quantization noise can be realized by high-dimensional vectorquantizers such as trellis-coded quantization [20]. The quantization noise variance σ p dictates the accuracyof the quantization and depends on the fronthaul capacity via standard information-theoretic identities [19],as further discussed below.Based on Eq. (5), the channel matrix H from the UE to the RRH is estimated at the BBU by the minimum mean square error (MMSE) method. Hence, it can be expressed as H = (cid:98) H + E , (6)where the estimated channel (cid:98) H is a complex Gaussian matrix with i.i.d. CN (0 , σ (cid:98) h ) entries, and the estima-tion error E has i.i.d. CN (0 , σ e ) entries. With σ (cid:98) h = α − σ e and σ e = αN t (1 + σ p ) / ( T p P p + N t (1 + σ p )) ,respectively [18], [21], where we recall that α is the power gain for the channel between UE and RRH.

2) Data Phase:

The quantized data signal received at the BBU can be similarly expressed as (cid:98) Y d = Y d + Q d , where the quantization noise Q d is assumed to have i.i.d. CN (0 , σ d ) entries. Moreover, it canbe written as the sum of a useful term (cid:98) HX d and of the equivalent noise N d = EX d + Z d + Q d , namely (cid:98) Y d = (cid:98) HX d + N d , (7)where the equivalent noise N d has i.i.d. entries with zero mean and power σ d + P d σ e . We observethat N d is not Gaussian distributed and is not independent of X d . Further discussion can be found in theliterature [17], [18].

3) Ergodic Rate:

As mentioned, we adopt as the performance criterion of interest the ergodic rate,which, under the assumption of Gaussian codebooks, is given by the mutual information T − I ( X d ; (cid:98) Y d | (cid:98) H ) [bits/s/Hz] (see, e.g, [19, Ch. 3]). This quantity can be lower-bounded by the following expression [17]: R = T d T E (cid:104) log det (cid:16) I N r + ρ eff (cid:98) H (cid:98) H † (cid:17)(cid:105) , (8)with ρ eff = P d / ( N t (1 + σ d + P d σ e )) being the effective signal to noise ratio (SNR), which accounts forthe effects of quantization and channel estimation, and (cid:98) H being distributed as in Eq. (6). The rate in Eq.(8) is hence an achievable ergodic rate [17]. Moreover, let us deﬁne as C p the fronthaul rate allocated totransmit information about the pilot signals and as C d the fronthaul rate for the data with C p + C d = ¯ C .Then, if the conditions C p = T p N r T log (cid:18) P p α + 1 σ p (cid:19) (9a)and C d = T d N r T log (cid:18) P d α + 1 σ d (cid:19) (9b) are satisﬁed, a quantization (and compression) scheme exists that guarantees the desired quantizationerrors ( σ d , σ p ) [17].The ergodic achievable rate in Eq. (8) can now be optimized over the fronthaul allocation ( C p , C d ) under the fronthaul constraint ¯ C = C p + C d , with C p and C d in Eq. (9), by maximizing the effective SNR ρ eff in Eq. (8). This non-convex problem can be tackled using a line search method [22] in a boundedinterval (e.g., over C p in the interval [0 , ¯ C ] ). C. Channel Estimation at the RRHs

With the mentioned alternative functional split, each RRH estimates the CSI on the basis of its receivedpilot signal in Eq. (3a), and then quantizes and compresses both its estimated CSI and its received datasignal in Eq. (3b) for transmission on the fronthaul.

1) Training Phase:

The RRH performs the MMSE estimate of the channel H given the observation Y p in Eq. (3a). As a result, similar to Eq. (6), we can decompose the channel matrix H into the MMSEestimate (cid:101) H and the independent estimation error E , as H = (cid:101) H + E , (10)where the error E has i.i.d. CN (0 , σ e ) entries with σ e = αN t / ( T p P p + N t ) and (cid:101) H has i.i.d. CN (0 , σ (cid:101) h ) entries with σ (cid:101) h = α − σ e .The sequence of channel estimates (cid:101) H for all coherence times in the coding block is compressed bythe RRH and forwarded to the BBU on the fronthaul link. The compressed channel (cid:98) H is related to theestimate (cid:101) H as (cid:101) H = (cid:98) H + Q p , (11)where the N r × N t quantization noise matrix Q p has i.i.d. CN (0 , σ p ) entries.

2) Data Phase:

During the data phase, the RRH quantizes the signal Y d in Eq. (3b) and sends it tothe BBU on the fronthaul link. The signal obtained at the BBU is related to Y d as (cid:98) Y d = Y d + Q d , (12) where Q d is independent of Y d and represents the quantization noise matrix with i.i.d. CN (0 , σ d ) entries.Separating the desired signal and the noise in Eq. (12), the quantized signal (cid:98) Y d can be expressed as (cid:98) Y d = (cid:98) HX d + N d , (13)where N d denotes the equivalent noise N d = ( Q p + E ) X d + Z d + Q d , which has i.i.d. zero-mean entrieswith power σ n = P d (cid:0) σ p + σ e (cid:1) σ d . (14)We observe that, as in Eq. (7), N d is not Gaussian distributed and is not independent of X d .

3) Ergodic Rate:

Let C p and C d denote respectively the fronthaul rates allocated for the transmissionof the quantized channel estimates in Eq. (11) and of the quantized received signals in Eq. (12) on thefronthaul link from the RRH to the BBU. An achievable ergodic rate is given as [17]: R = T d T E (cid:104) log det (cid:16) I N r + ρ eff (cid:98) H (cid:98) H † (cid:17)(cid:105) , (15)with the effective SNR ρ eff = P d N t σ n = P d N t (cid:0) σ d + P d (cid:0) σ p + σ e (cid:1)(cid:1) ; (16) (cid:98) H being distributed as in Eq. (11); and with σ e in Eq. (10). Moreover, if the conditions C p = N r N t T log (cid:18) α − σ e σ p (cid:19) (17a)and C d = N r T d T log (cid:18) (cid:18) αP d + 1 σ d (cid:19)(cid:19) , (17b)are satisﬁed, then a quantization scheme exists that guarantees the desired quantization error ( σ p , σ d ) [17].The ergodic achievable rate in Eq. (15) can now be optimized over the fronthaul allocation ( C p , C d ) underthe fronthaul constraint ¯ C = C p + C d , with C p and C d in Eq. (17), by maximizing the effective SNR ρ eff in Eq. (16) using a line search [22] in a bounded interval.   d ji RRH j UE i Fig. 2. Set-up under consideration for the numerical results, where RRHs and UEs are located in a square with side δ . All RRHs areconnected to the same BBU.

4) Adaptive quantization:

The alternative functional split studied here enables the RRHs to performsadaptive quantization of the data as a function of the estimated CSI in each coherence block. Speciﬁcally,rather than performing separate quantization of CSI and data, the data is quantized in each coherenceperiod with a different accuracy depending on the corresponding CSI: a better channel quality calls fora more accurate quantization of the data ﬁeld, and vice versa for worse CSI. We note that this is notpossible in the conventional approach in which CSI is not estimated at the RRHs. Further details can befound elsewhere [17].

D. Numerical Results

In this section, we evaluate the performance of the discussed conventional and alternative strategiesfor the uplink. For the latter, we consider both the basic and adaptive implementations mentioned in theprevious section. To this end, we consider a system with N R = N U = 2 RRHs and UEs with N t = N r = 4 Coherence time T E r go d i c a c h i e v a b l e s u m - r a t e ( b i t s / s / H z ) Alternative - adaptiveAlternativeConventional

Fig. 3. Ergodic achievable sum-rate vs. coherence time ( N R = N U = 2 , N t = N r = 4 , ¯ C = 6 bits/s/Hz, and ¯ P = 10 dB ). Fronthaul capacity C (bits/s/Hz) E r go d i c a c h i e v a b l e s u m - r a t e ( b i t s / s / H z ) Alternative - adaptiveAlternativeConventional

Fig. 4. Ergodic achievable sum-rate vs. fronthaul capacity ( N R = N U = 2 , N t = N r = 4 , ¯ P = 10 dB , and T = 10). antennas. The positions of the RRHs and the UEs are ﬁxed in the area with side δ = 500 m as in Fig. 2.In the path loss formula Eq. (4), we set the reference distance to d = 50 m and the path loss exponentto η = 3 . Throughout, we assume that each RRH has the same fronthaul capacity ¯ C , that is ¯ C j = ¯ C for j ∈ N R . We optimize over the power allocation ( P p , P d ) and we set T p = N t , which was shown to beoptimal in [18] for a point-to-point link with no fronthaul limitation.The effect of an increase of the coherence time on the ergodic achievable sum-rate is investigated in Fig.3 with fronthaul capacity ¯ C = 6 bits/s/Hz, and power ¯ P = 10dB. As expected from information-theoreticconsiderations, Fig. 3 demonstrates that the alternative approach is advantageous, although most of thegains are accrued by means of adaptive quantization. Moreover, it is observed that the performance ofthe conventional approach without adaptive quantization approaches that of the alternative approach asthe coherence time T increases. This is because, for large coherence time T , the fraction of fronthaulcapacity devoted to training becomes negligible and hence accurate CSI can be obtained at the BBU.In Fig. 4, we set the power as ¯ P = 10 dB and the coherence time as T = 10 , and we plot the ergodicachievable sum-rate versus the fronthaul capacity ¯ C . The main conclusions are consistent with thosediscussed above for Fig. 3. Moreover, it is seen that the performance gain of the alternative functionalsplit is relevant as long as ¯ C is not too large, in which case the performance is limited by the uplink SNRand not by the limited fronthaul capacity.IV. D OWNLINK : W

HERE TO P ERFORM C HANNEL E NCODING AND P RECODING ?In this section, we turn to the downlink and address the issue of whether it is more advantageous toimplement channel encoding and precoding at the RRHs rather than at the BBU as in the conventionalimplementation. Speciﬁcally, we compare the following two approaches: • the conventional approach, in which the BBU performs channel coding and precoding and thenquantizes and forwards the resulting baseband signals on the fronthaul links to the RRHs; The positions of RRHs are set as ppp R, = [307 .

50 233 . T and ppp R, = [430 . . T , where ppp R,i is the position of i -th RRH withcoordinate origin at the lower left corner, and the positions of UEs as ppp U, = [363 . . T and ppp U, = [438 .

17 107 . T , where ppp U,j is the position of j -th UE. . .. N r, 1 H N r, N U C N R ... N t, 1 N t, N R UE N U UE RRH RRH N R ...... . .. .. . datastreams C BBU

Fig. 5. Downlink of a C-RAN system consisting of N R RRHs and N U UEs. The BBU is connected to each i -th RRH with a fronthaullink of capacity ¯ C i . • channel encoding and precoding at the RRHs in which the BBU does not perform precoding butrather forwards separately the information messages of a subset of UEs, along with the quantizedprecoding matrices to the all RRHs, which then perform channel encoding and precoding.The conventional approach has been studied under a simpliﬁed quasi-static, rather than ergodic, channelmodel [23], [24], while the alternative functional split was investigated by Park et al. [25]. This sectionis adapted from our earlier paper [26], to which we refer for further details and proofs. We also note thatwe focus here on linear precoding, or beamforming, and separate quantization for each RRH, and thatrelated discussion on non-linear precoding and joint fronthaul quantization can be found in the literature[24].We start by detailing the system model in Sec. IV-A. In Sec. IV-B, we study the conventional approach,while the alternative functional split mentioned above is studied in IV-C. In Sec. IV-D, numerical resultsare presented. A. System Model

We consider the counterpart downlink C-RAN model of the uplink set-up studied in Sec III, in whicha cluster of N R RRHs provides wireless service to N U UEs as illustrated in Fig. 5. Most of the basebandprocessing for all the RRHs in the cluster is carried out at a BBU that is connected to each i -th RRHvia a fronthaul link of ﬁnite capacity ¯ C i . Each i -th RRH has N t,i transmit antennas and each j -th UE has N r,j receive antennas. We denote the set of all RRHs as N R = { , . . . , N R } and the set of all UEsas N U = { , . . . , N U } , and we deﬁne the number of total transmit antennas as N t = (cid:80) N R i =1 N t,i and oftotal receive antennas as N r = (cid:80) N U j =1 N r,j . Moreover, we adopt a block-ergodic channel model in whichthe fading channels are constant within a coherence period but vary in an ergodic fashion across a largenumber of coherence periods.Within each channel coherence period of duration T channel uses, the baseband signal transmitted bythe i -th RRH is given by a N t,i × T complex matrix X i , where each column corresponds to the signaltransmitted from the N t,i antennas in a channel use. The N r,j × T signal Y j received by the j -th UEin a given channel coherence period, where each column corresponds to the signal received by the N r,j antennas in a channel use, is given by Y j = H j X + Z j , (18)where Z j is the N r,j × T noise matrix, which consist of i.i.d. CN (0 , entries; H j = [ H j , . . . , H j NR ] denotes the N r,j × N t channel matrix for j -th UE, where H ji is the N r,j × N t,i channel matrix fromthe i -th RRH to the j -th UE; and X is the collection of the signals transmitted by all the RRHs, i.e., X = [ X T , . . . , X TN R ] T .We consider the scenario in which the BBU has instantaneous information about the channel matrix H as well as the case in which the BBU is only aware of the distribution of the channel matrix H , i.e., ithas stochastic CSI . Instead, the UEs always have full CSI about their corresponding channel matrices, aswe will state more precisely in the next sections. The transmit signal X i has a power constraint given as T − E [ (cid:107) X i (cid:107) ] ≤ ¯ P i .While the analysis applies more generally, in order to elaborate on the CSI requirements of the BBU,we consider as a speciﬁc channel model of interest the standard Kronecker model, in which the channelmatrix H ji is written as H ji = ΣΣΣ / R,ji (cid:101) H ji ΣΣΣ / T,ji , (19)where the N t,i × N t,i matrix ΣΣΣ

T,ji and the N r,j × N r,j matrix ΣΣΣ

R,ji are the transmit-side and receiver- Channel coding

Channelcoding ...

Precoding

Precoding design

CSI Q ... BBU

RRH RRH N R ... ... S S N M 𝐗 𝐗 N R W data streams Fronthaul

Fronthaul Q Fig. 6. Downlink: Conventional approach (“Q” represents fronthaul compression). side spatial correlation matrices, respectively, and the N r,j × N t,i random matrix (cid:101) H ji has i.i.d. CN (0 , variables and accounts for the small-scale multipath fading [27]. With this model, stochastic CSI entailsthat the BBU is only aware of the correlation matrices ΣΣΣ

T,ji and

ΣΣΣ

R,ji . Moreover, in case that the RRHsare placed in a higher location than the UEs, one can assume that the receive-side fading is uncorrelated,i.e.,

ΣΣΣ

R,ji = I N r,j , while the transmit-side covariance matrix ΣΣΣ

T,ji is determined by the one-ring scatteringmodel (see [27] and references therein). In particular, if the RRHs are equipped with λ/ -spaced uniformlinear arrays, we have ΣΣΣ

T,ji = ΣΣΣ T ( θ ji , ∆ ji ) for the j -th UE and the i -th RRH located at a relative angleof arrival θ ji and having angular spread ∆ ji , where the element ( m, n ) of matrix ΣΣΣ T ( θ ji , ∆ ji ) is given by [ΣΣΣ T ( θ ji , ∆ ji )] m,n = α ji ji (cid:90) θ ji +∆ ji θ ji − ∆ ji exp − jπ ( m − n ) sin( φ ) dφ, (20)with the path loss coefﬁcient α ji between the j -th UE and the i -th RRH being given as Eq. (4). B. Conventional Approach

We ﬁrst describe the conventional approach in Sec. IV-B1. Then, we discuss the joint optimization offronthaul quantization and precoding with perfect instantaneous channel knowledge at the BBU in Sec.IV-B2 and under the assumption of stochastic CSI at the BBU in Sec. IV-B3.

1) Problem Formulation:

With the conventional scheme as illustrated in Fig. 6, the BBU performschannel coding and precoding, and then quantizes the resulting baseband signals so that they can be forwarded on the fronthaul links to the corresponding RRHs. Speciﬁcally, channel coding is performedseparately for the information stream intended for each UE. This step produces the data signal S =[ S † , . . . , S † N U ] † for each coherence block, where S j is the M j × T matrix containing, as rows, the M j ≤ N r,j encoded data streams for the j -th UE. We deﬁne the number of total data streams as M = (cid:80) N U j =1 M j and assume the condition M ≤ N t . Following standard random coding arguments, we take all the entriesof matrix S to be i.i.d. as CN (0 , . The encoded data S is further processed to obtain the transmittedsignals X as detailed below.The precoded data signal computed by the BBU for any given coherence time can be written as (cid:101) X = WS , where W is the N t × M precoding matrix. With instantaneous CSI, a different precodingmatrix W is used for different coherence times in the coding block, while, with stochastic CSI, the sameprecoding matrix W is used for all coherence times.In both cases, the precoded data signal (cid:101) X can be divided into the N t,i × T signals (cid:101) X i corresponding to i -th RRH for all i ∈ N R as (cid:101) X = [ (cid:101) X T , . . . , (cid:101) X TN R ] T , with (cid:101) X i = W ri S , where W ri is the N t,i × N r precodingmatrix for the i -th RRH, which is obtained by properly selecting the rows of matrix W (as indicated bythe superscript “ r ” for “rows”): the matrix W ri is given as W ri = D rTi W , with the N t × N t,i matrix D ri having all zero elements except for the rows from (cid:80) i − k =1 N t,k + 1 to (cid:80) ik =1 N t,k , that contain an N t,i × N t,i identity matrix.The BBU quantizes each sequence of baseband signal (cid:101) X i for transmission on the i -th fronthaul link tothe i -th RRH independently. We write the compressed signals X i for the i -th RRH as X i = (cid:101) X i + Q x,i , (21)where the quantization noise matrix Q x,i is assumed to have i.i.d. CN (0 , σ x,i ) entries. Note that theadvantages of joint quantization across multiple RRHs are explored in [24] for static channels. Based onEq. (21), the design of the fronthaul compression reduces to the optimization of the quantization noise variances σ x, , . . . , σ x,N R . The power transmitted by i -th RRH is computed as P i (cid:0) W , σ x,i (cid:1) = 1 T E [ || X i || ] = tr (cid:0) D rTi WW † D ri + σ x,i I (cid:1) , (22)where we have emphasized the dependence of the power P i ( W , σ x,i ) on the precoding matrix W andquantization noise variances σ x,i . Moreover, using standard rate-distortion arguments, the rate requiredon the fronthaul between the BBU and i -th RRH in a given coherence interval can be quantiﬁed by I ( (cid:101) X i ; X i ) /T (see, e.g., [19, Ch. 3]), yielding [26] C i (cid:0) W , σ x,i (cid:1) = log det (cid:0) D rTi WW † D ri + σ x,i I (cid:1) − N t,i log (cid:0) σ x,i (cid:1) , (23)so that the fronthaul capacity constraint is C i ( W , σ x,i ) ≤ ¯ C i .We assume that each j -th UE is aware of the effective receive channel matrices (cid:101) H jk = H j W ck for all k ∈N U at all coherence times, where W ck is the N t × N r,j precoding matrix corresponding to k -th UE, whichis obtained from the precoding matrix W by properly selecting the columns as W = [ W c , . . . , W cN U ] .We collect the effective channels in the matrix (cid:101) H j = [ (cid:101) H j , . . . , (cid:101) H jN U ] = H j W . The effective channel (cid:101) H j can be estimated at the UEs via downlink training.Under these assumptions, the ergodic achievable rate for the j -th UE is computed as E [ R convj ( H , W , σσσ x )] ,with R convj ( H , W , σσσ x ) = I H ( S j ; Y j ) /T , where I H ( (cid:101) S j ; Y j ) represents the mutual information for a ﬁxedrealization of the channel matrix H , the expectation is taken with respect to H and R convj (cid:0) H , W , σσσ x (cid:1) = log det (cid:16) I + H j (cid:0) WW † + Ω x (cid:1) H † j (cid:17) (24) − log det  I + H j  (cid:88) k ∈N U \ j W ck W ck † + Ω x  H † j  . In Eq. (24), the covariance matrix Ω x is a diagonal with diagonal blocks given as diag ([ σ x, I , . . . , σ x,N R I ]) and σσσ x = [ σ x, , . . . , σ x,N R ] T .The ergodic achievable weighted sum-rate can be optimized over the precoding matrix W and thecompression noise variances σσσ x under fronthaul capacity and power constraints. In the next subsections,we consider separately the cases with instantaneous and stochastic CSI.

2) Instantaneous CSI:

In the case of instantaneous channel knowledge at the BBU, the design of theprecoding matrix W and the compression noise variances σσσ x , is adapted to the channel realization H for each coherence block. To emphasize this fact, we use the notation W ( H ) and σσσ x ( H ) . The problemof optimizing the ergodic weighted achievable sum-rate with given weights µ j ≥ for j ∈ N M is thenformulated as maximize W ( H ) ,σσσ x ( H ) (cid:88) j ∈N U µ j E (cid:2) R convj (cid:0) H , W ( H ) , σσσ x ( H ) (cid:1)(cid:3) (25a)s.t. C i (cid:0) W , σ x,i ( H ) (cid:1) ≤ ¯ C i , (25b) P i (cid:0) W ( H ) , σ x,i ( H ) (cid:1) ≤ ¯ P i , (25c)where Eq. (25b)-(25c) apply for all i ∈ N R and all channel realizations H . Due to the separability ofthe fronthaul and power constraints across the channel realizations H , the problem in Eq. (25) can besolved for each H independently. Note that the achievable rate in Eq. (25a) and the fronthaul constraintin Eq. (25b) are non-convex. However, the functions R convj ( H , W ( H ) , σσσ x ( H )) and C i ( W ( H ) , σ x,i ( H )) are difference of convex (DC) functions of the covariance matrices (cid:101) V j ( H ) = (cid:102) W cj ( H ) (cid:102) W c † j ( H ) for all j ∈ N U and the variance σσσ x ( H ) . The resulting rank-relaxed problem can be tackled via the Majorization-Minimization (MM) algorithm as detailed in [24], from which a feasible solution of problem in Eq. (25)can be obtained. We refer to [24] for details.

3) Stochastic CSI:

With only stochastic CSI at the BBU, in contrast to the case with instantaneous CSI,the same precoding matrix W and compression noise variances σσσ x are used for all the coherence blocks.Accordingly, the problem of optimizing the ergodic weighted achievable sum-rate can be reformulated asmaximize W ,σσσ x (cid:88) j ∈N U µ j E (cid:2) R convj (cid:0) H , W , σσσ x (cid:1)(cid:3) (26a)s.t. C i (cid:0) W , σ x,i (cid:1) ≤ ¯ C i , (26b) P i (cid:0) W , σ x,i (cid:1) ≤ ¯ P i , (26c) TABLE ID

ESIGN OF F RONTHAUL C OMPRESSION AND P RECODING : C

ONVENTIONAL A PPROACH WITH S TOCHASTIC

CSI

Initialization : Initialize the covariance matrices V (0) and the quantization noise variances σσσ x , and set n = 0 . repeat (outer loop) n ← n + 1 Generate a channel matrix realization H ( n ) using the available stochastic CSI. Initialization : Initialize V ( n, = V ( n − and σσσ n, x = σσσ n − x , and set r = 0 . repeat (inner loop) r ← r + 1 max V ,σσσ x n n (cid:88) l =1 (cid:88) j ∈N U µ j (cid:101) R convj (cid:16) H ( l ) , V , σσσ x | V ( l − , σσσ l − x (cid:17) s.t. (cid:101) C i (cid:16) V , σ x,i | V ( n,r − , σ n,r − x,i (cid:17) ≤ ¯ C i ,P i (cid:0) V , σ x,i (cid:1) ≤ ¯ P i , for all i ∈ N R . Update V ( n,r ) ← V and σσσ n,r ) x ← σσσ x . until a convergence criterion is satisﬁed.Update V ( n ) ← V ( n,r ) and σσσ n ) x ← σσσ n,r ) x . until a convergence criterion is satisﬁed. Solution : Calculate the precoding matrix W from the covariance matrices V ( n ) via rank reduction as W j = γ j ν ( M j ) max ( V ( n ) j ) for all j ∈ N U , where γ j is obtained by imposing P i (cid:0) W , σ x,i (cid:1) = ¯ P i using Eq. (22). where Eq. (26b)-(26c) apply to all i ∈ N R . In order to tackle this problem, we adopt the StochasticSuccessive Upper-bound Minimization (SSUM) method [28], whereby, at each step, a stochastic lowerbound of the objective function is maximized around the current iterate . To this end, similar to [24],we can recast the optimization over the covariance matrices V j = W cj W cj † for all j ∈ N U , instead ofthe precoding matrices W cj for all j ∈ N U . We observe that, with this choice, the objective function isexpressed as the average of DC functions, while the constraint in Eq. (26b) is also a DC function, withrespect to the covariance V = [ V . . . V N U ] and the quantization noise variances σσσ x . Due to the DCstructure, locally tight (stochastic) convex lower bounds can be calculated for objective function in Eq.(26a) and the constraint in Eq. (26b) (see, e.g., [30]).The algorithm proposed in [26] is based on SSUM [28] and contains two nested loops. At each outeriteration n , a new channel matrix realization H ( n ) = [ H T ( n )1 , . . . , H T ( n ) N U ] is drawn based on the availability We mention here that an alternative method to attack the problem is the strategy introduced in [29]. of stochastic CSI at the BBU. For example, with the model in Eq. (19), the channel matrices are generatedbased on the knowledge of the spatial correlation matrices. Following the SSUM scheme, the outer loopaims at maximizing a stochastic lower bound on the objective function, given as n n (cid:88) l =1 (cid:101) R convj (cid:0) H ( l ) , V , σσσ x | V ( l − , σσσ l − x (cid:1) , (27)where (cid:101) R convj ( H ( l ) , V , σσσ x | V ( l − , σσσ l − x ) is a locally tight convex lower bound on R convj ( H , W , σσσ x ) aroundsolution V ( l − , σσσ l − x obtained at the ( l − the outer iteration when the channel realization is H ( l ) .This can be calculated as (see, e.g., [28]) (cid:101) R convj (cid:0) H ( l ) , V , σσσ x | V ( l − , σσσ l − x (cid:1) (cid:44) log det (cid:32) I + H ( l ) j (cid:32) N U (cid:88) k =1 V k + Ω x (cid:33) H ( l ) † j (cid:33) − f (cid:16) I + H ( l ) j ΛΛΛ ( l − j H ( l ) † j , I + H ( l ) j ΛΛΛ j H ( l ) † j (cid:17) , (28)where ΛΛΛ j = (cid:80) N U k =1 ,k (cid:54) = j V k + Ω x , ΛΛΛ ( l − j = (cid:80) N U k =1 ,k (cid:54) = j V ( l − k + Ω x , the covariance matrix Ω ( l ) x is a diagonalmatrix with diagonal blocks given as diag ([ σ l ) x, I , . . . , σ l ) x,N R I ]) and the linearized function f ( A , B ) isobtained from the ﬁrst-order Taylor expansion of the log det function as f ( A , B ) (cid:44) log det ( A ) + 1 ln tr (cid:0) A − ( B − A ) (cid:1) . (29)Since the maximization of Eq. (27) is subject to the non-convex DC constraint in Eq. (26b), the innerloop tackles the problem via the MM algorithm i.e., by applying successive locally tight convex lowerbounds to the left-hand side of the constraint in Eq. (26b) [31]. Speciﬁcally, given the solution V ( n,r − and σσσ n,r − x at ( r − -th inner iteration of the n -th outer iteration, the fronthaul constraint in Eq. (26b)at the r -th inner iteration can be locally approximated as (cid:101) C i (cid:16) V , σ x,i | V ( n,r − , σ n,r − x,i (cid:17) (cid:44) (30) f (cid:32) N U (cid:88) k =1 D rTi V ( n,r − k D ri + σ n,r − x,i I , N U (cid:88) k =1 D rTi V k D ri + σ x,i I (cid:33) − N t,i log (cid:0) σ x,i (cid:1) . The resulting combination of SSUM and MM for the solution of problem in Eq. (26) is summarizedin Table Algorithm I. The algorithm is completed by calculating, from the obtained solution V ∗ of therelaxed problem, the precoding matrix W by using the standard rank-reduction approach [32], which is Precoding design

CSI MUX ...

BBU

MUX QQ ... De-MUXDe-MUX Q -1 Pre-coding

Pre- coding

RRH RRH N R 𝐖 𝑟 𝐖 N R 𝑟 𝐖 N R 𝑟 𝐖 𝑟 data streams Cluster-ing

Fronthaul

Fronthaul Channelcoding Q -1 Channelcoding

Fig. 7. Downlink: Alternative functional split (“Q” and “Q − ” represents fronthaul compression and decompression, respectively). given as W ∗ j = γ j ν ( M j ) max ( V ∗ j ) with the normalization factor γ j , selected so as to satisfy the power constraintwith equality, namely P i ( W , σ x,i ) = ¯ P i .We ﬁnally note that, since the approximated functions in Eq. (28) and Eq. (30) are local lower bounds,the algorithm provides a feasible solution of the relaxed problem at each inner and outer iteration (see,e.g., [28]). C. Channel Encoding and Precoding at the RRHs

With this alternative functional split, the BBU calculates the precoding matrices, but does not performprecoding. Instead, as illustrated in Fig. 7, it uses the fronthaul links to communicate the informationmessages of a given subset of UEs to each RRH, along with the corresponding compressed precodingmatrices. Each RRH can then encode and precode the messages of the given UEs based on the informationreceived from the fronthaul link. As it will be discussed, with this approach, a preliminary clustering stepis generally advantageous whereby each UE is assigned to a subset of RRHs. In the following, we ﬁrstdescribe the strategy in Sec. IV-C1. Then we discuss the design problem for fronthaul quantization andprecoding under instantaneous CSI in Sec. IV-C2 and with stochastic CSI in Sec. IV-C3.

1) Problem Formulation:

As shown in Fig. 7, the precoding matrix (cid:102) W and the information streamsare separately transmitted from the BBU to the RRHs, and the received information bits are encoded andprecoded at each RRH using the received precoding matrix. Note that, with this scheme, the transmissionoverhead over the fronthaul depends on the number of UEs supported by a RRH, since the RRHs should receive all the corresponding information streams.Given the above, we allow for a preliminary clustering step at the BBU whereby each RRH is assignedby a subset of the UEs. We denote the set of UEs assigned by i -th RRH as M i ⊆ N U for all i ∈ N R .This implies that i -th RRH only needs the information streams intended for the UEs in the set M i . Wealso denote the set of RRHs that serve the j -th UE, as B j = { i | j ∈ M i } ⊆ N R for all j ∈ N U . Weuse the notation M i [ k ] and B j [ m ] to respectively denote the k -th UE and m -th RRH in the sets M i and B j , respectively. We deﬁne the number of all transmit antennas for the RRHs, which serve the j -thUE, as N t, B j . We assume here that the sets of UEs assigned by i -th RRH are given and not subject tooptimization (see Sec. IV-D for further details).The precoding matrix (cid:102) W is constrained to have zeros in the positions that correspond to RRH-UE pairssuch that the UE is not served by the given RRH. This constraint can be represented as (cid:102) W = (cid:104) E c (cid:102) W c , . . . , E cN U (cid:102) W cN U (cid:105) , (31)where (cid:102) W cj is the N t, B j × N r,j precoding matrix intended for j -th UE and RRHs in the cluster B j , and the N t × N t, B j constant matrix E cj ( E cj only has either a 0 or 1 entries) deﬁnes the association between theRRHs and the UEs as E cj = [ D c B j [1] , . . . , D c B j [ |B j | ] ] , with the N r × N r,j matrix D cj having all zero elementsexcept for the rows from (cid:80) j − k =1 N r,k + 1 to (cid:80) jk =1 N r,j , which contain an N r,j × N r,j identity matrix.The sequence of the N t,i × N r, M i precoding matrices (cid:102) W ri intended for each i -th RRH for all coherencetimes in the coding block is compressed by the BBU and forwarded over the fronthaul link to the i -thRRH. The compressed precoding matrix W ri for i -th RRH is given by W ri = (cid:102) W ri + Q w,i , (32)where the N t,i × N r, M i quantization noise matrix Q w,i is assumed to have zero-mean i.i.d. CN (0 , σ w,i ) entries and to be independent across the index i . Overall, the N t × N r compressed precoding matrix W for all RRHs is represented as W = (cid:102) W + Q w , (33) where W = [ E r † W † w, , . . . , E r † N R W † w,N R ] † , (cid:102) W and Q w are similarly deﬁned.Similar to Eq. (24), an ergodic rate achievable for j -th UE can be written as E [ R altj ( H , (cid:102) W , σσσ w )] , where R altj (cid:16) H , (cid:102) W , σσσ w (cid:17) = 1 T I H ( S j ; Y j )=log det (cid:16) I + H j (cid:16) (cid:102) W (cid:102) W † + Ω w (cid:17) H † j (cid:17) − log det  I + H j  (cid:88) k ∈N U \ j (cid:102) W ck (cid:102) W c † k + Ω w  H † j  . (34)

2) Instantaneous CSI:

With perfect CSI at the BBU, as discussed in Sec. IV-B2, one can adapt theprecoding matrix (cid:102) W ( H ) , the user rates { R j ( H ) } and the quantization noise variances σσσ w ( H ) to thecurrent channel realization at each coherence block. The rate required to transmit precoding informationon the i -th fronthaul in a given channel realizations H is given by C i ( H , (cid:102) W ri , σ w,i ) /T , with T C i (cid:16) H , (cid:102) W ri , σ w,i (cid:17) = 1 T I H ( (cid:102) W ri ; W ri ) (35) = 1 T (cid:110) log det (cid:16) D rTi (cid:102) W (cid:102) W † D ri + σ w,i I (cid:17) − N t,i log (cid:0) σ w,i (cid:1)(cid:111) , where the rate C i ( (cid:102) W ri , σ w,i ) required on i -fronthaul link is deﬁned in Eq. (23). Note that the normalizationby T is needed since only a single precoding matrix is needed for each channel coherence interval. Then,under the fronthaul capacity constraint, the remaining fronthaul capacity that can be used to conveyprecoding information corresponding to the i -th RRH is ¯ C i − (cid:80) j ∈M i R j . As a result, the optimizationproblem of interest can be formulated asmaximize (cid:102) W ( H ) , σσσ w,i ( H ) , { R j ( H ) } (cid:88) j ∈N U µ j R j ( H ) (36a) s.t. R j ( H ) ≤ R altj (cid:16) H , (cid:102) W ( H ) , σσσ w ( H ) (cid:17) , (36b) T C i (cid:16) H , (cid:102) W ri ( H ) , σ w,i ( H ) (cid:17) ≤ ¯ C i − (cid:88) j ∈M i R j ( H ) , (36c) P i (cid:16) (cid:102) W ri ( H ) , σ w,i ( H ) (cid:17) ≤ ¯ P i , (36d)where the constraints apply to all channel realization, Eq. (36b) applies to all j ∈ N U , Eq. (36c) - (36d)apply to all i ∈ N R and the transmit power P i ( (cid:102) W ri ( H ) , σ w,i ( H )) at i -th RRH is deﬁned in Eq. (22). Similarto Sec. IV-B2, the problem in Eq. (36) can be solved for each channel realization H independently. Inaddition, each subproblem can be tackled by using MM algorithm [24].

3) Stochastic CSI:

With stochastic CSI at the BBU, the same precoding matrix is used for all thecoherence blocks and hence the rate required to convey the precoding matrix (cid:102) W ri to each i -th RRHbecomes negligible. As a result, we can neglect the effect of the quantization noise and set σ w,i = 0 forall i ∈ N R . Accordingly, the fronthaul capacity can be used to transfer the information stream under theconstraint (cid:80) j ∈M i R j ≤ ¯ C i , for all i ∈ N R . Based on the above considerations, the optimization problemof interest is formulated as maximize (cid:102) W , { R j } (cid:88) j ∈N U µ j R j (37a) s.t. R j ≤ E (cid:104) R altj (cid:16) H , (cid:102) W , (cid:17)(cid:105) , (37b) (cid:88) j ∈M i R j ≤ ¯ C i , (37c) P i (cid:16) (cid:102) W ri , (cid:17) ≤ ¯ P i , (37d)where Eq. (37b) applies to all j ∈ N U , Eq. (37c)-(37d) apply to all i ∈ N R and the transmit power P i ( (cid:102) W ri , σ w,i ) at i -th RRH is deﬁned in Eq. (22). In problem Eq. (37), the constraint in Eq. (37b) is notonly non-convex but also stochastic. Similar to Sec. IV-B3, the functions R altj ( H , (cid:102) W ) are DC functionsof the covariance matrices (cid:101) V j = (cid:102) W cj (cid:102) W c † j for all j ∈ N U , hence opening up the possibility to develop asolution based on SSUM. We refer to [26] for details on the resulting algorithm. D. Numerical Results

In this section, we compare the performance of the conventional approach and the alternative split. Tothis end, we consider RRHs and UEs to be randomly located in a square area with side δ = 500 m as inFig. 2. As in Sec. III-D, in the path loss formula Eq. (4), we set the reference distance to d = 50 m and thepath loss exponent to η = 3 . We assume the spatial correlation model in Eq. (20) with the angular spread ∆ ji = arctan( r s /d ji ) , with the scattering radius r s = 10 m and with d ji being the Euclidean distancebetween the i -th RRH and the j -th UE. Throughout, we consider that the every RRH is subject to thesame power constraint ¯ P and has the same fronthaul capacity ¯ C ; that is ¯ P i = ¯ P and ¯ C i = ¯ C for i ∈ N R .Moreover, in the alternative split scheme, the UE-to-RRH assignment is carried out by choosing, for each Fronthaul capacity ¯ C (bits/s/Hz) E r go d i c a c h i e v a b l e s u m - r a t e ( b i t s / s / H z ) ConventionalAlternative ( N c =2)Alternative ( N c =4) Instantaneous CSIStochastic CSI

Fig. 8. Ergodic achievable sum-rate vs. the fronthaul capacity ¯ C ( N R = N U = 4 , N t,i = 2 , N r,j = 1 , ¯ P = 10 dB, T = 20 , and µ = 1 ). RRH, the N c UEs that have the largest instantaneous channel norms for instantaneous CSI and the largestaverage channel matrix norms for stochastic CSI. Note that this assignment is done for each coherenceblock in the former case, while in the latter the same assignment holds for all coherence blocks. Notealso that a given UE is generally assigned to multiple RRHs.The effect of the fronthaul capacity limitations on the ergodic achievable sum-rate is investigated inFig. 8, where the number of RRHs and UEs is N R = N U = 4 , the number of transmit antennas is N t,i = 2 for all i ∈ N R , the number of receive antennas is N r,j = 1 for all j ∈ N U , the power is ¯ P = 10 dB , andthe coherence time is T = 20 . We ﬁrst observe that, with instantaneous CSI, the conventional approachstrategy is uniformly better than the alternative split as long as the fronthaul capacity is sufﬁciently large(here ¯ C > ). This is due to the enhanced interference mitigation capabilities of the conventional approachresulting from its ability to coordinate all the RRHs via joint baseband processing without requiring thetransmission of all messages on all fronthaul links. Note, in fact, that, with the alternative split, only N c UEs are served by each RRH, and that making N c larger entails a signiﬁcant increase in the fronthaul Coherence time ( T ) E r go d i c a c h i e v a b l e s u m - r a t e ( b i t s / s / H z ) ConventionalAlternative ( N c =2)Alternative ( N c =3) Instantaneous CSI Stochastic CSI

Fig. 9. Ergodic achievable sum-rate vs. the coherence time T ( N R = N U = 4 , N t,i = 2 , N r,j = 1 , ¯ C = 2 bits/s/Hz, ¯ P = 20 dB , and µ = 1 ). capacity requirements. We will later see that this advantage of the conventional approach is offset bythe higher fronthaul efﬁciency of the alternative split in transmitting precoding information for largecoherence periods T (see Fig. 9). Instead, with stochastic CSI, in the low fronthaul capacity regime, hereabout ¯ C < , the alternative split strategy is generally advantageous due to the additional gain that isaccrued by amortizing the precoding overhead over the entire coding block. Another observation is that,for small ¯ C , the alternative split schemes with progressively smaller N c have better performance thanksto the reduced fronthaul overhead. Moreover, for large ¯ C , the performance of the alternative split schemewith N c = N U , whereby each RRH serves all UEs, approaches that of the conventional scheme.Fig. 9 shows the ergodic achievable sum-rate as function of the coherence time T , with N R = N U = 4 , N t,i = 2 , N r,j = 1 , ¯ C = 2 bits/s/Hz, and ¯ P = 20 dB. As anticipated, with instantaneous CSI, the alternativesplit is seen to beneﬁt from a larger coherence time T , since the fronthaul overhead required to transmitprecoding information gets amortized over a larger period. This is in contrast to the conventional approach Number of UEs N U E r go d i c a c h i e v a b l e s u m - r a t e ( b i t s / s / H z ) ConventionalAlternative ( N c =2)Alternative ( N c =3) Instantaneous CSI Stochastic CSI

Fig. 10. Ergodic achievable sum-rate vs. the number of UEs N U ( N R = 4 , N t,i = 2 , N r,j = 1 , ¯ C = 4 bits/s/Hz, ¯ P = 10 dB, T = 10 ,and µ = 1 ). for which such overhead scales proportionally to the coherence time T and hence the conventional schemeis not affected by the coherence time. As a result, the alternative split can outperform the conventionalapproach for sufﬁciently large T in the presence of instantaneous CSI. Instead, with stochastic CSI, theeffect is even more pronounced due to the additional advantage that is accrued by amortizing the precodingoverhead over the entire coding block.Finally, in Fig. 10, the ergodic achievable sum-rate is plotted versus the number of UEs N U for N R = 4 , N t,i = 2 , N r,j = 1 , ¯ C = 4 , ¯ P = 10 dB and T = 10 . It is observed that the enhanced interferencemitigation capabilities of the conventional approach without the overhead associated to the transmissionof all messages on the fronthaul links yield performance gains for denser C-RANs, i.e., for larger valuesof N U . This remains true for both instantaneous and stochastic CSI cases. V. C

ONCLUDING R EMARKS

In this chapter, we have investigated two important aspects that pertain to the optimal functional splitbetween RRH and BBU at the PHY layer, namely whether uplink channel estimation and downlinkencoding/ precoding should be implemented at the RRH or at the BBU. The analysis, based on information-theoretical arguments, and numerical results, built on proposed efﬁcient design algorithms, yields insightinto the conﬁgurations of network architecture, channel variability and fronthaul capacities in whichdifferent functional splits are advantageous. Among the main conclusions, we have argued that thealternative functional split in which uplink channel estimation is performed at the RRH is to be preferredfor low or moderate values of the coherence period and fronthaul capacity, and mostly for its capabilityto enable adaptive quantization based on the channel conditions. Moreover, the alternative functional splitin which downlink encoding and precoding are carried out at the RRH is beneﬁcial for lightly loadednetworks in the presence of slowly changing channels, particularly under the assumption of stochasticCSI, due to its reduced fronthaul overhead.We close this chapter with some remark on further related topics and open issues. For the uplink, anaspect that deserves further study is the integration of distributed source coding techniques (or Wyner-Ziv coding) with fronthaul processing for the joint transfer of CSI and data (see [24] for some initialdiscussion). Analogously, for the downlink, the impact of joint, or multivariate, compression, as proposed in[24], on the optimal functional split in the presence of different degrees of CSI at the BBU is an interestingopen problem. Finally, the analysis of alternative RRH-BBU functional splits in conjunction with structuredcoding, or compute-and-forward, techniques calls for further attention (see [33] and references therein).R

EFERENCES [1] H. Bo, V. Gopalakrishnan, L. Ji, and S. Lee, “Network function virtualization: Challenges and opportunities for innovations,”

IEEEComm. Mag. , vol. 53, no. 2, pp. 90–97, Feb. 2015.[2] H. Al-Raweshidy and S. Komaki, “Radio over ﬁber technologies for mobile communications networks,”

Artech House , 2002.[3] Ericsson AB, Huawei Technologies, NEC Corporation, Alcatel Lucent, and Nokia Siemens Networks, “Common public radio interface(cpri); interface speciﬁcation,”

CPRI speciﬁcation v5.0 , Sep. 2011. [4] A. Checko, H. L. Christiansen, Y. Yan, L. Scolari, G. Kardaras, M. S. Berger, and L. Dittmann, “Cloud RAN for mobile networks - atechnology overview,” IEEE Communications Surveys and Tutorials , vol. 17, no. 1, pp. 405–426, First quarter 2015.[5] Integrated Device Technology, “Front-haul compression for emerging C-RAN and small cell networks,” White Paper, Integrated DeviceTechnology, Inc, Apr. 2013.[6] Fujitsu, “The beneﬁts of cloud-RAN architecture in mobile network expansion,” 2015.[7] D. Samardzija, J. Pastalan, M. MacDonald, S. Walker, and R. Valenzuela, “Compressed transport of baseband signals in radio accessnetworks,”

IEEE Trans. Wireless Comm. , vol. 11, no. 9, pp. 3216–3225, Sep. 2012.[8] B. Guo, W. Cao, A. Tao, and D. Samardzija, “CPRI compression transport for LTE and LTE-A signal in C-RAN,”

Proc. Int. ICSTConf. CHINACOM , pp. 843–839, 2012.[9] K. F. Nieman and B. L. Evans, “Time-domain compression of complex-baseband lte signals for cloud radio access networks,”

Proc.IEEE Glob. Conf. on Sig. and Inf. Proc. , pp. 1198–1201, Dec. 2013.[10] J. Lorca and L. Cucala, “Lossless compression technique for the fronthaul of LTE/LTE-advanced cloud-RAN architectures,”

Proc. IEEEInt. Symp. World of Wireless, Mobile and Multimedia Networks (WoWMoM) , pp. 1–9, 2013.[11] S. Grieger, S. Boob, and G. Fettweis, “Large scale ﬁeld trial results on frequency domain compression for uplink joint detection,”

Proc.IEEE Glob. Comm. Conf. , pp. 1128–1133, 2012.[12] A. Vosoughi, M. Wu, and J. R. Cavallaro, “Baseband signal compression in wireless base stations,”

Proc. IEEE Glob. Comm. Conf. ,pp. 4505–4511, 2012.[13] U. Dotsch, M. Doll, H. P. Mayer, F. Schaich, J. Segel, and P. Sehier, “Quantitative analysis of split base station processing anddetermination of advantageous architectures for LTE,”

Bell Labs Technical Journal , vol. 18, no. 1, pp. 105–128, 2013.[14] D. Wubben, P. Rost, J. Bartelt, M. Lalam, V. Savin, M. Gorgoglione, A. Dekorsy, and G. Fettweis, “Beneﬁts and impact of cloudcomputing on 5G signal processing: Flexible centralization through cloud-RAN,”

IEEE Sig. Proc. Mag. , vol. 31, no. 6, pp. 35–44, Nov.2014.[15] H. S. Witsenhausen, “Indirect rate distortion problems,”

IEEE Trans. Info. Th. , vol. 26, no. 5, pp. 518–521, Sep. 1980.[16] J. Hoydis, M. Kobayashi, and M. Debbah, “Optimal channel training in uplink network MIMO systems,”

IEEE Trans. Sig. Proc. ,vol. 59, no. 6, pp. 2824–2833, Jun. 2011.[17] J. Kang, O. Simeone, J. Kang, and S. Shamai, “Joint signal and channel state information compression for the backhaul of uplinknetwork MIMO systems,”

IEEE Trans. Wireless Comm. , vol. 13, no. 3, pp. 1555–1567, Mar. 2014.[18] B. Hassibi and B. M. Hochwald, “How much training is needed in multiple-antenna wireless links?”

IEEE Trans. Info. Th. , vol. 49,no. 4, pp. 951–963, Apr. 2003.[19] A. E. Gamal and Y.-H. Kim,

Network Information Theory . Cambridge University Press, 2011.[20] R. Zamir and M. Feder, “On lattice quantization noise,”

IEEE Trans. Info. Th. , vol. 42, no. 4, pp. 1152–1159, Jul. 1996.[21] E. Bjornson and B. E. Ottersten, “A framework for training-based estimation in arbitrarily correlated Rician MIMO channels withRician disturbance,”

IEEE Trans. Sig. Proc. , vol. 58, no. 3, pp. 1807–1820, Mar. 2010.[22] S. Boyd and L. Vandenberghe,

Convex Optimization . Cambridge University Press, 2004.[23] O. Simeone, O. Somekh, H. V. Poor, and S. Shamai, “Downlink multicell processing with limited-backhaul capacity,”

EURASIP Jour.Adv. Sig. Proc. , 2009.[24] S.-H. Park, O. Simeone, O. Sahin, and S. Shamai, “Joint precoding and multivariate backhaul compression for the downlink of cloudradio access networks,”

IEEE Trans. Sig. Proc. , vol. 61, no. 22, pp. 5646–5658, Nov. 2013.[25] S. Park, C.-B. Chae, and S. Bahk, “Before/after precoded massive MIMO in cloud radio access networks,”

Proc. IEEE Int. Conf. onComm. , Jun. 2013.[26] J. Kang, O. Simeone, J. Kang, and S. Shamai, “Fronthaul compression and precoding design for C-RANs over ergodic fading channel,” arXiv:1412.7713 . [27] A. Adhikary, J. Nam, J.-Y. Ahn, and G. Caire, “Joint spatial division and multiplexing: The large-scale array regime,” IEEE Trans.Info. Th. , vol. 59, no. 10, pp. 6441–6463, Oct. 2014.[28] M. Razaviyayn, M. Sanjabi, and Z.-Q. Luo, “A stochastic successive minimization method for nonsmooth nonconvex optimization withapplications to transceiver design in wireless communication networks,” arXiv:1307.4457 .[29] Y. Yang, G. Scutari, and D. P. Palomar, “Parallel stochastic decomposition algorithms for multi-agent systems,”

Proc. IEEE Workshopon Sign. Proc. Adv. in Wireless Comm. , pp. 180–184, Jun. 2013.[30] D. R. Hunter and K. Lange, “A tutorial on MM algorithms,”

The American Statistician , vol. 58, no. 1, pp. 30–37, Feb. 2004.[31] A. Beck and M. Teboulle, “Gradient-based algorithms with applications to signal recovery problems,” in Convex Optimization in SignalProcessing and Communications , Y. Eldar and D. Palomar, editors, pp. 42-48, Cambridge University Press 2010.[32] L. Vandenberghe and S. Boyd, “Semideﬁnite relaxation of quadratic optimization problems,”

SIAM Rev. , vol. 38, no. 1, pp. 49–95,1996.[33] B. Nazer, V. Cadambe, V. Ntranos, and G. Caire, “Expanding the compute-and-forward framework: Unequal powers, signal levels, andmultiple linear combinations,” arXiv:1504.01690arXiv:1504.01690