[PDF] Deep Learning Based FDD Non-Stationary Massive MIMO Downlink Channel Reconstruction

Abstract

This paper proposes a model-driven deep learning-based downlink channel reconstruction scheme for frequency division duplexing (FDD) massive multi-input multi-output (MIMO) systems. The spatial non-stationarity, which is the key feature of the future extremely large aperture massive MIMO system, is considered. Instead of the channel matrix, the channel model parameters are learned by neural networks to save the overhead and improve the accuracy of channel reconstruction. By viewing the channel as an image, we introduce You Only Look Once (YOLO), a powerful neural network for object detection, to enable a rapid estimation process of the model parameters, including the detection of angles and delays of the paths and the identification of visibility regions of the scatterers. The deep learning-based scheme avoids the complicated iterative process introduced by the algorithm-based parameter extraction methods. A low-complexity algorithm-based refiner further refines the YOLO estimates toward high accuracy. Given the efficiency of model-driven deep learning and the combination of neural network and algorithm, the proposed scheme can rapidly and accurately reconstruct the non-stationary downlink channel. Moreover, the proposed scheme is also applicable to widely concerned stationary systems and achieves comparable reconstruction accuracy as an algorithm-based method with greatly reduced time consumption.

Full PDF

11 Deep Learning Based FDD Non-Stationary MassiveMIMO Downlink Channel Reconstruction

Yu Han, Mengyuan Li,

Student Member, IEEE,

Shi Jin,

Senior Member, IEEE,

Chao-Kai Wen,

Member, IEEE, and Xiaoli Ma,

Fellow, IEEE

Abstract — This paper proposes a model-driven deep learning-based downlink channel reconstruction scheme for frequencydivision duplexing (FDD) massive multi-input multi-output(MIMO) systems. The spatial non-stationarity, which is the keyfeature of the future extremely large aperture massive MIMOsystem, is considered. Instead of the channel matrix, the channelmodel parameters are learned by neural networks to save theoverhead and improve the accuracy of channel reconstruction.By viewing the channel as an image, we introduce You OnlyLook Once (YOLO), a powerful neural network for objectdetection, to enable a rapid estimation process of the modelparameters, including the detection of angles and delays of thepaths and the identiﬁcation of visibility regions of the scatterers.The deep learning-based scheme avoids the complicated iterativeprocess introduced by the algorithm-based parameter extrac-tion methods. A low-complexity algorithm-based reﬁner furtherreﬁnes the YOLO estimates toward high accuracy. Given theefﬁciency of model-driven deep learning and the combinationof neural network and algorithm, the proposed scheme canrapidly and accurately reconstruct the non-stationary downlinkchannel. Moreover, the proposed scheme is also applicable towidely concerned stationary systems and achieves comparablereconstruction accuracy as an algorithm-based method withgreatly reduced time consumption.

Index Terms — Deep learning, FDD massive MIMO, non-stationary, visibility region.

I. I

NTRODUCTION

The acquisition of downlink channel state information (CSI)in frequency division duplex (FDD) massive multi-input multi-output (MIMO) systems has been a long-term problem thatobsesses the mobile communication industry [1]–[3]. Withoutthe reciprocity between uplink and downlink, the downlinkCSI has to be obtained through downlink training and feed-back, causing a large amount of overhead. Recently, studieshave suggested to utilize the spatial reciprocity [4] to reducethe cost of downlink CSI acquisition. Uplink and downlinkchannels has a similar spatial domain given that they sharethe same space and scatterers. Thereafter, part of downlinkCSI can be derived from the uplink CSI.

A. Related work

Many related works have been developed to estimate orreconstruct the FDD massive MIMO downlink channels under

Y. Han, M. Li and S. Jin are with the National Mobile CommunicationsResearch Laboratory, Southeast University, Nanjing, 210096, P. R. China(email: [email protected], [email protected], and [email protected]).C.-K. Wen is with the Institute of Communications Engineer-ing, National Sun Yat-sen University, Kaohsiung 804, Taiwan (e-mail:[email protected]).X. Ma is with the School of Electrical and Computer Engineering, GeorgiaInstitute of Technology, Atlanta, GA 30332, USA (email: [email protected]). different channel models. For clustering channels, where acontinuous spatial region has distinct power, the correlationmatrix is introduced to describe the power distribution ofthe channel in the spatial domain [5]–[7]. The downlinkcorrelation matrix can be derived from the uplink, and onlythe downlink instantaneous CSI should be estimated in thedownlink. For limited scattering channels, the angle and delayof each propagation path are common in uplink and downlink,and only the downlink gains should be estimated in thedownlink [8]–[11]. These spatial reciprocity-based methodseffectively ease the burden of downlink training and feed-back; they have great potential in the future use. The spatialreciprocity does not indicate that the downlink CSI can becompletely derived from the uplink. The overhead of downlinktraining and feedback is still required.In recent years, the rapid development of deep learningtechniques stimulates their wide applications to various ar-eas, including localization [12] and FDD downlink channelestimation or prediction [13]–[19]. Most of these methodsare based on an assumption that a mapping function existsbetween the uplink and the downlink channels, which can beconveniently learned by deep networks instead of traditionalalgorithms. With the mapping function, the downlink channelmatrix can be directly predicted from the uplink channelmatrix [13]–[17]. The channel matrix of the massive MIMOmulticarrier system also can be illustrated by an image. Thechannel as an image is an interesting strategy [20], [21],which enables the application of advanced deep learning-based image processing methods. For the downlink channelprediction problem, the uplink and downlink channel imagesare stacked together into a large image [15]. With the uplinkchannel, the base station (BS) draws one half of the image.The other half of the image, which is currently white, isthe downlink channel to be predicted. The downlink channelprediction method works as a painter to complete the otherhalf of the image through image processing methods, suchas generative adversarial networks. The methods in [13]–[17]do not require feedback, thereby raising the interests of theindustry. However, the assumption of the mapping functionbetween the uplink and the downlink channels is invalid incomplicated multipath propagation scenarios [9].To address this problem, the downlink channel is estimatedwith minor feedback in [18]. The downlink channel matrixobtained at the user side is initially encoded, sent to theBS, and then decoded with the aid of uplink channel matrix.Besides, the downlink subchannel on subarray A (denoted by H A ) is correlated with that on subarray B (denoted by H B ) a r X i v : . [ c s . I T ] F e b when spatial stationarity exists. If H B is obtained, then H A can be learned from H B , with the cost of downlink trainingand feedback overhead to acquire H B [19]. These methods areapplicable in practice. However, they ignore the channel modeland directly predicts the channel matrix. Under multipathpropagation conditions, the accuracy of estimation is affectedif the compression rate is low or the scale of subarray B ismuch smaller than that of subarray A . On the contrary, scalingup the compression rate or the size of subarray B furtherincreases the overhead amount. This contradiction limits theperformance of these data-driven methods. Therefore, referringto the channel model is necessary to increase the efﬁciency ofdeep learning-based channel estimation.In massive MIMO systems, where the scale of antenna arrayis extremely large, the signal reﬂected by a scatter does notarrive at the entire array, and the channel begins to show spatialnon-stationarity [22]–[24]. Non-stationarity is a distinct featurein future massive MIMO systems, where the array may bewidely spread on the wall of a building. The non-stationarychannel is more complicated than the stationary ones becausethe visibility region of each scatterer, that is, the part of thearray that can receive signals from the scatterer, should alsobe considered in the channel model. Thus, estimating thedownlink channel of an FDD non-stationary massive MIMOsystem is challenging. The methods in [5]–[10], [13]–[19] aredesigned for FDD stationary massive MIMO systems. Thestudy on the estimation of FDD non-stationary massive MIMOdownlink channels is limited. B. Contribution of this paper

We focus on the downlink channel reconstruction of FDDnon-stationary massive MIMO systems. In accordance withthe multipath channel model, the downlink channel can bereconstructed by the downlink gains, angles, delays, andvisibility regions of the propagation paths. We acquire thefrequency-independent parameters, including the angles, de-lays, and visibility regions, from the uplink and then estimatethe downlink gains from the downlink given the spatial reci-procity. If we apply the iteration-based algorithms to estimatethe frequency-independent parameters, then the complexity ofalgorithm explosively increases. To tackle this problem, wepropose a model-driven deep learning-based downlink channelreconstruction scheme, which has the following advantages.

1) Power of using You Only Look Once (YOLO):

YOLO,a fast object detection neural network that detects all theobjects by looking at the image for only once, can effectivelytackle the problem of explosive complexity. We introduceYOLO to detect each path with much reduced processingtime compared with using iteration-based algorithms [9]. Withthe bounding boxes designed in this study, the frequency-independent parameters of each path can be convenientlyobtained.

2) Efﬁciency of model-driven deep learning:

We do notfollow the data-driven methods to learn the downlink channelmatrix, but we learn the parameters of the paths in the channel.Driven by the channel model, the number of coefﬁcients tobe estimated is much smaller than the data-driven methods. Accordingly, the downlink training and feedback overheadis greatly reduced, and the accuracy of the reconstruction isguaranteed.

3) Ability to identify visibility regions:

The visibility regionof each scatterer consists of one or several subarrays thatreceive the signal reﬂected by the scatterer. Two algorithmsare proposed to identify the visibility regions in differentapproaches. Either approach achieves a successful ratio ofmore than 98%.

4) Reﬁnement of estimates:

A low-complexity reﬁnementmodule is introduced to reﬁne the estimates of angles anddelays. After reﬁnement, the normalized mean square error(NMSE) of uplink channel reconstruction is reduced from − dB to − dB at signal-to-noise ratio (SNR) = 0 dB.

5) Applicability to stationary cases:

The proposed schemealso works in the reduced FDD stationary massive MIMOsystems, where most existing works focus on, and the NMSEperformance is close to that of the algorithm-based recon-struction. On the basis of the stationarity of each subarray,an alternative scheme for non-stationary systems is furtherformulated and evaluated. Nevertheless, the proposed schemeis proven to be more efﬁcient than the alternative one.In the following section, we initially introduce the sys-tem model. The rationale of deep learning-based parameterestimation and the working steps of the proposed schemeare provided in Sections III and IV, respectively. Section Vevaluates the scheme and Section VI concludes the paper.

Notations —We denote scalars by letters in normal fonts,and use uppercase and lowercase boldface letters to representmatrices and vectors, respectively. The superscripts ( · ) ∗ , ( · ) T ,and ( · ) H indicate conjugate, transpose, and conjugate trans-pose, respectively. E {·} means considering the expectationwith respect to the random variables inside the brackets. (cid:12) and ⊗ denote taking the Hadamard and Kronecker products,respectively. We also denote the absolute value and modulusoperations by |·| and (cid:107)·(cid:107) and use (cid:98)·(cid:99) and (cid:100)·(cid:101) to round a decimalnumber to its nearest lower and higher integers, respectively. [ A ] i, : , [ A ] : ,j , and [ A ] i,j represent the i th row, the j th column,and the ( i, j ) th entry of matrix A .II. S YSTEM M ODEL

In a cell of the FDD massive MIMO system, the BS islocated at the cell center and equipped with an M -elementuniform linear array (ULA), where M is large. The distancebetween two adjacent ULA elements is d . Single-antenna usersare randomly distributed in the cell. The reconstruction of eachuser channel is conducted independently; therefore, we focuson a single user.The system works in the FDD duplexing mode. The uplinkand downlink carrier frequencies are f ul and f dl , respectively.The uplink and downlink carrier wavelengths are approxi-mately equal and uniﬁed as λ given | f ul − f dl | (cid:28) f ul , f dl . Orthogonal frequency division multiplexing (OFDM) isapplied. The area band has N subcarriers with spacing ∆ f between two adjacent subcarriers. BS usersubarray 1subarray 2subarray S scatterer 1scatterer 2 Fig. 1. Spatial non-stationarity. Path 1 arrives at subarrays 1 and 2, whereaspath 2 arrives at subarray S . A. Non-stationarity

The channel between the BS and the user comprises L paths,corresponding to L scatterers. For the line-of-sight path, thescatterer is the user antenna itself. The ULA at BS experiencesspatial non-stationarity due to the large aperture of the array.Signals reﬂected by a scatterer may arrive at the entire ULAor a part of the ULA, as shown in Fig. 1.The ULA is uniformly segmented into S subarrays, eachwith M/S elements. The set of adjacent subarrays that cansee scatterer l is deﬁned as the visibility region of scatterer l ,denoted as follows: Φ l = { s l, start , s l, start + 1 , . . . , s l, end − , s l, end } , (1)where s l, start and s l, end are the ﬁrst and last subarrays thatcan receive signals reﬂected from scatterer l , respectively,satisfying ≤ s l, start ≤ s l, end ≤ S .Similarly, the visibility region of subarray s includes thescatterers that can see subarray s , denoted as follows: Ψ s = { l s, , l s, , . . . , l s,L s } , (2)where ≤ l s,i ≤ L holds for i = 1 , . . . , L s and L s is thenumber of scatterers that can reﬂect signals to subarray s ,satisfying ≤ L s ≤ L .The example of Fig. 1 is considered to illustrate the visi-bility regions. For the scatterers, Φ = { , } and Φ = { S } ,and for the subarrays, Ψ = Ψ = { } and Ψ S = { } . B. Spatial reciprocity

Although the uplink and downlink channels are in differentfrequency bands, they share the space and the scatterers. Onthe basis of this spatial reciprocity, the delay and angle of the l path, as well as the visibility region of the l th scatterer, arefrequency-independent and identical in uplink and downlink.We denote τ l and θ l as the delay and angle of the l path,respectively, satisfying ≤ τ l ≤ / ∆ f and ≤ θ l ≤ π . Thefrequency-independent parameters are τ l , θ l , and Φ l , where l = 1 , . . . , L .When reﬂection or scattering occurs, the phase shift amountdiffers in the uplink and downlink due to different carrierfrequencies. Consequently, the complex gains are frequency-dependent and different in uplink and downlink. We denote In practical systems, τ l should be not greater than the cyclic-preﬁx length.Here, we relax this restriction by assuming that the cyclic-preﬁx length isequal to the symbol length. g ul l and g dl l as the uplink and downlink complex gains of the l th path, respectively, which are different from each other. C. Channel model

In the baseband, the frequency of the ﬁrst subcarrier of thedownlink OFDM module is regarded as 0 Hz, and that ofthe uplink OFDM module is f ul − f dl . The non-stationarydownlink channel between the BS and the user k across allantennas and subcarriers is modeled as H dl = L (cid:88) l =1 g dl l ( a (Θ l ) (cid:12) p (Φ l )) q T (Γ l ) , (3)where H dl ∈ C N × M is in the antenna subcarrier domain, Θ l = dλ sin θ l , Γ l = ∆ f τ l , (4)simplify the expressions and have frequency-independency,satisfying ≤ Θ l ≤ and ≤ Γ l ≤ , a (Θ) = (cid:104) , e j π Θ , . . . , e j π ( M − (cid:105) T (5)is the steering vector of the ULA, p (Φ) ∈ Z M × selects theULA elements that are in the subarrays in Φ , the m th entry is [ p (Φ)] m = (cid:40) , if (cid:100) mSM (cid:101) ∈ Φ , , else, (6)and q (Γ) = (cid:104) , e j π Γ , . . . , e j π ( N − (cid:105) T (7)is the phase shift vector across the OFDM subcarriers.Given the spatial reciprocity, the uplink baseband channelis expressed as H ul = L (cid:88) l =1 g ul l e j π ( f ul − f dl ) τ l ( a (Θ l ) (cid:12) p (Φ l )) q T (Γ l ) , (8)where H ul ∈ C N × M is in the antenna subcarrier domain. Wefurther deﬁne α l = g ul l e j π ( f ul − f dl ) τ l (9)as the effective uplink gain of the l th path and simplify theuplink channel model (8) as H ul = L (cid:88) l =1 α l ( a (Θ l ) (cid:12) p (Φ l )) q T (Γ l ) . (10)III. A CQUIRE MODEL PARAMETERS THROUGH LEARNING

We focus on the reconstruction of the downlink channel H dl , which is a fundamental requirement to harvest the spatialmultiplexing gain of FDD massive MIMO downlink. Given thechannel model (3), we can reconstruct the downlink channelwith the model parameters, i.e., Θ l , Γ l , Φ l , and g dl l of eachpath. Thereafter, the acquisition of these model parametersbecomes the primary task of downlink channel reconstruction.Notably, the number of paths (i.e., L ) is also unknown.On the basis of the spatial reciprocity, the model parame-ters are divided into two categories, that is, the frequency-independent parameters (i.e., Θ l , Γ l , and Φ l ) and the (a) (b) (c) (d) Fig. 2. Residues of pilots after each iteration of the NOMP algorithm, where the horizontal and vertical axes represent delay and angle, respectively. Onlyone path is detected at each iteration. If three paths exist, then the NOMP algorithm requires three iterations to ﬁnd all the paths. frequency-dependent parameters (i.e., g dl l ). We estimate thefrequency-independent parameters in the uplink and acquirethe frequency-dependent parameters through downlink trainingand feedback [9]. This method greatly relaxes the overheadrequirement on the downlink training and reduces the feed-back amount from M N to L complex numbers comparedwith traditional linear channel estimation methods, such asleast squares (LS) and linear minimum mean square error(LMMSE) estimators, which can also be regarded as data-driven methods.The frequency-independent parameters are estimated duringthe uplink sounding phase. The uplink all-one pilots receivedby the BS across all antennas and subcarriers are expressed as Y ul = √ P ul L (cid:88) l =1 α l ( a (Θ l ) (cid:12) p (Φ l )) q T (Γ l ) + Z ul , (11)where Y ul ∈ C M × N is in the antenna subcarrier domain, P ul is the transmitted power of user, and Z ul ∈ C M × N is the uplink complex Gaussian noise whose elements areindependent and identically distributed (i.i.d.) with zero meanand unit variance. Y ul is a noisy mixture composed of thepilot components that travel along the L paths and the additiveGaussian noise. We aim to extract Θ l , Γ l , and Φ l from thenoisy mixture Y ul .This section formulates two key problems that lie in theextraction of these frequency-independent parameters throughreviewing the authors’ previous work [9] for FDD stationarymassive MIMO systems, and then introduces deep learning totackle these problems. A. Problem formulation

Newton orthogonal matching pursuit (NOMP) algorithm[25] is adopted in [9] to estimate Θ l , Γ l , and α l from Y ul successively. In the l th iteration, NOMP estimates Θ l and Γ l of the l th path and removes the pilot component along thispath from Y ul . The residues of Y ul after each iteration areillustrated in Fig. 2, where M = N = 64 and L = 3 . Figs. 2(a)and (b) show that the NOMP algorithm can recognize onlythe pilot component with the largest power. Subsequently,this strongest component is removed and the updated residuecontains two components. After three iterations, all the com-ponents are removed, that is, all the paths are detected, therebyleaving only the noise in the residue, as illustrated in Fig. 2(d).The iteration-based algorithm requires L rounds of detection to recognize all the paths. The complexity of the NOMPalgorithm is O ( LM N log(

M N )) . If M , N , and L grow large,then the processing time is considerably long. To avoid thelatency caused by using high-complexity algorithms, we raisethe ﬁrst question as follows: • Q1 : Can we rapidly recognize the angles and delays ofall the paths?The scheme proposed in [9] was designed for spatiallystationary systems and cannot identify Φ l , which is alsofrequency-independent in non-stationary systems. One solu-tion is to estimate Φ l together with Θ l and Γ l at the l thiteration of the NOMP algorithm. With increasing parametersto be estimated, the computation complexity of the updatedalgorithm is further increased. All possible solutions of Φ l ,which is further transformed to the search of s l, start and s l, end , are exhaustively tested. The complexity of the updatedalgorithm is · · · + S = ( S + S ) / times that ofthe NOMP algorithm, thereby resulting in incredibly longprocessing time, which is unacceptable in practice. Thus, weraise the second question as follows: • Q2 : How to efﬁciently identify the visibility regions? B. Sparse image of uplink pilots

When observing Fig. 2(a) which shows signiﬁcant sparsity,we can rapidly determine the three paths in the channel. Thisprocess is fast without adopting iterations or generating ﬁguresof residues, which can imitated by artiﬁcial intelligence.Therefore, prior to answering the two questions, we initiallyinvestigate the sparse image of uplink pilots.In massive MIMO OFDM systems, L (cid:28) M N typicallyholds. After transforming Y ul from the antenna subcarrierdomain to the angular temporal domain, the pilots showsparsity. The angular and temporal transformation matrices aredeﬁned as U a = (cid:20) a (0) , a (cid:18) − γ a M (cid:19) , . . . , a (cid:18) − γ a M − γ a M (cid:19)(cid:21) (12)and U t = (cid:20) q (0) , q (cid:18) − γ t N (cid:19) , . . . , q (cid:18) − γ t N − γ t N (cid:19)(cid:21) (13)respectively, where γ a and γ t are oversampling rates. There-after, the uplink received pilots in temporal angular domainare calculated as ¯ Y ul = U H a Y ul U t , (14) w l h l (a) 0 xy 9380938 x l, min (c) x l, max y l, min y l, max cross-style pattern bounding box h l (b)angle01 ul Y Fig. 3. (a) Image of uplink pilots and the coordinate system of image. (b) Sinc-function pattern of a column of ¯ Y ul and the width of a dark spot (supposeonly one path exists). (c) Coordinate system of network and the bounding box. where ¯ Y ul ∈ C γ a M × γ t N is a sparse matrix. We normalize themodule of each entry of ¯ Y ul and obtain a new real-valuedmatrix ˜ Y ul , whose ( m, n ) th entry is [ ˜ Y ul ] m,n = η | [ ¯ Y ul ] m,n | max i =1 ,...,γ t N,k =1 ,...,γ a M | [ ¯ Y ul ] i,k | . (15)The maximal entry of ˜ Y ul is normalized by η . This normal-ization can avoid the wide color range of the images in anextremely high SNR regime.The image of ˜ Y ul is drawn by MATLAB, as an exampleshown Fig. 3(a), where M = 64 , N = 32 , S = 4 , and L =2 . In the image, the horizontal axis represents delay (i.e., Γ )ranging from 0 to 1, and the vertical axis represents angle (i.e., Θ ) ranging from 0 to 1. In the coordinate system of the image,the upper left vertex is the origin (0,0).The image has L cross-style patterns, each corresponding toa path. The darkness of the cross-style pattern is determinedby the gain of the path. Each cross-style pattern is composedof a strong dark spot at the center and four dotted tails thatstretch upwards, downwards, leftward, and rightwards. Eachdark spot has a semi-square or semi-rectangular shape andholds the following two properties. Property 1:

In the coordinate system of the image, thecoordinates of the center of the dark spot are exactly the delayand angle of the l th path, i.e., (Γ l , Θ l ) . Proof:

Refer to Appendix A.

Property 2:

The width w l and height h l of the l th dark spotin Fig. 3(a) are given as follows: w l = 2 N , h l = 2 S ( s l, end − s l, start + 1) M . (16)

Proof:

Refer to Appendix B.The proofs show that the cross-style pattern is resulted fromthe sinc-function pattern of ¯ Y ul . We extract one column of ¯ Y ul and illustrate it in Fig. 3(b). The sinc-function pattern inangular domain is exactly the array pattern of the ULA.The two properties indicate that the information of Γ l , Θ l ,and Φ l are directly illustrated in the image of uplink pilots. Byobserving the dark spots in the image, we can easily obtainthese frequency-independent parameters. C. Power of YOLO network

With the two properties, we regard the dark spots as theobjects and tackle the problems in Section III.B with apowerful neural network for object detection, that is, YOLO.

Fast : As the name suggests, YOLO can ﬁnd all objects thatthe network knows in an image by only observing the imageonce. According to [26], YOLO can process 45 large imagesin a second, thereby demonstrating its rapid processing ability.

Ability to bound objects : YOLO can position the objects andestimate the size of each object by observing the boundingboxes that frame the objects. If the bounding boxes can belearned to exactly bound the dark spots, then we can answerthe two questions as follows. • Answer to Q1 : Γ l and Θ l can be rapidly estimated bycalculating the center of the l th bounding box. • Answer to Q2 : The size of Φ l can be estimated byobserving the height of the l th bounding box, therebysimplifying the identiﬁcation of Φ l .YOLO has advanced to version 3 [27], which has a compre-hensive network structure but a greatly enhanced successfuldetection ratio of small objects. Therefore, this version isadopted in this study. We maintain the original structure andthe input and output settings of the YOLO network to thegreatest extent. However, we perform the necessary modiﬁca-tions to satisfy the requirement of parameter estimation.The image in Fig. 3(a) illustrates the input of the YOLOnetwork. Only a small amount of data can train the networkbecause all the input images of YOLO have strong similarities.YOLO has its own coordinate system, where the top leftvertex of a input image is regarded as the origin (0,0), asshown in Fig. 3(c). The x and y axes stretch rightwardand downward, respectively. These settings coincide with thecoordinate system of the image of uplink pilots. Each axis inthe coordinate system of the network ranges from 0 to 938, and the coordinates take integer values.Here, the network outputs L parameters after processingthe image, where ˆ L denotes the number of detected paths andis an estimate of L . Five parameters are provided to describethe l th detected path, which are denoted as { C l , x l, min , y l, min , x l, max , y l, max } (17)

938 is the double of the resolution of the network. where C l indicates the conﬁdence level of the detection of the l th path, satisfying < C l ≤ . When C l grows large, theprobability of a successful detection increases. Generally, if C l is less than 0.5, then the l th detected path may be fake.False alarm generally happens in a low SNR regime, where thenoise is falsely identiﬁed as the path. Only one class of object(the path) should be recognized; thus, the class indicator inthe original network is no longer provided in the output.Specially, ( x l, min , y l, min ) and ( x l, max , y l, max ) are the co-ordinates of the top left and the bottom right vertexesof the bounding box, respectively, thereby satisfying ≤ x l, min , y l, min < and < x l, max , y l, max ≤ , as shownin Fig. 3(c). The bounding box exactly bounds the dark spotsas suggested. Thereafter, when generating the labels of trainingdata, we set x l, min = (cid:108) (cid:16) Θ l − w l (cid:17)(cid:109) , y l, min = (cid:24) (cid:18) Γ l − h l (cid:19)(cid:25) ,x l, max = (cid:108) (cid:16) Θ l + w l (cid:17)(cid:109) , y l, max = (cid:24) (cid:18) Γ l + h l (cid:19)(cid:25) . (18)Under this setting of coordinates, Γ l , Θ l , and Φ l can beestimated efﬁciently. D. YOLO-based parameter estimation

Based on the Answer to Q1, Θ l and Γ l are derived fromthe center of the bounding box, i.e., ˜Θ l = y l, min + y l, max × (19)and ˜Γ l = x l, min + x l, max × , (20)where ˜Θ l and ˜Γ l are the coarse estimates of Θ l and Γ l ,respectively.According to Property 2, the size of Φ l determines theheight of the l th bounding box, which is calculated as ˆ h l = y l, max − y l, min . (21)For a scatterer that can see s subarrays, the height of thebounding box should be equal to H ( s ) = 2 SsM , (22)where s = 1 , . . . , S . We estimate the size of Φ l by exhaus-tively searching H (1) , . . . , H ( S ) for the one that has the closestvalue to ˆ h l , i.e., S l = arg min s =1 ,...,S | ˆ h l − H ( s ) | . (23)The identiﬁcation of Φ l is greatly simpliﬁed with the knowl-edge of S l , because we are required to identify only the ﬁrst orthe last subarrays of the adjacent S l subarrays. Two pointers,denoted as i l, start and i l, end , are the indicators of s l, start and s l, end , respectively, as shown in Fig. 4. We initially set i l, start = 1 and i l, end = S . We can determine the S l subarraysby moving the two pointers for a sum of S − S l steps.We decide how to move the pointers by observing theprojection power. If a subarray can see a scatterer, then the ( ) ,end l i ,start l i ,start l s ,end l s subarrayspointers ,start l i ,end l i ( ) Fig. 4. Pointers are used to identify the non-stationarity. The gray blocksrepresent the subarrays in Φ l . uplink pilots on this subarray obtain distinct projection poweron this path. The projection power from path l to the uplinkpilots on subarray s is deﬁned as P l,s = (cid:12)(cid:12)(cid:12)(cid:12)(cid:16) a ( ˜Θ l ) (cid:12) p ( { s } ) (cid:17) H Y ul q ∗ (˜Γ l ) (cid:12)(cid:12)(cid:12)(cid:12) . (24)The following theorem provides the approximation of P l,s . Property 3: If ˜Θ l ≈ Θ l , ˜Γ l ≈ Γ l , and the size of eachsubarray is large, then P l,s can be approximated by P l,s ≈ (cid:40) P ul | α l | M N /S + M N /S, if s ∈ Φ l , M N /S, else. (25)

Proof:

See Appendix C.According to Property 3, P l,s ≈ P l,s holds for subarrays s , s ∈ Φ l . Meanwhile, P l,s (cid:29) P l,s holds for subarrays s ∈ Φ l and s / ∈ Φ l . That is, the subarrays in Φ l have similarvalues of projection power on path l , and these values aremuch larger than those of the subarrays that are not in Φ l .For the two pointers, if P l,i l, start ≥ P l,i l, end , then theprobability that i l, end / ∈ Φ l is high. We move the pointer i l, end backward by one step, i.e., i l, end = i l, end − . Oth-erwise, we move the pointer i l, start forward by one step, i.e., i l, start = i l, start + 1 . We continue to move the pointers until i l, end − i l, start = S l − . Thereafter, we set ˆ s l, start = i l, start , ˆ s l, end = i l, end , (26)and obtain the estimate of Φ l as follows: ˆΦ l = { ˆ s l, start , ˆ s l, start + 1 , . . . , ˆ s l, end } . (27)The non-stationarity identiﬁcation algorithm that utilizes S l isnamed as the bounding box-based algorithm.IV. D OWNLINK CHANNEL RECONSTRUCTION SCHEME

On the basis of the channel model and the power of YOLO,we propose a model-driven deep learning-based scheme toreconstruct the non-stationary downlink channel. Fig. 5 illus-trates the diagram of the proposed non-stationary downlinkchannel reconstruction scheme. The scheme functions succes-sively in the following ﬁve modules.Module 1: The angle and delay detector at the BS obtainscoarse estimates ˜Θ l and ˜Γ l by applying (19) and (20).Module 2: The non-stationarity identiﬁer at the BS obtains ˆΦ l through the bounding box-based algorithm or the algorithmdescribed in the following section that uses ˜Θ l , ˜Γ l , and Y ul .Module 3: The angle and delay reﬁner at the BS obtains ˆΘ l and ˆΓ l , which are the reﬁned estimates of Θ l and Γ l ,respectively, by utilizing ˜Θ l , ˜Γ l , ˆΦ l , and Y ul . angle & delay detector non-stationarity identifier angle & delay refiner downlink gain estimator downlink channel reconstructorBS BSuser ˆ ˆ, l l   ˆ l  downlink pilots reconstructed downlink channeluplink pilots , l l     , l l     ˆ l  ˆ ˆ ˆ, , l l l    dl ˆ l g ul Y Fig. 5. Modules of the proposed downlink channel reconstruction scheme. The modules in gray are based on deep learning. The symbols above the arroware the outputs of the left module of the arrow.

Module 4: The downlink gain estimator at the user obtains ˆ g dl l , which is the estimate of g dl l , by utilizing the downlinkpilots and ˆΦ l , ˆΘ l , and ˆΓ l , and sends ˆ g dl l to the BS.Module 5: The downlink channel reconstructor at the BSreconstructs the downlink channel by applying ˆΦ l , ˆΘ l , ˆΓ l , and ˆ g dl l to (3) as follows: ˆ H dl = ˆ L (cid:88) l =1 ˆ g dl l (cid:16) a ( ˆΘ l ) (cid:12) p ( ˆΦ l ) (cid:17) q T (ˆΓ l ) . (28)The proposed scheme has low overhead and low complexity.In comparison with [9], the present work can rapidly identifythe non-stationarity aside from detecting delays and angles.In the following subsections, modules 2–4 are described indetail, and the reduced case in the stationary scenario is furtherdiscussed. A. Non-stationarity identiﬁer

The bounding box-based algorithm utilizes the deep learn-ing results but is sensitive to the accuracy of y l, max and y l, min ,especially when the size of Φ l is smaller than S and thepower of this path is much smaller than the largest power ofa path in the channel (i.e., | α l | (cid:28) max k | α k | ). To enhance theaccuracy of the non-stationarity identiﬁer, we further proposea projection power-based algorithm.This algorithm is also based on Property 3, but identiﬁes Φ l by comparing the projection power from path l to the uplinkpilots on each subarray. We initially determine the subarraywith the maximal projection power on path l , ¯ s l = arg max s =1 ,...,S P l,s . (29)Subsequently, we ﬁnd the subarrays that have similar projec-tion power with P l, ¯ s l .We still introduce two pointers and initialize them by j l, start = 1 and j l, end = S . We move forward the pointer j l, start until P l,j l, start ≥ δP l, ¯ s l , where < δ < . We set δ ∈ [0 . , . , considering the estimation error of ˜Θ l and ˜Γ l and the existence of noise. Afterward, we move backward thepointer j l, end until P l,j l, end ≥ δP l, ¯ s l . Finally, the estimatedindices of the ﬁrst and last subarrays in Φ l are ˆ s l, start = j l, start , ˆ s l, end = j l, end , (30)and ˆΦ l is derived by applying (30) to (27). B. Angle and delay reﬁner

With ˜Θ l , ˜Γ l , and ˆΦ l , the angle and delay reﬁner thencalculates the reﬁned estimates of angles and delays. Priorto describing the method to reﬁne the estimates, we initiallyexplain the reason of introducing this module.

1) Reasons of introducing the reﬁner:

The angle and delayreﬁner is introduced because the accuracy of ˜Θ l and ˜Γ l isimpacted by the following factors of YOLO. Image resolution : Each image is generated by a ﬁnite-dimensional angular temporal domain pilot matrix. The valuesof γ a M and γ t N are large but not inﬁnite, thereby resulting inthe on-grid effect. Then, the coordinates of the l th dark spotcenter are close to but not equal to (Θ l , Γ l ) . One solutionis to increase the values of γ a and γ t . However, scalingthe oversampling rates results in multiplied complexity andextended running time to generate the images. Network resolution : The maximal coordinates in the coordi-nate system of network are (938,938). For any input image, thenetwork initially rescales the size of the image to × .That is, a × dimensional matrix is processed in thenetwork, instead of the original γ a M × γ t N dimensionalmatrix. Once < γ a M or < γ t N holds, the resolutionis decreased. Integer labels : We set the coordinates of the bounding boxesas integers to maintain the settings of the original YOLOnetwork and guarantee the accuracy of detection. Using integercoordinates also results in the on-grid effect.

Detection error : Although the network is well trained, thedetection error is inevitable. The bounding box may deviatefrom the ideal one. The minimum deviation amount is 1,thereby resulting in the error amount of / . Moreover, falsealarm and miss detection may occur in a low SNR regime.Therefore, the network detection error is the most criticalfactor that harms the accuracy.Consequently, the accuracy of ˜Θ l and ˜Γ l is questioned dueto these factors. Especially when M and N are large, a smallerror of angle and delay results in sharp degradation of channelreconstruction accuracy. Therefore, further processing thesecoarse estimates is necessary.

2) Reﬁning the estimates:

The inputs of the angle and delayreﬁner are (cid:110) Y ul , ˜Θ , ˜Γ , ˆΦ , . . . , ˜Θ ˆ L , ˜Γ ˆ L , ˆΦ ˆ L (cid:111) (31) The outputs are the reﬁned angles and delays, as follows: (cid:110) ˆΘ , ˆΓ , . . . , ˆΘ ˆ L , ˆΓ ˆ L (cid:111) . (32)Recalling the NOMP algorithm, within each iteration, NOMPreﬁnes all the extracted paths through the Newton reﬁnementmethod, which can effectively reﬁne the estimates of delaysand angles toward their real values. However, the originalNewton reﬁnement method is designed for stationary systems.Here, we adjust the method to ﬁt the non-stationary cases.The Newton method reﬁnes the paths one by one in de-creasing order of the path power to guarantee the effectivenessof reﬁnement. We initially calculate the coarse estimates ofuplink effective gains of these ˆ L paths by (cid:2) ˜ α , . . . , ˜ α ˆ L (cid:3) T = (cid:0) A ul H A ul (cid:1) − A ul H y ul , (33)where A ul ∈ C MN × ˆ L , the l th column of A ul is [ A ul ] : ,l = q (˜Γ l ) ⊗ (cid:16) a ( ˜Θ l ) (cid:12) p ( ˆΦ l ) (cid:17) , (34)and y ul is obtained by stacking all the columns of Y ul into avector. Thereafter, we sort these paths by the decreasing orderof (cid:107) ˜ α l p ( ˆΦ l ) (cid:107) . To simplify the expression, we still maintainthe denotations of the coarse estimates in (31), which currentlysatisfy (cid:107) ˜ α p ( ˆΦ ) (cid:107) ≥ . . . ≥ (cid:107) ˜ α ˆ L p ( ˆΦ ˆ L ) (cid:107) . The residue iscalculated as Y ulres = Y ul − ˆ L (cid:88) l =1 √ P ul ˜ α l (cid:16) a ( ˜Θ l ) (cid:12) p ( ˆΦ l ) (cid:17) q T (˜Γ l ) . (35)The Newton method reﬁnes the angles and delays by mini-mizing the residue power.We describe the Newton method by taking the ﬁrst path asan example. We initially deﬁne Y ulres , +1 = Y ulres + √ P ul ˜ α (cid:16) a ( ˜Θ ) (cid:12) p ( ˆΦ ) (cid:17) q T (˜Γ ) . (36)Only the uplink pilots on the subarrays in ˆΦ are utilized inthe reﬁnement of ˜Θ and ˜Γ . The reﬁned estimates obtainedby the Newton method, that is, ˆ α , ˆΘ , and ˆΓ , can achievethe minimum residue power, i.e., (ˆ α , ˆΘ , ˆΓ )= arg min α, Θ , Γ (cid:13)(cid:13)(cid:13)(cid:13)(cid:104) Y ulres , +1 − √ P ul α a (Θ) q T (Γ) (cid:105) r (ˆΦ ) , : (cid:13)(cid:13)(cid:13)(cid:13) F , (37)where the row-selection vector is deﬁned as r (Φ) = (cid:20) MS ( s start −

1) + 1 , . . . , MS s end (cid:21) , (38)and s start and s end represent the indices of the ﬁrst and lastsubarrays in Φ , respectively. The derivations of ˆ α , ˆΘ , and ˆΓ are the same as the original Newton reﬁnement method in[9]; thus, they are omitted here. Having reﬁned the estimatesof the ﬁrst path, we update the residue by Y ulres = Y ulres , +1 − √ P ul ˆ α (cid:16) a ( ˆΘ ) (cid:12) p ( ˆΦ ) (cid:17) q T (ˆΓ ) . (39)Thereafter, the estimates of the other paths are reﬁned fol-lowing the similar approach from (36). The angle and delay reﬁner repeats the above reﬁnement methods for R c rounds.The reﬁnement has a low complexity of O ( R c ˆ LM N ) .After the reﬁner completes its work, all the frequency-independent parameters are acquired by the BS. The BS thenreconstructs the uplink channel by ˆ H ul = ˆ L (cid:88) l =1 ˆ α l (cid:16) a ( ˆΘ l ) (cid:12) p ( ˆΦ l ) (cid:17) q T (ˆΓ l ) . (40) C. Downlink gain estimator

The estimated frequency-independent parameters, including ˆΘ l , ˆΓ l , and ˆΦ l , are sent to the downlink gain estimator. Thismodule functions at the user equipment. As suggested in [9],the downlink pilots are beamformed along the angles of thepaths to enhance the received power at user equipment andimprove the estimation accuracy of downlink gains. Thus, all-one downlink pilots occupy ˆ L OFDM symbols. The pilotsreceived by the user on OFDM symbol t are expressed as y dl t = L (cid:88) l =1 √ P dl g dl l q (Γ l ) ( a (Θ l ) (cid:12) p (Φ l )) T b t + z dl t , (41)where t = 1 , . . . , ˆ L , y dl t ∈ C N × , P dl is the transmitted powerof BS, b t = (cid:115) S (ˆ s t, end − ˆ s t, start + 1) M a ∗ ( ˆΘ t ) (cid:12) p T ( ˆΦ t ) (42)is the beamforming vector for the downlink pilots on OFDMsymbol t , and z dl t ∈ C N × is the downlink complex Gaussiannoise whose elements are i.i.d. with zero mean and unitvariance. (cid:107) b t (cid:107) = 1 due to the power constrain. The design in(42) indicates that on OFDM symbol t , the transmitted poweris allocated only to the subarrays in ˆΦ t .Afterwards, ˆΘ l , ˆΓ l , and ˆΦ l are applied in (41) to replace Θ l , Γ l , and Φ l , respectively. The downlink gains of the ˆ L pathsare estimated by (cid:2) ˆ g dl1 , . . . , ˆ g dlˆ L (cid:3) T = (cid:0) A dl H A dl (cid:1) − A dl H y dl , (43)where A dl = (cid:2) A dl T , . . . , A dl T ˆ L (cid:3) T , (44)the l th column of the t th submatrix A dl t ∈ C N × ˆ L is expressedas [ A dl t ] : ,l = q (ˆΓ l ) (cid:16) a ( ˆΘ l ) (cid:12) p ( ˆΦ l ) (cid:17) T b t , (45)and y dl is obtained by stacking y dl1 , . . . , y dlˆ L into a vector.The estimated downlink gains are fed back to the BS.Finally, the proposed scheme is completed with ˆ H dl beingreconstructed by the downlink channel reconstructor at the BS. D. Discussions

The proposed scheme can efﬁciently reconstruct the FDDnon-stationary massive MIMO downlink channel by identify-ing the mapping between scatterers and subarrays. Moreover,it can be reduced to ﬁt the stationary systems, thereby fur-ther achieving an alternative scheme to reconstruct the non-stationary downlink channel. (b) (c) (d)(a)

Fig. 6. YOLO detection results of the images of uplink pilots when S = 4 , (a) M = N = 32 , SNR = 0 dB, (b) M = N = 32 , SNR = 10 dB, (c) M = N = 128 , SNR = 0 dB, and (d) M = N = 128 , SNR = 10 dB. The values upon the bounding boxes are the conﬁdences of detection.

1) Reducing to stationary cases:

In stationary massiveMIMO systems, each scatterer can see all the subarrays. Orequivalently, the ULA is segmented into only S = 1 subarray,and Φ = . . . = Φ L = { } holds. Under this condition, thefollowing changes occur for the proposed scheme.First, in the image of uplink pilots, when M and N areﬁxed, all the dark spots have the same shape and the uniﬁedheight or width. With equal size of objects, the bounding boxescan frame the dark spots more accurately.Second, the outputs of the non-stationarity identiﬁer become ˆΦ = . . . = ˆΦ ˆ L = { } with probability 1.Third, the angle and delay reﬁner and the downlink gainestimator are reduced to the versions for stationary systems.Therefore, in widely concerned FDD stationary massiveMIMO systems, the proposed downlink channel reconstructionscheme also works, and even works better than in non-stationary systems with the same array scale.

2) Alternative scheme for non-stationary systems:

Theapplicability of the proposed scheme in stationary systemsinspires us with an alternative scheme to reconstruct thedownlink non-stationary channel. It is known that a subarrayis the smallest unit to describe the non-stationarity. If onesubarray is considered individually, then stationarity exists inthe subsystem formed by this subarray. We can apply theproposed scheme in each subsystem individually. Under thiscondition, the following changes should be performed.First, the uplink pilots are divided by Y ul = (cid:2) Y ul T , . . . , Y ul TS (cid:3) T , where Y ul s ∈ C M/S × N .Second, for subsystem s , the image is generated from Y ul s .The reﬁned angles and delays of the paths that subarray s cansee are ˆΘ s,l and ˆΓ s,l , respectively, where l = 1 , . . . , ˆ L s , and ˆ L s is an estimate of L s .Third, a total of (cid:80) Ss =1 ˆ L s paths are estimated by the alterna-tive scheme, requiring (cid:80) Ss =1 ˆ L s OFDM symbols for downlinkpilots. Afterwards, the downlink gains, denoted as ˆ g dl s,l , are sentback to the BS, where l = 1 , . . . , ˆ L s , s = 1 , . . . , S .Fourth, the stationary downlink channel in subsystem s isreconstructed using ˆΘ s,l , ˆΓ s,l , and ˆ g dl s,l , where l = 1 , . . . , ˆ L s .Thereafter, the large-scale non-stationary channel is obtainedby stacking all the stationary downlink channels together.The alternative scheme requires a large amount of downlinktraining and feedback overhead because (cid:80) Ss =1 ˆ L s > L .Moreover, when Ψ = Ψ holds, the alternative scheme cannot identify this equivalence. The estimation accuracy of anglesand delays further degrades when using a reduced number ofantennas. Therefore, the proposed scheme is more efﬁcientthan the alternative.V. N UMERICAL RESULTS

In this section, we evaluate the performance of the proposednon-stationary downlink channel reconstruction scheme. In theFDD system, f ul = 2 . GHz, and f dl = 2 . GHz. TheOFDM subcarrier spacing is ∆ f = 15 kHz. The number ofpaths L is uniformly distributed in [1 , . For the l th path, Θ l and Γ l are uniformly distributed in [0 , . The effective uplinkgain satisﬁes α l = β l e jφ ul l , where β l is uniformly distributedin [0 . , , and φ ul l is uniformly distributed in [0 , π ) . Thedownlink gain is g dl l = β l e jφ dl l , where φ dl l is i.i.d. with φ ul l .YOLO is implemented on the computer with one NvidiaGeForce GTX 1080 Ti GPU. The deep learning library ofKeras running on top of TensorFlow is used. In the trainingphase, we generate 3,000 groups of data under each set of M , N , and S . Each group of training data consists of an imageof uplink pilots and a label vector, which is denoted as { , x l, min , y l, min , x l, max , y l, max } , (46)where the ﬁrst parameter (0) indicates the object class. We set γ a = γ t = 16 , and η = 255 . L is randomly and uniformlydistributed in [1 , . SNR ranges from 0 dB to 10dB. Thenumber of epochs is 300, and the batch size is 4. The trainingand testing data are generated by following the same proceduredescribed in Section III and are not biased from each other.Therefore, overﬁtting issues do not exist. A. Evaluation of deep learning-based estimation

We initially test the performance of the angle and delaydetector, especially the detection accuracy of the YOLO net-work. For any input image, the ratio of successful detection ofobjects is increased when the sizes of objects are large. Thus,we start from the large dark spot cases, where M = N = 32 and S = 4 . The channel is composed of two paths, satisfying Θ = 0 . , Γ = 0 . , and Φ = { , } for path 1 and Θ = 0 . , Γ = 0 . , and Φ = { , } for path 2.Fig. 6(a) illustrates the detection result under the conditionof SNR = 0 dB. The ﬁgure shows that the network can SNR (dB) s u cc e ss f u l r a t i o projection power-based identifierbounding box-based identifier Fig. 7. Successful ratios of the algorithms in a non-stationarity identiﬁer. successfully recognize the actual dark spots from the noisyimage with conﬁdence levels of 0.97 and 0.95. The coarseestimates of angles and delays are ˜Θ = 0 . , ˜Γ = 0 . ,and ˆΘ = 0 . , ˜Γ = 0 . , which are close to the actualvalues. However, in a low SNR regime, the noise is distinct inthe image and appears to be similar to the dark spots, therebyresulting in a false alarm with a conﬁdence level of 0.58.Fig. 6(b) shows the detection result when the SNR is increasedto 10 dB. The cross-style patterns can be clearly observedfrom the image, and the network detects the dark spots withconﬁdence levels of 0.98 and 0.97. The conﬁdences increasein proportion to SNR, and a false alarm is avoided. However,the coarse estimates of angles and delays are ˜Θ = 0 . , ˜Γ = 0 . , and ˜Θ = 0 . , ˜Γ = 0 . , whose accuracyis not improved accordingly.Thereafter, the detection accuracy is tested under small darkspot condition, which is the actual massive MIMO condition.We set M = N = 128 and S = 4 . Figs. 6(c) and (d) illustratethe results of SNR = 0 and dB, respectively. In massiveMIMO systems, the channel becomes sparse, and the noisepower is no longer comparable with that of the dark spots evenin a low SNR regime. The images of uplink pilots appear to besame as those under SNR = 0 and dB. Thus, Figs. 6(c) and(d) show similar detection results. The sizes of the boundingboxes are much smaller than those in Figs. 6(a) and (b),achieving accurate coarse estimates of angles and delays. Wetake the ﬁrst path with a conﬁdence of 1.00 as an example.The actual values are Θ = 0 . and Γ = 0 . . Thecoarse estimates are ˜Θ = 0 . and ˜Γ = 0 . underSNR = 0 dB. The accuracy is enhanced compared with that of M = N = 32 . However, the large-aperture array is sensitive tothe error of angles. Thus, the reﬁnement of angles and delaysis essential.We evaluate the non-stationarity identiﬁer by examining thesuccessful ratio of visibility region identiﬁcation. For path l ,the visibility region is successfully identiﬁed if ˆΦ l = Φ l . Thesuccessful identiﬁcation ratios of the bounding box-based andthe projection power-based algorithms are illustrated in Fig. 7,where M = N = 128 , S = 4 , L = 10 , and δ = 0 . . Thetwo algorithms can successfully identify the visibility regions SNR (dB) -4 -3 -2 -1 N M SE LSLMMSEYOLO w/o refineYOLO w refine

Fig. 8. Effectiveness of the angle and delay reﬁner. with a probability higher than 0.98. As expected, the accuracyof bounding box-based algorithm is sensitive to the detectionerrors of bounding boxes, whereas the projection power-basedalgorithm is more robust. Therefore, the projection power-based algorithm achieves a high successful identiﬁcation ratio.In the following simulations, we adopt this algorithm in thenon-stationarity identiﬁer.

B. Evaluation of the reﬁnement

We examine the effectiveness of the angle and delay reﬁnerthrough testing the NMSE performance of the reconstructeduplink channel. The reﬁner works for R c = 3 rounds. Weintroduce two widely used channel estimation algorithms, i.e.,LS and LMMSE, as the benchmarks. Notably, LS and LMMSEare not realized through deep learning and do not involvenetwork training. The non-stationary massive MIMO is stillconsidered, where M = N = 128 , and S = 4 . The NMSEis calculated by averaging the NMSEs of the reconstructedor estimated uplink channel across all antennas and on onesubcarrier as NMSE = E (cid:40) N N (cid:88) n =1 (cid:107) [ ˆ H ul ] : ,n − [ H ul ] : ,n (cid:107) (cid:107) [ H ul ] : ,n (cid:107) (cid:41) . (47)Fig. 8 illustrates the results. In non-stationary systems, the LSalgorithm performs worse than in stationary systems becausesome subarrays may not see any path, and the channel acrossthese subarrays is zero. However, the LS algorithm still resultsin a nonzero estimated channel, which is the noise. TheLMMSE algorithm identiﬁes the noise through multiplyingthe covariance matrix of the channel. Thus, the LMMSEalgorithm has accurate channel estimation results even in non-stationary systems. For the proposed reconstruction scheme, ifwe directly apply the coarse estimates in the uplink channelmodel, then the NMSE of the reconstructed channel is worse.Fig. 6 shows that even though the coarse estimates of theangles and delays are very close to their actual values, theirestimation errors are large and unacceptable in massive MIMOsystems, where a small angle offset dramatically impacts thechannel reconstruction accuracy. Moreover, the accuracy of SNR (dB) -5 -4 -3 -2 -1 N M SE NOMP M=N=32YOLO M=N=32NOMP M=N=128YOLO M=N=128LMMSE M=N=128CS M=N=128BT M=N=128

Fig. 9. NMSEs of downlink channel estimation schemes under stationarity. the reconstruction without reﬁnement remains − . (i.e., − dB) and cannot be improved with the increase of SNRbecause the bounding boxes remain unchanged [Figs. 6(c)and (d)]. Fortunately, the angle and delay reﬁner signiﬁcantlyimproves the NMSE of the reconstructed uplink channel, forexample, − . (i.e., − dB) at SNR = 0 dB. Moreover,the NMSE can be further decreased with the increase in SNR,demonstrating the effectiveness of the reﬁner. C. Comparison of the proposed and the alternative schemes

We further evaluate the proposed downlink channel re-construction scheme when reduced to stationary conditions.The NOMP, LMMSE, beam tracking (BT), and compressedsensing (CS)-based downlink channel estimation schemes areintroduced as benchmarks. The three latter schemes do notutilize spatial reciprocity and simply rely on downlink train-ing and feedback [28], [29]. Here, we consider the upperbound cases of the BT and CS-based schemes. The BT-basedscheme adopts full-set discrete Fourier transform beams atthe BS, and therefore requires M orthogonal downlink pilots.Subsequently, the user feeds back the received pilots thatoccupy more than 99% of the total received power and theirbeam indices. The CS-based scheme also uses M orthogonaldownlink pilots to distinguish different BS antennas and adoptsthe OMP algorithm to estimate the downlink gains on theextracted orthogonal paths, which are then fed back to theBS. The feedback amounts of the two schemes are deﬁnitelylarger than that of the proposed scheme because of the on-grid effect. In the stationary system, S = 1 , L ∈ [1 , ,and we set M = N = 32 and M = N = 128 , respec-tively. Fig. 9 presents the NMSEs of the reconstructed orestimated downlink channels. Under the same system settings,the proposed scheme achieves nearly the same accuracy as theNOMP-based scheme with greatly reduced time consumption.Especially when M = N = 128 and L = 10 , NOMPconsumes more than 5 minutes and YOLO costs less than2 seconds to determine all the paths. Moreover, althoughwith much lower cost of downlink training and feedback, theproposed scheme still signiﬁcantly outperforms the LMMSE, SNR (dB) nu m be r o f pa t h s truealternativeproposed Fig. 10. Number of paths estimated by the proposed and alternative schemes.

BT, and CS-based schemes. Therefore, the proposed deep-learning scheme is more efﬁcient than the existing algorithm-based schemes. On the other hand, the accuracy is improvedproportional to the values of M and N . When SNR = 10 dB,the NMSE approximates − and − under the conditionsof M = N = 32 and M = N = 128 , respectively, servingas the lower and upper NMSE bounds of the proposed non-stationary channel reconstruction scheme.Finally, we compare the proposed scheme with the alter-native scheme under non-stationary conditions of S = 4 , M = N = 128 , and L ∈ [1 , . From the image drawn by ˜ Y ul , YOLO can detect all the paths without causing missingor false alarm, demonstrating the accuracy of the proposedscheme. The total number of paths estimated by the alternativescheme is more than the number of actual paths and that of thepaths estimated by the proposed scheme even though all thepaths are accurately estimated. In a subsystem generated bya subarray, when the SNR is low, the noise seriously disturbsthe detection of YOLO, thereby causing an extremely highfalse alarm rate, as illustrated in Fig. 10. Most of the pathsestimated by the alternative scheme are fake paths. With theincrease in SNR, the probability of false alarm decreases.Nevertheless, the overhead amount still quadruples that of theproposed scheme when SNR = 10 dB.For the alternative scheme, with less measurements in eachsubsystem, the estimation accuracy of the angles and delaysof actual paths is lower than that of the proposed scheme.Therefore, the NMSE of the uplink channel reconstructed bythe alternative scheme is worse than that of the proposedscheme, as shown by the curve labeled as “UL alternative”in Fig. 11. Moreover, the large amount of fake paths causesthe alternative scheme to outperform the LMMSE methodby integrating these paths together, thereby compensatingthe estimation error of actual paths and achieving a highglobal accuracy. When reconstructing the downlink channel,the amount of downlink training and feedback overhead costby the downlink gain estimation module of the alternativescheme is much larger than that of the proposed scheme.The alternative scheme has a good NMSE performance inreconstructing the downlink channel. Nevertheless, the NMSE SNR (dB) -4 -3 -2 -1 N M SE LMMSEUL alternativeUL proposedDL alternativeDL proposed

Fig. 11. NMSEs of the proposed and the alternative schemes. of the alternative scheme is still inferior to that of the proposedscheme. The proposed scheme harvests the multi-subarray gainand achieves almost equivalent NMSE performance in recon-structing the uplink and downlink channels, demonstratingthe accuracy of frequency-independent parameter estimation.Moreover, the NMSE performance of the two schemes undernon-stationary condition is between that under the stationaryconditions of M = N = 32 and M = N = 128 . Thisphenomenon is in accordance with the assumption of the lowerand upper bounds in Fig. 9.We further compare the performance of the two schemes inpractical systems. The spectral efﬁciency in the downlink isevaluated under the condition of maximal ratio transmitting.The signal received by the user can be expressed as r = √ P diag (cid:40) H dl ˆ H dl H (cid:107) ˆ H dl H (cid:107) (cid:41) x + z dl , (48)where r ∈ C N × with the n th entry as the received signal onthe n th subcarrier, x ∈ C N × is the transmitted signal acrossall subcarriers, satisfying E { xx H } = I , and z dl ∈ C N × isthe noise whose elements are i.i.d. with zero mean and unitvariance. When perfect downlink CSI is available at the user,the spectral efﬁciency can be calculated as SE = E  N N (cid:88) n =1 log  (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:34) H dl ˆ H dl H (cid:107) ˆ H dl H (cid:107) (cid:35) n,n (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)  . (49)Fig. 12 illustrates the Monte-Carlo results of the spectralefﬁciency. When the reﬁner is applied, the proposed scheme,the alternative scheme, and the LMMSE method have nearlythe same spectral efﬁciency because their NMSEs are lowerthan − . LMMSE requires 128 OFDM symbols for down-link training and feeds back × complex numbers,whereas the proposed scheme only costs 1 to 10 OFDMsymbols for downlink training and feeds back 1 to 10 complexnumbers. If the reﬁner is absent, then the proposed schemeachieves relatively lower spectral efﬁciency than the alternativeefﬁciency because of the smaller number of estimated paths, SNR (dB) S pe c t r a l E ff i c i en cy ( b / s / H z ) LMMSEalternative w refineproposed w refinealternative w/o refineproposed w/o refine

Fig. 12. Spectral efﬁciency of the proposed and the alternative schemes. as well as the greatly reduced amount of downlink trainingand feedback overhead. On the other hand, although theNMSE performance is poor, the spectral efﬁciency is notbadly impacted without the reﬁnement module. Therefore,directly applying the learning-based estimates of parametersis acceptable in single user systems.VI. C

ONCLUSION

This study considered the FDD non-stationary massiveMIMO system and proposed a deep learning-based schemeto reconstruct the downlink channel. Two key problems onthe processing time and the non-stationary identiﬁcation weresuccessfully tackled by YOLO. The proposed downlink chan-nel reconstruction scheme was designed to function in ﬁvemodules given the power of YOLO. The visibility regionswere detected by the non-stationary identiﬁer, and estimationaccuracy of angles and delays was improved by the angleand delay reﬁner. Moreover, the reduced case for stationarysystems was discussed, and an alternative scheme for non-stationary systems was further analyzed. The numerical resultsveriﬁed the efﬁciency of the proposed scheme, and demon-strated that the NMSE was superior to that of the NOMP-basedscheme in the FDD stationary massive MIMO systems.VII. A

PPENDIX

A. Proof of Property 1

The image of uplink pilots is generated by ˜ Y ul , whichis further obtained from ¯ Y ul . The ( m, n ) th entry of ¯ Y ul k isexpressed as [ ¯ Y ul ] m,n = L (cid:88) l =1 α l κ a ,l κ t ,l , (50)where κ a ,l = a H (cid:18) − mγ a M (cid:19) ( a (Θ l ) (cid:12) p (Φ l )) , (51)and κ t ,l = q T (Γ l ) q (cid:18) − nγ t N (cid:19) . (52) We initially derive the expression of κ a ,l . In accordance with(5) and (6), (51) can be expressed by κ a ,l = e j πm l, start ( Θ l + mγ a M )+ · · · + e j πm l, end ( Θ l + mγ a M ) , (53)where m l, start = ( s l, start − M/S + 1 and m l, end = s l, end M/S . Utilizing the feature of geometric progression, wecan further express (53) by κ a ,l =1 − e j π ( m l, end − m l, start +1) ( Θ l + mγ a M )1 − e j π ( Θ l + mγ a M ) e j πm l, start ( Θ l + mγ a M ) . (54)In addition, − e j π ( Θ l + mγ a M ) = e jπ ( Θ l + mγ a M ) sin (cid:18) π (cid:18) Θ l + mγ a M (cid:19)(cid:19) . (55)Thereafter, we can obtain the module of κ a ,l as | κ a ,l | = sin (cid:16) π ( m l, end − m l, start + 1) (cid:16) Θ l + mγ a M (cid:17)(cid:17) sin (cid:16) π (cid:16) Θ l + γ a M (cid:17)(cid:17) , (56)which is the sinc function shown in Fig. 3(b). In accordancewith (56), | κ a ,l | achieves its maximal value, i.e., m l, end − m l, start + 1 , when Θ l + m/ ( γ a M ) = 0 . The center of the l thdark spot has the maximal value. Thus, the vertical coordinateof the dark spot center is y = − mγ a M = Θ l . (57)Similarly, we calculate the module of κ t ,l as | κ t ,l | = sin (cid:16) πN (cid:16) Γ l − nγ t N (cid:17)(cid:17) sin (cid:16) π (cid:16) Γ l − nγ t N (cid:17)(cid:17) . (58)In accordance with (58), κ t ,l achieves its maximal value, i.e., N , when Γ l − n/ ( γ t N ) = 0 . Thus, the horizontal coordinateof the dark spot center of the l th cross-style pattern is x = nγ t N = Γ l . (59) B. Proof of Property 2

Given that m l, end − m l, start +1 = ( s l, end − s l, start +1) M/S ,(56) can be further expressed by | κ a ,l | = sin (cid:16) π MS ( s l, end − s l, start + 1) (cid:16) Θ l + mγ a M (cid:17)(cid:17) sin (cid:16) π (cid:16) Θ l + γ a M (cid:17)(cid:17) . (60)(60) shows that κ a ,l achieves its minimum value, i.e., 0, when MS ( s l, end − s l, start + 1) (cid:18) Θ l + mγ a M (cid:19) = q, (61)where q is an nonzero integer. The vertical coordinates of thewhite points along the vertical central line of the dark pointare y = − mγ a M = Θ l + qh l , (62) and h l is deﬁned in (16). The vertical coordinate of the darkspot is Θ l ; thus, the half-height of the dark spot is (cid:12)(cid:12)(cid:12)(cid:12)(cid:18) Θ l − h l (cid:19) − Θ l (cid:12)(cid:12)(cid:12)(cid:12) = h l . (63)Similarly, (58) shows that κ t ,l achieves its minimum value,i.e., 0, when Γ l − nγ t N = qw l . (64)Accordingly, the half-width of the dark spot is w l / . C. Proof of Property 3 If ˜Θ l ≈ Θ l and ˜Γ l ≈ Γ l , then P l,s is approximated by P l,s ≈ P ul (cid:80) Lk =1 η ,k + η , where η ,k = (cid:12)(cid:12)(cid:12) α k ( a (Θ l ) (cid:12) p ( { s } )) H ( a (Θ k ) (cid:12) p (Φ k )) q T (Γ k ) q ∗ (Γ l ) (cid:12)(cid:12)(cid:12) , (65)and η = (cid:12)(cid:12)(cid:12) ( a (Θ l ) (cid:12) p ( { s } )) H Z ul q ∗ (Γ l ) (cid:12)(cid:12)(cid:12) . (66)If M/K is large, then P l,s ≈ E { P l,s } , E { η ,k } = (cid:40) P ul | α l | M N /S , if k = l and s ∈ Φ l , , if k (cid:54) = l, (67)and E { η } ≈ M N /S hold. Finally, we obtain (25) byapplying (66) and (67). R

EFERENCES[1] D. C. Ara¨ujo, T. Maksymyuk, A. L. F. de Almeida, T. Maciel, J. C. M.Mota, and M. Jo, “Massive MIMO: Survey and future research topics,”

IET Commun. , vol. 10, no. 15, pp. 1938-1946, Oct. 2016.[2] O. Elijah, C. Y. Leow, T. A. Rahman, S. Nunoo, and S. Z. Iliya, “Acomprehensive survey of pilot contamination in massive MIMO-5Gsystem,”

IEEE Commun. Surveys Tuts. , vol. 18, no. 2, pp. 905-923,2nd Quart. 2016.[3] D. Fan, F. Gao, G. Wang, Z. Zhong, and A. Nallanathan, “Angledomain signal processing-aided channel estimation for indoor 60-GHzTDD/FDD massive MIMO systems,”

IEEE J. Sel. Areas Commun. , vol.35, no. 9, pp. 1948-1961, Sep. 2017.[4] K. Hugl, K. Kalliola, and J. Laurila, “Spatial reciprocity of uplink anddownlink radio channels in FDD systems,” in

COST 273 TD(02)066 ,2002.[5] H. Xie, F. Gao, S. Jin, J. Fang, and Y.-C. Liang, “Channel estimation forTDD/FDD massive MIMO systems with channel covariance computing,”

IEEE Trans. Wireless Commun. , vol. 17, no. 6, pp. 4206-4218, Jun. 2018.[6] S. Haghighatshoar, M. B. Khalilsarai, and G. Caire, “Multi-band covari-ance interpolation with applications in massive MIMO,” in

Proc. IEEEISIT , 2018, pp. 386-390.[7] M. B. Khalilsarai, S. Haghighatshoar, X. Yi, and G. Caire, “FDD massiveMIMO via UL/DL channel covariance extrapolation and active channelsparsiﬁcation,”

IEEE Trans. Wireless Commun. , vol. 18, no. 1, pp. 121-135, Jan. 2019.[8] X. Zhang, L. Zhong, and A. Sabharwal, “Directional training for FDDmassive MIMO,”

IEEE Trans. Wireless Commun. , vol. 17, no. 8, pp.5183-5197, Aug. 2018.[9] Y. Han, T.-H. Hsu, C.-K. Wen, K.-K. Wong, and S. Jin, “Efﬁcientdownlink channel reconstruction for FDD multi-antenna systems,”

IEEETrans. Wireless Commun. , vol. 18, no. 6, pp. 3161-3176, Jun. 2019.[10] Y. Han, Q. Liu, C.-K. Wen, S. Jin, and K.-K. Wong, “FDD massiveMIMO based on efﬁcient downlink channel reconstruction,”

IEEE Trans.Commun. , vol. 67, no. 6, pp. 4020-4034, Jun. 2019.[11] Y. Han, Q. Liu, C. Wen, M. Matthaiou, and X. Ma, “Tracking FDDmassive MIMO downlink channels by exploiting delay and angularreciprocity,”

IEEE J. Sel. Topics Signal Process. , vol. 13, no. 5, pp.1062-1076, Sep. 2019. [12] U. Ihsan, S. Yan, and R. Malaney, “Location veriﬁcation for emergingwireless vehicular networks,” IEEE Internet Things J. , vol. 6, no. 6, pp.10261-10272, Dec. 2019.[13] M. Alrabeiah and A. Alkhateeb, “Deep learning for TDD andFDD massive MIMO: Mapping channels in space and frequency,” arXiv:1905.03761v1 , May 2019.[14] M. Arnold, S. D¨orner, S. Cammerer, S. Yan, J. Hoydis, and S. ten Brink,“Enabling FDD massive MIMO through deep learning-based channelprediction,” arXiv:1901.03664v1 , Jan. 2019.[15] M. Safari, V. Pourahmadi, “Deep UL2DL: Channel knowledge transferfrom uplink to downlink,” arXiv:1812.07518 , Dec. 2018.[16] J. Wang, Y. Ding, S. Bian, Y. Peng, M. Liu, and G. Gui, “UL-CSI datadriven deep learning for predicting DL-CSI in cellular FDD systems,”

IEEE Access , vol. 7, pp. 96105-96112, Jul. 2019.[17] Y. Yang, F. Gao, G. Y. Li, and M. Jian, “Deep Learning-Based DownlinkChannel Prediction for FDD Massive MIMO System,”

IEEE Commun.Lett. , vol. 23, no. 11, pp. 1994-1998, Nov. 2019.[18] Z. Liu, L. Zhang, and Z. Ding, “Exploiting bi-directional channelreciprocity in deep learning for low rate massive MIMO CSI feedback,”

IEEE Wireless Commun. Lett. , vol. 8, no. 3, pp. 889-892, Jul. 2019.[19] P. Dong, H. Zhang, and G. Y. Li, “Machine learning prediction basedCSI acquisition FDD massive MIMO downlink,” in

Proc. IEEE Globe-com , Dec. 2018, pp. 1-6.[20] C.-K. Wen, W.-T. Shih, and S. Jin, “Deep learning for massive MIMOCSI feedback,”

IEEE Wireless Commun. Lett. , vol. 7, no. 5, pp. 748-751,Oct. 2018.[21] T. Wang, C.-K. Wen, S. Jin, and G. Y. Li, “Deep learning-based CSIfeedback approach for time-varying massive MIMO channels,”

IEEEWireless Commun. Lett. , vol. 8, no. 2, pp. 416-419, Apr. 2019.[22] E. D. Carvalho, A. Ali, A. Amiri, M. Angjelichinoski, and R. W.Heath Jr., “Non-stationarities in extra-large scale massive MIMO,” arXiv:1903.03085 , Mar. 2019.[23] A. Ali, E. D. Carvalho, and R. W. Heath Jr., “Linear receivers innon-stationary massive MIMO channels with visibility regions,”

IEEEWireless Commun. Lett. , vol. 8, no. 3, pp. 885-888, Jun. 2019.[24] A. Amiri, M. Angjelichinoski, E. D. Carvalho, and R. W. Heath Jr.,“Extremely large aperture massive MIMO: Low complexity receiverarchitectures,” in

Proc. IEEE Globecom Workshops , Dec. 2018, pp. 1-6.[25] B. Mamandipoor, D. Ramasamy, and U. Madhow, “Newtonized orthog-onal matching pursuit: Frequency estimation over the continuum,”

IEEETrans. Signal Process. , vol. 64, no. 19, pp. 5066-5081, Oct. 2016.[26] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only lookcnce: Uniﬁed, real-time object detection,” arXiv:1506.02640 , Jun. 2015.[27] J. Redmon and A. Farhadi, “YOLOv3: An incremental improvement,” arXiv:1804.02767 , Apr. 2018.[28] X. Rao and V. K. N. Lau, “Distributed compressive CSIT estimationand feedback for FDD multi-user massive MIMO systems,”

IEEE Trans.Signal Process. , vol. 62, no. 12, pp. 3261-3271, Jun. 2014.[29] Z. Gao, L. Dai, Z. Wang, and S. Chen, “Spatially common sparsity basedadaptive channel estimation and feedback for FDD massive MIMO,”