Deep Learning-based Phase Reconfiguration for Intelligent Reflecting Surfaces
DDeep Learning-based Phase Reconfiguration forIntelligent Reflecting Surfaces
Özgecan Özdo˘gan, Emil Björnson,Department of Electrical Engineering (ISY), Linköping University, Sweden
Abstract —Intelligent reflecting surfaces (IRSs), consisting ofreconfigurable metamaterials, have recently attracted attentionas a promising cost-effective technology that can bring newfeatures to wireless communications. These surfaces can beused to partially control the propagation environment and canpotentially provide a power gain that is proportional to the squareof the number of IRS elements when configured in a proper way.However, the configuration of the local phase matrix at the IRSscan be quite a challenging task since they are purposely designedto not have any active components, therefore, they are not able toprocess any pilot signal. In addition, a large number of elementsat the IRS may create a huge training overhead. In this paper, wepresent a deep learning (DL) approach for phase reconfigurationat an IRS in order to learn and make use of the local propagationenvironment. The proposed method uses the received pilot signalsreflected through the IRS to train the deep feedforward network.The performance of the proposed approach is evaluated and thenumerical results are presented.
I. I
NTRODUCTION
An intelligent reflecting surface (IRS), also known underthe names reconfigurable intelligent surface [1] and software-controlled metasurface [2], is a thin two-dimensional meta-surface that is used to aid communications [3]. According tothe application of interest, an IRS has the ability to controland transform electromagnetic waves that are impinging onit. Recently, it has received a massive attention from theacademia and sometimes marketed as one of the key enablingtechnologies for the next generation wireless communicationsystems.Bringing such a technology into reality requires to addreesmany practical challenges. For instance, the proper configu-ration of an IRS critically depends on accurate channel stateinformation (CSI). However, there are two main issues thatcomplicates the channel acquisition with IRS [4]. First, the IRSis not inherently equipped with transceiver chains. Therefore,it can not sense the pilot signals. Besides, introducing an IRSinto an existing setup will increase the number of channelcoefficients proportionally to the number of IRS elements.In the literature, some deep learning (DL) solutions arediscussed to tackle these problems [5]. In [6], a supervisedlearning approach is presented where two identical convo-lutional neural networks (CNNs) are trained to estimate thedirect and cascaded channels. In [7], a feedforward neuralnetwork is proposed to unveil the mapping between themeasured user coordinates and the optimal phase matrix at theIRS that maximzes the targeted user’s signal strength. Anotherapproach is to equip the IRS with a small number of activeelements with sensing capabilities. The data collected from the active elements are utilized during the training of deepneural networks (DNNs) in [8], [9] and the underlying channelstructure is exploited to learn the entire channel. There are alsodeep reinforcement learning based methods that aim to solvethe problem of joint optimization of IRS phases and transmitbeamforming assuming perfect CSI [10], [11].In this paper, we propose a novel DL approach for phase-configuration in an IRS-assisted MIMO system. We design twoDNNs that are fed by the received pilot signals to directly findthe mapping between the pilot signals and the optimum phasematrix and downlink transmit beamforming vector, therebybypassing the conventional intermediate step of estimating thechannels, which is prone to error propagation. In the firstDNN, we send full-length pilot sequences and compare ourresults with a conventional least-square (LS) estimator basedscheme. In the second method, our goal is to reduce the pilotoverhead. We train the DNN with shorter pilot sequences andpredict the optimum phases and beamforming vector at theonline stage.
Notation : Lower and upper case boldface letters are usedfor vectors and matrices, respectively. The transpose andHermitian transpose of a matrix A are written as A T and A H , respectively. The superscript ( . ) ∗ denotes the complexconjugate. The operation A = diag ( a ) with a ∈ C N × returns the matrix A ∈ C N × N with a on the diagonal. Theoperator ⊗ denotes the Kronecker product. The Euclidiannorm is denoted by (cid:107)·(cid:107) .II. S YSTEM M ODEL WITH
IRS
SUPPORTED TRANSMISSION
We consider communication from an M -antenna BS to asingle-antenna user equipment (UE) as shown in Fig. 1. Aplanar IRS with N elements (composed of N H horizontal and N V vertical) is located in between to assist. The locations ofthe BS and IRS are fixed whereas the UE can be in differentlocations. Each element of the IRS has the ability to introducea phase shift to an incoming narrowband signal. The phase isadjusted by an IRS-controller that enables manipulation of theimpinging wave. The IRS-controller is connected to the BSover a backhaul link to coordinate between the IRS and BS.To configure the IRS elements, the CSI is crucial. Since theIRS is not equipped with radio frequency chains, we assumethat the channel estimation is performed at the BS side. A. Channel Estimation
We assume quasi-static flat-fading channels and the systemoperates in time divison duplex (TDD) mode. Pilot-basedchannel training is utilized to estimate the channels at the BS. a r X i v : . [ ee ss . SP ] S e p ig. 1: Illustration of an IRS-assisted communication system.During the channel estimation phase, the UE sends the pilotsignal x t ∈ C at time slot t . The received pilot signal at theBS is modeled as [12] y t = ( h d + H br diag( φ t ) h ru ) x t + n t , (1)where n t ∼ CN ( , I M ) is the additive white Gaussian noise(AWGN), h d ∈ C M × , H br ∈ C M × N , h ru ∈ C N × are thechannels between BS and UE, BS and IRS, IRS and UE,respectively. The phase configuration at the IRS at time slot t is denoted by φ t = [ e jφ t, , . . . , e jφ t,N ] T ∈ C N × where φ t,n ∈ [0 , π ) is the phase shift of the n th element.We assume that the BS is equipped with a horizon-tal uniform linear array (ULA) placed on the x -axis. Un-like the UE, the IRS and BS have typically fixed lo-cations once they are deployed. Therefore, H br is repre-sented by a static line-of-sight (LoS) channel as H br = √ β br a BS ( ϕ BS , θ BS ) a IRS ( ϕ IRS , θ
IRS ) H where β br is thepathloss coefficient, a BS ( ϕ BS , θ BS ) = (cid:104) , . . . , e j π ( M − d H cos( ϕ BS ) cos( θ BS ) (cid:105) T (2)is the BS’s array response vector where ϕ BS , θ BS are theazimuth and elevation angle-of-arrivals (AoA) to the IRS seenfrom the BS, d H is the antenna spacing parameter measuredin the number of wavelengths. The array response of the IRS(placed on the yz -plane) is denoted by a IRS ( ϕ IRS , θ
IRS ) = [ e j k ( ϕ IRS ,θ IRS ) T u , . . . , e j k ( ϕ IRS ,θ IRS ) T u N ] T (3)where ϕ IRS and θ IRS are the azimuth and elevation angle-of-departures (AoD) to the BS seen from the IRS, respectively.Recall that we consider a planar IRS. The wave vector is k ( ϕ IRS , θ
IRS ) = 2 πλ c cos( ϕ IRS ) cos( θ IRS )sin( ϕ IRS ) cos( θ IRS )sin( θ IRS ) , (4)and the indexing vector is u n = [0 , i ( n ) d r λ c , j ( n ) d r λ c ] T where λ c is the wavelength at the carrier frequency, i ( n ) =mod ( n − , N H ) , and j ( n ) = (cid:98) ( n − /N H (cid:99) are used for thedescribing the location of each IRS element [13, Sec. 7.3].The parameter d r denotes the element spacing at the IRS, in both the horizontal and vertical directions. Notice that theULA array response in (2) is a special case of planar arrayresponse in (3) where [ a BS ( ϕ BS , θ BS )] m = e j k ( ϕ BS ,θ BS ) T u m with u m = [( m − d H λ c , , T .To account for the assumed limited scattering environment,the channels h d and h ru are represented by the Saleh-Valenzuela (SV) model [6], [14]. We assume that there are L d and L ru paths, respectively. Thus, the direct channel ismodeled as h d = (cid:114) L d L d (cid:88) l =1 α l d a BS ( ϕ l BS , θ l BS ) (5)where α l d is the complex channel gain, ϕ l BS , θ l BS are theazimuth and elevation AoAs associated with the l th path.Similarly, the channel between the IRS and UE is h ru = (cid:114) L ru L ru (cid:88) l =1 α l ru a IRS ( ϕ l IRS , θ l IRS ) (6)where α l ru is the complex channel gain, ϕ l IRS , θ l IRS are theazimuth and elevation AoAs associated with the l th path.At time slot t , we can rewrite (1) as y t = ( h d + V φ t ) x t + n t (7)where V = H br diag( h ru ) = [ v , v , . . . , v N ] ∈ C M × N isthe cascaded BS-IRS-UE channel. The pilot signals are sent T times by the UE. We assume that the channels are fixedduring the estimation period and φ t is reconfigured at eachtime slot t . The collection of all the pilot signal at the BS is y p = [ y T , y T , . . . , y TT ] T ∈ C T M × can be written as y p = X ( Φ ⊗ I M ) h + n (8)where the pilot signal is X = diag ([ x M , . . . , x T M ]) ∈ C T M × T M , and n ∼ CN ( , I T M ) . The channels are stackedinto h = [ h T d , v T , . . . , v TN ] T ∈ C ( N +1) M × . All thephase configurations at the IRS are collected in Φ =[ ¯ φ , . . . , ¯ φ T ] T ∈ C T × ( N +1) where ¯ φ t = [1 , φ Tt ] T ∈ C ( N +1) × is the extended reflection pattern accounting forboth the direct and cascaded channels. Notice that the firstcolumn of Φ is set to an all one vector to estimate the directchannel.The IRS phase configuration during the channel estimationperiod, Φ , mimics a discrete Fourier Transform matrix as in[12], [15]. More precisely, each element of the phase matrixcan be written as [ Φ ] t,n = e − j π ( t − n − N +1 (9)where Φ can not contain more than N + 1 unique valuesaround the unit circle. Note that this specific selection of Φ guarantess that rank ( Φ ) = min { T, N + 1 } and the phase ofeach element satisfies the unit-modulus constraint. Besides, thefirst column of Φ is equal to an all one vector. The property | [ Φ ] t,n | = 1 is particulary important since implementingdifferent amplitudes at each IRS element can be costlier andharder. Another potential choice of Φ that satisfies the sameconstraints is a truncated Hadamard matrix [15].ssuming that T ≥ N + 1 , based on the pilot signal y p ,the channels can be estimated by the LS estimator as [12] ˆ h = arg min h (cid:107) Ph − y p (cid:107) = (cid:0) P H P (cid:1) − P H y p (10)where P = X ( Φ ⊗ I M ) is the observation matrix. The BScan utilize these channel estimates to compute the downlinktransmit beamforming vector at the BS and the optimumphase configuration at the IRS. Then, the BS can send the N optimum phases to the IRS via backhaul link. B. IRS Phase Reconfiguration and Downlink Spectral Effi-ciency
If the BS has perfect CSI, it can compute the optimal phasesand the beamforming vector using the alternating optimizationmethod in [16] as φ opt n = arg (cid:0) h H d w (cid:1) − arg (cid:0) v Hn w (cid:1) , (11) w opt = h d + V (cid:0) φ opt (cid:1) ∗ (cid:13)(cid:13)(cid:13) h d + V (cid:0) φ opt (cid:1) ∗ (cid:13)(cid:13)(cid:13) (12)where φ opt = [ φ opt1 , . . . , φ opt N ] T ∈ C N × . We initialize thebeamforming vector as w = √ M [1 , . . . , T . Note that theoptimized phases are obtained by phase aligning the direct andcascaded channels. Besides, for any given phase configuration,the optimum transmit beamforming is equal to the maximumratio precoding vector.During the downlink transmission, the UE receives y r = (cid:0) h H d + h H ru diag (cid:0) φ opt (cid:1) H H br (cid:1) w opt s + n (13)where s is the data signal and n ∼ CN (0 , is the additivenoise. Alternatively, we can rewrite (13) as y r = (cid:16) h H d + (cid:0) φ opt (cid:1) T V H (cid:17) w opt s + n. (14)If the channels are fixed throughout the transmission, therate is R = log (cid:18) γ (cid:12)(cid:12)(cid:12)(cid:16) h H d + (cid:0) φ opt (cid:1) T V H (cid:17) w opt (cid:12)(cid:12)(cid:12) (cid:19) (15) = log (cid:18) γ (cid:13)(cid:13)(cid:13)(cid:16) h H d + (cid:0) φ opt (cid:1) T V H (cid:17)(cid:13)(cid:13)(cid:13) (cid:19) (16)where γ is the signal-to-noise-ratio (SNR). If the BS utilizesthe LS estimator then it treats the estimated channels as thetrue channels and calculates φ opt and w opt based on ˆ h in(10). Then, the optimum phase configuration φ opt based onLS estimator are sent to the IRS over the backhaul link.III. D EEP L EARNING - BASED P HASE C ONFIGURATION
According to the universal approximation theorem, a DNNhas the capability of approximating any continuous function[17]. In supervised learning, DNNs are trained using a trainingdataset that is given as input-output pairs. The goal of the pro-posed DNNs is to find the mapping between the received pilotsignals and the optimum phase configuration and downlinktransmit beamforming vector. The pilot signals go through allthe channels and reach the BS. Therefore, it captures important information for the phase and beamforming setting since thereis a nonlinear relation between the optimal phases and thechannel coefficients. A properly designed DNN can learn thisrelation. Therefore, the problem is to train effectively theweights and biases of the DNN so that it can learn a nearlyoptimal mapping between received pilots and phases. A testdataset that is separately generated from the training data isused to evaluate the performance of the DNNs. During theonline phase, the trained DNNs compute the required phasesand beamforming vector.As mentioned earlier, a main challenge of channel acqui-sition with IRS is that the number of channel coefficientsincreases proportionally to N . The conventional methods suchas the LS estimator in (10) requires a pilot training periodwith T ≥ N + 1 . When applying an LS estimator and thentreating the estimate as perfect, there is an information loss,which is not the case when we directly obtain the phase shiftsand beamforming vector. Besides, the LS estimator is unawareof the underlying propagation conditions, while a DNN canlearn it. Hence, it is possible for a DNN to outperformthe conventional LS method. In this paper, we present twodifferent DNNs with different T values as described in thefollowing subsections. A. Deep Learning Method 1
In the first method, to train the DNN, we set T = N + 1 and use the input-output pairs { y p , Ω } that are generatedduring the preamble stage. The output is formed by stackingthe optimum phases and beamforming vector into Ω = (cid:2) ( φ opt ) T , ( w opt ) T (cid:3) T ∈ C ( N + M ) × . Both input and outputvectors contain complex numbers. To feed them into the DNN,the real and imaginary parts of each entry are separated. Thus,the input has size T M × and the output dimension is N + M ) × . Using a training set of n train samples consistingof different realizations, the DNN emulates the mapping byadjusting the weights and bias terms.The proposed DNN (DL method 1) is composed of 3 fullyconnected hidden layers. The details are presented in Table I.The input data is scaled using Standard Scaler function in thePython environment, which removes the mean and normalizethe input data such that it has unit variance. We use the Adamoptimizer with adaptive learning rates starting from . .The learning rate is reduced to its half when there is noimprovement in the last 5 epochs. As loss function, we selectthe mean square error (MSE). The batch size is chosen as and an early stopping criteria is applied that stops thetraining when the validation accuracy does not improve in 10consecutive epochs. The maximum number of epochs is set to200. B. Deep Learning Method 2
In the second DNN, we set
T < N + 1 to reduce thepilot overhead and the intention is that the DNN will learnhow to reconstruct the channel despite the reduced dimen-sionality. The input-output pairs { y p , Ω } are generated duringthe preamble stage. Note that the input y p is shorter in thiscase. As in DL method 1, the real and imaginary parts of ayers Size Activation FunctionInput T M eluLayer (Dense) eluLayer (Dense) eluLayer (Dense) eluOutput N + M ) linear TABLE I: Layout of the proposed DL method 1 where T = N + 1 . Layers Size Activation FunctionInput T M eluLayer (Dense) eluLayer (Dense) eluLayer (Dense) eluLayer (Dense) eluOutput N + M ) linear TABLE II: Layout of the proposed DL method 2 where
T UMERICAL R ESULTS In this section, we evaluate the performance of the proposedDNNs where M = 10 and N = 100 . For each data sample,the location of the UE with height . m is drawn from auniform distribution over a × square-meter room. Thenumbers of paths are set as L d = L ru = 5 . The downlinktransmit power is dBm and the pilot power is dBm,unless otherwise stated. The receiver noise power is − dBmwhere the bandwidth is MHz.The pathloss coefficient of the BS-IRS channel is calculatedas β br = NA πd where A = ( d r λ c ) is the area of one IRSelement with d r = 0 . and λ c = 0 . m and d br = 292 m isthe distance between the BS and IRS. The antenna spacing atthe BS is d H = 0 . .The other pathloss parameters are set based on [18],[19] as α l d = (cid:112) β ( d bu /d ) − . e − j πf c τ l d and α l ru = (cid:112) β ( d ru /d ) − . e − j πf c τ l ru where d = 1 m, β = − . dB is the reference pathloss, d bu and d ru are the distancesbetween BS-UE and IRS-UE, respectively. The associated pathdelays in nanoseconds are τ l d ∼ U [0 , , τ l ru ∼ U [0 , . Theminimum allowed d ru = 7 m.The DNN was trained based on a dataset of n train = 80000 training samples. Particularly, of the samples was usedfor training and for validation. Another samplesformed the test dataset, which is independent from the trainingdataset but drawn from the same distribution. The trainingprocess takes around 1 hour and the online testing requiresapproximately 0.2 ms for both methods in Python on aWindows 10 personal computer having Intel i7-6600U CPUwith 2.81 GHz and Intel HD Graphics 520 GPU.The normalized mean-squared-error (NMSE) of the phaseconfiguration is calculated as NMSE = 1 n test n test (cid:88) s =1 (cid:13)(cid:13)(cid:13) φ opt s − ˆ φ xs (cid:13)(cid:13)(cid:13) (cid:13)(cid:13) φ opt s (cid:13)(cid:13) (17) where φ opt s is the optimum phase configuration based onperfect CSI, ˆ φ xs is either the output of one of the DNNsor calculated based on LS-based estimation i.e., x ∈{ DL method 1 , DL method 2 , LS-based method } . Notice that (cid:13)(cid:13) φ opt s (cid:13)(cid:13) = (cid:107) ˆ φ xs (cid:107) = N .Fig. 2: Cumulative distribution function of the downlinkspectral efficiency.Fig. 3: NMSE versus pilot transmit powers.Fig. 4: Cumulative distribution function of beamforming mis-match for different methods.ig. 2 compares the cumulative distribution of the downlinkspectral efficiencies that are calculated based on (15) fordifferent cases. The “Direct Path” label represents the casewhen there is no IRS in the system. The “Random φ ” denotesthe setting where the phase configuration at the IRS is setrandomly and the downlink transmit beamforming vector iscalculated based on these phases for each test sample. Weobserve that DL method 1 performs better than the classicalLS-based method for almost all of the samples. It is very closeto the “Optimum φ ” in which the phase configuration and thebeamforming vector are computed based on perfect CSI. Notethat in both DL method 1 and the LS-based method, we usedthe same pilot length T = N +1 = 101 . Moreover, DL method2 in which we used T = 64 also performs better than the LS-based method for most of the test data. The pilot overheadis reduced by in DL method 2 compared to DL method1 and LS-based method. This is because of the fact that theDNNs are able to find the direct mapping between the receivedpilot signals and the optimum phases and beamformer whereasthe LS-based method treats the estimates as the true channelsthat causes an information loss. Besides, the LS estimator doesnot have any prior information on the channel whereas theDNNs can learn the features of the channel from the datasets.In Fig. 3, we compare the NMSEs of the presented methodsfor different pilot transmit powers. During the preamble stage,the training data is generated for different pilot transmit powerswhile keeping the other parameters fixed. Then, the DNNsare trained by these received pilots. It is demonstrated thatfor practical pilot powers the DL methods provide betterperformance whereas for high pilot powers the LS-basedmethod outperforms the DL approaches. However, potentially,another DNN could be designed and trained for high pilotpowers by increasing the width of the hidden layers that wouldincrease the accuracy. However, a potential pitfall with thisapproach is to create an overfitting problem causing the DNNto memorize the training set.In Fig. 4, we compare the accuracy of the downlinktransmit beamforming vectors that are designed at the BSside based on the presented methods. More precisely, thebeamforming mismatch is computed as (cid:107) w opt − w x (cid:107) where x ∈ { DL method 1 , DL method 2 , LS-based method } . Noticethat (cid:107) w opt (cid:107) = (cid:107) w x (cid:107) = 1 . We observe that the DL methodsgive very similar accuracy and they are superior to the LS-based approach. V. C ONCLUSIONS This paper proposes a DNN framework for the reconfigu-ration of IRS elements based on the available pilot signals.We showed that a properly trained feed-forward DNN isable to learn how to configure the IRS phases and downlinkbeamforming vector. DL method 1 outperforms the classicalLS estimator based method for practical pilot transmit powers.Its performance is close to the perfect CSI based approach. Inaddition, DL method 2 reduces the pilot overhead and have asimilar performance to the LS based method. To further improve the framework, other things could bedone such as considering multiple users, IRS-element groupingfor reducing the pilot overhead further or using quantized IRSphases. Besides, measured channels could be used for DNNtraining. R EFERENCES[1] C. Huang, A. Zappone, G. C. Alexandropoulos, M. Debbah, andC. Yuen, “Reconfigurable intelligent surfaces for energy efficiency inwireless communication,” IEEE Trans. Wireless Commun. , vol. 18, no. 8,pp. 4157–4170, 2019.[2] C. Liaskos, S. Nie, A. Tsioliaridou, A. Pitsillides, S. Ioannidis, andI. Akyildiz, “A new wireless communication paradigm through software-controlled metasurfaces,” IEEE Commun. Mag. , vol. 56, no. 9, pp. 162–169, 2018.[3] Q. Wu and R. Zhang, “Towards smart and reconfigurable environment:Intelligent reflecting surface aided wireless network,” IEEE Commun.Mag. , vol. 58, no. 1, pp. 106–112, 2020.[4] E. Björnson, O. Özdogan, and E. G. Larsson, “Reconfigurable intelligentsurfaces: Three myths and two critical questions,” IEEE Commun. Mag. ,2020, to appear.[5] A. M. Elbir and K. V. Mishra, “A survey of deep learningarchitectures for intelligent reflecting surfaces,” 2020. [Online].Available: https://arxiv.org/abs/2009.02540[6] A. M. Elbir, A. Papazafeiropoulos, P. Kourtessis, and S. Chatzino-tas, “Deep channel learning for large intelligent surfaces aided mm-Wave massive MIMO systems,” IEEE Wireless Communications Letters ,vol. 9, no. 9, pp. 1447–1451, 2020.[7] C. Huang, G. C. Alexandropoulos, C. Yuen, and M. Debbah, “Indoorsignal focusing with deep learning designed reconfigurable intelligentsurfaces,” in IEEE 20th International Workshop on Signal ProcessingAdvances in Wireless Communications (SPAWC) , 2019, pp. 1–5.[8] A. Taha, M. Alrabeiah, and A. Alkhateeb, “Deep learning for largeintelligent surfaces in millimeter wave and massive MIMO systems,”in IEEE Global Communications Conference (GLOBECOM) , 2019, pp.1–6.[9] F. Jiang, L. Yang, D. B. da Costa, and Q. Wu, “Channel estimation viadirect calculation and deep learning for RIS-Aided mmWave systems,”2020. [Online]. Available: https://arxiv.org/abs/2008.04704[10] C. Huang, R. Mo, and C. Yuen, “Reconfigurable intelligent surface as-sisted multiuser MISO systems exploiting deep reinforcement learning,” IEEE Journal on Selected Areas in Communications , vol. 38, no. 8, pp.1839–1850, 2020.[11] K. Feng, Q. Wang, X. Li, and C. Wen, “Deep reinforcement learningbased intelligent reflecting surface optimization for MISO communica-tion systems,” IEEE Wireless Communications Letters , vol. 9, no. 5, pp.745–749, 2020.[12] T. L. Jensen and E. De Carvalho, “An optimal channel estimation schemefor intelligent reflecting surfaces based on a minimum variance unbiasedestimator,” in IEEE International Conference on Acoustics, Speech andSignal Processing (ICASSP) , 2020, pp. 5000–5004.[13] E. Björnson, J. Hoydis, and L. Sanguinetti, “Massive MIMO networks:Spectral, energy, and hardware efficiency,” Foundations and Trends R (cid:13) in Signal Processing , vol. 11, no. 3-4, pp. 154–655, 2017.[14] O. E. Ayach, S. Rajagopal, S. Abu-Surra, Z. Pi, and R. W. Heath,“Spatially sparse precoding in millimeter wave MIMO systems,” IEEETrans. Wireless Commun. , vol. 13, no. 3, pp. 1499–1513, 2014.[15] C. You, B. Zheng, and R. Zhang, “Intelligent reflecting surface withdiscrete phase shifts: Channel estimation and passive beamforming,” in IEEE International Conference on Communications (ICC) , 2020, pp.1–6.[16] Q. Wu and R. Zhang, “Intelligent reflecting surface enhanced wirelessnetwork via joint active and passive beamforming,” IEEE Trans. WirelessCommun. , vol. 18, no. 11, pp. 5394–5409, Nov. 2019.[17] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning IEEE Journal on Selected Areas in Communications ,vol. 5, no. 2, pp. 128–137, 1987.[19] Z. Wang, L. Liu, and S. Cui, “Channel estimation for intelligentreflecting surface assisted multiuser communications,” in