[PDF] Deep Learning-based Power Control for Cell-Free Massive MIMO Networks

Abstract

A deep learning (DL)-based power control algorithm that solves the max-min user fairness problem in a cell-free massive multiple-input multiple-output (MIMO) system is proposed. Max-min rate optimization problem in a cell-free massive MIMO uplink setup is formulated, where user power allocations are optimized in order to maximize the minimum user rate. Instead of modeling the problem using mathematical optimization theory, and solving it with iterative algorithms, our proposed solution approach is using DL. Specifically, we model a deep neural network (DNN) and train it in an unsupervised manner to learn the optimum user power allocations which maximize the minimum user rate. This novel unsupervised learning-based approach does not require optimal power allocations to be known during model training as in previously used supervised learning techniques, hence it has a simpler and flexible model training stage. Numerical results show that the proposed DNN achieves a performance-complexity trade-off with around 400 times faster implementation and comparable performance to the optimization-based algorithm. An online learning stage is also introduced, which results in near-optimal performance with 4-6 times faster processing.

Full PDF

aa r X i v : . [ ee ss . SP ] F e b Deep Learning-based Power Control for Cell-FreeMassive MIMO Networks

Nuwanthika Rajapaksha, K. B. Shashika Manosha, Nandana Rajatheva, and Matti Latva-aho

Centre for Wireless Communications, University of Oulu, FinlandE-mail: { nuwanthika.rajapaksha, nandana.rajatheva, matti.latva-aho } @oulu.ﬁ, [email protected] Abstract —A deep learning (DL)-based power control algorithmthat solves the max-min user fairness problem in a cell-freemassive multiple-input multiple-output (MIMO) system is pro-posed. Max-min rate optimization problem in a cell-free massiveMIMO uplink setup is formulated, where user power allocationsare optimized in order to maximize the minimum user rate.Instead of modeling the problem using mathematical optimizationtheory, and solving it with iterative algorithms, our proposedsolution approach is using DL. Speciﬁcally, we model a deepneural network (DNN) and train it in an unsupervised mannerto learn the optimum user power allocations which maximizethe minimum user rate. This novel unsupervised learning-basedapproach does not require optimal power allocations to beknown during model training as in previously used supervisedlearning techniques, hence it has a simpler and ﬂexible modeltraining stage. Numerical results show that the proposed DNNachieves a performance-complexity trade-off with around 400times faster implementation and comparable performance to theoptimization-based algorithm. An online learning stage is alsointroduced, which results in near-optimal performance with 4-6times faster processing.

Index Terms —cell-free massive MIMO, max-min user fairness,power control, deep learning, unsupervised learning

I. I

NTRODUCTION

Massive MIMO, where a base station with a large numberof antennas simultaneously serves many users, has becomea key technology in ﬁfth generation (5G) networks due totheir high throughput and reliability [1]–[3]. Cell-free massiveMIMO combines both massive MIMO and distributed MIMOand has the potential of providing uniformly good throughputfor all users [4]–[6]. In cell-free massive MIMO, a largenumber of distributed access points (APs) serve a muchsmaller number of users distributed over a wide area, and thereare no cells or cell-boundaries [4]. All the APs are connectedto a central processing unit (CPU) via the backhaul network,and they coherently cooperate to serve all users using sametime-frequency resources via time-division duplex (TDD) [5].Different scalable cell-free massive MIMO architectures andreceiver/combiner schemes for uplink/downlink signal pro-cessing are studied in [5]–[8].Proper power allocation helps controlling the inter-userinterference and optimizing the network performance, whichoften involves advanced optimization techniques [9]. In [5],authors show that max-min power control enables cell-freemassive MIMO to provide uniformly good service to all users,

This work was supported by the Academy of Finland 6Genesis Flagship(grant no. 318927). regardless of their locations. The channel hardening prop-erty of cell-free MIMO allows neglecting small-scale fading,causing long-term fading to determine the power controllingtime [9]. Higher time-complexity of optimization-based powercontrol becomes a challenge to meet these time constraints andlimits their practical implementation.Owing to the universal function approximation property ofartiﬁcial neural networks (ANNs) [10], DL-based techniqueshave enabled radio resource allocation with a lower complexitythan traditional optimization-based approaches. Several studieshave proposed DL-based power control for cellular and cell-free massive MIMO systems [9], [11]–[14]. Most of theexisting studies focus on supervised learning approach where aDNN is trained to learn the mapping between the inputs (userlocations or channel statistics) and the optimal power alloca-tions obtained by an optimization algorithm. The unsupervisedlearning algorithm proposed in [13] for K -user interferencechannel power control problem eliminates the need of knowingthe optimal power allocations during model training. In thisstudy, we are interested in such an unsupervised learningalgorithm for cell-free massive MIMO power control whichwill simplify the data preparation and model training stages.The contributions of the paper are as follows: • In this paper, we consider the max-min rate problem in acell-free massive MIMO system. We propose, design andimplement a DNN to learn user power allocations usingchannel statistics to achieve max-min user fairness in anunsupervised manner. The method consists of an ofﬂinemodel training stage and an online prediction stage. • In contrast to previous work on supervised learning-based power control, we introduce unsupervised learningfor cell-free massive MIMO power control. It does notrequire the optimal power allocations to be known duringmodel training as in supervised learning, which makesthe data preparation and model training simpler, morepractical and ﬂexible, since the DNN can be easilyretrained in a changing environment over the time. • Furthermore, we introduce an online training stage toimprove the performance. The model is customized andﬁne-tuned in each channel realization during the onlineimplementation in order to improve minimum user rate. • Simulation results show that the proposed DNN achievesclose performance to the conventional optimization-basedmax-min power control, with a signiﬁcantly lower time-omplexity. The performance-complexity trade-off of theproposed DL-based approach makes it a potential candi-date for practical implementation.II. S

YSTEM M ODEL

We consider a cell-free massive MIMO system with M single-antenna APs and K single-antenna users randomlydistributed in a D × D geographic area. The APs are connectedto a CPU via backhaul connections. The channel coefﬁcientbetween k th user and m th AP is modeled as g mk = √ β mk h mk [5]. Here, β mk is the large-scale fading consisting of pathlossand shadowing, and h mk ∼ CN (0 , represents the small-scale fading between k th user and m th AP. The uplink of thenetwork is considered which consists of pilot transmission,channel estimation, and uplink data transmission phases. A. Pilot Transmission and Channel Estimation

Initially, all the users undergo a pilot transmission phase inorder to estimate the uplink channel coefﬁcients. During thisstage, all K users simultaneously transmit their pilot sequencesof length τ symbols to the APs. Let √ τφφφ k ∈ C τ × be the pilotsequence assigned to k th user with k φφφ k k = 1 . The receivedsignal at m th AP is then given by y p,m = √ τ ρ p K X k =1 g mk φφφ k + w p,m , (1)where w p,m ∈ C τ × is the additive noise at m th AP withi.i.d CN (0 , elements. Then, m th AP estimates the channel g mk , ∀ k by projecting the received signal y p,m onto pilotsequence φφφ Hk as ˜ y p,mk = φφφ Hk y p,m [5]. Thus, ˜ y p,mk = √ τ ρ p (cid:16) g mk + K X k ′ =1 g mk ′ φφφ Hk φφφ k ′ (cid:17) + φφφ Hk w p,m , (2)where the linear minimum mean-squared error (MMSE) esti-mate of g mk given ˜ y p,mk is ˆ g mk = E { ˜ y ∗ p,mk g mk } E {| ˜ y p,mk | } ˜ y p,mk = c mk ˜ y p,mk , (3)where c mk is obtained as [5] c mk = √ τ ρ p β mk τ ρ p P Kk ′ =1 β mk ′ | φφφ Hk φφφ k ′ | +1 . (4) B. Uplink Data Transmission

After the training phase, the actual uplink data transmissionbegins where all the users simultaneously send their signals tothe APs. Let x k = √ ρ q k s k be the transmit signal from k thuser, where s k is the transmit symbol with E {| s k | } = 1 .Normalized uplink SNR is denoted by ρ and q k is the powercontrol coefﬁcient of k th user, where ≤ q k ≤ . The receivedsignal at m th AP from all the users is given by y m = K X k =1 g mk x k + w m = √ ρ K X k =1 g mk √ q k s k + w m , (5)where w m ∼ CN (0 , is the additive noise at m th AP. Thenmatch ﬁltering is done at each AP using the locally obtained channel estimate ˆ g mk , and the scaled received signals are sentto the CPU for joint detection. The aggregated received signal r k (6) at the CPU is used to detect s k . We assume the largescale fading β mk to be known [5]. r k = √ ρ K X k ′ =1 M X m =1 ˆ g ∗ mk g mk ′ √ q k ′ s k ′ + M X m =1 ˆ g ∗ mk w m . (6) C. Max-Min User Rate Scheme

In this work, we consider a max-min user fairness schemewhere the objective is to maximize the minimum user rateby optimizing the user power allocations. We assume thatonly the knowledge of channel statistics is used at the CPUwhen deriving the achievable rate of each user. The uplinkrate for the k th user can be derived as (7) [5]. There, γ mk = E {| ˆ g mk | } = √ τ ρ p β mk c mk . Therefore, we cansee that the achievable rate in (7) is a function of only thelarge-scale fading β mk and user transmit power coefﬁcients q k , and does not involve instantaneous channel values. Then,the max-min rate problem can be formulated as P q k min k =1 , ,...,K R UPk , s.t. ≤ q k ≤ , k = 1 , , ..., K. (8)In [5], an algorithm using bisection and solving a sequenceof linear feasibility problems is proposed to solve problem P1.In [15], a less complex algorithm is proposed by reformulatingthe original problem into a geometric programming (GP)problem and solving it using a convex optimization softwareto obtain optimum power allocations. Instead of using such ananalytical method, we propose a data-driven approach to learnthe optimum solutions of the max-min problem.III. D EEP L EARNING - BASED P OWER C ONTROL

As mentioned earlier, we are interested in an unsuper-vised learning-based approach which does not require optimalground truth outputs for model training. For the consideredpower control problem, a supervised learning approach com-plicates data preparation and model training due to the com-plexity of generating ground truth power allocations using anoptimization algorithm, especially when M and K are large.In contrast, an unsupervised DNN can be directly fed withinputs in order to learn the optimum solutions minimizing agiven loss function during the training process. Such approachis more ﬂexible and adaptable to be practically implementedin a changing wireless communications environment.We propose a feedforward DNN as illustrated in Fig. 1in order to address the power control problem P1 (8). Thenetwork consists of L +1 layers that are sequentially connectedto produce the mapping f ( x ; θ ) : R N × R N L × of aninput vector x ∈ R N × to an output vector x L ∈ R N L × through L iterative processing steps x l = f l ( x l − ; θ l ) , l = 1 , , ..., L. (9)Here f l ( x l − ; θ l ) : R N l − × R N l × is the mappingperformed by the l th layer which depends on the output UPk = log q k (cid:18) P Mm =1 γ mk (cid:19) P Kk ′ = k q k ′ (cid:18) P Mm =1 γ mk β mk ′ β mk (cid:19) | φφφ Hk φφφ k ′ | + P Kk ′ =1 q k ′ P Mm =1 γ mk β mk ′ + ρ P Mm =1 γ mk ! . (7)vector x l − from the previous layer and the set of learnableparameters θ l in the l th layer. The set θ = { θ , θ , .., θ L } denotes the set of all the parameters of the network which arelearnt through model training.Input to the network is the large-scale channel coefﬁcients β mk of all the APs and users, aligned as a column vectordenoted by β with dimension N = M K . Thus, the DNNinput is x = β ∈ R MK × . The model outputs the estimatedpower allocation vector q = [ q , q , ..., q K ] T ∈ A K × , where A = { a ∈ R : 0 ≤ a ≤ } . We have implemented a fullyconnected neural network with Dense layers where f l ( x l − ; θ l ) has the form f l ( x l − ; θ l ) = σ ( W l x l − + b l ) , l = 1 , , ..., L, (10)where W l ∈ R N l × N l − is the weight matrix, b l ∈ R N l × isthe bias vector. Then, the set of learnable parameters is θ l = { W l , b l } . In (10), σ ( · ) is a called an activation function such as ReLU , eLU , Sigmoid etc. which introduces non-linearity to thenetwork. For the L hidden layers in the model, we have usedthe eLU (exponential linear unit) activation function. For theoutput layer, Sigmoid activation function is used to guaranteethat the outputs are in the range [0 , adhering to the transmitpower constraints ≤ q k ≤ , ∀ k users.We implemented a DNN with 4 layers ( L = 3 ) whichhas { M K, K, M, K } number of neurons in each layer and { eLU , eLU , eLU , Sigmoid } activations respectively. This DNNhas a considerably simpler structure than the DNN proposedin [12] in terms of network dimensions. Therefore, it has alower training complexity and can produce outputs with alower online complexity.Given that the goal of problem P1 in (8) is to maximizethe minimum user rate, we apply following loss function formodel training as loss = − E β [ R ( β , θ ) min ] , (11)where θ denotes the set of trainable parameters in the model.There, R ( β , θ ) min = min k =1 , ,...,K R ( β , θ ) UPk is the min-imum user rate among all the K users for a given channelrealization with large-scale fading of β and given θ . For eachuser k , R ( β , θ ) UPk is calculated from (7) using β and q ( θ ) where q ( θ ) is the output from DNN for given θ .This loss function is differentiable with θ which allowstraining the network via stochastic gradient descent (SGD).We adopt mini-batch gradient descent approach to reduce thecomplexity of the SGD. In each iteration of the training, aset of channel realizations are generated from its distribution.Thus, the training loss is approximated as loss ≈ − | B | X β ∈B [ R ( β , θ ) min ] , (12) Hidden Layers(eLU Activation) Output Layer (Sigmoid Activation)Input Layer

Fig. 1. Model layout of power control DNN. where B denotes the set of channel realizations in eachiteration and | B | is the mini-batch size. Thus, during training,the model learns parameters θ to minimize the loss given in(12) which maximizes the minimum user rate as expected.IV. S IMULATIONS AND R ESULTS

In this section we present the numerical simulations toevaluate the performance of the proposed DL-based max-minfairness scheme in comparison with existing optimization-based techniques.

A. Simulation Parameters

We consider a cell-free MIMO system in a simulationarea of × km with different number of AP and userconﬁgurations. This square area is wrapped around at theedges to avoid boundary effects and to emulate a cell-freenetwork with an inﬁnite area [5]. We refer to [5] and Table Ifor more details about simulation parameters.The large-scale fading coefﬁcient β mk from k th user to m thAP is given by [5] β mk = P L mk σshzmk , (13)where P L mk is the pathloss from k th user to m th AP, calcu-lated using the three-slope model used in [5] with parametersas in Table I. The shadow fading with standard deviation σ sh ,and z mk ∼ N (0 , is denoted by σshzmk .When evaluating the spectral efﬁciencies, per-user netthroughputs are considered as follows accounting for thechannel estimation overhead as well. R netk = B (1 − τ /τ c )2 R UPk , (14)where R UPk is the per-user rate in (7), B is the spectralbandwidth, and τ c is the coherence interval in samples. Wehave used τ c = 200 corresponding to a coherence bandwidth ABLE IS

IMULATION PARAMETERS

Parameter Value

Carrier frequency ( f ) 1.9 GHzSimulation area length ( D ) 1 kmAP antenna height ( h AP ) 15 mUser antenna height ( h u ) 1.65 m d , d

10, 50 mBandwidth ( B ) 20 MHzNoise ﬁgure 9 dB σ sh ¯ ρ p , ¯ ρ ) 100, 100 mW of 200 kHz and a coherence time of 1 ms. Orthogonal pilotassignment is considered in all simulations where τ = K so that each user is assigned with a unique orthogonal pilotsequence. B. Simulation Setup

DNN model implementation, training and testing is done inTensorFlow [16]. Optimization-based baseline implementationis done in Matlab using CVX convex optimization softwarepackage [17], [18]. Both implementations are done on thesame platform with a 4-core Intel(R) Core(TM) i5-8250U CPUwith 1.6 GHz frequency.We used three different datasets for DNN model training,validation and testing, consisting of , 1000 and 1000 differ-ent samples respectively. For each sample, different AP anduser distributions were considered with randomly generatedlarge-scale fading channel coefﬁcients. Input to the DNNis normalized using the training dataset mean and variance.The network is trained for 10000 iterations, using mini-batch gradient descent along with the ADAM optimizer withlearning rate 0.01. In each iteration, a random mini-batch ofsize 100 is selected from the training dataset. In every 50iterations, validation dataset is used to evaluate the model,where the model parameters corresponding to the minimumvalidation loss are preserved along the training. After training,performance is evaluated for the test dataset where the trainedmodel is used to produce the power allocations for the testdataset and to calculate per-user rates using (7).The GP optimization algorithm proposed in [15] is usedas the baseline for performance comparison and solved usingCVX. Maximum power transmission where all users trans-mit with full power (i.e. q k = 1 , ∀ k ) is also considered.CVX-based max-min power control results and maximumpower transmission results for the test dataset are denoted as“baseline” and “maximum-power” respectively, in the resultssection. C. Results and Discussion

For performance comparison, we consider two scenarioswith M = 30 , K = 5 and M = 50 , K = 10 randomlydistributed over the × km simulation area. Obtainedcumulative distribution curves are presented in Fig. 2. TheDNN has close performance to the baseline in the lower netthroughput range. Both the baseline and DNN have almost Per-User Uplink Net Throughput (Mbits/s) C u m a l a t i v e D i s t r i bu t i on Empirical CDF

Maximum powerMax-min power control baselineMax-min power DNN

Fig. 2. Cumulative distribution of the per-user net throughput for M =30 , K = 5 (solid lines) and M = 50 , K = 10 (dashed lines). Minimum User Rate (bits/s/Hz) C u m a l a t i v e D i s t r i bu t i on Empirical CDF

Maximum powerMax-min power control baselineMax-min power DNN

Fig. 3. Cumulative distribution of the minimum user rate for M = 30 , K = 5 (solid lines) and M = 50 , K = 10 (dashed lines). same 95% likely per-user net throughput in both networkconﬁgurations. However, the difference in the baseline andDNN performance suggests that the DNN may have learnt asub-optimal power allocation scheme. In order to assess howwell the DNN has achieved the desired objective of max-minuser fairness, we have also evaluated the minimum user rateperformance, which is shown in Fig. 3. Even though the DNNhas a lower minimum user rate performance than the optimalbaseline solution, it has a signiﬁcant improvement over themaximum power scenario.Furthermore, exploiting the unsupervised learning capabilityof the model, we have introduced online training to improvethe performance of the DNN. There, the originally trainedmodel (with the training set) is retrained for 100 iterationswith learning rate 0.01 for each input sample in the testset. Performing online training allows further customizationand ﬁne-tuning of model parameters based on large-scalechannel inputs in each channel realization, further improvingthe minimum user rate. Fig. 4 and Fig. 5 show the improved Per-User Uplink Net Throughput (Mbits/s) C u m a l a t i v e D i s t r i bu t i on Empirical CDF

Maximum powerMax-min power control baselineMax-min power DNNMax-min power DNN online learning

Fig. 4. Cumulative distribution of the per-user net throughput for M =30 , K = 5 , with online training. Minimum User Rate (bits/s/Hz) C u m a l a t i v e D i s t r i bu t i on Empirical CDF

Maximum powerMax-min power control baselineMax-min power DNNMax-min power DNN online learning

Fig. 5. Cumulative distribution of the minimum user rate for M = 30 , K = 5 with online training. results for the per-user rate and minimum user rate cumulativedistributions with online training. It can be seen that theminimum user rates are signiﬁcantly improved with onlinetraining, resulting in worst case user performance much closerto the optimal. Obtained average minimum user rates over thetraining and test datasets for baseline and DNN implementa-tions in all simulation scenarios are summarized in Table II.We also evaluated the performance of the proposed DNNmodel with a ﬁxed AP setup and moving user scenario with M = 30 , K = 5 . The same DNN architecture and trainingprocess is used, but with a different data set. Here, we considerthat the APs are located in a regular grid. Each user startswith a random initial position which is uniformly distributedin the coverage area, and moves in a random direction (left,right, up and down) in a random speed uniformly distributedbetween 0 and 20 m/s. The moving speed and direction arechanged in every 5s. If a user reaches the boundary of thecoverage area, then the direction is reversed so that it remains Per-User Uplink Net Throughput (Mbits/s) C u m a l a t i v e D i s t r i bu t i on Empirical CDF

Maximum powerMax-min power control baselineMax-min power DNNMax-min power DNN online learning

Fig. 6. Cumulative distribution of the per-user net throughput for M =30 , K = 5 , with ﬁxed APs and moving users. Minimum User Rate (bits/s/Hz) C u m a l a t i v e D i s t r i bu t i on Empirical CDF

Maximum powerMax-min power control baselineMax-min power DNNMax-min power DNN online learning

Fig. 7. Cumulative distribution of the minimum user rate for M = 30 , K =5 , with ﬁxed APs and moving users. inside the area. A dataset of 12000 samples corresponding to12000s is generated in this manner, and divided into 10000,1000 and 1000 samples to be used as training, validation andtest sets respectively. Fig. 6 and Fig. 7 show the per-user rateand minimum user rate cumulative distributions obtained forthis setup. DNN implementations show better results than inthe earlier random AP/user scenario, achieving near-optimalminimum user rate performance with online training. Notethat here the reported performance was achieved only using atraining set of 10000 samples and spending less than 10 min-utes for ofﬂine model training. This shows the potential of theproposed unsupervised DNN for quick and easy deploymentin a practical setup due to its low complexity ofﬂine trainingand online processing, and the comparable performance.V. C OMPLEXITY A NALYSIS

Here we compare the computational complexity of thebaseline methods and the proposed DNN implementation forsolving problem P1. Bisection-based algorithm proposed in

ABLE IIA

VERAGE MINIMUM USER RATE COMPARISON FOR

DNN

AND BASELINE . System Baseline DNN DNN withSetup Online Training M = 30 Train: 1.0218 Train: 0.8747 Train: 0.8747 K = 5 Test: 1.0221 Test: 0.8653 Test: 0.9817 M = 50 Train: 0.9897 Train: 0.7966 Train: 0.7966 K = 10 Test: 0.9940 Test: 0.7816 Test: 0.9199 M = 30 Train: 0.9889 Train: 0.8842 Train: 0.8842 K = 5 Test: 0.9854 Test: 0.8473 Test: 0.9664Moving users [5] has a complexity of log ( t max − t min ǫ ) O ( K ) [15]. The GP-based low complexity approach proposed in [15] has O ( K / ) complexity. The newly proposed DNN for approximatingsolutions for the problem P1 has a complexity of O ( K M ) considering the dimensions of the proposed DNN model.The recorded CPU timing for CVX solver and the DNNwith and without online training, to produce outputs for 100channel realizations for M = 30 , K = 5 are 46.47s, 10.61sand 0.12s respectively. For M = 50 , K = 10 , respectivecomputational times are 78.95s, 12.87s and 0.17s. Thus, theDNN is around 400 times faster than the baseline. Further-more, DNN with online training which has near-optimal rateperformance is also around 4-6 times faster than the baseline.This computational complexity of DNNs can be signiﬁcantlyreduced by GPU aided parallel processing implementationswhich are often used in DL implementations. Furthermore, itis apparent that the fast processing of DNN implementationsbecome more signiﬁcant with increasing network dimensions M and K . VI. C ONCLUSION

In this study, we have proposed an unsupervised DL-based algorithm for max-min power control for the uplinkof a cell-free massive MIMO system. The proposed DNNproduces sub-optimal power allocations resulting in close per-user net throughput performance compared to the GP-basedoptimal solution. Performing online training to customize thelearnt model parameters in each channel realization to furtherimprove the max-min performance results in near-optimalper-user and min-user rate performance at the expense ofprocessing complexity. Nevertheless, the proposed DNN im-plementations are much less computationally complex than theGP-based optimization algorithm, specially for larger AP anduser conﬁgurations. Furthermore, the proposed unsupervisedlearning approach has a lower training complexity than thesupervised learning implementation in [12], and also has amuch lower online complexity due to its simpler networkstructure compared to [12].While this is the ﬁrst time unsupervised learning-basedpower control is implemented for cell-free massive MIMO,it should be noted that we have analyzed the performance fora fairly a simpler network conﬁguration with less number ofAPs and users than in a practical network setup. Therefore, further investigations need to be done to understand the bestDNN architectures and hyper-parameters for the proposedunsupervised learning approach to get better results in suchcomplex setups. Nevertheless, DL-based power control in cell-free massive MIMO has research potential, especially whenconsidering complex scenarios such as joint AP selection/userassignment and power control where conventional approachesmight be sub-optimal. R

EFERENCES[1] F. Boccardi, R. W. Heath, A. Lozano, T. L. Marzetta, and P. Popovski,“Five disruptive technology directions for 5G,”

IEEE CommunicationsMagazine , vol. 52, no. 2, pp. 74–80, 2014.[2] T. L. Marzetta, “Noncooperative cellular wireless with unlimited num-bers of base station antennas,”

IEEE Transactions on Wireless Commu-nications , vol. 9, no. 11, pp. 3590–3600, 2010.[3] F. Rusek, D. Persson, B. K. Lau, E. G. Larsson, T. L. Marzetta,O. Edfors, and F. Tufvesson, “Scaling up MIMO: Opportunities andchallenges with very large arrays,”

IEEE Signal Processing Magazine ,vol. 30, no. 1, pp. 40–60, 2013.[4] H. Q. Ngo, A. Ashikhmin, H. Yang, E. G. Larsson, and T. L. Marzetta,“Cell-free massive MIMO: Uniformly great service for everyone,” in , 2015, pp. 201–205.[5] H. Q. Ngo, A. Ashikhmin, H. Yang, E. G. Larsson, and T. L. Marzetta,“Cell-free massive MIMO versus small cells,”

IEEE Transactions onWireless Communications , vol. 16, no. 3, pp. 1834–1850, 2017.[6] E. Nayebi, A. Ashikhmin, T. L. Marzetta, H. Yang, and B. D. Rao,“Precoding and power optimization in cell-free massive MIMO sys-tems,”

IEEE Transactions on Wireless Communications , vol. 16, no. 7,pp. 4445–4459, 2017.[7] E. Bj¨ornson and L. Sanguinetti, “Making cell-free massive MIMOcompetitive with MMSE processing and centralized implementation,”

IEEE Transactions on Wireless Communications , vol. 19, no. 1, pp. 77–90, 2020.[8] E. Bj¨ornson and L. Sanguinetti, “Scalable cell-free massive MIMOsystems,”

IEEE Transactions on Communications , vol. 68, no. 7, pp.4247–4261, 2020.[9] Y. Zhao, I. G. Niemegeers, and S. H. De Groot, “Power allocation incell-free massive MIMO: A deep learning method,”

IEEE Access , vol. 8,pp. 87 185–87 200, 2020.[10] K. Hornik, M. Stinchcombe, and H. White, “Multilayer feedforwardnetworks are universal approximators,”

Neural Networks , vol. 2, no. 5,pp. 359 – 366, 1989.[11] L. Sanguinetti, A. Zappone, and M. Debbah, “Deep learning powerallocation in massive MIMO,” in , 2018, pp. 1257–1261.[12] C. D’Andrea, A. Zappone, S. Buzzi, and M. Debbah, “Uplink powercontrol in cell-free massive MIMO via deep learning,” in , 2019, pp. 554–558.[13] F. Liang, C. Shen, W. Yu, and F. Wu, “Towards optimal power controlvia ensembling deep neural networks,”

IEEE Transactions on Commu-nications , vol. 68, no. 3, pp. 1760–1776, 2020.[14] T. Van Chien, T. Nguyen Canh, E. Bj¨ornson, and E. G. Larsson, “Powercontrol in cellular massive MIMO with varying user activity: A deeplearning solution,”

IEEE Transactions on Wireless Communications ,vol. 19, no. 9, pp. 5732–5748, 2020.[15] M. Bashar, K. Cumanan, A. G. Burr, M. Debbah, and H. Q. Ngo, “Onthe uplink max–min SINR of cell-free massive MIMO systems,”

IEEETransactions on Wireless Communications , vol. 18, no. 4, pp. 2021–2036, 2019.[16] M. Abadi et al.