An Efficient Active Set Algorithm for Covariance Based Joint Data and Activity Detection for Massive Random Access with Massive MIMO
AAN EFFICIENT ACTIVE SET ALGORITHM FOR COVARIANCE BASED JOINT DATA ANDACTIVITY DETECTION FOR MASSIVE RANDOM ACCESS WITH MASSIVE MIMO
Ziyue Wang (cid:63), § , Zhilin Chen † , Ya-Feng Liu § , Foad Sohrabi † , and Wei Yu † (cid:63) School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing, China † Department of Electrical and Computer Engineering, University of Toronto, Toronto, Canada § LSEC, ICMSEC, AMSS, Chinese Academy of Sciences, Beijing, ChinaEmail: [email protected], { zchen, fsohrabi, weiyu } @ece.utoronto.ca, yafl[email protected] ABSTRACT
This paper proposes a computationally efficient algorithm to solvethe joint data and activity detection problem for massive random ac-cess with massive multiple-input multiple-output (MIMO). The BSacquires the active devices and their data by detecting the transmittedpreassigned nonorthogonal signature sequences. This paper employsa covariance based approach that formulates the detection problemas a maximum likelihood estimation (MLE) problem. To efficientlysolve the problem, this paper designs a novel iterative algorithm withlow complexity in the regime where the device activity pattern issparse – a key feature that existing algorithmic designs have not pre-viously exploited for reducing complexity. Specifically, at each it-eration, the proposed algorithm focuses on only a small subset ofall potential sequences, namely the active set , which contains a fewmost likely active sequences (i.e., transmitted sequences by all activedevices), and performs the detection for the sequences in the activeset. The active set is carefully selected at each iteration based on thecurrent detection result and the first-order optimality condition ofthe MLE problem. Simulation results show that the proposed activeset algorithm enjoys significantly better computational efficiency (interms of the CPU time) than the state-of-the-art algorithms.
Index Terms — Active set, joint data and active detection, mas-sive MIMO, massive random access.
1. INTRODUCTION
Massive machine-type communication (mMTC) is a main use casein the fifth generation (5G) cellular systems [1]. A challenging taskin mMTC is the uncoordinated random access, in which a large num-ber of sporadically active devices wish to send small data to the base-station (BS) in the uplink [2]. To meet the low-latency requirementin mMTC, the grant-free random access scheme could be a promis-ing solution [3, 4], in which each device is preassigned multiple sig-nature sequences from a large set of nonorthogonal sequences, andthe active device selects one sequence from the assigned sequencesto transmit. The data and device identification are embedded in thesequence selection. The BS then detects the active devices and de-codes their data by detecting which sequences are transmitted.By exploiting the sporadic device traffic, the joint data and activ-ity detection problem has been formulated as a compressed sensingproblem [4], in which the data and the device activity are recoveredalong with the instantaneous channel state information (CSI) via theapproximate message passing (AMP) algorithm. Similar methodshave also been used for the scenario where each device is associated with only one sequence for the purpose of device activity detection[5, 6, 7]. However, if the CSI is not needed, it is actually possible torecover the data and activities without recovering the channel coeffi-cients using a convariance based approach [8, 9], which outperformsthe AMP method, especially in the massive multiple-input multiple-output (MIMO) systems. In the covariance based method, the de-tection problem is formulated as a maximum likelihood estimation(MLE) problem, in which the channel coefficients are treated as ran-dom samples and averaged out in computing the covariance. Thiscovariance based method is first suggested in [10] for device activitydetection, and it has also been used for a few related data/activity de-tection problems, e.g., data decoding for unsourced random access[11], cooperative activity detection in cell-free systems [12], and ac-tivity detection with interference [13].The coordinate descent (CD) algorithm that iteratively updatesthe sequence selection for each device is commonly used in solv-ing the detection problem in the covariance based approach, whichachieves excellent detection performance; see [8, 10, 13] for moredetails. The possible reason for the popularity of the CD algorithmis that its subproblem (i.e., the original problem with respect to onlyone variable) admits a nice closed-form solution [10], which makesit easily implementable. To further speed up the convergence of theCD algorithm, a new coordinate sampling strategy is proposed in[14]. Other algorithms for solving the detection problem include theexpectation maximization/minimization (EM) algorithm (i.e., sparseBayesian learning) [15] and the SPICE algorithm [16]. However,none of the above mentioned solutions take advantage of the spar-sity of the true solution of the detection to lower their algorithmiccomplexities, thus becoming less computationally efficient when theproblem’s dimension is huge, which is the case in mMTC.In this paper, we propose a computationally efficient algorithmthat carefully exploits the sparsity of the true solution to solve thejoint data and activity detection problem in the covariance based ap-proach. Specifically, we propose an iterative algorithm that attacksthe original large-scale problem by solving a sequence of small-sizeproblems. We focus on only a small subset of all sequences at eachiteration, termed as the active set , which contains only the mostlikely active sequences and can be seen as an approximation of theset of active sequences. We perform joint data and activity detectionfor only the sequences in the active set using a low-complexity spec-tral projected gradient (PG) algorithm [17]. We carefully update theactive set at each iteration based on the current detection result andthe first-order optimality condition of the joint detection problem. Active sequences in this paper refer to transmitted sequences by all activedevices in the joint data and activity detection problem. a r X i v : . [ ee ss . SP ] F e b e also establish the convergence of the proposed active set algo-rithm. Simulation results show that as compared to the commonlyused CD algorithm in the covariance based approach, the proposedactive set algorithm has much higher computational efficiency (interms of the CPU time).
2. SYSTEM MODEL AND PROBLEM FORMULATION2.1. System Model
Consider an uplink single-cell system where there are one BSequipped with M (cid:29) antennas and N devices each equippedwith a single antenna. Assume a quasi-static narrow-band chan-nel model, where the wireless channels remain unchanged withineach transmission block but may vary over different blocks. Let √ g n h n ∈ C M × denote the channel vector from device n to theBS, where g n ≥ is the large-scale fading component (dependingon the device’s location), and h n ∈ C M × is the Rayleigh fadingcomponent following the i.i.d. complex Gaussian distribution.In each coherence block, only K (cid:28) N devices are active (dueto the sporadic traffic), and each active device wishes to transmit J bits of data to the BS, where J is a small value in the mMTCscenario. Assume that each device n has a unique signature sequenceset S n = { s n, , s n, , . . . , s n,Q } , where s n,q ∈ C L × , ≤ q ≤ Q (cid:44) J , and L is the signature sequence length. When device n isactive and needs to send J bits of data, it selects one sequence from S n to transmit. Finally, let χ n,q ∈ { , } indicate whether or notsequence q of device n (i.e., s n,q ) is transmitted. Notice that at mostone sequence is selected by each device, then it follows that χ n,q satisfies (cid:80) Qq =1 χ n,q ∈ { , } , where (cid:80) Qq =1 χ n,q = 0 indicates thatdevice n is inactive, and (cid:80) Qq =1 χ n,q = 1 indicates that device n isactive.Assume that the sequences transmitted by active devices are per-fectly synchronized. Then the received signal Y ∈ C L × M at the BS,which is a superposition of the transmitted signals from all active de-vices, can be expressed as Y = N (cid:88) n =1 Q (cid:88) q =1 χ n,q s n,q √ g n h Tn + W , (1)where W ∈ C L × M is the effective i.i.d. Gaussian noise whose vari-ance σ w is the background noise power normalized by the devicetransmit power.To obtain a more compact expression of the received sig-nal in (1), we define S n = [ s n, , . . . , s n,Q ] ∈ C L × Q , D n = √ g n diag { χ n, , . . . , χ n,Q } ∈ C Q × Q , H n = [ h n , . . . , h n ] T ∈ C Q × M for all n. Based on them, we further define S = [ S , . . . , S N ] ∈ C L × NQ , Γ / = diag { D , . . . , D N } ∈ C NQ × NQ , and H =[ H T , . . . , H TN ] T ∈ C NQ × M . Then, the received signal in (1) canbe compactly expressed as Y = SΓ / H + W . (2)Let γ ∈ C NQ × denote the diagonal entries of Γ , i.e., γ =[ γ T , . . . , γ TN ] T , where γ n = [ γ n, , . . . , γ n,Q ] T ∈ C Q × with γ n,q = g n χ n,q . In the following, we will use γ and Γ interchange-ably. The joint activity and data detection problem is to detect the vari-ables γ n,q ’s, which indicate both the activity of device n and its data (if it is active) from the received signal Y based on the knowledgeof the signature sequence matrix S . Specifically, if γ n,q > , thendevice n is active and it transmits sequence s n,q ; otherwise device n is inactive.As shown in [10, 8], the above joint activity and data detectionproblem can be mathematically formulated as the MLE problem.Specifically, it can be observed from (2) that given γ , each columnof Y , denoted as y m ∈ C L × , ≤ m ≤ M , can be seen as inde-pendent samples from a complex Gaussian distribution as y m ∼ CN (cid:16) , SΓ / ΛΓ / S H + σ w I (cid:17) , (3)where the covariance matrix is obtained by computing E [ y m y Hm ] based on (2), Λ is a block diagonal matrix with each block being theall-one matrix E ∈ R Q × Q , and I is an identity matrix. Since there isat most one non-zero entry in each diagonal block D n in Γ / , thecovariance matrix in (3) can be simplified as SΓ / ΛΓ / S H + σ w I = SΓS H + σ w I . Given γ , we have p ( Y | γ ) = Π Mm =1 p ( y m | γ ) . Based on this and(3), the minimization of − M log p ( Y | γ ) , equivalent to the maxi-mization of p ( Y | γ ) , can be formulated as min γ log (cid:12)(cid:12)(cid:12) SΓS H + σ w I (cid:12)(cid:12)(cid:12) + Tr (cid:18)(cid:16) SΓS H + σ w I (cid:17) − ˆ Σ (cid:19) (4a) s . t . γ ≥ , (4b)where ˆ Σ = YY H /M is the sample covariance matrix computed byaveraging over different antennas, and γ ≥ is due to the fact that γ n,q = g n χ n,q ≥ for all n and q. Since the objective function inproblem (4) depends on Y only through the sample covariance ma-trix ˆ Σ , the approach of estimating activity and associated data basedon solving problem (4) is called the covariance based approach. Itis worthwhile mentioning that problem (4) reduces to the activitydetection problem in [10] if each device has only a single signaturesequence (i.e., J = 0 and thus Q = 1 ).Let f ( γ ) denote the objective function of problem (4). Then,for any q = 1 , , . . . , Q, n = 1 , , . . . , N, the gradient of f ( γ ) withrespect to γ n,q is [ ∇ f ( γ )] n,q = s Hn,q Σ − s n,q − s Hn,q Σ − ˆΣΣ − s n,q . The first-order (necessary) optimality condition of problem (4) is [ ∇ f ( γ )] n,q (cid:40) = 0 , if γ n,q > ≥ , if γ n,q = 0 , ∀ q, n, (5)which is equivalent to [ γ − ∇ f ( γ )] + − γ = , where [ · ] + denotes the projection operator onto the nonnegative or-thant. It can be checked that computing ∇ f ( γ ) has a complexity of O ( NQL ) .
3. PROPOSED ACTIVE SET ALGORITHM
The basic idea of the proposed active set algorithm for solving prob-lem (4) is to fully exploit the sparsity of its true solution in the algo-rithmic design, which is in sharp contrast to all existing algorithmssuch as EM [15], CD [10, 8], and SPICE [16]. More specifically, atach iteration, the active set algorithm first judiciously selects an ac-tive set then solves the subproblem defined over the variables in theactive set with all the other variables fixed being zero. Since the truesolution of problem (4) is sparse, it is expected that the cardinality ofthe carefully selected active set, i.e., the dimension of the subprob-lem, will be significantly less than the total number of variables ofthe original problem (4). Therefore, solving the subproblem definedover the variables in the active set will be much more computation-ally efficient than directly solving the original problem (4) (over allvariables).
Selecting the active set.
In principle, a desirable active setshould contain the indices of active sequences in order to correctlydetect the active users and associated data; on the other hand, itscardinality should be as small as possible in order to avoid unnec-essary computation and improve the computational efficiency. Ourselection strategy of the active set A k at a given feasible point γ k is based on (i) the engineering insight of the joint activity and datadetection problem and (ii) the first-order necessary optimality con-dition (5) of the joint detection problem. In particular, for any givenfeasible point γ k , the selected active set A k contains the indiceswhose corresponding values of γ k are positive and large (based on(i)) and the indices whose corresponding values of ∇ f ( γ k ) are neg-ative and small (due to (ii)). Mathematically, the proposed selectionstrategy of the active set A k is A k = (cid:110) ( i, q ) | γ ki,q > ω k or [ ∇ f ( γ k )] i,q < − ν k (cid:111) , (6)where ω k and ν k are two positive parameters. The choices of theparameters ω k and ν k provide a trade-off between reducing the car-dinality of the active set and not missing the active sequences. Ingeneral, the smaller these two parameters, the larger the cardinalityof the selected active set and the lower probability of missing theactive sequences. To make sure of not missing the active sequences,we let ω k ↓ and ν k ↓ in (6), which means that ω k and ν k mono-tonically decrease and converge to zero. Solving the subproblem.
At the k -th iteration, once the activeset A k is selected, we solve the following subproblem min ˆ f ( γ A k ) (7a) s . t . γ A k ≥ , (7b)where γ A k is the subvector of γ indexed by A k and ˆ f ( γ A k ) is f ( γ ) defined over γ A k with all the other variables fixed being zero. Obvi-ously, problem (7) is different from problem (4). For instance, prob-lem (7) is defined over γ A k , whereas problem (4) is defined over γ .Therefore, the dimension of problem (7) is potentially much smallerthan that of problem (4) (if the set A k in (7) is properly chosen).We apply the spectral PG algorithm [17] to solve the subproblemin (7) until γ k +1 A k satisfying (cid:13)(cid:13)(cid:13)(cid:104) [ γ k +1 A k − ∇ ˆ f ( γ k +1 A k )] + − γ k +1 A k (cid:105)(cid:13)(cid:13)(cid:13) < ε k (8)is found, where ε k > is the solution tolerance at the k -th itera-tion. The spectral PG algorithm [17] is an PG algorithm with thespectral stepsizes, also called the Barzilai-Borwein (BB) stepsizes[18], which approximately solves the Quasi-Newton equation. Inthe PG algorithm, we need to compute the gradient of the objec-tive function ˆ f ( γ A k ) (equivalent to computing the partial gradientof function f ( γ ) with respect to the variables in A k ), which has acomplexity of O ( (cid:12)(cid:12) A k (cid:12)(cid:12) L ) . Two distinctive advantages of the spec-tral PG algorithm [17] in the context of solving problem (7) are asfollows. First, the non-negative constraint is easy to project onto, and thus the algorithm can be easily implemented to solve problem (7).Second, the algorithm enjoys a quite good numerical performancedue to the use of the alternating BB stepsizes [18, 19].Now, we are ready to present the proposed active set PG algo-rithm for solving problem (4). The pseudocodes of the proposedalgorithm are given in Algorithm 1.
Algorithm 1
Proposed active set PG algorithm for solving problem(4) Initialize: γ = , k = 0 , { ω k , ν k , ε k } k ≥ , and ε > repeat Select the active set A k according to (6); Apply the spectral PG algorithm [17] to solve the subproblem(7) until (8) is satisfied; Set k ← k + 1; until (cid:107) [ γ k − ∇ f ( γ k )] + − γ k (cid:107) < ε Output: γ k Next, we present some convergence properties of the proposedactive set PG Algorithm 1 (without rigorous proofs due to the spacelimitation). Note that a not careful selection of the active set (andparameters in it) might lead to oscillation (and divergence) of thecorresponding active set algorithm among different active sets. Thefollowing finite termination property is mainly because of the activ-ity set selection strategy in (6) (and careful choices of parameters ω k and ν k ) and the convergence property of the spectral PG algorithm. Theorem 1
For any given tolerance ε > , suppose that the pa-rameters ω k and ν k in (6) satisfy ω k ↓ and ν k ↓ and the param-eter ε k in (8) satisfy lim k →∞ ε k < ε, then the active set PG Algorithm 1will terminate within a finite number of iterations.
4. SIMULATION RESULTS
In this section, we present some simulation results to show the effi-ciency of the proposed active set PG algorithm for solving the jointdata and activity detection problem (4). We generate the same pa-rameters as in [8] in our numerical simulations. More specifically,we consider a single cell of radius 1000m and consider the worst-case scenario where all devices are located in the cell edge such thatthe large-scale fading components g n ’s are the same for all devices.The power spectrum density of the background noise is -169dBm/Hzover 10 MHz and the transmit power of each device is set as 25dBm.The number of BS antennas, the length of the signature sequence,and the bits of the data are set to be M = 256 , L = 150 , and J = 1 (and thus Q = 2 ), respectively. We generate all signature sequencesfrom i.i.d. complex Gaussian distribution with zero mean and unitvariance. We set K/N = 0 . , which means that of the totaldevices are active, and compare the performance of different algo-rithms as N increases. The parameters in the proposed Algorithm 1are ω k = 10 − − k , ε k = max (cid:110) − k , . ∗ − (cid:111) ,ν k = min (cid:26) − k , . (cid:12)(cid:12)(cid:12)(cid:12) min n,q (cid:26)(cid:104) ∇ f ( γ k ) (cid:105) n,q (cid:27)(cid:12)(cid:12)(cid:12)(cid:12)(cid:27) , and ε = 10 − . All simulation results in this section are obtained byaveraging over
Monte-Carlo runs.The upper subfigure in Fig. 1 plots the average ratio of the cardi-nality of the selected active sets during all iterations of the proposed
000 2000 3000 4000 5000 6000 7000 8000 9000 10000
Total number of users R a t i o Total number of users N u m be r o f It e r a t i on s Fig. 1 . Performance of the proposed active set PG algorithm.
Total number of users C P U t i m e [ s e c ] Random CDActive Set PGIdeal CDIdeal PG
Fig. 2 . Average CPU time comparison of the proposed active set PGalgorithm, the random CD algorithm, the ideal CD algorithm, andthe ideal PG algorithm with
K/N = 0 . . Algorithm 1 and the number of active sequences (i.e., K ). In theideal case where the set of active sequences is always selected asthe active set in Algorithm 1, the corresponding ratio is ; in theworst case where the whole set is always selected as the active setin Algorithm 1, the corresponding ratio is QN/K = 20 . This ra-tio measures the efficiency of the corresponding active set selectionstrategy and the smaller the ratio the better the active set selectionstrategy. We can observe from the upper subfigure in Fig. 1 that theratio is in the interval [1 . , . (and in fact very close to for differ-ent N ’s), which clearly shows that the proposed active set selectionstrategy (6) is very efficient. The lower subfigure in Fig. 1 plots theaverage number of iterations that the proposed algorithm needs toterminate. This subfigure shows that Algorithm 1 will generally ter-minate within 4–7 iterations, which validates the finite terminationresult in Theorem 1.Next, we compare the proposed active set PG Algorithm 1 withthe following three benchmark algorithms:• Random CD [8, 10]: To the best of our knowledge, randomCD is the state-of-the-art algorithm for solving the covariancebased detection problem (4), which is much faster than EM [15] and SPICE [16]. Among various variants, one of themost efficient ones is the so-called random permuted CD [8],which randomly permutes the indices of all variables at eachiteration and then updates the variables one by one accordingto the order in the permutation in closed form (see line 5 of[10, Algorithm 1]).• Ideal CD: This refers to the algorithm which applies the ran-dom permuted CD algorithm to solve problem (4) definedover the variables in the (ideal) set of active sequences. Sincethis ideal set is not known at the BS in practice, ideal CDis not a practical algorithm. Ideal CD is compared here forthe purpose of characterizing the best possible performanceof the CD types of algorithms.• Ideal PG: This refers to the algorithm which applies the PGalgorithm [17] to solve problem (4) defined over the variablesin the (ideal) set of active sequences. This algorithm is alsonot practical and is only theoretically interesting for charac-terizing the best possible performance of the PG algorithm.We have observed in the simulations that random CD and activeset PG algorithms always find the same solution of problem (4), andthus below we focus on their CPU time comparison. Fig. 2 plotsthe CPU time comparison of the proposed algorithm with the abovethree benchmark algorithms. It can be clearly observed from Fig. 2that the proposed active set PG algorithm significantly outperformsthe state-of-the-art random CD algorithm [10, 8] in terms of the CPUtime. The proposed algorithm even achieves slightly better com-putational efficiency than the ideal CD algorithm. This shows theimportance of exploiting the sparsity of the true solution in orderto efficiently solve problem (4). Note that we fix K/N = 0 . inFig. 2. It is expected that the proposed active set PG algorithm willbecome more efficient than the random CD algorithm as the ratio K/N becomes smaller (i.e., the solution of problem (4) becomesmore sparse).We have also observed that directly applying the PG algorithm[17] to solve problem (4) is much slower than random CD. Fig. 2shows that ideal PG is more efficient than ideal CD. These obser-vations are consistent with our optimization practice that it is betterto coordinately update all variables together instead of individually(unless for very large-scale optimization problems where it might becomputationally expensive to update all variables together).In summary, the high computational efficiency of the proposedactive set PG algorithm is mainly attributed to the following two fac-tors. First, the active set selection strategy (6) is efficient, which isable to substantially reduce the dimension of the subproblems (com-pared to the original problem). Second, it is important to choose anappropriate algorithm for solving the subproblems defined over thevariables in the active set and the PG algorithm [17] turns out to be agood option (which is obviously much better than the state-of-the-artCD algorithm [8, 10]).
5. CONCLUSIONS
Scalable and efficient joint data and activity detection is essentialfor massive random access in mMTC. In this paper, we propose anovel active set PG algorithm that carefully exploits the sporadicnature of the device traffic. The proposed algorithm is much moreefficient than the existing state-of-the-art algorithms (in terms of theCPU time). We have observed from simulation results that severalfirst-order algorithms can find the same (global) solution of the non-convex joint detection problem. It will be interesting to obtain sometheoretical guarantees for this observation. . REFERENCES [1] C. Bockelmann, N. Pratas, H. Nikopour, K. Au, T. Svens-son, C. Stefanovic, P. Popovski, and A. Dekorsy, “Massivemachine-type communications in 5G: Physical and MAC-layersolutions,”
IEEE Commun. Mag. , vol. 54, no. 9, pp. 59–65,Sept. 2016.[2] X. Chen, D. W. K. Ng, W. Yu, E. G. Larsson, N. Al-Dhahir,and R. Schober, “Massive access for 5G and beyond,”
IEEE J.Sel. Areas Commun. (to appear) , 2020.[3] L. Liu, E. G. Larsson, W. Yu, P. Popovski, ˇC. Stefanovi´c, andE. de Carvalho, “Sparse signal processing for grant-free mas-sive connectivity: A future paradigm for random access pro-tocols in the internet of things,”
IEEE Signal Process. Mag. ,vol. 35, no. 5, pp. 88–99, Sept. 2018.[4] K. Senel and E. G. Larsson, “Grant-free massive MTC-enabledmassive MIMO: A compressive sensing approach,”
IEEETrans. Commun. , vol. 66, no. 12, pp. 6164–6175, Dec. 2018.[5] L. Liu and W. Yu, “Massive connectivity with massive MIMO—Part I: Device activity detection and channel estimation,”
IEEE Trans. Signal Process. , vol. 66, no. 11, pp. 2933–2946,June 2018.[6] Z. Chen, F. Sohrabi, and W. Yu, “Sparse activity detection formassive connectivity,”
IEEE Trans. Signal Process. , vol. 66,no. 7, pp. 1890–1904, Apr. 2018.[7] L. Liu and Y.-F. Liu, “An efficient algorithm for devicedetection and channel estimation in asynchronous IoTsystems,” in
Proc. IEEE Int. Conf. Acoustics, Speech,Signal Process. (ICASSP), Toronto, Canada , 2021. [Online].Available: https://arxiv.org/abs/2010.09979[8] Z. Chen, F. Sohrabi, Y.-F. Liu, and W. Yu, “Covariance basedjoint activity and data detection for massive random accesswith massive MIMO,” in
Proc. IEEE Int. Conf. Commun.(ICC), Shanghai, China , May 2019, pp. 1–6.[9] Z. Chen, F. Sohrabi, Y.-F. Liu, and W. Yu, “Phasetransition analysis for covariance based massive randomaccess with massive MIMO,” 2020. [Online]. Available:https://arxiv.org/abs/2003.04175[10] S. Haghighatshoar, P. Jung, and G. Caire, “Improved scalinglaw for activity detection in massive MIMO systems,” in
Proc.IEEE Int. Symp. Inf. Theory (ISIT), Vail, CO, USA , June 2018,pp. 381–385.[11] A. Fengler, G. Caire, P. Jung, and S. Haghighatshoar, “MassiveMIMO unsourced random access,” 2019. [Online]. Available:http://arxiv.org/abs/1901.00828[12] X. Shao, X. Chen, D. W. K. Ng, C. Zhong, andZ. Zhang, “Covariance-based cooperative activity detectionfor massive grant-free random access,” 2020. [Online].Available: https://arxiv.org/abs/2008.10155[13] D. Jiang and Y. Cui, “ML estimation and MAP estimationfor device activities in grant-free random access with interfer-ence,” in
Proc. IEEE Wireless Commun. Netw. Conf. (WCNC) ,2020, pp. 1–6.[14] J. Dong, J. Zhang, Y. Shi, and J. H. Wang, “Fasteractivity and data detection in massive random access: Amulti-armed bandit approach,” 2020. [Online]. Available:https://arxiv.org/abs/2001.10237 [15] D. P. Wipf and B. D. Rao, “An empirical Bayesian strategyfor solving the simultaneous sparse approximation problem,”
IEEE Trans. Signal Process. , vol. 55, no. 7, pp. 3704–3716,July 2007.[16] Z. Yang, J. Li, P. Stoica, and L. Xie, “Sparse methods fordirection-of-arrival estimation,” in
Academic Press Library inSignal Processing . Elsevier, 2018, vol. 7, pp. 509–581.[17] E. G. Birgin, J. M. Mart´ınez, and M. Raydan, “Nonmonotonespectral projected gradient methods on convex sets,”
SIAM J.Optim. , vol. 10, no. 4, pp. 1196–1211, 2000.[18] J. Barzilai and J. M. Borwein, “Two-point step size gradientmethods,”
IMA J. Numer. Anal. , vol. 8, no. 1, pp. 141–148,1988.[19] Y.-H. Dai and R. Fletcher, “Projected Barzilai-Borwein meth-ods for large-scale box-constrained quadratic programming,”