[PDF] ADMM for Distributed Dynamic Beamforming

Abstract

This paper shows the capability the alternating direction method of multipliers (ADMM) has to track, in a distributed manner, the optimal down-link beam-forming solution in a multiple input multiple output (MISO) multi-cell network given a dynamic channel. Each time the channel changes, ADMM is allowed to perform one algorithm iteration. In order to implement the proposed scheme, the base stations are not required to exchange channel state information (CSI), but will require to exchange interference values once. We show ADMM's tracking ability in terms of the algorithm's Lyapunov function given that the primal and dual solutions to the convex optimization problem at hand can be understood as a continuous mapping from the problem's parameters. We show that this holds true even considering that the problem looses strong convexity when it is made distributed. We then show that these requirements hold for the down-link, and consequently up-link, beam-forming case. Numerical examples corroborating the theoretical findings are also provided.

Full PDF

aa r X i v : . [ m a t h . O C ] A ug SUBMITTED TO IEEE TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING OVER NETWORKS 1

ADMM for Distributed Dynamic Beam-forming

Marie Maros,

Student Member, IEEE, and Joakim Jald´en,

Senior Member, IEEE

Abstract —This paper shows the capability the alternatingdirection method of multipliers (ADMM) has to track, in adistributed manner, the optimal down-link beam-forming solutionin a multiple input multiple output (MISO) multi-cell networkgiven a dynamic channel. Each time the channel changes, ADMMis allowed to perform one algorithm iteration. In order to im-plement the proposed scheme, the base stations are not requiredto exchange channel state information (CSI), but will require toexchange interference values once. We show ADMM’s trackingability in terms of the algorithm’s Lyapunov function giventhat the primal and dual solutions to the convex optimizationproblem at hand can be understood as a continuous mappingfrom the problem’s parameters. We show that this holds trueeven considering that the problem looses strong convexity whenit is made distributed. We then show that these requirementshold for the down-link, and consequently up-link, beam-formingcase. Numerical examples corroborating the theoretical ﬁndingsare also provided.

Index Terms —Alternating direction method of multipliers(ADMM), dencentralized optimization, dynamic optimization,MIMO, multi-cell wireless networks, second-order cone program-ming (SOCP)

I. I

NTRODUCTION

Coordinated transmissions in multi-cell communication net-works has in the recent years drawn great attention due tothe promise of signiﬁcantly higher spectral efﬁciencies [1],[2]. Such techniques include both multi-point cooperativetechniques where mobile users are simultaneously served byseveral base stations [2], and inter-cell interference mitigatingtechniques where base stations coordinate to limit interferenceto neighboring cells [3].Coordinated transmissions place larger requirements onthe availability of accurate channel state information (CSI)throughout the network, and these requirements are often themajor hurdle for adoption of coordinated transmission tech-niques. Centralized solutions further require channel knowl-edge of the entire network to be present at a single node thatwill then be capable of obtaining the optimal transmit strategyand distribute it to the base stations that will be using therespective beam-formers. Centralized solutions are impracticalfor all but very small networks, and, as mentioned in [4], thechannels might have changed before the central solution hasreached the base stations.This has led many researchers to consider distributed opti-mization techniques that circumvent the need for network widecollection of CSI [3], [4], [5]. Still, distributed optimizationtechniques are iterative in nature, and their convergence rateand need for interchanging intermediate information overback-haul channels must always be compared to the total

M. Maros and J. Jald´en are with the Department of Signal Processing, KTHRoyal Institute of Technology, Stockholm, Sweden amount of back-haul transmissions needed by a centralized so-lution when assessing their relative merits. If the convergenceto an optimal solution is slow or requires an excessive amountof intermediate signaling, a centralized solution may still bepreferable, at least within a localized cluster of neighboringcells. This said, one clear advantage of a distributed solutionis that, once it has converged, it may be able to continuouslyadapt to small changes in the CSI with limited intermediatesignaling. This is typically very hard to achieve with central-ized solutions, as the CSI needs to be redistributed in thenetwork on a time-scale dictated by the channels coherencetime.Motivated by the above, we will, in this paper, study theability of the popular alternating direction method of mul-tipliers (ADMM) algorithm to dynamically track an optimaldown-link beam-forming solution in a multiple input multipleoutput (MISO) multi-cell network with time-varying channels.We will assume that the base stations are equipped withmultiple antennas and that the mobile terminals (users) areequipped with single antennas. The base stations may usechannel state information (CSI) to adapt the multi-antennatransmission in order to intelligently mitigate the effect ofinter-cell interference. Given the described scenario, severalnotions of an optimal transmit strategy have been adopted inthe literature. The main differences lie in what one wishesto optimize. In opportunistic formulations, the focus is onﬁnding a transmit strategy that maximizes a utility functionof the transmission rate given a ﬁxed power budget. How-ever, such formulations may lead to variable rates whichmight not be desired for services where a speciﬁc qualityof service (QoS) needs to be guaranteed. Additionally, utilityrate maximization problems have been shown to be NP-hardin general [6], which makes characterization of distributedsolutions signiﬁcantly harder. On the contrary, the problemof minimizing the transmit power subject to QoS constraintsin terms of the required signal to noise and interference ratios(SINRs) at each mobile terminal, initially believed to be non-convex, was shown to yield optimal solutions through the useof semi-deﬁnite relaxation (SDR) [7], and was shown to beequivalent to a second-order cone program (SOCP) [7], [8].We will therefore, in this work, consider the QoS constrainedbeam-forming problem formulation, partially for reasons oftractability.Algorithms that can solve convex optimization problems ina distributed manner have attracted great interest in the recentyears. A tutorial on general decomposition techniques can befound in [9]. Primal and dual decomposition are well knownclasses of techniques to decompose an optimization problem[10]. Both classes of decompositions rely on having a masterproblem and slave sub-problems. The sub-problems are thenindependently solved in the separate nodes while the master

UBMITTED TO IEEE TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING OVER NETWORKS 2 problem is solved iteratively using parameters obtained fromthe individual sub-problems. Primal and dual decompositionhave also been previously applied to the QoS constrainedpower minimization problem considered herein. Examplesinclude [11] where dual decomposition was used, [5] whereprimal decomposition was used, and also [12], [4] whereADMM was used. There are also problem speciﬁc distributedtechniques based on ﬁxed-point iterations that exploit up-linkdown-link duality [3]. The up-link down-link approach hasalso been extended to the rate maximization problem [13]. Thework in [12] considered a robust ADMM formulation whereSDR was used to solve local worst-case robust beam-formingproblems. We will however herein, for simplicity, not considerthe robust ADMM formulation and instead apply ADMM asin [4].ADMM has previously been shown capable of trackingthe optimal solution to a dynamically changing optimizationproblem [14]. Such results exist also for other decompositiontechniques [15]. However, most of the available results requirestrong convexity of the objective function, and deal with eitherunconstrained minimization problems or static feasible sets[15], [14]. A notable exception is the work in [16], where atime varying constraint set is used and the requirement fora strongly convex objective is removed in a gradient-typetracking algorithm. While the centralized QoS beam-formingproblem considered herein has a strongly convex objectivefunction, this strong convexity is unfortunately broken in theADMM decomposition. This makes us unable to directly applythe results in [14], as these require strong convexity of theobjective function in order to establish linear convergence [17]as part of the proof therein. Furthermore, the QoS constraintsare herein channel dependent and thus time-varying. We willtherefore seek to establish a dynamic tracking result throughan application of the weaker but more general ADMM con-vergence results presented in [18].Another issue to take into consideration is that the QoSconstrained beam-forming problem is not generally guaranteedto be feasible over all possible channels for a given userto base station assignations. Clearly, no algorithm will beable to track the optimal solution if it does not exist. Wewill deal with this issue by limiting the tracking argumentto sequences of channels within a compact set of channels forwhich the problem is guaranteed to be feasible. In practice, acommunications system would continuously need to monitorthe amount of power used, reject and admit users to the system,and reassign users to base stations. When the channel changessufﬁciently much the mechanism in charge of performing theuser to base station assignation will naturally introduce achange leading to an abrupt change of the problem structureand implying a loss of the tracking ability. We will however notexplicitly consider such mechanisms further, and only considertracking for channel sequences where the centralized problemremain feasible.Finally, ADMM as proposed in [12], [4], and many otherdistributed algorithms as well, will only provide feasiblesolutions in the limit. This issue has not been overlooked bythe research community. The standard solution is to interruptthe algorithm and perform a projection over the feasible set in order to to achieve feasibility of the solution [4], [11].However, even when the original problem is assumed to befeasible, there is no guarantee that the projection step issuccessful. While it can be argued that the likelihood of theprojection being feasible increases as the algorithm converges, we propose an alternative way of addressing this issue byallowing the QoS constraints to be violated by some smallamount. As, under stable running condition, the deviation fromthe QoS constraints will be limited and controlled, we arguethat the introduction of a simple QoS SINR margin would beenough to ensure the applicability of the algorithm in practice,and therefore we will not strictly enforce the QoS constraints.With the above caveats in mind, we will prove that anADMM algorithm that is allowed to perform one ADMMiteration per discrete unit time will be able to yield beam-formers that are arbitrarily close to the globally optimal beam-formers and provide SINRs which are arbitrarily close to orabove the target QoS constraints, provided that the channelsvary sufﬁciently little between each time step within a compactset of channels for which the overall beam-forming problemis feasible.We begin the paperin Section II by introducing the down-link beam-forming problem and its reformulation so as towrite it in a way that is amendable to ADMM and in orderto introduce notation. We also discuss the requirements andassumptions needed for our main result to hold true in thesame section. We then proceed to show in Section III thetracking ability of ADMM in a general setting under certaincontinuity assumption of intermediate solutions when viewedas functions of the channels and intermediate iterates. Oncethe tracking ability has been shown, we proceed to provein Section IV that the continuity assumptions hold for theconsidered beam-forming problem. Numerical results that areused to illustrate the results are presented in Section V. Finally,concluding remarks are given in Section VI.II. P

ROBLEM FORMULATION

Consider a cellular system with B base stations and K users where each user is served by one base station at atime. Assume that each base station is equipped with N T transmitting antennas and that each mobile station is equippedwith a single antenna.Each user k has been assigned to a speciﬁc base station b = b ( k ) that will serve it while keeping the interference causedto other users small. Given channels h mk ∈ C N T × frombase station m to user k , the received signal at user k can beexpressed as [11], [5], [4] y k , h Hb ( k ) k w k d k | {z } scaled signal of interest + intracell interference z }| {X i ∈U ( b ( k )) \ k h Hb ( k ) k w i (1a) + X m = b ( k ) X i ∈U ( m ) h Hmk w i | {z } intercell interference + n k , (1b)where U ( m ) denotes the set of users served by base station m , where w k denotes the transmit beam-former used by base UBMITTED TO IEEE TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING OVER NETWORKS 3 station b ( k ) to transmit to user k , where d k ∈ C is the signalof interest with E {| d k | } = 1 and E { d k d j } = 0 for k = j ,and where n k represents circularly symmetric additive whiteGaussian noise (AWGN) with variance σ k . We will assume inthe distributed solutions that base station b has knowledge of h bk for all k = 1 , . . . , K and w k for k ∈ U ( b ) , but not of h mk for m = b and w k for k / ∈ U ( b ) . The signal to interferenceand noise ratio (SINR) at user k , for a given set of channelsand for a given transmit strategy, is given bySINR k ( W , H ) , (2a) | h Hb ( k ) k w k | X i ∈U ( b ( k )) \ k | h Hb ( k ) k w i | + X m = b X i ∈U ( m ) | h Hmk w i | + σ k , (2b)where H and W are matrices containing the complete set ofchannels and beam-forming vectors. Since the rate r k to user k is a monotonically increasing function of SINR k ( W , H ) ,requiring a minimum SINR is equivalent to requiring a mini-mum rate per user. Hence, W ⋆ ( H ) , arg min { w k } B X b =1 X k ∈U ( b ) || w k || (3a)s.t. SINR k ( W , H ) ≥ γ k > (3b) k = 1 , . . . , K can be seen as a formulation of the minimum power strategyfor a speciﬁc set of user QoS constraints, where W ⋆ ( H ) provides the optimal set of beam-formers W ⋆ given thechannels H .The problem in (3) can be equivalently formulated as asecond order cone program (SOCP) [7], [8]. The extension to avariable amount of antennas per base station is straightforwardand avoided herein for simplicity. However, the extensionto several antennas in reception is probably NP-hard forresource allocation with ﬁxed QoS requirements [19] and ratemaximization subject to power constraints is NP-hard for twoor more transmit antennas per base station [6]. A technicalissue with (3) as stated is that the optimal solution is onlyunique up to a phase ambiguity in the beam-forming vectors,i.e., if w ⋆k is optimal, so is w ⋆k e jφ , where φ is an arbitraryphase. This phase ambiguity is removed when formulating theproblems as a SOCP by setting the phase such that the products h Hb ( k ) k w k are real valued and positive [8], enforcing a uniquephase for each beam-former. For this reason we will withoutloss of generality and without much further comments treat W ⋆ ( H ) as a singleton set, i.e., we assume that the optimalsolution is unique.As formulated in (3), the optimization problem wouldrequire centralization of the CSI. In order to solve (3) withonly local CSI, [4], [12] proposed ADMM based distributedformulations of problem (3). Using ADMM in order to solvea problem in a distributed fashion involves creating copies ofthe variables that are shared by different nodes, or in this casebase stations. Hence, the ﬁrst step is to identify the shared The scenario treated in [12] considers also that the obtained CSI isimperfect which is a generalization we do not consider information and deﬁne a new set of variables so as to limitthe information exchange. We deﬁne, similar to [12] and [11], t mk , P j ∈U ( m ) | h Hmk w j | , for m = b ( k ) , which is the powerof the inter-cell interference caused by base station m on user k served by base station b ( k ) . The problem in (3) can then beequivalently expressed asminimize { t mk } , { w k } B X b =1 X k ∈U ( b ) || w k || (4a)s.t. | h Hb ( k ) k w k | X i ∈U ( b ( k )) \ k | h Hb ( k ) k w i | + X m = b t ( b ( k ))2 mk + σ k ≥ γ k (4b) (cid:0) t ( m ) mk (cid:1) − X i ∈U ( m ) | h Hmk w i | ≥ (4c) t ( m ) mk = t ( b ( k )) mk (4d)for k = 1 , . . . , K for m = b ( k ) , where t ( m ) mk is the inter-cell interference copy in base station m and t ( b ( k )) mk is the inter-cell interference copy found in basestation b ( k ) . Note that, except for the equality constraints in(4d), i.e., that base stations m and b ( k ) agree on the amountof interference caused and suffered, the constraints in (4)only involve information of a single base station and the costfunction in (4a) is separable across base stations. It shouldalso be clear from the formulation in (4) that the interferencecaused by base station m and suffered by a user in base station b ( k ) will only be relevant, and hence exchanged, among basestations m and b ( k ) . The coupling between base stations isalso made explicit by (4d).Typically, dual decomposition or ADMM are used to decou-ple problems coupled through a constraint [9]. However, in thiscase we are in the presence of coupling variables. To be able touse ADMM we introduce a consistency variable τ mk and forcethe equalities, according to t ( m ) mk = τ mk and t ( b ( k )) mk = τ mk .More compactly, we can deﬁne t b ∈ R K + |U ( b ) | ( B − contain-ing base station b ’s copies of the interference terms causedand suffered by its users, i.e., t ( b ) bj for j

6∈ U ( b ) and t mk , ∀ m = b and ∀ k ∈ U ( b ) , respectively. For notational simplicitywe additionally introduce t T = ( t T , . . . , t TB ) ∈ R B − K and τ ∈ R K ( B − , as aggregate vectors that contains allinterferences and consistency variables. Then, the equalityconstraints in (4d) can be compactly expressed using theequality E τ = t , where E ∈ R ( B − K × B − K is a matrixwhose elements are { , } that copies the elements of τ inthe positions corresponding to the copies in t . If the equality E τ = t , or equivalently (4d), were to be ignored, (4) wouldbecome decomposable over the base stations since the feasibleset would be the Cartesian product of the independent feasiblesets. This allows us to use ADMM [12] (or alternatively dualdecomposition [11]) to provide a distributed algorithm. Inorder to simplify the formulation of the problems solved byeach of the base stations we introduce UBMITTED TO IEEE TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING OVER NETWORKS 4

SINR k ( W b , H b , t b ) , | h Hb ( k ) k w k | P i ∈U ( b ( k )) \ k | h Hb ( k ) k w i | + P m = b t ( b )2 mk + σ k (5)and INT bj ( W b , H b , t b ) , (cid:0) t ( b ) bj (cid:1) − X i ∈U ( b ) | h Hbj w i | , (6)where (5) denotes the SINR of user k as a function of thebeam-formers used by base station b , W b , the channels knownto base station b , H b and the estimated caused and sufferedinterference at base station b , t b . Analogously, (6) representsthe constraint on the interference caused by base station b touser j , where j

6∈ U ( b ) .Using these quantities, the problem in (4) can now becompactly written asmin { w k } , t , τ B X b =1 X k ∈U ( b ) || w k || (7a)s.t. SINR k ( W b , H b , t b ) ≥ γ k (7b) ∀ k ∈ U ( b ) , b = 1 , . . . , B INT bj ( W b , H b , t b ) ≥ , (7c) ∀ j

6∈ U ( b ) , b = 1 , . . . , B E τ = t . (7d)Further, E can be partitioned accordingly to the t b in t leadingto B linear equalities of the kind E b τ = t b , where E b denotesthe partition of E corresponding to the interference termsrelevant to base station b .Given a static set of channels, the problem in (7), orequivalently (4) or (3), can thus be solved iteratively byAlgorithm 1, which represents the ADMM algorithm appliedto (7). Convergence to an optimal solution follows fromstandard convergence proofs such as those presented in [18].However, given dynamically fading channels, the risk ofrendering the CSI obsolete will lead to the necessity ofinterrupting the algorithm before it has reached an optimalpoint [4]. In case this happens, the approach proposed in[4] and [11] is to interrupt the algorithm and to project overthe feasible set, by setting the variables t [ i ] b = E b τ [ i − , ∀ b .However, this projection is not necessarily feasible in whichcase more iterations will be required [12]. The approachadvocated herein is instead to allow for the SINR constraintsin (7b) or (4b) to be violated by a controlled amount.Assuming block fading, and that the changes in the channelsfrom block to block are bounded, a possible solution is to trackthe optimal set of beam-formers. A result showing ADMM’stracking capabilities, when the objective function changesfrom iteration to iteration, is provided in [14]. However, theanalysis found in [14] considers unconstrained minimizationof a strongly convex function with a Lipschitz continuousgradient. Unfortunately, even though the original problem in(3) can be written with a strongly convex objective functionwith respect to the beam-formers, the price to pay for de-composability is the loss of strong convexity in (7a) withrespect to the variables t , τ . The results provided in [14] Algorithm 1

ADMM for distributed beamforming Initialize τ [0] and ν [0] such that E T ν [0] = 0 . Set i = 1 . Distributedly solve min t b , { w j } X j ∈U ( b ) || w j || + ( ν [ i − b ) T ( t b − E b τ [ i − ) (8a) + ρ || t b − E b τ [ i − || s.t. SINR k ( W b , H b , t b ) ≥ γ k , ∀ k ∈ U ( b ) (8b)INT bj ( W b , H b , t b ) ≥ , ∀ j

6∈ U ( b ) (8c) for each b = 1 , . . . , B to obtain w [ i ] k for k = 1 , . . . , K and t [ i ] b at each base station b . Each base stations shares the relevant elements within t [ i ] b with other base stations. Compute τ [ i ] = E † t [ i ] : E b t [ i ] b can be computed locallyby averaging the terms in t [ i ] b with the received terms. Compute ν [ i ] b = ν [ i − b + ρ ( t [ i ] b − E b τ [ i ] ) Set i ← i + 1 , and return to 2.heavily rely on ADMM’s linear convergence [20] which hasthe same requirements. Hence, in order to prove that ADMM iscapable of tracking the optimal set of beam-formers, a differentapproach is required.Our aim in this paper is to prove that given an initial set ofvariables τ [0] and ν [0] in Algorithm 1 satisfying E T ν [0] = ,ADMM is with only one ADMM iteration per channel changecapable of providing a set of beam-formers that lie in abounded neighborhood of the optimal set of beam-formerswhile the feasibility SINR constraints in (7b) are violated atmost by a bounded amount. The proposal is hence to simplyuse Algorithm 1 with the static channels H replaced by thechannels at iteration i , denoted by H [ i ] . In order to prove thetracking capability, we require that the channels lies within a compact set of channels, H , that ensures that (3) is strictlyfeasible. The compact set of channels fulﬁlling this conditionwill be referred in the sequel as the γ k − feasible channels.An essential difference compared with other works [4], [11]is the requirement of strictly feasible channels. Consideringstrictly feasible channels guarantees that an arbitrarily smallchange in the channel will not render the problem infeasible.A second difference is that we allow the SINR constraintsto be violated by a bounded amount so as to allow forsmall disagreements in the interference values at different basestations and hence avoiding the need to solve non-feasibleproblems.The contributions of this paper are particularized for theMISO optimal beam-forming problem. However the prooffound in Section III shows that ADMM is capable of trackingan optimal solution as long as some continuity conditions aremet by the problem at hand. In particular, we require thatthe optimal primal and dual points are continuous functionsof the problem’s data, which in this case is the channel.Additionally, we also require that the primal parameters, in Which is also fulﬁlled by the optimal set of multipliers ν ⋆ UBMITTED TO IEEE TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING OVER NETWORKS 5 this case W [ i ] , t [ i ] and τ [ i ] obtained by solving (8) in step 2 ofAlgorithm 1, are continuous functions of the channel and theprevious parameters used by ADMM i.e. ν [ i − and τ [ i − .In order to formalize the paper’s main result we introduceTheorem 1 which is proven in the subsequent sections. Theorem 1.

Let { H [ i ] } ∞ i =0 be a sequence of channels thatlie within a compact set H of strictly γ k -feasible channels.Given arbitrary positive constants ǫ > and ǫ > , thereis some δ > for which Algorithm 1 generates a sequence ofbeam-formers W [ i ] for which the distance to the γ k -optimalbeam-formers is guaranteed to fulﬁlllim sup i →∞ || W [ i ] − W [ i ] ⋆ || ≤ ǫ , (9) where W [ i ] ⋆ denotes the optimal beam-formers at time i , andthe SINR of all users are guaranteed to fulﬁlllim inf i →∞ SINR k ( W [ i ] , H [ i ] , t [ i ] b ) ≥ γ k − ǫ (10) whenever k H [ i ] − H [ i − k ≤ δ for all i ≥ . III. T

RACKING WITH

ADMMIn this section we show, given a set of continuity assump-tions, that ADMM is capable of tracking. In order to be able toshow tracking without resorting to any proof requiring linearconvergence, we use the convergence proof found in [18].This proof relies on a per-iteration decrease on the algorithm’sLyapunov function, which we deﬁne as V ( ν , τ , H ) , ρ || ν − ν ⋆ ( H ) || + ρ || E ( τ − τ ⋆ ( H )) || , (11)where τ ⋆ ( H ) denotes the optimal consistency variables in(7d) given channels H , and ν ⋆ ( H ) denotes the optimal dualvariables associated with the consistency constraints (7d) giventhe channels H . Note that the dependence on H has its originin the fact that the optimal interference values and optimaldual multipliers associated with (7d) depend on the problemsdata, i.e. H . In [18], ADMM is shown to converge by usingthe fact that V ( ν [ i ] , τ [ i ] , H ) ≤ V ( ν [ i − , τ [ i − , H ) − ρ || r [ i ] || − ρ || E ( τ [ i ] − τ [ i − ) || , (12)where r [ i ] , t [ i ] − E τ [ i − is the primal residual at the i th iterate. The residual r [ i ] represents the disagreement amongbase stations, or deviation from the mean interference value,during the previous iterate. Equation (12) essentially impliesthat there exists a non-zero decrease in V ( ν , τ , H ) at eachiteration unless all base-stations agree on the amount ofinterference. Note that in [18] the optimization problem isassumed to be static, i.e., in our context, the channels H in(12) are assumed to be constant from iteration to iteration.However, in the remainder of this section this assumption willbe relaxed and the channel will be assumed to change fromone iteration to the next and will hence be indexed using theiteration number.Given that the considered set of γ k − feasible channels H iscompact and that the channel variation is such that || H [ i ] − H [ i +1] || ≤ δ for all i ≥ , we will assume the following: (A1) The optimal consistency variables τ ⋆ for the consistencyconstraint (7d) are a continuous function of the channel H , i.e. τ ⋆ ( H ) is a continuous function of H over H .(A2) The optimal dual multipliers ν ⋆ in the consistency con-straint (7d) are a continuous function of the channel H ,i.e. ν ⋆ ( H ) is a continuous function of H over H .(A3) The primal iterates W [ i ] , t [ i ] and τ [ i ] at time i ,are continuous functions of the iterates at time i − , i.e., W [ i ] ( τ [ i − , ν [ i − , H [ i ] ) , t [ i ] ( τ [ i − , ν [ i − , H [ i ] ) and τ [ i ] ( τ [ i − , ν [ i − , H [ i ] ) corresponding to the result-ing parameters generated by steps 2 and 4 in Algorithm1, are continuous functions of their respective inputparameters.Note that assumption (A3) also implies continuity of ν [ i ] ( τ [ i − , ν [ i − , H [ i − ) by the continuity of the dual updatein step 5 in Algorithm 1. Additionally the continuity of τ [ i ] ( τ [ i − , ν [ i − , H [ i ] ) follows by the same principle fromthe continuity of t [ i ] ( τ [ i − , τ [ i − , H [ i − ) for the beam-forming problem. However, this might not be the case for otheroptimization problems, and is thus assumed. Assumptions(A1)-(A3) will be proven to hold in the next section. However,for the time being they will be assumed to be given.Conceptually, the proof that follows can be split in two parts.First, we show that given a bound on the Lyapunov functionbefore the ADMM update, i.e. V ( τ [ i − , ν [ i − , H [ i ] ) , we arecapable of guaranteeing a bound on the distance to the optimalset of beam-formers. Second, we then show that there exists achannel variation δ such that we are guaranteed that the boundon the Lyapunov function holds in the limit when i → ∞ .Following this approach, we introduce two lemmas and theirrespective proofs to show that Theorem 1 holds true givenassumptions (A1)-(A3). Lemma 1.

Given that assumption (A3) holds and given aconstant ǫ > , there exists a constant c > such thatlim sup i →∞ V ( τ [ i − , ν [ i − , H [ i ] ) ≤ c (13) implies that lim sup i →∞ || W [ i ] − W [ i ] ⋆ || ≤ ǫ . (14)Two alternative proofs of this can be provided. In thegeneral case, the bound in (13) will by (11) imply that ν [ i − and τ [ i − are close to their respective optimal values. Thecontinuity assumption for W [ i ] ( τ [ i − , ν [ i − , H [ i − ) made in(A3) will imply that also W [ i ] is close to the global optimalvalue. However, for the particular problem at hand, an explicitbound that yields insight into the dependency of c on ǫ canalso be provided; which is done in Appendix A. Lemma 2.

Given that assumptions (A1)-(A3) hold, given acompact set H of γ k -feasible channels, and given a constant c > , there exists a maximum channel variation δ > , where H [ i − ∈ H and k H [ i ] − H [ i − k ≤ δ for all i ≥ , for whichlim sup i →∞ V ( τ [ i − , ν [ i − , H [ i ] ) ≤ c . (15) Proof.

ADMM guarantees that V ( ν [ i ] , τ [ i ] , H [ i ] ) to be speciﬁedlater, i.e. || H [ i +1] − H [ i ] || ≤ δ for all i ≥ . For some given µ l > and arbitrary ( ν [0] , τ [0] ) with E T ν [0] = deﬁne µ , max n V ( ν [0] , τ [0] , H [0] ) , µ l o . (16)Then, choose a ﬁnite ∆ µ > , let µ u , µ + ∆ µ and deﬁnethe set V ⊂ R B − K as V , { ( ν , τ ) | ∃ H ∈ H , V ( ν , τ , H ) ≤ µ u , E T ν = } . (17)Due to the compactness of H , the continuity of ν ⋆ ( H ) and τ ⋆ ( H ) , and the strong convexity of V ( ν , τ , H ) in ( ν , τ ) , theset V is also compact. Next, let U , { ( ν , τ , H ) | V ( ν , τ , H ) ≥ µ l , E T ν = } . (18)Note that this set is closed but not bounded. However, the set U ∩ ( V × H ) is compact as it is closed and bounded. Theset U ∩ ( V × H ) is simply the set of parameters ( ν , τ , H ) forwhich the Lyapunov function (11) is upper and lower boundedaccording to µ l ≤ V ( ν , τ , H ) ≤ µ u = µ + ∆ µ , (19)i.e. we are conﬁning the Lyapunov function’s value between µ l and µ u by considering ( ν , τ , H ) ∈ U ∩ ( V × H ) .From [18] we have that for a given triplet ( ν [ i − , τ [ i − , H [ i ] ) at the start of steps 2 in Algorithm1, the decrease in the Lyapanov function in the i th iterationover steps 2 to 4, is lower bounded as [cf. (12)] D ( τ [ i − , ν [ i − , H [ i ] ) , V ( ν [ i − , τ [ i − , H [ i ] ) − V ( ν [ i ] , τ [ i ] , H [ i ] ) ≥ ρ || r [ i ] || + ρ || E ( τ [ i ] − τ [ i − ) || ≥ (20)where equality holds only at the optimum when τ [ i − = τ ⋆ ( H [ i ] ) , ν [ i − = ν ⋆ ( H [ i ] ) and V ( ν [ i − , τ [ i − , H [ i ] ) = 0 .Given (A3) the lower bound on D ( τ [ i − , ν [ i − , H [ i ] ) in (20) is continuous in ( τ [ i − , ν [ i − , H [ i ] ) . Addition-ally, given the compactness of U ∩ ( V × H ) andthat V ( ν [ i − , τ [ i − , H [ i ] ) ≥ µ l > whenever ( τ [ i − , ν [ i − , H [ i ] ) ∈ U ∩ ( V × H ) , it follows that thereis a constant d > for which D ( ν , τ , H ) ≥ d for all ( τ [ i − , ν [ i − , H [ i ] ) ∈ U ∩ ( V × H ) , i.e., there is a minimumguaranteed decrease of the Lyapanov function.By assumption it holds that ( ν [0] , τ [0] , H [0] ) ∈ V × H and that V ( τ [0] , τ [0] , H [0] ) ≤ µ . Assume now that forsome arbitrary i ≥ it holds that ( ν [ i − , τ [ i − , H [ i − ) ∈V × H and that V ( ν [ i − , τ [ i − , H [ i − ) ≤ µ . Given achange in the channel from H [ i − to H [ i ] , we can have that V ( ν [ i − , τ [ i − , H [ i ] ) ≤ µ l or that V ( ν [ i − , τ [ i − , H [ i ] ) >µ l . In the former case, it follows immediately by the mono-tonicity of the Lyapanov function for ﬁxed channels thatalso V ( ν [ i ] , τ [ i ] , H [ i ] ) ≤ µ l . In the latter case, the Lyapunovfunction can be bounded, by using the fact that E T ν ⋆ = and the triangular inequality applied to (11), according to V ( ν [ i − , τ [ i − , H [ i ] ) ≤ µ + 2 √ µ ( 1 √ ρ ∆ ν ⋆ ( δ ) + √ ρ ∆ τ ⋆ ( δ ))+ 1 ρ (∆ ν ⋆ ( δ )) + ρ (∆ τ ⋆ ( δ )) , (21)where ∆ τ ⋆ ( δ ) , max H , H ′ ∈H || τ ⋆ ( H ) − τ ⋆ ( H ′ ) || (22a)s.t. || H − H ′ || ≤ δ (22b)and where ∆ ν ⋆ ( δ ) is analogously deﬁned. Due to the compact-ness of H , the continuity of τ ⋆ ( H ) and ν ⋆ ( H ) the quantities ∆ τ ⋆ ( δ ) and ∆ ν ⋆ ( δ ) are bounded and satisfy lim δ → ∆ τ ⋆ ( δ ) =0 and lim δ → ∆ ν ⋆ ( δ ) = 0 . We need to select δ so as to guarantee that ( ν [ i ] , τ [ i ] , H [ i − ) ∈ U ∩ ( V × H ) . We do this by selecting δ such that √ µ ( 1 √ ρ ∆ ν ⋆ ( δ ) + √ ρ ∆ τ ⋆ ( δ ))+ 1 ρ (∆ ν ⋆ ( δ )) + ρ (∆ τ ⋆ ( δ )) ≤ ∆ µ, (23)implying that we have that V ( ν [ i ] , τ [ i ] , H [ i ] ) ≤ V ( ν [ i − , τ [ i − , H [ i ] ) − d . Thus, if δ is chosen suchthat √ µ ( 1 √ ρ ∆ ν ⋆ ( δ ) + √ ρ ∆ τ ⋆ ( δ ))+ 1 ρ (∆ ν ⋆ ( δ )) + ρ (∆ τ ⋆ ( δ )) ≤ d, (24)we have that V ( ν [ i ] , τ [ i ] , H [ i ] ) ≤ µ , which implies in turnthat ( ν [ i +1] , τ [ i +1] , H [ i +1] ) ∈ V × H . Therefore, δ can beselected small enough so as to guarantee that the decrease canalways compensate for the increase induced by the channelchange, and at the same time, guarantee that there exists nochannel change that pushes the Lyapunov function to a regionin which the guaranteed decrease does not apply. Note that d isnot dependent on δ but on H , while µ l and ∆ µ are arbitrarilyselected. Hence it is always possible to ﬁnd a parameter δ > fulﬁlling (23) and (24).Expressions (23) and (24) provide insights on how to selectthe parameter ρ in case one can obtain the sensitivity of thedual problem or primal problem with respect to the problem’sdata; in other words, if the dual problem were to be very UBMITTED TO IEEE TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING OVER NETWORKS 7 sensitive to the problem’s data while the primal is less, onewould select a large value for ρ .It follows by induction that the bound V ( ν [ i ] , τ [ i ] , H [ i ] ) ≤ µ will hold for all i . Additionally, if δ ′ is picked so thatwe have a margin, i.e such that d ≥ √ µ ( √ ρ ∆ ν ⋆ ( δ ′ ) + √ ρ ∆ τ ⋆ ( δ ′ ))+ ρ (∆ ν ⋆ ( δ ′ )) + ρ (∆ τ ⋆ ( δ ′ )) + m, where m > is a constant, we have that at each iteration, as long as weremain within U ∩ ( V × H ) , a net decrease, i.e. V ( ν [ i ] , τ [ i ] , H [ i ] ) ≥ V ( ν [ i − , τ [ i − , H [ i − ) − m. (25)This implies that there exists a i such that: V ( ν [ i ] , τ [ i ] , H [ i ] ) ≤ µ l . (26)Note that we are not guaranteed a decrease of at least d nowsince ( ν [ i ] , τ [ i ] , H [ i ] )

6∈ U . However, we have that afterone channel iteration, the Lyapunov function will be upperbounded by: V ( ν [ i ] , τ [ i ] , H [ i +1] ) ≤ µ l + 2 √ µ l ( 1 √ ρ ∆ ν ⋆ ( δ ′ ) (27a) + √ ρ ∆ τ ⋆ ( δ ′ )) + 1 ρ (∆ ν ⋆ ( δ ′ )) + ρ (∆ τ ⋆ ( δ ′ )) . (27b)In case V ( ν [ i ] , τ [ i ] , H [ i +1] ) ≥ µ l we have that ( ν [ i ] , τ [ i ] , H [ i +1] ) ∈ U ∩ ( V × H ) and therefore, there is aguaranteed decrease d > √ µ ( √ ρ ∆ ν ⋆ ( δ ′ ) + √ ρ ∆ τ ⋆ ( δ ′ )) + ρ (∆ ν ⋆ ( δ ′ )) + ρ (∆ τ ⋆ ( δ ′ )) compensating the increase causedin the Lyapunov function and yielding that ∀ i ≥ i V ( ν [ i ] , τ [ i ] , H [ i ] ) ≤ µ l , (28)or equivalentlylim sup i →∞ V ( ν [ i ] , τ [ i ] , H [ i ] ) ≤ µ l ; (29)and thus lim sup i →∞ V ( ν [ i ] , τ [ i ] , H [ i +1] ) ≤ c (30)where c , µ l + √ µ l ( √ ρ ∆ ν ⋆ ( δ ′ ) + √ ρ ∆ τ ⋆ ( δ ′ )) + ρ (∆ ν ⋆ ( δ ′ )) + ρ (∆ τ ⋆ ( δ ′ )) , concluding the proof of Lemma2. Given ADMM’s nature, primal feasibility can not be guar-anteed until the algorithm has converged completely for a ﬁxedchannel. We therefore proceed to show the worse case possibledeviation from the desired SINRs, γ k .In particular, in the worst case scenario the obtained SINRfor user k satisﬁes for i ≥ i SINR k ( W b , H b , t b ) = γ k (cid:18) − (cid:18) cρσ k + 4 c (cid:19)(cid:19) = γ k (1 − ǫ k ) . (31)The proof of (31) can be found in appendix B. This concludesthe fact that ADMM can track the optimal set of beam-formers given the continuity assumptions (A1)-(A3) and thatthe channel does not vary too much from one time instance tothe next. Note that when considering the minimum achievedSINR due to the disagreement among base stations, the noise’svariance plays an important role, i.e. the larger the noisevariance the more negligible the disagreement among base stations is. As one might intuitively expect, the parameter ρ is relevant in order to mitigate the disagreement. This can beeasily seen due to the penalty parameter assigning weight tothe term || E b t b − τ || in (8a). However, from (31) we can seethat the effect ρ has is equivalent to that of a “noise enhancer”when it comes to mitigating the effect of the disagreement onthe interference values.IV. C ONTINUITY ANALYSIS

In this section we show that assumptions (A1)-(A3) madein order to prove ADMM’s tracking ability of the optimalset of beam-formers hold. We will ﬁrst argue that showingcontinuity of the optimal consistency variables τ ⋆ (A1) andthe optimal dual variables associated with (7d), τ ⋆ (A2)is equivalent to showing continuity of the primal and dualoptimal variables of the centralized problem. When it comesto the optimal consistency variables τ ⋆ this is fairly obvious,since the interference constraints (8c), as deﬁned in Algorithm1 will be fulﬁlled tightly at the optimal point. In orderto show the continuity of the optimal dual multipliers ν ⋆ associated with (7d) it sufﬁces to express the Lagrangianof the centralized problem (3) and of the problem in (7)as in [3]. By ﬁnding the optimality conditions with respectto the additional variables (i.e., the interference estimates t and the consistency variables τ ) one can show that each ofthe elements in ν ⋆ equals the product of the correspondinginterference estimate t ( m ) ⋆mbk and the multiplier associated withthe SINR constraint (8b) that contains it. Further, by takinggradient with respect to the beam-formers we obtain that thedual multipliers corresponding to the SINR constraint (8b)equal the dual multipliers associated to the SINR constraints(3b) in the centralized problem.We will ﬁrst show that the optimal interference estimates t ⋆mbk are continuous functions of the channel. In order todo this we show that the optimal set of beam-formers W ⋆ are continuous functions of the channels in the centralizedproblem. For this purpose we require Theorem 2 (a specialcase of [21, Theorem 2.2, 2.3]). For completeness we includethe deﬁnitions of closed and open point to set mappingsdeﬁned as in [21]. Deﬁnition 1.

A point to set mapping W ( H ) is closed at ¯H if for any sequence of channels H n ∈ H , H n → ¯ H , andassociated feasible beam-formers W n ∈ W ( H n ) such that W n → ¯ W it holds that ¯ W ∈ W ( ¯ H ) . Deﬁnition 2.

A point to set mapping W ( H ) is open at ¯H iffor any sequence of channels H n ∈ H , such that H n → ¯H and ¯ W ∈ W ( ¯ H ) , it holds that there exists m and { W n } suchthat W n ∈ W ( H n ) for all n ≥ m , and W n → ¯W . We are now ready to introduce the following theorem:

Theorem 2.

Let W ( H ) and W ⋆ ( H ) be the set of feasible ofbeam-formers and the optimal beam-formers given channel H respectively. For the problem in (3) W ⋆ ( H ) is continuous at H if: The objective function in (3a) is continuous on W ( H ) ; The point to set mapping W ( H ) is closed and open at H ; UBMITTED TO IEEE TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING OVER NETWORKS 8 The primal optimal point exists and is unique.

The objective function in (3) is strictly convex and contin-uous on C N T × K . Additionally when writing the constraintsas SOCs the phase ambiguity, otherwise present, vanishes asmentioned in Section II; this implies that not only the ﬁrst,but the also the third of the theorem hold. Hence, in order toshow continuity of the function W ⋆ ( H ) with respect to thechannels, we require showing that the point to set mapping W ( H ) is closed and open ∀ H ∈ H .Intuitively, showing that the point to set mapping represent-ing the feasible sets is closed implies that an arbitrarily smallchange in the channel is not capable of violating the constraintsby an arbitrarily large amount. Lemma 3.

Given the set of γ k − feasible channels H , for aspeciﬁc user-base station allocation and SINR requirements,the mapping providing the optimal set of beam-formers isclosed and open.Proof. Note that the SINR constraints of the centralized prob-lem (3b) are continuous functions of the channels and thebeam-formers. In particular, let SINR k ( W n , H n ) in (2) bethe value of the SINR constraint of user k given the channel H n and the set of beam-formers W n and let ¯H and ¯W begiven as in Deﬁnition 1. Then, due to continuity of the SINRfunctions with H and W it follows thatlim n →∞ SINR k ( W n , H n ) = SINR k ( ¯ W , ¯H ) , ∀ k, (32)and given that all pairs H n , W n satisfy SINR k ( W n , H n ) ≥ γ k this will also be satisﬁed in the limit. This establishes thatthe mapping is closed.We now proceed to show that W ( H ) is also open. In orderto prove this, we assume that we have a set of feasible beam-formers ¯W for a given channel ¯H . Then, we use knownresults to establish a neighborhood of ¯H for which we canﬁnd a scaling vector p = [ p , . . . , p K ] T that is a continuousfunction of the channel such that a feasible set of beam-formers W ∈ W ( H ) can be found using ¯W as w k = √ p k ¯w k . For this purpose, assume that for a given channel ¯H we haveavailable (without loss of generality) strictly feasible beam-formers ¯W ∈ W ( ¯H ) . Deﬁne now as in [22], the power of theinterference caused by the transmission to user j over user k as G kj ( ¯H , ¯W ) = ¯w Hj ¯h b ( j ) k ¯h Hb ( j ) k ¯w j . These terms will becollected in the matrix Ψ ( ¯H , ¯W ) , where [ Ψ ( ¯H , ¯W )] kj = ( G kj ( ¯H , ¯W ) , j = k j = k. (33)Let D ( ¯H , ¯W ) , diag { γ G ( ¯H , ¯W ) , . . . , γ K G KK ( ¯H , ¯W ) } whichrepresents the ratio between desired SINR and received powerfor each of the users, and let σ = [ σ , . . . , σ K ] T . The vector p is deﬁned as the optimal solution to the power allocationproblemmin p || p ′ || s.t. SINR ′ k ( p ′ , σ k , ¯W , H ) ≥ γ k , ∀ k, (34)where SINR ′ k denotes the SINR of user k given a set of ﬁxedbeam-formers ¯W a set of channels ¯H and noise variances σ k . From [22] we know that the solution to (34) (if it exists) tothe optimization problem (34) is unique and characterized by ( I − D ( H , ¯W ) Ψ ( H , ¯W )) p = D ( H , ¯W ) σ , (35)which is an equivalent formulation of the SINR constraints(3b) being fulﬁlled tightly. Additionally, given theorem 4 in[22] we have that the solution p > exists if and only ifthe matrix D ( ¯H , ¯W ) Ψ ( ¯H , ¯W ) has spectral radius strictlysmaller than 1, i.e. ρ ( D ( ¯H , ¯W ) Ψ ( ¯H , ¯W )) < . In otherwords, by solving (35) we are capable of ﬁnding a feasibleset of beam-formers for H = ¯H in a neighborhood of ¯H byscaling the strictly feasible beam-formers for ¯H by the powersobtained using p = ( I − D ( H , ¯W ) Ψ ( H , ¯W )) − D ( H , ¯W ) σ ,i.e. w k = p p ⋆k ¯w k , relying on the fact that the spectralradius, ρ ( D ( ¯H , ¯W ) Ψ ( ¯H , ¯W )) , is strictly smaller than 1. Notethat the scaling can be claimed to be continuous with thechannels because of the continuity of the matrices Ψ ( H , ¯W ) and D ( H , ¯W ) with the channels. This argument establishesthat for a neighborhood of ¯H that fulﬁlls N ( ¯H ) , { H | ρ ( D ( H , ¯W ) Ψ ( H , ¯W )) < } , (36)there exists a continuous scaling p given by (35) providingfeasible beam-formers. This implies that given a sequence ofchannels H n → ¯H and ¯W ∈ W ( ¯H ) , there exists an m and asequence { W n } such that for all n ≥ m W n ∈ W ( H n ) .In particular, given a feasible beam-former ¯W ∈ W ( ¯H ) we can generate by using (35) feasible beam-formers in theneighborhood of ¯H thus concluding the proof. By invokingTheorem 2 the continuity of the function W ⋆ ( H ) follows.We now introduce and prove the following lemma regardingthe continuity of the dual multipliers. Lemma 4.

Given the set of γ k − feasible channels H for a spe-ciﬁc user base station allocation and SINR requirements, theoptimal dual multiplers { λ ⋆k } , k = 1 , . . . , K are continuousfunctions of the channel H .Proof. As shown in [3] and reviewed in [23] the dual problemof (3) can be expressed asmin { λ k } B X b =1 X k ∈U ( b ) λ k σ k (37a)s.t. I + K X j =1 λ j h b ( k ) j h Hb ( k ) j (cid:23) (1 + 1 γ k ) λ k h b ( k ) k h Hb ( k ) k , (37b) k = 1 , . . . , K. (37c)The dual problem in (37) has been shown to be equivalent tosolving the following up-link beam-forming problem yielding Note that [3] contains a technical error in the proof of Theorem 1 betweenequations (12) and (13). However, this does not compromise the validity ofthe result. An alternative proof can be provided by using the fact that givena symmetric positive semideﬁnite matrix A ∈ C n × n and an n × vector b in the row space of A , A (cid:23) bb H iff b H A † b ≤ . UBMITTED TO IEEE TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING OVER NETWORKS 9 the down-link up-link duality result in [3]:min { λ k } K X k =1 λ k σ k (38a)s.t. η k ≥ σ k , (38b)where η k = max w dk : || w dk || =1 λ k | w dHk h b ( k ) k | P j = k λ j | w dHk h b ( k ) j | + σ k || w dk || andwhere w dk denote the up-link beam-formers. It can be shown[3, Theorem 1], that given a ﬁxed set of values { λ j } the down-link beam-formers maximizing η k follows the expression w dk ( { λ j } ) = ( P j λ j h b ( k ) j h Hb ( k ) j + σ k I ) − h b ( k ) k || (cid:16)P j λ j h b ( k ) j h Hb ( k ) j + σ k I (cid:17) − h b ( k ) k || , and that w dk are a scaled version of the optimal down-link beam-formers forthe optimal multipliers λ ⋆k , i.e. w ⋆k = p p ⋆k w dk ( { λ ⋆j } ) , where p ⋆k is the optimal power allocation to user k . This establishes,by Lemma 3 uniqueness and continuity with the channels ofthe normalized up-link beam-formers, i.e. the mapping W d⋆ ( H ) ,  w d⋆k | w d⋆k = (cid:16)P j λ ⋆j h b ( k ) j h Hb ( k ) j + σ k I (cid:17) − h bk || (cid:16)P j λ ⋆j h b ( k ) j h Hb ( k ) j + σ k I (cid:17) − h b ( k ) k ||  , (39)where { λ ⋆k } are the optimal solutions to (37), is closed andopen for all H ∈ H . This establishes the continuity of thefunction W ⋆d ( H ) . By uplink-downlink duality [3] the optimaldual multipliers { λ ⋆j } correspond to the scaled powers of theup-link beam-formers, i.e. || w ⋆dk || = λ k σ k , which by thecontinuity of w ⋆dk establishes the continuity of λ ⋆k and provesthe Lemma.The continuity of the optimal and primal dual variables ofthe centralized problem in (3) has now been proven. It ispossible to show not only continuity, but differentiability of thefunction mapping channels to the unique optimal primal-dualsolution by writing the problem as a standard SOCP and usingthe results in [24]. This yields Lipschitz continuity within aset of compact channels since it would allow bounding theJacobian matrix’s largest eigenvalue. However, this result ismore involved and not required for the proofs at hand.We now proceed to prove that (A3) holds. For this purpose,we require proving continuity of the elements w [ i ] , t [ i ] mk ,resulting at each iteration i of ADMM, with respect to theparameters fed to the algorithm at iteration i , i.e. consistencyvariables τ [ i − and duals ν [ i − resulting from iterate i − and with respect to the channels. This will imply continuityof all the ADMM parameters τ , ν since the rest of the stepsare updated linearly with t [ i ] mk . We proceed, using the samemethodology as in the centralized case, to show that theoptimization problem in Algorithm 1 yields continuous primalsolutions. Lemma 5.

Given the set of γ k − feasible channels H for aspeciﬁc user-base station allocation and SINR requirementsand the parameters τ [ i − and ν [ i − provided for iterate i in algorithm 1, the ADMM parameters provided for iterate i ,i.e. τ [ i ] , ν [ i ] and the obtained primal solution w [ i ] , t [ i ] arecontinuous functions of H , τ [ i − and of ν [ i − . Proof. Let us equivalently (in the sense that it yields anequivalent problem) rewrite the objective function in (8a) as P k ∈U ( b ) || w k || + ρ || t b − E b τ [ i − + ν b ρ || . For simplicity de-ﬁne y [ i − b , E b τ [ i − − ν [ i − b ρ . Note now that the interferenceconstraints (8c)) might not always hold tightly as opposed tothe SINR constraints (8b). This is due to the fact that t ( b ) bj ,appearing in (8c), is selected to fulﬁll the constraint, but atthe same time to be close to the corresponding value in y b as possible so as to minimize the objective. Hence, given aset of values ( { w k } , { t ( b ) mk } m = b,k ∈U ( b ) ) , corresponding to thebeam-formers and suffered interference values, fulﬁlling theSINR constraints the problem will always be feasible since thecaused interference values { t ( b ) bj } j ( b ) can always be selectedaccordingly. For this reason, the coming analysis will prioritizethe fulﬁllment of the SINR constraints (8b) and deal with theinterference constraints (8c) later on.The conditions required to establish continuity are unique-ness of the primal solution, and that the feasible set WT ( H ) , { ( W , t ) | (8b) and (8c) hold ∀ k, j, b } , (40)corresponding to the feasible sets of beam-formers and es-timated interference values t b , is both closed and open forall H and y , E τ − ν ρ . Note that the feasible set does notexplicitly depend on the parameter y since the feasibility of abeam-former will not be affected by y .The proof that WT ( H ) is closed is analogous to thecentralized case (proof of Lemma 3) and will therefore beomitted. Uniqueness of the primal solution follows from thestrong convexity of the objective function in (8a). The proof of WT ( H ) being open is very similar but provides some insightto the problem and will therefore be included. Given a set ofchannels ¯H and parameters ¯y , assume each base station hasperformed an iteration of the ADMM algorithm and found thecorresponding optimal solutions. We will then have that allSINR constraints (8b) will hold tightly, while all interferenceconstraints (8c) will be either not active, weakly active, oractive depending on the values in ¯y . Analogously to before,given an optimal point, the SINR constraints, correspondingto all users and therefore all base stations, can be expressedas ( I − D ( H , W ) Ψ ( H , W )) = D ( H , W ) η , where in thiscase η , ( P m = b ( t ( m )[ i ] mj ) + σ j , . . . , P m = b ( t ( m )[ i ] mj |U ( b ) | ) ) + σ j |U ( b ) | ) T , where j k ∈ U ( b ) . Given a second set of channelsand parameters H and y , if the optimal set of beam-formersand interference levels corresponding to ¯H and ¯y where used,the SINR constraints (8b) may again not be fulﬁlled. We willcircumvent this in the same way as before, implying therefore,that there will exist a scaling p that is continuous with thechannel and allows us to produce a feasible set of beam-formers for H based on the optimal beam-formers for ¯H . Notehowever, that with this new scaling, if the interference valuesare left untouched, and the interference constraints (8c) mightnot be fulﬁlled. A simple way of solving this problem is to de-ﬁne the new interference values as t ( b )2 bj = p max j ( p j ) t ( b )[ i ]2 bj .From here, the proof is analogous to that of the centralizedcase.In this case, the problem can also be re-written as a UBMITTED TO IEEE TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING OVER NETWORKS 10 standard SOCP and degeneracy conditions can be studied.However, one of the conditions required to show continuityand differentiability of the function that maps the channelsto the optimal values is strict complementarity which is notfulﬁlled in this case when the interference constraints (8c) areweakly active. In these cases differentiability of the mappingto the primal dual optimal solution is lost but, as proven,continuity is kept.We have therefore proven, by showing that all point toset mappings representing the feasible sets are open andclosed, using the fact that the optimal solutions are uniqueand invoking Theorem 2 that all assumptions required for thetracking abilities of ADMM when deprived of strong convexityin (4) hold. V. N

UMERICAL EXPERIMENTS

This section provides numerical experiments to demonstratethe performance of the proposed dynamic beam-forming tech-nique. In the simulations the parameter γ k is set to 10 forall users and the plotted SINR (linear) corresponds to theaverage SINR achieved by the users using the beam-formersobtained after a single iteration. The initial channel vectors(in H [0] ) and the innovation channels ( H inn [ i ] ) are generatedfollowing: h bk ∼ CN ( , I ) , i.e. they are independent complexcircularly Gaussian random vectors with unit variance. Atrack is then generated as H [ i ] = √ ζ H inn [ i ] + √ − ζ H [ i − ;however, this method might lead to channels that do nothave a feasible solution for the required SINRs. In order toavoid tracking infeasible solutions each channel is checkedfor feasibility prior to feeding it to Algorithm 1. In case thechannel does not allow for a feasible solution it is discardedand replaced by a channel generated following the sameinnovation equation that is feasible. Even though this is not anappropriate choice when modeling the dynamics of a wirelesschannel, it allows us to illustrate the tracking ability of thealgorithm for this speciﬁc problem while keeping the modelsimple. Note also that this update rule does not guarantee thata bound || H [ i ] − H [ i − || ≤ ǫ is fulﬁlled. This is due to thefact that the innovation H inn [ i ] can take arbitrarily large values.However, it will hold true that E {|| H [ i ] − H [ i − ||} ≤ ǫ. The considered system consists of 2 base stations equippedwith 4 antennas serving 2 users each. In all cases, ADMM isinitialized with τ [0] = and ν [0] = . In case ADMM whereto be used with a very large penalty parameter ρ the solutionwould very slowly deviate from the zero forcing solution sincewe would be enforcing, initially, that the algorithm does notdeviate from it. In ﬁgures 1, 2 and 3 the dynamic behaviourof the algorithm is illustrated for penalty parameters ρ = 1 , ρ = 50 and ρ = 1000 respectively. It can be seen that thealgorithm is in general capable of providing a set of beam-formers which use a similar total power as in the optimalcase. We can also see that, even though the solution is notalways feasible the achieved SINR levels are not far from 10( γ k ). Additionally, as intuitively expected, the fulﬁllment ofthe SINR constraints is better as ρ increases. This is due tothe fact that we are assigning more weight to consensus amonginterference levels by selecting a larger ρ . Note, however, that A v e r a g e po w e r Optimal Dynamic0 10 20 30 40 5005101520 Channels S I N R ( li n ea r) Fig. 1. Total transmit power and average user SINR for a single track of 50channels, ρ = 1 , γ = , σ = √ ζ = 0 . , N t = 4 , N u = 4 , N b = 2 . by selecting ρ we do not only select how important it is forus that the base stations are in agreement, but also the stepsize of the sub-gradient step in charge of maximizing the dualproblem. Hence, a large value of ρ implies a large step sizewhich might make convergence slow. The same can happenin case of selecting a very small ρ . In order to illustrate thisclearly we provide ﬁgures 4, 5 and 6 where 10000 independenttracks consisting of 50 channel are generated and averaged,with ρ = 1 , ρ = 50 and ρ = 1000 , respectively. In this casethe tracks are averaged and in case of the SINR we providea 1 standard deviation shift from the average SINR in allcases. Observe that in ﬁgure 6 the dynamic solution yields, inaverage, powers superior to optimal even when the problem isnot feasible, while in 4 and 5 the dynamic solution approachesthe optimal on the opposite side, in other words, it is belowthe optimal solution. This is due to the selected initial values.A relatively small ρ does not penalize the algorithm fromdeviation of the initial value, , and hence ADMM is free toselect a power minimizing solution. On the other hand, whenthe parameter ρ is larger, the solution provided by ADMM willbe more similar to a zero-forcing solution, providing a feasiblesolution earlier but approaching the optimal value from above.As mentioned earlier, given a very small or large ρ con-vergence is slow, however, when the step size is large, theSINR values are very close to feasibility. As in the static case,optimal parameter selection for ρ is not known [18] exceptfor speciﬁc cases [25]. In [4] it is experimentally shown thatpenalty parameters related to the channels provide quickerconvergence. This could also be done in order to improveconvergence of the ADMM algorithm, potentially improvingthe tracking ability. However, in order to normalize ρ withrespect to the problem’s data we would require to centralizethe CSI breaking the distributed nature of the algorithm.VI. C ONCLUSIONS

This paper shows that ADMM can be used in order todynamically, and in a distributed manner, follow the set of

UBMITTED TO IEEE TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING OVER NETWORKS 11 A v e r a g e po w e r Optimal Dynamic0 10 20 30 40 505101520 Channels S I N R ( li n ea r) Fig. 2. Total transmit power and average user SINR for a single track of50 channels, ρ = 50 , γ = , σ = √ , ζ = 0 . , N t = 4 , N u = 4 , N b = 2 . The plot below illustrates each of the users perceived SINR. A v e r a g e po w e r Optimal Dynamic0 10 20 30 40 5099.51010.511 Channels S I N R ( li n ea r) Fig. 3. Total transmit power and average user SINR for a a single track of50 channels, ρ = 1000 , γ = , σ = √ ζ = 0 . , N t = 4 , N u = 4 , N b = 2 . The plot below illustrates each of the users perceived SINR. optimal beam-formers given that the channel varies slowlyenough. This is done even though the strong convexity as-sumption is broken when the problem is written in a ready-to-distribute manner. We have presented a novel approach toshow the tracking ability of an algorithm that does not relyon an explicit convergence rate and therefore, allows to us torelax the strong convexity requirement. In particular, the strongconvexity requirement is replaced by continuity requirementson the optimal point with respect to the problem’s parameters.Additionally, some insights regarding the effect of the step-size on the algorithm’s tracking ability are provided.A PPENDIX A Proof.

By writing the KKT conditions of the problem in (7)it can be shown that it holds that E T ν ⋆ = . Additionally,provided that ν [0] is initialized fulﬁlling E T ν [0] = 0 , we A v e r a g e po w e r S I N R ( li n ea r) Optimal DynamicAverage SINR 1 std

Fig. 4. Total average transmit power and average user SINR for 10000 tracksof 50 channels, ρ = 1 , γ = , σ = √ ζ = 0 . , N t = 4 , N u = 4 , N b = 2 . A v e r a g e po w e r Optimal Dynamic0 10 20 30 40 50789101112 Channels S I N R ( li n ea r) Average SINR 1 std

Fig. 5. Total average transmit power and average user SINR for 10000 tracksof 50 channels, ρ = 50 , γ = , σ = √ , ζ = 0 . , N t = 4 , N u = 4 , N b = 2 . A v e r a g e po w e r Optimal Dynamic0 10 20 30 40 5089101112 Channels S I N R ( li n ea r) Average SINR 1 std

Fig. 6. Total average transmit power and average user SINR for 10000 tracksof 50 channels, ρ = 1000 , γ = , σ = √ , ζ = 0 . , N t = 4 , N u = 4 , N b = 2 . UBMITTED TO IEEE TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING OVER NETWORKS 12 can guarantee that at each ADMM iterate, regardless of thechannel, will satisfy E T ν [ i ] = 0 . This is intuitively soundsince E T ν implies that there is a pair of (estimated) dualmultipliers that have the same absolute value but opposite signwhich will be associated to the copies of the same interferencevalues present at two base stations at a time. This can also bethought of as in [10] where the dual variables associated to aconsistency constraint are shown to always be 0 when summedover the network. Given this fact, we have that || E ( τ [ i ] ⋆ − τ [ i ] )+( ν [ i ] ρ − ν [ i ] ⋆ ρ ) || = || E ( τ [ i ] ⋆ − τ [ i ] ) || + ρ || ν [ i ] − ν [ i ] ⋆ || ,since the cross product ( ν [ i ] ρ − ν [ i ] ⋆ ρ ) T ( E ( τ [ i ] ⋆ − τ [ i ] )) = 0 .Note that this expression is nothing but the Lyapunov functionscaled by ρ . Hence we have that || E τ [ i ] ⋆ − ν [ i ] ⋆ ρ − ( E τ [ i ] − ν [ i ] ρ ) || ≤ µ l ρ . In the sequel, we equivalently rewrite (8a) as P j ∈U ( b ) || w j || + ρ || t b − E b τ [ i − + ν [ i − b ρ || in order to usethe just derived bound. Note that since the base stations donot share any variable (they share copies) solving each of theproblems in (8) locally at each of the base stations is equivalentto solving a problem where the feasible set is the Cartesianproduct of feasible sets and the objective function is nothingmore than the sum of objective functions. This leads to thefollowing optimization problem:min w , t || w || + ρ || t − E τ [ i − + ν [ i − ρ || (41a)s.t. ( w , t ) ∈ WT ( H [ i ] ) , (41b)where WT ( H ) deﬁned in (40) denotes the feasible set for allbeam-formers and interference estimates, i.e. the constraintsof all base stations in (8b) and (8c). Deﬁne for the sakeof simplicity y [ i − , E τ [ i − − ν [ i − ρ . Note that giventhe optimal set of dual multipliers and consistency variablesthe problem in (41) yields the optimal solution to (7) andthat the feasible set is only dependent on the channel H [ i ] .Deﬁne the optimal parameter y [ i ] ⋆ , E τ ⋆ ( H [ i ] ) − ν [ i ] ⋆ ( H [ i ] ) ρ .Then, the objective function can be equivalently replaced by || w || + ρ || t || + ρ y T t = f ( w ) + g ( t ) + h ( t , y ) , where f ( w ) = || w || , g ( t ) = ρ || t || and ﬁnally h ( t , y ) = ρ y T t .Given the parameters y [ i − , which is a concatenation of y [ i − b deﬁned earlier as E b τ [ i − − ν [ i − b ρ , we have that ∇ w f ( w [ i ] ) T ( w [ i ] ⋆ − w [ i ] ) + ∇ t g ( t [ i ] ) T ( t [ i ] ⋆ − t [ i ] )++ ∇ t h ( t [ i ] , y [ i − ) T ( t [ i ] ⋆ − t [ i ] ) ≥ , (42)where ( w [ i ] , t [ i ] ) is optimal given y [ i − and ( w [ i ] ⋆ , t [ i ] ⋆ ) isoptimal given y [ i ] ⋆ . We also have ∇ w f ( w [ i ] ⋆ ) T ( w [ i ] − w [ i ] ⋆ ) + ∇ t g ( t [ i ] ⋆ ) T ( t [ i ] − t [ i ] ⋆ )++ ∇ t h ( t [ i ] ⋆ , y [ i ] ⋆ ) T ( t [ i ] − t [ i ] ⋆ ) ≥ . (43)By adding (42) and (43), and using the strong convexity of f and g ( y [ i ] ⋆ − y [ i − ) T ( t [ i ] ⋆ − t [ i ] ) ≥|| w [ i ] − w [ i ] ⋆ || + || t [ i ] − t [ i ] ⋆ || . (44)In turn, the ﬁrst term can be upper bounded by || t [ i ] ⋆ − t [ i ] |||| y [ i ] ⋆ − y [ i − || ≤ √ c ( || t [ i ] ⋆ − E τ [ i ] ⋆ || + || E ( τ [ i ] ⋆ − τ [ i ] ⋆ ) || + || t [ i ] − E τ [ i ] || ) , (45) where c represents the bound on the Lyapunov function as in(30). Note that the ﬁrst term in the RHS of (45) is 0 sincewe are dealing with optimal points. Additionally, the secondterm can be again bounded by √ c . The third term is a scaledversion of the primal residual and can be also bounded bythe Lyapunov function, since one can not perform a decreaselarger than its current value. Hence, we conclude that || w [ i ] − w [ i ] ⋆ || + || t [ i ] − t [ i ] ⋆ || ≤ (cid:18) √ ρ (cid:19) c, (46)for i → ∞ and hence, given that w [ i ] is a vectorized versionof W [ i ] we havelim sup i →∞ || W [ i ] − W [ i ] ⋆ || ≤ (cid:18) ρ (cid:19) c (47)A PPENDIX B Proof.

After iteration i using the corresponding channels H [ i ] , each base station has found a set of beam-formers andlocal copies of interference values t ( b ) mk that fulﬁll the SINRconstraints tightly. However, since before convergence ADMMdoes not guarantee primal feasibility, the local interferenceestimate might not match the perceived interference whenthe obtained beam-formers W [ i ] are used, i.e. different base-stations may disagree on how much they are interfering eachother and hence the interfering base station will cause moreinterference than predicted by the base station whose user issuffering the interference. We therefore aim to ﬁnd the worstcase perceived SINR. The proof will be performed for user k associated to base station b . In particular we have that basestation b has performed an ADMM step yielding beam-formersand interference values such that | h Hb ( k ) k w k | P i ∈U ( b ( k )) \ k | h Hb ( k ) k w i | + P m = b t ( b )2 mk + σ k = γ k . (48)However, the perceived SINR satisﬁes | h Hb ( k ) k w k | P i ∈U ( b ( k )) \ k | h Hb ( k ) k w i | + P m = b t ( m )2 mk + σ k ≥ γ k − ǫ k , (49)where ǫ k is the loss of SINR at user k and is the quantity wewish to upper bound. For notational simplicity we deﬁne t ′ bk and t ′ mk as the vectors containing in each of their componentsthe interference estimates of base station b appearing in (48)and analogously for t ′ mk with (49), implying that || t ′ bk || = P m = b t ( b )2 mk and || t ′ mk || = P m = b t ( m )2 mk .By writing the difference between (48) and (49) and sim-plifying we obtain γ k ( || t ′ mk || − || t ′ bk || ) P i ∈U ( b ( k )) \ k | h Hb ( k ) k w i | + || t ′ mk || + σ k ≤ (50a) γ k ( || t ′ mk || − || t ′ bk || ) || t ′ mk || + σ k (50b)where we have used that || t ′ mk || ≥ || t ′ bk || since we areinterested in bounding the worst case scenario. In particular,the worst perceived interference, by a speciﬁc user, will occur UBMITTED TO IEEE TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING OVER NETWORKS 13 when the user is expected to not be interfered at all, i.e. zeroforced by other base stations, but it is however interfered.Recall now, that the primal residual, i.e. || t − E τ || actsas a lower bound in the Lyapunov’s function decrease (20).Hence, the primal residual can not attain a value larger thanthe Lyapunov function itself. Additionally it has been shownin Section III that (29) holds. We have also seen that when thechannel changes the Lyapunov function with no update can beupper bounded as follows: V ( ν [ i ] , τ [ i ] , H [ i +1] ) ≤ c, (51)where c is deﬁned in (30). Hence, the primal residual cannot attain values larger than c . Consequently the term || t ′ mk − t ′ bk || ≤ cρ . We then have that γ k ( || t ′ mk || − || t ′ bk || )) || t ′ mk || + σ k ≤ (52a) γ k || t ′ mk || || t mk ′ || + σ k ≤ γ k cρσ k + 4 c . (52b)Thus yielding the upper bound ǫ k ≤ γ k ǫ ρσ k + 4 ǫ . (53)A CKNOWLEDGMENT

The authors would like to thank Prof. W. Yu for helpfuldiscussions regarding the proof of Theorem 1 in [3].R

EFERENCES[1] K. Karakayali, G. J. Foschini, and R. A. Valenzuela, “Network Coor-dination for Spectrally Efﬁcient Communications in Cellular Systems,”

IEEE Wireless Commun. Mag. , vol. 13, no. 4, pp. 56–61, Aug. 2006.[2] G.J. Foschini, K. Karakayali, and R.A. Valenzuela, “Coordinating mul-tiple antenna cellular networks to achieve enormous spectral efﬁciency,”

IEEE Proc. Commun. , vol. 153, no. 4, pp. 548–555, Aug 2006.[3] H. Dahrouj and W. Yu, “Coordinated beamforming for the multicellmulti-antenna wireless system,”

IEEE Trans. on Wireless Comm. , vol.9, no. 5, pp. 1748–1759, May 2010.[4] S. Joshi, M. Codreanu, and M. Latva-Aho, “Distributed resourceallocation for MISO downlink systems via the alternating directionmethod of multipliers,”

Conference Record - Asilomar Conference onSignals, Systems and Computers , pp. 488–493, 2012.[5] H. Pennanen, A. Tolli, and M. Latva-aho, “Decentralized CoordinatedDownlink Beamforming via Primal Decomposition,”

IEEE SignalProcessing Letters , vol. 18, no. 11, pp. 647–650, Nov 2011.[6] Y.F. Liu, Y.H. Dai, and Z.Q. Luo, “Coordinated beamforming for MISOinterference channel: Complexity analysis and efﬁcient algorithms,”

IEEE Trans. Signal Process. , vol. 59, no. 3, pp. 1142–1157, 2011.[7] M. Bengtsson and B. Ottersten, “Optimal Downlink Beamforming UsingSemideﬁnite Optimization,” in

Proc. of 37th Annual Allerton Conferenceon Communication, Control and Computing , 1999, pp. 987–996.[8] A. Wiesel, Y. C. Eldar, and S. Shamai, “Linear precoding via conicoptimization for ﬁxed MIMO receivers,”

IEEE Trans. Signal Process. ,vol. 54, no. 1, pp. 161–176, Jan 2006.[9] D. P. Palomar and M. Chiang, “A tutorial on decomposition methodsfor network utility maximization,”

IEEE J. Sel. Areas in Commun. , vol.24, no. 8, pp. 1439–1451, Aug 2006.[10] S. Boyd, L. Xiao, A. Mutapcic, and J. Mattingley, “Notes on Decom-position Methods,”

Notes for EE364B, Stanford University , pp. 1–36,2007.[11] A. T¨olli, H. Pennanen, and P. Komulainen, “Distributed coordinatedmulti-cell transmission based on dual decomposition,” in

GLOBECOM- IEEE Global Telecommunications Conference , Nov 2009, pp. 1–6. [12] C. Shen, T. Chang, K. Wang, Z. Qiu, and C. Chi, “Distributed RobustMulticell Coordinated Beamforming With Imperfect CSI: An ADMMApproach,”

IEEE Trans. Signal Process. , vol. 60, no. 6, pp. 2988–3003,Jun 2012.[13] Y. Huang, G. Zheng, M. Bengtsson, K. Wong, L. Yang, and B. Ottersten,“Distributed Multicell Beamforming With Limited Intercell Coordina-tion,”

IEEE Trans. Signal Process. , vol. 59, no. 2, pp. 728–738, Feb2011.[14] Qing Ling and Alejandro Ribeiro, “Decentralized dynamic optimizationthrough the alternating direction method of multipliers,” in . 2013, pp. 170–174, IEEE.[15] A. Simonetto and G. Leus, “Distributed Asynchronous Time-VaryingConstrained Optimization,” in

Asilomar Conference on Signals, Systemsand Computers , 2014, number 8, pp. 2142–2146.[16] A. Simonetto and G. Leus, “Double Smoothing for Time-VaryingDistributed Multiuser Optimization,” in , 2014, pp. 852–856.[17] M. Hong and Z.Q. Luo, “On the linear convergence of the alternatingdirection method of multipliers,” Tech. Rep., Aug 2012.[18] S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, “DistributedOptimization and Statistical Learning via the Alternating DirectionMethod of Multipliers,”

Foundations and Trends in Machine Learning ,vol. 3, no. 1, pp. 1–122, Jan 2011.[19] E. Bj¨ornson and E. Jorswieck,

Optimal Resource Allocation in Coordi-nated Multi-Cell Systems , vol. 9, Now Publishers, Jan 2013.[20] W. Deng and W. Yin, “On the Global and Linear Convergence of theGeneralized Alternating Direction Method of Multipliers,”

Journal ofScientiﬁc Computing , 2015.[21] A. V. Fiacco and Y. Ishizuka, “Sensitivity and stability analysis fornonlinear programming,”

Annals of Operations Research , vol. 27, pp.215–236, 1990.[22] H. Boche and M. Schubert, “A general duality theory for uplinkand downlink beamforming,” in

Proceedings IEEE 56th VehicularTechnology Conference . 2002, vol. 1, pp. 87–91, IEEE.[23] E. Bj¨ornson, M. Bengtsson, and B. Ottersten, “Optimal MultiuserTransmit Beamforming: A Difﬁcult Problem with a Simple SolutionStructure,”

IEEE Signal Process. Mag. , vol. 31, no. 4, pp. 142–148,2014.[24] F. Alizadeh and D. Goldfarb, “Second-order cone programming,”

Mathematical Programming , vol. 95, no. 1, pp. 3–51, 2003.[25] E. Ghadimi and A. Teixeira, “Optimal parameter selection for the alter-nating direction method of multipliers (ADMM): quadratic problems,”