On the SCALE Algorithm for Multiuser Multicarrier Power Spectrum Management
aa r X i v : . [ c s . I T ] O c t IEEE TRANS. SIGNAL PROCESSING, VOL. 60, NO. 9, P.P. 4992 4998, SEP. 2012. 1
On the SCALE Algorithm for MultiuserMulticarrier Power Spectrum Management
Tao Wang,
Member, IEEE and Luc Vandendorpe,
Fellow, IEEE
Abstract —This paper studies the successive convex approxima-tion for low complexity (SCALE) algorithm, which was proposedto address the weighted sum rate (WSR) maximized dynamicpower spectrum management (DSM) problem for multiusermulticarrier systems. To this end, we first revisit the algorithm,and then present geometric interpretation and properties ofthe algorithm. A geometric programming (GP) implementationapproach is proposed and compared with the low-complexity ap-proach proposed previously. In particular, an analytical method isproposed to set up the default lower-bound constraints added bya GP solver. Finally, numerical experiments are used to illustratethe analysis and compare the two implementation approaches.
Index Terms —Dynamic spectrum management, power control,cochannel interference mitigation, convex optimization, geometricprogramming, orthogonal frequency division modulation, digitalsubscriber lines.
I. I
NTRODUCTION
Weighted sum rate (WSR) maximized dynamic power spec-trum management (DSM) has lately been attracting muchresearch interest for multiuser multicarrier systems. Optimumspectrum balancing (OSB) algorithm was first proposed basedon dual decomposition [1]. When there is a big number ofcarriers, the global optimality of this algorithm was justifiedin [2], [3], by showing that the duality gap of the problemapproaches zero asymptotically as the number of carriers goesto infinity. A more efficient algorithm, referred to as iterativespectrum balancing (ISB), was proposed in [2], [4].Recently, a successive convex approximation for low com-plexity (SCALE) algorithm was proposed based on the ideaof solving convex approximations of the original problemsuccessively for increasingly better solutions [5], [6]. Thisalgorithm also proved to be very useful to address variousresource allocation problems for interference mitigation [7].Compared with the above existing works, this paper makesthe following contributions: • Novel geometric interpretation and properties are pre-sented for the SCALE algorithm. Most interestingly, weshow that the algorithm is asymptotically optimum , i.e., itproduces power allocations with WSRs approaching themaximum value as long as a sufficiently good initializa-tion is used, even though the produced power allocationsare entrywise positive while the optimum one might
T. Wang and L. Vandendorpe are with ICTEAM Institute,Universit´e Catholique de Louvain, Louvain-la-Neuve, Belgium (email: { tao.wang,luc.vandendorpe } @uclouvain.be). The authors would like to thankthe French Community for funding the project ARC SCOOP and Fonds dela Recherche Scientifique (FNRS) in Belgium. contain zero entries. This property is also illustrated bynumerical experiments. • A geometric programming (GP) approach is developed toimplement the SCALE algorithm. This approach revealsthe fact that, each convex approximate problem for theSCALE algorithm is actually a GP, thus a GP solver canbe exploited to pursue its global optimum. A subtlety isthat default lower-bound constraints are added in the GPsolver to avoid overflow (see Appendix). For GP basedpower allocation algorithms reported in the literature, theincurred loss of optimum objective value as well as howto set up these constraints were however not discussed[8], [9]. In view of this context, these aspects are studiedfor solving the DSM problem with the GP implemen-tation of the SCALE algorithm. The GP approach isalso compared with the low-complexity implementationapproach previously proposed in [5].The rest of this paper is organized as follows. First, thesystem model and DSM problem are described in Section II. InSection III, the SCALE algorithm is revisited. Then, geometricinterpretation and properties are presented in Section IV.After that, the two implementation approaches are shown inSection V. In Section VI, numerical experiments are given.Finally, some conclusions and future research directions aresummarized in Section VII.
Notations:
A vector is denoted by a lower-case bold letter,e.g., x , with its i -th entry denoted by [ x ] i . A matrix isdenoted by an upper-case bold letter, e.g., X , with [ X ] ij denoting the entry at the i -th row and j -th column. X ≻ X (respectively, X (cid:23) X ) means that X is entrywise strictlygreater (respectively, greater) than X . e x and e X representsthe vector and matrix which are entrywise mapped from x and X through the exponential function, respectively. ∇ X y ( X ) stands for a matrix containing entrywise derivative of y ( X ) with respect to X .II. S YSTEM MODEL AND
DSM
PROBLEM
Consider the scenario where K transmission links, eachusing N carriers, communicate simultaneously with cochannelinterference. Each of the transmitters and receivers is equippedwith a single antenna. Transmitter k ( k = 1 , · · · , K ) encodesits data and then emits them over all carriers to receiver k , with p kn being the transmit power for carrier n ( k = 1 , · · · , K , n = 1 , · · · , N ). The channel power gain at carrier n fromtransmitter l to receiver k is denoted by G kln . We assume G kln > , ∀ l, k, n . It is assumed that each receiver decodes itsown data by treating interference as noise, and every coherence IEEE TRANS. SIGNAL PROCESSING, VOL. 60, NO. 9, P.P. 4992 4998, SEP. 2012. period in which all channels remain unchanged is sufficientlylong. There exits a spectrum management center (SMC) whichfirst obtains { G kln |∀ k, l, n } , then executes a DSM algorithm,and finally assigns the optimized power spectra to transmittersfor data transmission.To facilitate analysis, we stack all power variables into thematrix P with [ P ] kn = p kn and its n -th column is denotedby p n . The signal-to-interference-plus-noise ratio (SINR) atcarrier n of receiver k can be expressed as γ kn ( p n ) = g kkn p kn I kn ( p n ) ,where I kn ( p n ) = σ kn + P Kl =1 ,l = k G kln p ln represents theinterference-plus-noise power received at carrier n for receiver k , and σ kn is the power of additive white Gaussian noise(AWGN). G n is a matrix with all diagonal entries equal tozero and [ G n ] kl = g kln g kkn if k = l . We make a mild assumptionthat ∀ n , G n is primitive , i.e., there exists a positive integer i such that ( G n ) i ≻ (see Theorem . . of [10]).The WSR maximized DSM problem is max g ( P ) = K X k =1 N X n =1 w k log (cid:0) γ kn ( p n )Γ (cid:1) (1) s . t . N X n =1 p kn ≤ p k , ∀ k ; p kn ∈ [0 , p kn ] , ∀ k, n, where Γ > , w k , p k , and p kn represent the SINR gapbetween the adopted modulation and coding scheme and theone achieving channel capacity, the prescribed positive weightfor receiver k ’s rate, the sum power available to transmitter k ,and the power spectrum mask imposed on p kn , respectively.III. R EVISIT OF THE
SCALE
ALGORITHM
It is difficult to find a global optimum for (1) since g ( P ) isneither convex nor concave of P . Initialized by a feasible P (1) ,the SCALE algorithm circumvents this difficulty, by solvingconvex approximations of (1) successively. To facilitate de-scription, a superscript m put to a variable indicates that itis associated with the m th iteration of the algorithm hereafter( m ≥ ). Specifically, P ( m +1) is found as a global optimumto an approximation of (1), i.e., max g ( m ) ( P ) = g ( P ( m ) ) + X k,n w k γ ( m ) kn Γ + γ ( m ) kn log γ kn ( p n ) γ ( m ) kn ! s . t . N X n =1 p kn ≤ p k , ∀ k ; p kn ∈ [0 , p kn ] , ∀ k, n, (2)where g ( m ) ( P ) is a lower bound approximation of g ( P ) withtightness at P = P ( m ) , i.e., ∀ P , g ( P ) ≥ g ( m ) ( P ) and g ( P ( m ) ) = g ( m ) ( P ( m ) ) as proven later. Based on this prop-erty, g ( P ( m +1) ) ≥ g ( m ) ( P ( m +1) ) ≥ g ( m ) ( P ( m ) ) = g ( P ( m ) ) follows, ensuring that { g ( P ( m ) ) |∀ m } is an increasing se-quence. Therefore, g ( P ( m ) ) must converge as m increases.Note that g ( m ) ( P ) is not concave of P . Nevertheless,after replacing P with P = e Q where Q contains log-power variables ( [ Q ] kn = q kn ), g ( m ) ( e Q ) is concave of Q ,because log( γ kn ( q n )) is concave of q n = [ q n , · · · , q Kn ] T according to Lemma 1 of [7]. This means that after the changeof variables, (2) can be solved by state-of-the-art convex optimization methods. The idea behind the SCALE algorithmis illustrated in Figure 1. C P P ( m ) P g ( m )( P ( m +1)) g ( P ( m )) g ( P ( m +1)) P ( m +1) g ( m )( P ) g ( P ) C Q Q ( m ) Q Q ( m +1) g ( m )( e Q ) (a) (b) Fig. 1. Illustration of the idea behind the SCALE algorithm, where C P isthe feasible set of (1). In [5], [6], some theoretical analysis has been made to showthat when g ( P ( m +1) ) = g ( P ( m ) ) , P ( m ) satisfies the Karush-Kuhn-Tucker (KKT) conditions for (1). In the next section,we will present geometric interpretation and more propertiesof the SCALE algorithm. To this end, we first revisit thederivation of g ( m ) ( P ) as a lower bound approximation of g ( P ) with tightness at P = P ( m ) [6]. To facilitate description,the log-SINR variable for carrier n of receiver k is definedas φ kn , and all φ kn , ∀ k, n are stacked into the matrix Φ with [Φ] kn = φ kn . The Φ corresponding to Q is denoted by Φ( Q ) . [Φ( Q )] kn is a function of q n , and denoted by φ kn ( q n ) hereafter. Note that φ kn ( q n ) = log( γ kn ( q n )) is a concavefunction of q n as said earlier. The feasible sets of Q and Φ are defined as C Q and C Φ = { Φ( Q ) | Q ∈ C Q } , respectively.It is very important to note that an analytical expression for C Φ was given in [11], with which it can be proven that C Φ is an unbounded convex set by using the log-convexity of thespectral radius function of a nonnegative matrix.The WSR when expressed as f (Φ) = P k,n w k log(1 + e φkn Γ ) , is strictly convex of Φ . Thanks to this property, the first-order Taylor approximation of f (Φ) around Φ ( m ) = Φ( Q ( m ) ) ,expressed as f ( m ) (Φ) = f (Φ ( m ) ) + X k,n ∂f (Φ ( m ) ) ∂φ kn ( φ kn − φ ( m ) kn ) (3)where φ ( m ) kn = [Φ ( m ) ] kn , is a lower bound approximationof f (Φ) with tightness at Φ = Φ ( m ) , i.e., ∀ Φ , f (Φ) ≥ f ( m ) (Φ) and f (Φ ( m ) ) = f ( m ) (Φ ( m ) ) . Note that g ( m ) ( P ) = f ( m ) (Φ( P )) where Φ( P ) denotes the Φ corresponding to P ,indicating that g ( m ) ( P ) is indeed a lower bound of g ( P ) withtightness at P = P ( m ) ,IV. A NALYSIS OF THE
SCALE
ALGORITHM
A. Geometric interpretation of the algorithm over C Φ The derivation of g ( m ) ( P ) inspires fundamental insight that,the SCALE algorithm actually exploits the convexity of theWSR with respect to Φ and the entrywise concavity of Φ( Q ) with respect to Q , to iteratively look for increasingly better ANG AND VANDENDORPE: ON THE SCALE ALGORITHM FOR MULTIUSER MULTICARRIER POWER SPECTRUM MANAGEMENT 3 Φ ∈ C Φ . At the m th iteration, the first-order Taylor approx-imation of f (Φ) around Φ ( m ) is used to construct f ( m ) (Φ) as a lower bound approximation of f (Φ) , by exploiting theconvexity of f (Φ) with respect to Φ . Then, a Φ maximizing f ( m ) (Φ) over C Φ is found by solving (2) and assigned back to Φ ( m +1) , so as to ensure that f (Φ ( m +1) ) ≥ f ( m ) (Φ ( m +1) ) ≥ f ( m ) (Φ ( m ) ) = f (Φ ( m ) ) . Note that this idea of formulating theWSR as a function of Φ , and then iteratively maximizing itsfirst-order Taylor approximation has also been used in [11]–[13]. Φ( m )Φ( m +1) C B Φ C Φ ∇ Φ f (Φ( m )) Fig. 2. Illustration of looking for Φ ( m +1) over C Φ . When viewed over C Φ , the SCALE algorithm can be inter-preted in a very simple way. To this end, note that the m thiteration is to find Φ ( m +1) = arg max Φ ∈C Φ X k,n ∂f (Φ ( m ) ) ∂φ kn ( φ kn − φ ( m ) kn ) , (4)i.e., Φ ( m +1) is the Φ ∈ C Φ with the maximum projection of Φ − Φ ( m ) toward the direction specified by ∇ Φ f (Φ ( m ) ) asillustrated in Figure 2. It can readily be shown that Φ ( m +1) must belong to C Φ ’s Pareto-optimal boundary (POB) denotedby C B Φ , due to the fact that ∀ Φ ( m ) ∈ C Φ , ∇ Φ f (Φ ( m ) ) ≻ .Specifically, C B Φ consists of all Φ ∈ C Φ for which theredoes not exist a e Φ ∈ C Φ satisfying e Φ ≻ Φ [14]. Everypoint belonging to C B Φ represents a best tradeoff among φ kn , ∀ k, n for maximizing the sum of them weighted by certaincoefficients. Moreover, the iterative search has the followingproperty: Lemma 1: If Φ ( m +1) = Φ ( m ) , then f (Φ ( m +1) ) > f (Φ ( m ) ) . Proof:
From the strict convexity of f (Φ) with respectto Φ , f ( m ) (Φ) is a strict underestimator of f (Φ) exceptat Φ = Φ ( m ) . Therefore, f (Φ ( m +1) ) > f ( m ) (Φ ( m +1) ) ≥ f ( m ) (Φ ( m ) ) = f (Φ ( m ) ) .In a word, the SCALE algorithm iteratively searches over C B Φ to produce Φ ( m ) with strictly increasing WSR as m increases . As proposed in [6], the SCALE algorithm can be generalized to maximizethe WSR of rate-adaptive (RA) users penalized by the sum power consumptionof fixed-margin (FM) users. In such a case, it can be shown that, theSCALE algorithm can be interpreted as iteratively looking for Φ ( m +1) whichmaximizes the weighted sum of all φ kn corresponding to the RA userspenalized by the sum power consumption of the FM users, over a convexsubset of C Φ dependent on Φ ( m ) . B. Properties of the algorithm at convergence
Note that a fixed point denoted by Φ ′ for the iterativeoperation (4) of the SCALE algorithm must satisfy Φ ′ = arg max Φ ∈C Φ X k,n ∂f (Φ ′ ) ∂φ kn φ kn . (5)Let us collect all fixed points for the SCALE algorithm in C SΦ . It can readily be shown that C SΦ must be part of the POBof C Φ , i.e., C SΦ ⊂ C B Φ . Moreover, every Φ ′ ∈ C SΦ is a stationarypoint of f (Φ) over C Φ , because ∀ Φ ∈ C Φ , X k,n ∂f (Φ ′ ) ∂φ kn ( φ ′ kn − φ kn ) ≥ (6)is satisfied (see the definition of a stationary point in page of [15]). Since C Φ is convex, all local maximum of f (Φ) over C Φ must belong to C SΦ according to Proposition . . of [15].Moreover, the following theorem can be proven: Theorem 1:
The following claims are true:1) If f (Φ ( m +1) ) = f (Φ ( m ) ) , then Φ ( m +1) = Φ ( m ) ∈ C SΦ holds and P ( m +1) = P ( m ) satisfies the KKT conditionsof (1).2) Provided that f (Φ (1) ) > max Φ ∈C SΦ −C ⋆ Φ f (Φ) (7)where C ⋆ Φ denotes the set of globally optimum Φ over C Φ , then lim m → + ∞ f (Φ ( m ) ) = g ⋆ where g ⋆ is themaximum WSR for (1). Proof:
To prove the first claim, suppose Φ ( m +1) = Φ ( m ) .From Lemma 1, f (Φ ( m +1) ) > f (Φ ( m ) ) follows, leading to acontradiction. Thus, Φ ( m +1) = Φ ( m ) ∈ C SΦ . Since Φ( P ) is aone-to-one mapping as shown in [11], [12], P ( m +1) is equalto P ( m ) , and they satisfy the KKT conditions of (1) as provedin [5].To prove the second claim, two cases are examined. Inthe first case, an m satisfying f (Φ ( m ) ) = f (Φ ( m +1) ) exists.Suppose Φ ( m ) / ∈ C ⋆ Φ , then Φ ( m ) ∈ C SΦ − C ⋆ Φ follows, meaningthat f (Φ (1) ) > f (Φ ( m ) ) . This is a contradiction with the factthat f (Φ ( m ) ) is strictly increasing with m , thus Φ ( m ) ∈ C ⋆ Φ and lim m → + ∞ f (Φ ( m ) ) = g ⋆ holds for the first case. Inthe second case, f (Φ ( m ) ) strictly increases endlessly as il-lustrated by a numerical example in Section VI-A. Accordingto the monotone convergence theorem, f (Φ ( m ) ) approachesthe supremum of { f (Φ) | Φ ∈ C B Φ } , i.e., lim m → + ∞ f (Φ ( m ) ) =sup Φ ∈C B Φ f (Φ) = g ⋆ [16]. Therefore, the second claim is true.The first claim indicates that when f (Φ ( m +1) ) = f (Φ ( m ) ) ,the SCALE algorithm reaches a fixed point in C SΦ , and P ( m ) (respectively, Φ ( m ) ) will remain there for all following itera-tions, i.e., it never happens that f ( P ( m ) ) remains fixed while P ( m ) keeps oscillating among different values as m increases.Nevertheless, it may happen that f ( P ( m ) ) increases strictlyand endlessly as illustrated by a numerical example in SectionVI-A. Therefore, the SCALE algorithm can be terminatedwhen either prescribed M iterations are already executed or P ( m +1) = P ( m ) is satisfied. IEEE TRANS. SIGNAL PROCESSING, VOL. 60, NO. 9, P.P. 4992 4998, SEP. 2012.
According to the second claim of Theorem 1, the SCALEalgorithm is asymptotically optimum, i.e., it produces P ( m ) with WSR approaching g ⋆ as long as f ( P (1) ) is above thethreshold value on the right-hand side of (7), even though ∀ m , P ( m ) is entrywise positive while the optimum P for(1) might contain zero entries . This will be illustrated bya numerical example in Section VI-A. Note that the abovecondition is generally true to guarantee global optimality. Itwas also pointed out in [11]–[13] that for the single carriercase, the SCALE algorithm always converges to a globaloptimum when the optimum Φ and { w k |∀ k } satisfy somespecial conditions.V. A PPROACHES TO IMPLEMENT THE
SCALE
ALGORITHM
The key to implementing the SCALE algorithm is to solvethe m th iteration problem: max Q X k,n a ( m ) kn (cid:0) q kn − log (cid:0) σ kn + X l = k G kln e q ln ) (cid:1) (8) s . t . N X n =1 e q kn ≤ p k , ∀ k ; e q kn ≤ p kn , ∀ k, n, where a ( m ) kn = w k γ ( m ) kn Γ+ γ ( m ) kn = w k G kkn p ( m ) kn G kkn p ( m ) kn +Γ · I kn ( P ( m ) ) . In the follow-ing, we first present the low-complexity approach proposedin [5], and then the GP approach to implement the SCALEalgorithm. Finally, the two approaches are compared. A. The low-complexity approach
Algorithm 1
The low-complexity implementation approach m = 0 ; ∀ k, n , a ( m ) kn = w k , p kn = 0 ; repeat repeat ∀ k , µ k = 0 if P n Ω kn ( P , ≤ p k , otherwise searchthe b k > satisfying P n Ω kn ( P , b k ) = p k with thebisection method, and µ k = b k ; ∀ k, n , update p kn with Ω kn ( P , µ k ) ; until P converges or L iterations have been executed m = m + 1 ; P ( m ) = P ; compute a ( m ) kn , ∀ k, n ; until m = M or P ( m ) = P ( m − output P m as a solution to (1).A low-complexity approach was proposed in [5] to solve(8), with KKT conditions based fixed-point equations and thebisection method for updating P and Lagrange multipliers,respectively. To facilitate description, the Lagrange multiplierfor transmitter k ’s sum power constraint is denoted as µ k . TheKKT conditions require the P corresponding to the optimum Q and the optimum µ k , ∀ k to satisfy: ∀ k, n, p kn = Ω kn ( P , µ k ) , and ∀ k, µ k X n p kn − p k ! = 0 , This happens under certain conditions shown in [17]. However, it stillremains an open and challenging problem to decide which users and carriersshould be shut down. where Ω kn ( P , µ k ) = (cid:20) a ( m ) kn µ k + P l = k ( a ( m ) ln G lkn /I ln ( P )) (cid:21) p kn and [ x ] zy = max { y, min { x, z }} . The low-complexity approach issummarized in Algorithm 1. B. The GP approach
In fact, it can be shown that (8) is equivalent to min Q log (cid:0) e P k,n a ( m ) kn ( t kn − q kn ) (cid:1) s . t . log (cid:0) X n p k e q kn (cid:1) ≤ , ∀ k, log (cid:0) X l = k G kln e q ln − t kn + σ kn e − t kn (cid:1) ≤ , ∀ k, n, (9) log( 1 p kn e q kn ) ≤ , ∀ k, n. where { t kn |∀ k, n } is a set of extra variables introduced toguarantee equivalence. The GP approach to implement theSCALE algorithm simply uses a GP solver, to solve (9) for Q m +1 (see Appendix).A subtlety deserving special attention is that, an extra lower-bound constraint on every q kn , expressed as q kn ≥ log( ξ ) where ξ can be preassigned, is added by default in the GPsolver to avoid overflow . Denote the feasible set of Φ by C Φ ( ξ ) after the extra constraints p kn ≥ ξ , ∀ k, n are added.It can be seen that after adding the extra constraints, the GPapproach iteratively searches over C B Φ ( ξ ) representing the POBof C Φ ( ξ ) to produce P ( m ) with strictly increasing WSR.The GP approach actually implements the SCALE algo-rithm to solve (1) with the extra constraints. It is interesting toevaluate the incurred loss of the maximum WSR for (1) due tothe extra constraints. Obviously, g ⋆ ( ξ ) ≤ g ⋆ where g ⋆ ( ξ ) is theglobal optimum for (1) with the extra constraints. Note that (1)might have multiple globally optimum solutions, one of whichis denoted by P ⋆ hereafter. If there exists a P ⋆ satisfying P ⋆ (cid:23) ξ , g ⋆ ( ξ ) = g ⋆ follows. Otherwise, g ⋆ ( ξ ) < g ⋆ , meaningthat a loss of the optimum WSR is incurred by the extra lowerbounds. In practice, it is rarely known a priori if there alwaysexists a P ⋆ satisfying P ⋆ (cid:23) ξ . To evaluate the worst-case lossof the optimum WSR, we present the following theorem: Theorem 2:
Suppose ≤ ξ ≪ p k N ( N +1) , then g ⋆ ≥ g ⋆ ( ξ ) ≥ g ⋆ − ξN K max k,l,n w l G lkn σ ln (10) Proof:
To prove the claim, two cases are examined. Inthe first case, there exists a P ⋆ (cid:23) ξ , thus (10) follows since g ⋆ ( ξ ) = g ⋆ . In the second case, there exists at least an entrysmaller than ξ for any P ⋆ , and we will prove the validity of(10) as follows.Let’s first choose a P ⋆ , from which we will construct a P ′′ very close to it and feasible for (1) with the extra constraintsin two steps. In the first step, all entries in P ⋆ smaller than ξ are raised to be ξ . The resulting power allocation is denotedby P ′ . Clearly, the total increase of every transmitter’s sum As for t kn , ∀ k, n , the idle constraint t kn ≥ log( σ kn ) , i.e., it isalways relaxed since the second constraint in (9) should be satisfied, canbe preassigned to the GP solver. ANG AND VANDENDORPE: ON THE SCALE ALGORITHM FOR MULTIUSER MULTICARRIER POWER SPECTRUM MANAGEMENT 5 power is not higher than
N ξ . In the second step, all rowsof P ′ are examined. For the k th row, no entries is updatedif P n [ P ′ ] kn ≤ p k . Otherwise, only the maximum entry,i.e., [ P ′ ] kn k with n k = arg max n [ P ′ ] kn , which must satisfy [ P ′ ] kn k ≥ p k N ≫ ( N + 1) ξ , is reduced by N ξ . The finallyproduced power can be taken as P ′′ .It can be shown that ∀ k, n , ∂g ( P ⋆ ) ∂p kn = ∂C n ( P ⋆ ) ∂p kn − ∂D n ( P ⋆ ) p kn where C n ( P ) = P l w l log( G lln p ln + Γ I ln ( P )) and D n ( P ) = P l w l log(Γ I ln ( P )) . Therefore, ∂g ( P ⋆ ) ∂p kn ≥ − ∂D n ( P ⋆ ) p kn = − X l = k w l G lkn I ln ( P ⋆ ) ≥ − K max k,l,n w l G lkn σ ln ,∂g ( P ⋆ ) ∂p kn ≤ ∂C n ( P ⋆ ) ∂p kn ≤ K max k,l,n w l G lkn σ ln . Obviously, g ⋆ ≥ g ⋆ ( ξ ) ≥ g ( P ′′ ) . Since ξ is very small, g ( P ′′ ) can be computed according to the first-order Taylorapproximation around P ⋆ as g ( P ′′ ) ≈ g ⋆ + X k ∈S k − δ ∂f ( P ⋆ ) ∂p kn k + X n ∈S kn ξ ∂f ( P ⋆ ) ∂p kn ! , where δ is equal to either or N ξ , k belongs to S k if theentries in P ⋆ corresponding to transmitter k are modified toget P ′′ , and S kn is the set of carrier numbers n satisfying [ P ⋆ ] kn < ξ . Therefore, g ( P ′′ ) ≥ g ⋆ + X k ∈S k (cid:18) − δK max k,l,n w l G lkn σ ln − ξN K max k,l,n w l G lkn σ ln (cid:19) ≥ g ⋆ − ξN K max k,l,n w l G lkn σ ln , meaning that the claim holds for the second case as well.According to the above theorem, ξ should satisfy ξ ≤ ǫ (cid:16) N K max k,l,n w l G lkn σ ln (cid:17) − to ensure that g ⋆ ( ξ ) ≥ g ⋆ − ǫ ,where ǫ > is prescribed and represent the maximumtolerable loss of the optimum WSR. Algorithm 2 summarizesthe GP approach to implement the SCALE algorithm, wherea ξ satisfying the above condition is used to configure the GPsolver. Algorithm 2
The GP implementation approach m = 0 ; ∀ k, n , a ( m ) kn = w k ; repeat solve (9) for Q ( m +1) by a GP solver configured with a ξ ≤ ǫ (cid:16) N K max k,l,n w l G lkn σ ln (cid:17) − ; P ( m +1) = e Q ( m +1) ; m = m + 1 ; compute a ( m ) kn , ∀ k, n ; until m = M or P ( m ) = P ( m − output P ( m ) as a solution to (1). C. Comparison of the two approaches
As said earlier, the GP approach iteratively searches over C B Φ ( ξ ) to produce P ( m ) with strictly increasing WSRs, i.e.,Lemma 1 still holds. Therefore, the GP approach has guar-anteed convergence as m increases. Note that C Φ ( ξ ) mightbe nonconvex as illustrated later. Theorem 1 also remains true when C SΦ and g ⋆ are replaced by C SΦ ( ξ ) and g ⋆ ( ξ ) , respectively.This means that as long as a sufficiently good initializationis used, the GP approach still produces P ( m ) with WSRapproaching g ⋆ ( ξ ) asymptotically.The low-complexity approach uses a heuristic rule to pursuethe combination satisfying the KKT conditions of (8). Onceconverged, the inner iterations must output a global optimumfor (8) as P ( m +1) . If the convergence of inner iterationscould always be achieved, the low-complexity approach wouldbecome a faithful implementation of the SCALE algorithm.However, it is unclear if the inner iterations always convergeas L increases, since a theoretical proof is difficult and has notbeen available yet. Nevertheless, numerical experiments showthat the convergence is always observed when L is sufficientlylarge for practical channel realizations [5]. In practice, it isvery attractive to implement the SCALE algorithm with thelow-complexity approach using a very small L . In such acase, the inner iterations might often not converge, and thelow-complexity approach might have very interesting behavior,e.g., the produced f ( P ( m +1) ) might be either smaller, or evengreater than the WSR of the optimum for (8) , as will beillustrated later. Note that the GP approach with a very small ξ can be re-garded as a close approximation of the faithful implementationof the SCALE algorithm. Therefore, the GP approach can beused as a benchmark to evaluate the low-complexity approach.In Section VI, we will further use numerical experiments tocompare the GP approach and the low-complexity approach,and show the behavior of the low-complexity approach when L varies. VI. N UMERICAL EXPERIMENTS
A. A simple numerical example to illustrate analysis
We first consider a simple scenario with K = 2 and N = 1 because a faithful implementation of the SCALE algorithm canreadily be made in this scenario. The parameters are set as Γ =0 dB, ∀ k, n , p k = p kn = G kkn = σ kn = 1 . Since a singlecarrier exists, the carrier-number subscript of every variable isomitted for simplicity hereafter. The analytical expression C Φ and C Φ ( ξ ) can readily be derived and thus not shown here dueto space limitation.Figure 3 shows the results when the first set of channeland weight parameters in Table I is used. For the faithfuland low-complexity approaches illustrated in Figure 3.a, C B Φ is plot as the line asymptotically extending to ( −∞ , and (0 , −∞ ) , respectively, and C Φ is the unbounded region belowthat line. For the GP approach, ξ = 0 . is chosen, and thePOB and the lower-boundary for C Φ ( ξ ) are plot, with C Φ ( ξ ) being the region enclosed inside. Obviously, C Φ ( ξ ) ⊆ C Φ and C B Φ ( ξ ) ⊆ C B Φ , respectively. C Φ is convex, which illustrates TABLE IT
HREE SETS OF CHANNEL AND WEIGHT PARAMETERS . g g g g w w the first set 1.0 1.0 0.8 0.4 3.0 1.0the second set 1.0 1.0 1.8 0.4 1.8 1.0 IEEE TRANS. SIGNAL PROCESSING, VOL. 60, NO. 9, P.P. 4992 4998, SEP. 2012. −3 −2 −1 0−3−2−10 φ φ The faithful implementation −3 −2 −1 0−3−2−10 φ φ The GP approach−3 −2 −1 0−3−2−10 φ φ The low−complexity approach with L=1 −3 −2 −1 0−3−2−10 φ φ The low−complexity approach with L=8C Φ B C Φ B C Φ B as m increasesas m increasesas m increasesas m increases C Φ B ( ξ ) (a) −4 −2 011.52 φ W S R ( na t s / sy m bo l ) WSR with respect to ( φ , φ ) ∈ C Φ B (b)Fig. 3. Visualization of the iterative searches for the approaches to implementthe SCALE algorithm, when the second set of channel and weight parametersis used. the validity of the analysis in [11]–[13]. Most interestingly, C Φ ( ξ ) is nonconvex for this parameter set. The globallyoptimum Φ in C Φ and C Φ ( ξ ) is at ( φ = 0 , φ = −∞ ) and ( φ = − . , φ = − . , and corresponds to g ⋆ = 2 . and g ⋆ ( ξ ) = 2 . , respectively. Obviously, they satisfy (10),thus illustrating the validity of Theorem 2. The faithful im-plementation and the low-complexity approach with L = 8 produce the same sequence of Φ m approaching the globallyoptimum Φ asymptotically as m increases (the iterationsproceed endlessly, and only the first iterations are shownhere for clarity). This illustrates the asymptotic optimality ofthe SCALE algorithm even when the optimum P contains zeroentries, as discussed in Section IV. This also suggests that theinner iterations for the low-complexity approach with L = 8 always converge to an optimum solution for (8), ∀ m . The GPapproach converges to the globally optimum Φ in C Φ ( ξ ) after iterations. On the other hand, the low-complexity approachwith L = 1 produces a different sequence of solutions,suggesting its inner iterations do not always converge to theglobal optimum for (8). Nevertheless, the produced Φ ( m ) stillapproaches the globally optimum Φ asymptotically.For the second parameter set in Table I, the results areshown in Figure 4. For this parameter set, the globally op-timum Φ in C Φ and C Φ ( ξ ) is at ( φ = 0 , φ = −∞ ) and ( φ = − . , φ = − . , and corresponds to g ⋆ = 1 . and g ⋆ ( ξ ) = 1 . , respectively. They still satisfy (10) and illustratethe validity of Theorem 2. Similar phenomena can be observedas for the first parameter set, except for those explainedas follows. The faithful, GP and low-complexity approaches −4 −3 −2 −1 0−3−2−10 φ φ The faithful implementation −4 −3 −2 −1 0−3−2−10 φ φ The GP approach−4 −3 −2 −1 0−3−2−10 φ φ The low−complexity approach with L=1 −4 −3 −2 −1 0−3−2−10 φ φ The low−complexity approach with L=8C Φ B C Φ B C Φ B ( ξ )C Φ B as m increases as m increasesas m increasesas m increases (a) −4 −2 00.60.811.2 φ W S R ( na t s / sy m bo l ) WSR with respect to ( φ , φ ) ∈ C Φ B (b)Fig. 4. Visualization of the iterative searches for the approaches to implementthe SCALE algorithm, when the third set of channel and weight parametersis used. with L = 8 all produce the same sequence of Φ ( m ) , andconverge to a fixed point ( φ = − . , φ = − . which islocally optimum after iterations. In particular, the f (Φ (1) ) produced by all the approaches does not satisfy condition (7),indicating such a convergence to a local optimum is indeedpossible. Moreover, it is very interesting to see that for thelow-complexity approach with L = 1 , the produced Φ (1) hasa higher WSR than that for the other approaches, and the Φ ( m ) approaches the globally optimum Φ asymptotically as m increases (only the first iterations are shown here). Thissuggests that the low-complexity approach using a small L ,even without convergence of its inner iterations, might lead toa better solution than the GP approach.B. Numerical experiments for a realistic scenario We have also conducted numerical experiments for a real-istic scenario with K = 4 and N = 128 . The k th receiveris located at the coordinate ( x = k, y = 10) , whereas the k th transmitter is at ( x = k, y = 5) and ( x = k, y = 0) if k = 1 , and k = 3 , , respectively. These coordinatesare in the unit of meter. The parameters are set as Γ = 0 dB, w = w = 1 , w = w = 2 , ∀ k, n , σ kn = − dBm, p kn = p k . The channel for every link is generatedwith the channel model explained in [18], [19]. Note thatthe transmitted power is attenuated by dB in averagewhen received at a distance of meter apart. To evaluatethe effectiveness of the implementation approaches for theSCALE algorithm, the ISB algorithm proposed in [2] was also ANG AND VANDENDORPE: ON THE SCALE ALGORITHM FOR MULTIUSER MULTICARRIER POWER SPECTRUM MANAGEMENT 7 na t s / sy m bo l M ISBGPL=1L=2L=4L=8L=16UPA
Fig. 5. The average WSRs produced by every implementation approach forthe SCALE algorithm, and for using the UPA and ISB algorithm, respectively. implemented (note that a grid search of points was used tooptimize every power variable to ensure a good performance).We conducted the following experiments with Matlab v7.1 ona laptop equipped with an Intel Duo CPU of 2.53 GHz and amemory of 3 GBytes.We have generated random realizations of all channels.When ∀ k , p k = 50 dBm, we have implemented for everyrealization the ISB algorithm, the GP and low-complexityapproaches with different L . The GP solver gpcvx was usedto solve (9) for the GP approach, and ξ = 10 − is assigned[20]. During the simulation, we find that for every channelrealization, the maximum g lkn , ∀ k, l, n is smaller than − .According to Theorem 2, g ⋆ ( ξ ) ≥ g ⋆ − . × − follows,meaning that the worst-case loss of the optimum WSR isnegligible.The average time spent by the ISB algorithm is around seconds. The average time for the GP approach increasesproportionally with M and it is around seconds when M = 8 , and the one for the low-complexity approach increasesproportionally with L ∗ M , and it is around . seconds when M = 8 and L = 16 . The GP approach is much faster than theISB algorithm, because the interior-point method (IPM) usedis more efficient than the dual method and exhaustive searchused by the ISB algorithm. The low-complexity approach ismuch faster than the GP approach, indicating that the heuristicupdate rule used for the inner iteration of the low-complexityapproach leads to a much faster speed than the IPM used bythe GP solver.Figure 5 shows the average WSR of the solutions producedby every implementation approach for the SCALE algorithmwhen M ≤ . The average WSRs corresponding to usingthe uniform power allocation (UPA) and the ISB algorithm,respectively, are also shown. It can be seen that the SCALEalgorithm implemented with every approach leads to a muchgreater average WSR than the UPA when M ≥ . The averageWSR for the GP approach increases and becomes close to thatfor the ISB approach as M increases, which illustrates theeffectiveness of the SCALE algorithm. For the low-complexityapproach with a fixed L , the average WSR increases as M increases. For the low-complexity approach with a fixed M ,the average WSR increases as L increases, and is close tothat for the GP approach when L ≥ . Most interestingly, thelow-complexity approach using L = 1 and M = 8 is a good option to implement the SCALE algorithm in practice, sinceit has a fast speed and its average WSR performance is closeto that for the GP approach.VII. C ONCLUSION AND FUTURE WORK
We have presented the geometric interpretation and prop-erties of the SCALE algorithm. A GP approach has alsobeen developed to implement the algorithm, and an analyticalmethod to set up the lower-bound constraints added by a GPsolver. Numerical experiments have been shown to illustratethe analysis and compare the GP approach and the low-complexity approach proposed previously.In future, the following aspects can be further investigated.First, a theoretical study can be made on the convergenceof the inner iterations for the low-complexity approach, aswell as the behavior of the low-complexity approach using asmall inner iteration number. Second, we can study how togeneralize the SCALE algorithm for multiple-input-multiple-output systems, and compare it with other algorithms [21],[22]. Third, it is important to note that the SCALE algorithmiteratively produces increasingly better Φ through solving (2)for the globally optimum Q ∈ C Q and then transforming itback to Φ . Another possible way is to solve (4) directly forthe globally optimum Φ ∈ C Φ , e.g., by using an analyticalformulation of C Φ . Works along this direction have been donein [11]–[13] for the single-carrier case. It is interesting tocheck how to extend those studies for the multicarrier case.A CKNOWLEDGEMENT
The authors would like to thank Prof. Min Dong andthe anonymous reviewers for their precious comments andsuggestions which significantly improve this work.A
PPENDIX
A standard-form GP problem is expressed as min x f ( x )s . t . f j ( x ) ≤ , j = 1 , ..., J, (11) x i ≥ , i = 1 , ..., I, where ∀ j = 0 , , · · · , f j ( x ) is a posynomial of x with [ x ] i = x i . Specifically, an example posynomial of x is expressed as p ( x ) = P Aa =1 u a ( x ) with u a ( x ) = β a Q Ii =1 x α ai i where β a > and α ai ( i = 1 , · · · , I ) are real constants. It is very importantto note that s ( y ) = log (cid:0) p ( e y ) (cid:1) is convex of y . Therefore,although a standard-form GP problem in (11) is nonconvex,it can be converted by making the logarithmic transformationfrom x to y satisfying x = e y to its equivalent convex form min y g ( y ) = log (cid:0) f ( e y ) (cid:1) (12) s . t . g j ( y ) = log (cid:0) f j ( e y ) (cid:1) ≤ , j = 1 , ..., J, which is then solved by a GP solver, e.g., gpcvx or MOSEKbased on state-of-the-art IPM.An important subtlety should be noted. When all globallyoptimum x for (11) contains zero entries, the correspondingoptimum y for (12) contains entries equal to −∞ . In such a IEEE TRANS. SIGNAL PROCESSING, VOL. 60, NO. 9, P.P. 4992 4998, SEP. 2012. case, the IPM iteratively outputs y with g ( y ) asymptoticallyapproaching the optimum objective value for (12), which leadsto an overflow in the processor running the GP solver thatimplements the IPM. It is rarely known a priori if the optimum x for (11) contains zero entries. To avoid overflow, the GPsolver by default adds entrywise lower-bound constraints on x ,or equivalently on y before solving (12) (see page 3 of [20]).These constraints should be set to ensure that the optimum x for (11) with the extra constraints corresponds to an objectivevalue within a prescribed small tolerance around the originaloptimum objective for (11).R EFERENCES[1] R. Cendrillon, W. Yu, M. Moonen, J. Verlinden, and T. Bostoen,“Optimal multiuser spectrum balancing for digital subscriber lines,”
IEEE Trans. Commun. , vol. 54, no. 5, pp. 922–933, May 2006.[2] W. Yu and R. Lui, “Dual methods for nonconvex spectrum optimizationof multicarrier systems,”
IEEE Trans. Commun. , vol. 54, no. 7, pp. 1310–1322, July 2006.[3] Z.-Q. Luo and S. Zhang, “Dynamic spectrum management: Complexityand duality,”
IEEE J. Select. Topics Signal Process. , vol. 2, no. 1, pp.57 –73, Feb. 2008.[4] R. Cendrillon and M. Moonen, “Iterative spectrum balancing for digitalsubscriber lines,” in
IEEE ICC , vol. 3, May 2005, pp. 1937–1941.[5] J. Papandriopoulos and J. Evans, “Low-complexity distributed algo-rithms for spectrum balancing in multi-user DSL networks,” in
IEEEICC , vol. 7, June 2006, pp. 3270–3275.[6] ——, “SCALE: A low-complexity distributed protocol for spectrumbalancing in multiuser DSL networks,”
IEEE Trans. Inf. Theory , vol. 55,no. 8, pp. 3711 –3724, Aug. 2009.[7] T. Wang and L. Vandendorpe, “Iterative resource allocation for maxi-mizing weighted sum min-rate in downlink cellular OFDMA systems,”
IEEE Trans. Signal Process. , vol. 59, no. 1, pp. 223–234, Jan. 2011.[8] S. Boyd, S.-J. Kim, L. Vandenberghe, and A. Hassibi, “A tutorial ongeometric programming,”
Optimization and engineering , vol. 8, no. 1,pp. 67–127, MAR 2007.[9] M. Chiang, C. W. Tan, D. Palomar, D. O’Neill, and D. Julian, “Powercontrol by geometric programming,”
IEEE Trans. on Wireless Commun. ,vol. 6, no. 7, pp. 2640 –2651, July 2007.[10] R. Horn and C. R. Johnson,
Matrix analysis . Cambridge UniversityPress, 1985.[11] C. W. Tan, S. Friedland, and S. Low, “Spectrum management inmultiuser cognitive wireless networks: Optimality and algorithm,”
IEEEJ. Select. Areas in Commun. , vol. 29, no. 2, pp. 421 –430, Feb. 2011.[12] ——, “Nonnegative matrix inequalities and their application to non-convex power control optimization,”
SIAM J. on Matrix Analysis andApplications , vol. 32, no. 3, pp. 1030–1055, Sep. 2011.[13] C. W. Tan, M. Chiang, and R. Srikant, “Maximizing sum rate andminimizing mse on multiuser downlink: Optimality, fast algorithms andequivalence via max-min sinr,”
IEEE Trans. Signal Processing , vol. 59,no. 12, pp. 6127–6143, Dec. 2011.[14] M. J. Osborne and A. Rubenstein,
A Course in Game Theory . MITPress, 1994.[15] D. P. Bertsekas,
Nonlinear programming, nd edition . Athena Scien-tific, 2003.[16] J. Bibby, “Axiomatisations of the average and a further generalisationof monotonic sequences,” Glasgow Mathematical Journal , vol. 15, pp.63–65, 1974.[17] S. Hayashi and Z. Luo, “Dynamic spectrum management: when isFDMA sum rate optimal,” in
Proc. ICASSP , 2007, pp. 609–612.[18] T. Wang and L. Vandendorpe, “WSR maximized resource allocation inmultiple DF relays aided OFDMA downlink transmission,”
IEEE Trans.Signal Proc. , vol. 59, no. 8, pp. 3964–3976, Aug. 2011.[19] ——, “Sum rate maximized resource allocation in multiple DF relaysaided OFDM transmission,”
IEEE J. Sel. Areas Commun. \ thicksim$boyd/ggplab/gpcvx.pdf.[21] S. Christensen, R. Agarwal, E. Carvalho, and J. Cioffi, “Weighted sum-rate maximization using weighted mmse for mimo-bc beamformingdesign,” IEEE Trans. Wireless Commun. , vol. 7, no. 12, pp. 4792–4799,Dec. 2008. [22] Q. Shi, M. Razaviyayn, Z.-Q. Luo, and C. He, “An iteratively weightedmmse approach to distributed sum-utility maximization for a mimointerfering broadcast channel,”