[PDF] Multi-Group Multicast Beamforming by Superiorized Projections onto Convex Sets

Abstract

In this paper, we propose an iterative algorithm to address the nonconvex multi-group multicast beamforming problem with quality-of-service constraints and per-antenna power constraints. We formulate a convex relaxation of the problem as a semidefinite program in a real Hilbert space, which allows us to approximate a point in the feasible set by iteratively applying a bounded perturbation resilient fixed-point mapping. Inspired by the superiorization methodology, we use this mapping as a basic algorithm, and we add in each iteration a small perturbation with the intent to reduce the objective value and the distance to nonconvex rank-constraint sets. We prove that the sequence of perturbations is bounded, so the algorithm is guaranteed to converge to a feasible point of the relaxed semidefinite program. Simulations show that the proposed approach outperforms existing algorithms in terms of both computation time and approximation gap in many cases.

Full PDF

11 Multi-Group Multicast Beamforming bySuperiorized Projections onto Convex Sets

Jochen Fink,

Student Member, IEEE,

Renato L.G. Cavalcante,

Member, IEEE, and SławomirSta´nczak,

Senior Member, IEEETechnische Universit¨at Berlin and

Fraunhofer Heinrich-Hertz-Institute , Berlin, Germany { jochen.ﬁnk, renato.cavalcante, slawomir.stanczak } @hhi.fraunhofer.de This manuscript has been submitted to IEEE Transactions on Signal Processing for possible publication.

Abstract —In this paper, we propose an iterative algorithmto address the nonconvex multi-group multicast beamformingproblem with quality-of-service constraints and per-antennapower constraints. We formulate a convex relaxation of theproblem as a semideﬁnite program in a real Hilbert space,which allows us to approximate a point in the feasible setby iteratively applying a bounded perturbation resilient ﬁxed-point mapping. Inspired by the superiorization methodology, weuse this mapping as a basic algorithm, and we add in eachiteration a small perturbation with the intent to reduce theobjective value and the distance to nonconvex rank-constraintsets. We prove that the sequence of perturbations is bounded,so the algorithm is guaranteed to converge to a feasible pointof the relaxed semideﬁnite program. Simulations show that theproposed approach outperforms existing algorithms in terms ofboth computation time and approximation gap in many cases.

Index Terms —Multicast beamforming, nonconvexoptimization, semideﬁnite relaxation, projections onto convexsets, superiorization.

I. I

NTRODUCTION M ANY applications in wireless networks involvemulticast communication, which can be deﬁned as thetransmission of identical information to multiple receivers.One example is connected driving, where applications suchas platooning can beneﬁt from transmitting the same statusor control information to a group of vehicles [1]. Anotherexample is the transmission of audio signals for live events,where each spectator can select from a variety of audiostreams. Both use cases can beneﬁt considerably from physicallayer precoders that ensure a given quality-of-service (QoS)level for the requested stream at each receiver while reusingthe same time and frequency resources for all receivers.Physical layer multicasting schemes have been extensivelyinvestigated in the last two decades. The authors of [2] showthat the performance of multicast transmission can be greatlyimproved by exploiting channel state information (CSI) atthe transmitter. They consider two beamforming problems forsingle-group multicast beamforming, the max-min-fair (MMF)multicast beamforming problem and the QoS constrainedmulticast beamforming problem. While the MMF formulationaims at maximizing the lowest signal-to-noise ratio (SNR)among a group of users subject to a unit power constraint on

This work is supported by the Federal Ministry of Education and Researchof the Federal Republic of Germany (BMBF) in the framework of the projectAI4Mobile with funding number 16KIS1170K, and by the Federal Ministryfor Economic Affairs and Energy of the Federal Republic of Germany (BMWi)in the framework of the project LIPS with funding number 01MD18010E. Theauthors alone are responsible for the content of the paper. the beamforming vector, the objective of the QoS constrainedformulation is to minimize the transmit power subject to SNRconstraints for the individual users. Moreover, the authors of[2] show that the solutions to both problems are equivalent upto a scaling factor.The more general case with multiple cochannel multicastgroups is considered in [3]. Unlike the single-group case,the QoS constrained and MMF versions of the multi-groupmulticast beamforming problem are different in the sensethat a solution to one version cannot generally be obtainedby scaling a solution to the other. However, algorithms forthe QoS constrained formulation can be straightforwardlyextended to approximate the MMF version, by performinga bisection search over the target signal-to-interference plusnoise ratio (SINR) values. In this paper, we will thereforerestrict our attention to the QoS constained formulation.The QoS-constrained multi-group multicast beamformingproblem is a well-studied nonconvex quadratically constrainedquadratic programming (QCQP) problem, for which variousalgorithmic approximations have been proposed. Existingapproaches such as semideﬁnite relaxation with Gaussianrandomization and successive convex approximation (SCA)algorithms – also known as convex-concave-procedures (CCP)– involve solving a sequence of convex subproblems. Solutionsto these subproblems can be approximated either using off-the-shelf interior-point methods or using ﬁrst-order algorithmssuch as the alternating direction method of multipliers(ADMM). While the use of interior-point methods typicallyresults in a high computational complexity, the ADMM canrequire a large number of iterations to achieve a certainaccuracy. Regardless of the algorithm used to approximateeach subproblem, the CCP results in nested approximationloops. Terminating the inner iteration after a ﬁnite numberof steps can hinder the feasibiltiy of estimates, which isrequired to ensure that the CCP converges. By contrast, ifwe assume the singular value decomposition of a matrix to becomputable, the algorithm proposed in this paper is free ofnested optimization loops.In cases where constrained minimization becomes toocostly, the superiorization methodology (see, e.g., [5], [6])constitutes a promising alternative. Whereas the goal ofconstrained minimization is to ﬁnd a feasible point (i.e.,a point satisfying all constraints) for which the objectivevalue is minimal, superiorization typically builds upon a The convergence of algorithms for computing the singular valuedecomposition is well-studied (see, e.g., [4]). a r X i v : . [ ee ss . SP ] F e b simple ﬁxed-point algorithm that produces a sequence ofpoints which provably converges to a feasible point. Thisﬁxed-point algorithm serves as the so-called basic algorithm ,which is then modiﬁed by adding small perturbations ineach iteration with the intent to ﬁnd a feasible point withreduced (not necessarily minimal) objective value. By showingthat the basic algorithm is bounded perturbation resilient,its convergence guarantee towards a feasible point can beextended to this modiﬁed algorithm called a superiorizedversion of the basic algorithm.In this paper, we consider the QoS-constrained multi-groupmulticast beamforming problem in [3] with optional per-antenna power constraints as introduced in [7]. We proposean algorithmic approximation based on superiorization of abounded perturbation resilient ﬁxed point mapping. To doso, we formulate the problem in a product Hilbert spacecomposed of subspaces of Hermitian matrices. This allowsus to approximate a feasible point of the relaxed problemwith the well-known projections onto convex sets (POCS)algorithm [8], which iteratively applies a ﬁxed-point mappingcomprised of the (relaxed) projections onto each constraint set.We show that this operator is bounded perturbation resilient,which allows us to add small perturbations in each iterationwith the intent to reduce the objective value and the distanceto the nonconvex rank-one constraints. Simulations show that,compared to existing methods, the proposed approach canprovide better approximations at a lower computational costin many cases. A. Preliminaries and Notation

Unless speciﬁed otherwise, lowercase letters denote scalars,lowercase letters in bold typeface denote vectors, uppercaseletters in bold typeface denote matrices, and letters incalligraphic font denote sets. The sets of nonnegative integers,nonnegative real numbers, real numbers, and complex numbersare denoted by N , R + , R , and C , respectively. The real part,imaginary part, and complex conjugate of a complex number x ∈ C are denoted by Re { x } , Im { x } , and x ∗ , respectively.The nonnegative part of a real number x ∈ R is denoted by ( x ) + := max( x, .We denote by Id the identity operator and by I N the N × N -identity matrix. The all-zero vector or matrix is denoted by and the i th Cartesian unit vector is denoted by e i , wherethe dimension of the space will be clear from the context.The Euclidean norm of a real or complex column vector x is denoted by (cid:107) x (cid:107) = √ x H x . The i th singular value of amatrix A ∈ C N × N is denoted by σ i ( A ) , where the singularvalues are ordered such that σ ( A ) ≥ · · · ≥ σ N ( A ) . Forsquare matrices A we deﬁne diag( A ) to be the column vectorcomposed of the diagonal of A , and for row or column vectors a we deﬁne diag( a ) to be a square diagonal matrix having a asits diagonal. We write A (cid:60) for positive semideﬁnite (PSD)matrices A .The distance between two points x , y ∈ H in a real Hilbertspace ( H , (cid:104)· , ·(cid:105) ) is d ( x , y ) = (cid:107) x − y (cid:107) , where (cid:107) · (cid:107) is thenorm induced by the inner product (cid:104)· , ·(cid:105) . The distance betweena point x ∈ H and a nonempty set C ⊂ H is deﬁnedas d ( x , C ) = inf y ∈C (cid:107) x − y (cid:107) . Following [9], we deﬁne the projection of a point x ∈ H onto a nonempty subset C ⊂ H as the set Π C ( x ) = { y ∈ C| d ( x , y ) = d ( x , C ) } , and denote by P C : H → C an arbitrary but ﬁxed selection of Π C , i.e., ( ∀ x ∈ H ) P C ( x ) ∈ Π C ( x ) . If C is nonempty, closed,and convex, the set Π C ( x ) is a singleton for all x ∈ H , so Π C has a unique selection P C , which itself is called a projector.For closed nonconvex sets C (cid:54) = ∅ in ﬁnite-dimensional Hilbertspaces, Π C ( x ) is nonempty for all x ∈ H , although it isnot generally a singleton. Nevertheless, we will refer to theselection P C as the projector, as the distinction from the set-valued operator Π C will always be clear.A ﬁxed point of a mapping T : H → H is a point x ∈ H satisfying T ( x ) = x . The set Fix( T ) = { x ∈ H | T ( x ) = x } is called the ﬁxed point set of T [10]. Given two mappings T , T : H → H , we use the shorthand T T := T ◦ T to denote their concatenation, which is deﬁned by thecomposition ( ∀ x ∈ H ) T T ( x ) := ( T ◦ T )( x ) = T ( T ( x )) .For the following statements, let ( H , (cid:104)· , ·(cid:105) ) be a real Hilbertspace with induced norm (cid:107) · (cid:107) . Deﬁnition 1.

A mapping T : H → H is called nonexpansive if ( ∀ x , y ∈ H ) (cid:107) T ( x ) − T ( y ) (cid:107) ≤ (cid:107) x − y (cid:107) [10]. Deﬁnition 2.

A mapping T : H → H is α -averagednonexpansive if there exist α ∈ (0 , and a nonexpansiveoperator R : H → H such that T = (1 − α )Id + αR [11,Deﬁnition 4.33]. Fact 1.

Let T , . . . , T L : H → H be (averaged) nonexpansivemappings with at least one common ﬁxed point. Then thecomposition T · · · T L is also (averaged) nonexpansive and Fix( T · · · T L ) = (cid:84) l ∈{ ,...,L } Fix( T l ) . [10, Fact 1], [12,Proposition 2.3] Fact 2.

Let T : H → H be a nonexpansive mapping with

Fix( T ) (cid:54) = ∅ . Then for any initial point x ∈ H and α ∈ (0 , ,the sequence ( x n ) n ∈ N ⊂ H generated by x n +1 = (1 − α ) x n + αT ( x n ) converges weakly to an unspeciﬁed point in Fix( T ) . This factis a special case of [13, Proposition 17.10b]. II. P

ROBLEM S TATEMENT

In Section II-A, we deﬁne the system model and state themulti-group multicast beamforming problem with QoS- andper-antenna-power-constraints, and we reformulate it in termsof a nonconvex semideﬁnite program (SDP). A well-knownapproach to approximating solutions to such problems resortsto solving a convex relaxation: First, the original problemis relaxed and solved using, e.g., interior point methods.Subsequently, randomization techniques are applied to obtaincandidate solutions to the original problem [3], [14]. However,in real-time applications, the complexity of interior pointsolvers becomes prohibitive as it grows very fast with thesystem size (i.e., the number of users and the number ofantennas). In ﬁnite dimensional Hilbert spaces, weak convergence implies strongconvergence [13].

Therefore, in Section II-B, we formulate the problem in areal product Hilbert space composed of complex (Hermitian)matrices. This formulation makes the problem accessible bya variety of ﬁrst-order algorithms with low complexity andprovable convergence properties.

A. System Model and Original Problem

Following the system model in [3], we consider thedownlink in a network with a transmitter equipped with N antenna elements, each of them represented by an element ofthe set N := { , . . . , N } . Each user k ∈ K := { , . . . , K } is equipped with a single receive antenna. The users aregrouped into M disjoint multicast groups G m ⊆ K indexedby m ∈ M := { , . . . , M } , such that (cid:83) Mm =1 G m = K . Eachmember of a multicast group G m is intended to receive thesame information-bearing symbol x m ∈ C . The receive signalfor the k th user can be written as y k = (cid:80) Mm =1 w Hm h k x m + n k ,where w m ∈ C N is the beamforming vector for the m thmulticast group, h k ∈ C N is the instantaneous channelto user k , and n k ∈ C — drawn independently from thedistribution CN (0 , σ k ) — is the noise sample at the receiver.Consequently, the transmit power for group G m is proportionalto (cid:107) w m (cid:107) .In this paper, we consider the multi-group multicastbeamforming problem with QoS-constraints [3], which hasthe objective to minimize the total transmit power subjectto constraints on the QoS expressed in terms of SINRrequirements. We use the following problem formulation from[7], with an individual power-constraint for each transmitantenna: minimize { w m ∈ C N } Mm =1 M (cid:88) m =1 (cid:107) w m (cid:107) (1a) s . t . ( ∀ m ∈ M )( ∀ k ∈ G m ) | w Hm h k | (cid:80) l (cid:54) = m | w Hl h k | + σ k ≥ γ k (1b) ( ∀ i ∈ N ) M (cid:88) m =1 w Hm e i e Ti w m ≤ p i (1c)The objective function in (1a) corresponds to the total transmitpower. The inequalities in (1b) constitute the SINR-constraints,where γ k is the SINR required by user k . The inequalities in(1c) correspond to the per-antenna power constraints, where e i ∈ R N is the i th Cartesian unit vector.The problem in (1) is a nonconvex QCQP, which isknown to be NP-hard [2]. A well-known strategy forapproximating solutions to such problems is the semideﬁniterelaxation technique [3], [14]. By this technique, weobtain a convex relaxation of the original problem byreformulating it as a nonconvex semideﬁnite program andby dropping the nonconvex rank constraints. More precisely,using the trace identity tr( AB ) = tr( BA ) for matrices A , B of compatible dimensions, we can write (cid:107) w m (cid:107) = w Hm w m = tr( w Hm w m ) = tr( w m w Hm ) and | w Hm h k | = w Hm h k ( w Hm h k ) ∗ = tr( w Hm h k h Hk w m ) = tr( w m w Hm h k h Hk ) .By deﬁning ( ∀ k ∈ K ) Q k = h k h Hk , and replacing theexpression w m w Hm by a positive semideﬁnite rank-one matrix X m ∈ C N × N for all m ∈ M , we obtain the nonconvexsemideﬁnite program minimize { X m ∈ C N × N } Mm =1 M (cid:88) m =1 tr( X m ) (2a) s . t . ( ∀ m ∈ M )( ∀ k ∈ G m ) (2b) tr( Q k X m ) ≥ γ k (cid:88) l (cid:54) = m tr( Q k X l ) + γ k σ k ( ∀ i ∈ N ) M (cid:88) m =1 tr( e i e Ti X m ) ≤ p i (2c) ( ∀ m ∈ M ) X m (cid:60) (2d) rank( X m ) ≤ , (2e)This formulation is equivalent to (1) in the sense that { X m = w m w Hm } Mm =1 solves (2) if and only if { w m } Mm =1 solves (1).A convex relaxation of Problem (2) can be obtained bysimply dropping the rank-constraints in (2e). The approachin [2], [3] solves this relaxed problem and, subsequently,generates candidate approximations for Problem (2) (andhence (1)) using randomization techniques. A solution tothe relaxed problem is typically found using general-purposeinterior point solvers, which results in high computationalcost for large-scale problems. In the multi-group setting [3],each randomization step involves solving an additional powercontrol problem, which further increases the computationalburden. B. Problem Formulation in a Real Hilbert Space

The objective of this section is to show that Problem (2)can be formulated in a real Hilbert space, which enables usto approach the problem by means of efﬁcient projection-based methods. To this end, we consider the real vector space V := C N × N of complex N × N -matrices. More precisely, wedeﬁne vector addition in the usual way, and we restrict scalarmultiplication to real scalars a ∈ R , where each coefﬁcientof a vector X ∈ V is multiplied by a to obtain the vector a X ∈ V . In this way, V is a real vector space, i.e., a vectorspace over the ﬁeld R .If we equip the space V with a real inner product ( ∀ X , Y ∈ V ) (cid:104) X , Y (cid:105) := Re (cid:8) tr (cid:0) X H Y (cid:1)(cid:9) , (3)which induces the standard Frobenius norm || X || = (cid:112) (cid:104) X , X (cid:105) = (cid:113) tr ( X H X ) , we obtain a real Hilbert space ( V , (cid:104)· , ·(cid:105) ) .In the remainder of this paper, we restrict our attention to thesubspace H := { X ∈ V | X = X H } of Hermitian matrices.Following the notation in [8], we deﬁne a product space H M as the M -fold Cartesian product H M := H × · · · × H (cid:124) (cid:123)(cid:122) (cid:125) M times of H . In this vector space, the sum of two vectors X =( X , . . . , X M ) and Y = ( Y , . . . , Y M ) ∈ H M is given A proof that this function is in fact a real inner product can be found inRemark 5 in the Appendix. by X + Y := ( X + Y , . . . , X M + Y M ) and scalarmultiplication is restricted to real scalars a ∈ R , where a ( X , . . . , X M ) := ( a X , . . . , a X M ) . We equip the space H M with the inner product (cid:104)(cid:104) X , Y (cid:105)(cid:105) := M (cid:88) m =1 (cid:104) X m , Y m (cid:105) , (4)which induces the norm ||| X ||| = (cid:104)(cid:104) X , X (cid:105)(cid:105) = M (cid:88) m =1 (cid:107) X m (cid:107) , where ( ∀ m ∈ M ) X m ∈ H and Y m ∈ H . Consequently, (cid:0) H M , (cid:104)(cid:104)· , ·(cid:105)(cid:105) (cid:1) is also a real Hilbert space.In order to pose Problem (2) in this Hilbert space, weexpress the objective function in (2a) and the constraints in(2b)–(2e) in terms of a convex function and closed sets in (cid:0) H M , (cid:104)(cid:104)· , ·(cid:105)(cid:105) (cid:1) as shown below:1) The objective function in (2a) can be written as thefollowing inner product: M (cid:88) m =1 tr( X m ) = (cid:104)(cid:104) J , X (cid:105)(cid:105) , (5)where J = ( I N , . . . , I N ) . This follows from (3), (4),and the fact that ( ∀ W ∈ H ) Im { tr( W ) } = 0 .2) The SINR constraint for user k ∈ K in (2b) correspondsto the closed half-space Q k = (cid:8) X ∈ H M (cid:12)(cid:12) (cid:104)(cid:104) X , Z k (cid:105)(cid:105) ≥ σ k (cid:9) , (6)where ( ∀ k ∈ K ) Z k ∈ H M is given by Z k = (cid:16) − Q k , · · · , − Q k (cid:124) (cid:123)(cid:122) (cid:125) , ··· ,g k − , γ − k Q k (cid:124) (cid:123)(cid:122) (cid:125) g k , − Q k , · · · , − Q k (cid:124) (cid:123)(cid:122) (cid:125) g k +1 , ··· ,M (cid:17) . Here, we introduced indices { g k } k ∈K that assign to eachreceiver k ∈ K the multicast group G m to which itbelongs (i.e., g k = m , if k ∈ G m ).In order to verify that the set Q k in (6) indeed representsthe SINR constraint for user k in (2b), we rearrange (cid:104)(cid:104) X , Z k (cid:105)(cid:105) = 1 γ k (cid:104) X g k , Q k (cid:105) − (cid:88) l ∈M l (cid:54) = g k (cid:104) X l , Q k (cid:105) . Using the deﬁnition of the inner product in (3), and thefact that ( ∀ W ∈ H ) W H = W and Im { tr( W ) } = 0 ,we can rewrite the constraint Q k as tr( X g k Q k ) − γ k (cid:88) l ∈M l (cid:54) = g k tr( X l Q k ) ≥ γ k σ k , which corresponds to the k th SINR constraint in (2b).3) The per-antenna power constraints in (2c) are expressedby the closed convex set P = (cid:8) X ∈ H M (cid:12)(cid:12) ( ∀ i ∈ N ) (cid:104)(cid:104) D i , X (cid:105)(cid:105) ≤ p i (cid:9) , where ( ∀ i ∈ N ) D i := ( e i e Ti , . . . , e i e Ti ) ∈ H M . (7) In the remainder of this paper, we use the convention that X m ∈ H denotes the m th component matrix of an M -tuple X ∈ H M . This follows immediately from (3) and (4).4) The PSD constraints in (2d) correspond to the closedconvex cone C + given by C + = (cid:8) ( X , . . . , X M ) ∈ H M (cid:12)(cid:12) ( ∀ m ∈ M ) X m (cid:60) (cid:9) .

5) The rank constraints in (2e) can be represented by thenonconvex set R = (cid:8) X ∈ H M | ( ∀ m ∈ M ) rank( X m ) ≤ (cid:9) . (8)Consequently, we can pose Problem (2) as minimize X ∈H M (cid:104)(cid:104) J , X (cid:105)(cid:105) (9) s . t . ( ∀ k ∈ K ) X ∈ Q k X ∈ P , X ∈ C + , X ∈ R . The problems in (2) and (9) are equivalent in the sensethat { X m ∈ V} m ∈M solves Problem (2) if and only if ( X , . . . , X M ) ∈ H M solves Problem (9). The advantage ofthe formulation in (9) is that it enables us to (i) streamlinenotation, (ii) express the updates of the algorithm proposedlater in Section III in terms of well-known projections, and (iii)simplify proofs by using results in operator theory in Hilbertspaces, as we show in the following.It is worth noting that all constraint sets described above areclosed, so a projection onto each of the sets exists for any point X ∈ H M . This property is crucial to derive projection-basedalgorithms, such as the proposed algorithm. In particular, notethat we cannot replace the inequality in (2e) with an equality,as commonly done in the literature. The reason is that, withan equality, the corresponding set is not closed, as shown inRemarks 1 and 2, and the practical implication is that theprojection may not exist everywhere. Speciﬁcally, this happenswhenever X = ( X , . . . , X M ) satisﬁes X m = for some m ∈ M , which would leave the update rule at such pointsundeﬁned in projection-based methods. This is illustrated forthe case X = ∈ H M in Example 1 below. Remark 1.

The rank constraint set R in (8) is closed. Proof:

Let (cid:0) X ( n ) (cid:1) n ∈ N be a sequence of points in R converging to a point X (cid:63) = ( X (cid:63) , . . . , X (cid:63)M ) ∈ H M and denoteby ( ∀ m ∈ M )( ∀ n ∈ N ) X ( n ) m = U ( n ) m S ( n ) m ( V ( n ) m ) H thesingular value decomposition of the m th component matrixof X ( n ) . It follows from X ( n ) ∈ R that ( ∀ m ∈ M ) S ( n ) m = diag([ s ( n ) m , , . . . , . Since a sequence of zeroscan only converge to zero, the singular value decomposition X (cid:63)m = U (cid:63)m S (cid:63)m ( V (cid:63)m ) H of the m th component matrix of X (cid:63) satisﬁes S (cid:63)m = diag([ s (cid:63)m , , . . . , for some s (cid:63)m ∈ R + .Therefore ( ∀ m ∈ M ) rank( X (cid:63)m ) ≤ , so X (cid:63) ∈ R . The aboveshows that R contains all its limit points, so it is closed. Remark 2.

By contrast, R (cid:48) = (cid:8) X ∈ H M | ( ∀ m ∈ M ) rank( X m ) = 1 (cid:9) is not a closed set, since for all X ∈ R (cid:48) and α ∈ (0 , , thesequence ( α n X ) n ∈ N in R (cid:48) converges to / ∈ R (cid:48) . Example 1.

The set-valued projection of ∈ H M onto theset R (cid:48) in Remark 2 is empty. Proof:

Suppose that Π R (cid:48) ( ) (cid:54) = ∅ and let Z ∈ Π R (cid:48) ( ) , i.e., Z is any of the closest points of the set R (cid:48) to the zero vector . Since Π R (cid:48) ( ) ⊂ R (cid:48) , ( ∀ m ∈ M ) rank( Z m ) = 1 , i.e., ( ∀ m ∈M ) σ ( Z m ) > . Therefore, for any α ∈ (0 , , α Z ∈ R (cid:48) and d ( , α Z ) < d ( , Z ) , i.e., α Z ∈ R (cid:48) is closer to the zero vectorthan Z ∈ R (cid:48) , thus contradicting our assumption that Z is oneof the closest points in R (cid:48) to the vector . III. A

LGORITHMIC S OLUTION

The main difﬁculty in solving (9) is the presence ofthe nonconvex rank constraint. A well-known technique forapproximating rank-constrained semideﬁnite programs usingconvex optimization methods is the semideﬁnite relaxationapproach [2], [3], [14]. This approach ﬁrst solves (9) withoutthe rank constraint, and then it applies heuristics to obtainrank-one approximations based on the solution to this relaxedproblem. Similarly, we can obtain a convex relaxation minimize X ∈H M (cid:104)(cid:104) J , X (cid:105)(cid:105) (10) s . t . ( ∀ k ∈ K ) X ∈ Q k X ∈ P , X ∈ C + , of Problem (9) by dropping the nonconvex constraint set R . Inprinciple, we could solve this relaxed problem using ﬁrst-ordertechniques for constrained convex minimization. For instance,we could apply a projected (sub-)gradient method (see, e.g.,[15, Section 3.2.3]), which interleaves (sub-)gradient stepsfor the objective function with projections onto the feasibleset of Problem (10). However, computing the projectiononto the intersection of all constraint sets in Problem (10)typically requires an inner optimization loop because nosimple expression for this projection is known. As it wasshown in [16], superiorization can signiﬁcantly reduce thecomputation time compared to the projected gradient methodin some applications if the projection onto the feasible set isdifﬁcult to compute.The superiorization methodology typically relies on aniterative process that solves a convex feasibility problem (i.e.,that produces a sequence of points converging to a point withinthe intersection of all constraint sets) by repeatedly applyinga computationally simple mapping. This iterative algorithm iscalled the Basic Algorithm . Based on this Basic Algorithm,the superiorization methodology automatically produces a

Superiorized Version of the Basic Algorithm , by addingbounded perturbations to the iterates of the Basic Algorithmin every iteration.

Deﬁnition 3.

Let (cid:0) Y ( n ) (cid:1) n ∈ N be a bounded sequence in a realHilbert space and let (cid:0) β ( n ) (cid:1) n ∈ N be a sequence in R + suchthat (cid:80) n ∈ N β ( n ) < ∞ . Then ( ∀ n ∈ N ) β ( n ) Y ( n ) are boundedperturbations [6]. The perturbations are typically generated based onsubgradient steps for a given objective function, in a waythat ensures the sequence of perturbations to be bounded. Byshowing that the Basic Algorithm is bounded perturbationresilient (i.e., that the resulting sequence is guaranteed toconverge to a feasible point, even when bounded perturbationsare added in each iteration), one can ensure that the sequenceproduced by the Superiorized Version of the Basic Algorithmalso converges to a feasible point. In contrast to constrainedminimization, superiorization does not guarantee that the objective value of the resulting approximation is minimal.However, the limit point of the superiorized algorithmtypically has a lower objective value than the limit point ofthe unperturbed Basic Algorithm [6].To apply the superiorization methodology to Problem (9),we proceed as follows. In Section III-A, we propose a BasicAlgorithm by deﬁning a mapping T (cid:63) : H M → H M . Givenany point X (0) ∈ H M , this mapping generates a sequence ofpoints converging to a feasible point of Problem (10) by ( ∀ n ∈ N ) X ( n +1) = T (cid:63) (cid:16) X ( n ) (cid:17) . (11)In Section III-B, we deﬁne a sequence (cid:0) β ( n ) Y ( n ) (cid:1) n ∈ N ofbounded perturbations, with the intent to reduce slightly (i)the objective value of Problem (9) and (ii) the distance to thenonconvex rank constraint R in every iteration. As we showin Proposition 2 below, the proposed perturbations can achieveboth goals simultaneously. The sequence of perturbationsyields a Superiorized Version of the Basic Algorithm givenby X (0) ∈ H M , ( ∀ n ∈ N ) X ( n +1) = T (cid:63) (cid:16) X ( n ) + β ( n ) Y ( n ) (cid:17) . (12)In Section III-C, we prove that the algorithm in (12)converges to a feasible point of Problem (10) by showingthat the mapping T (cid:63) is bounded perturbation resilient, and that (cid:0) β ( n ) Y ( n ) (cid:1) n ∈ N is a sequence of bounded perturbations. Therelation between the proposed method and the superiorizationmethodology is discussed in detail in Section III-D. Finally,the proposed algorithm is summarized in Section III-E. A. Feasibility-Seeking Basic Algorithm

A feasible point for the relaxed SDP in (10) can be foundby solving the convex feasibility problemFind X ∈ H M such that X ∈ C (cid:63) := K (cid:92) k =1 Q k ∩ P ∩ C + . (13)According to Fact 2 and Deﬁnition 2, given any X (0) ∈H M , the iteration in (11) generates a sequence of pointsconverging to a point in C (cid:63) if T (cid:63) is α -averaged nonexpansivewith Fix( T (cid:63) ) = C (cid:63) . A particular case of such a mapping,which is used in the well-known projections onto convex sets(POCS) algorithm [8], is given by (see also Fact 1) T (cid:63) := T µ K +2 C + T µ K +1 P T µ K Q K . . . T µ Q , (14)where for a nonempty closed convex set C ∈ H M , T µ C = Id + µ ( P C − Id) denotes the relaxed projector onto C with relaxation parameter µ ∈ (0 , . The formal expressions for the projections of X ∈H M onto each of the sets in (13) are given below.1) The SINR constraint sets Q k ∈ H M are half-spaces, theprojections onto which are given by [11, Example 29.20] ( ∀ k ∈ K )( ∀ X ∈ H M ) P Q k ( X ) = (cid:40) X , if X ∈ Q k X + σ k −(cid:104)(cid:104) X , Z k (cid:105)(cid:105)||| Z k ||| Z k , otherwise.2) The per-antenna power constraint set P is an intersectionof the N half-spaces deﬁned by the normal vectors D i in (7) for i ∈ N . Since these vectors are mutuallyorthogonal, i.e., ( ∀ i ∈ N )( ∀ j ∈ N \{ i } ) (cid:104)(cid:104) D i , D j (cid:105)(cid:105) = 0 ,the projection onto P can be written in closed form as P P ( X ) = X + (cid:88) i : p i < (cid:104)(cid:104) X , D i (cid:105)(cid:105) p i − (cid:104)(cid:104) X , D i (cid:105)(cid:105)||| D i ||| D i . This follows from [8, Thm 4.3-1] and Halperin’sTheorem (see [17], [18, Thm. 4.2]).3) The set C + is the intersection of PSD cones in orthogonalsubspaces of H M . The projection of X ∈ H M onto C + is therefore given component-wise by P C + ( X ) = (cid:0) P H + ( X ) , . . . , P H + ( X M ) (cid:1) , where, H + = { X ∈ H | X (cid:60) } is the cone ofPSD matrices in H . We use the eigendecomposition X m = V m Λ m V Hm with (real) eigenvalues Λ m =diag( λ ( X m ) , . . . , λ N ( X m )) to deﬁne the projection of X m ∈ H onto H + as P H + ( X m ) = V m Λ + m V Hm , where Λ + m := diag (cid:0) ( λ ( X m )) + , . . . , ( λ N ( X m )) + (cid:1) .According to the fundamental theorem of POCS [8, Thm2.5-1], the sequence (cid:0) X ( n ) (cid:1) n ∈ N of vectors X ( n ) ∈ H M produced by the update rule in (14) is guaranteed to convergeto a solution of the feasibility problem in (13) for any X (0) ∈ H M , if a solution exists (i.e., if C (cid:63) (cid:54) = ∅ ). Notethat this is the case if the relaxed semideﬁnite program in(10) is feasible. Alternatively, we can derive this convergenceguarantee immediately from Remark 4 and Fact 2. B. Proposed Perturbations

In the following, we devise perturbations that steer theiterates of the ﬁxed point algorithm in (12) towards a solutionto the nonconvex problem in (2) and (9). To do so, weintroduce a mapping that reduces the objective value and amapping that reduces the distance to rank constraint sets. Thenwe deﬁne the proposed perturbations based on the compositionof these two mappings As proven in Proposition 2 below, theresulting perturbations can achieve both goals simultaneously.

1) Power Reduction by Bounded Perturbations:

In theliterature on superiorization, the perturbations are typicallydeﬁned based on subgradient steps of the objective function(see, e.g., [6]). For the linear objective function in (10), thiswould result in perturbations of the form − α ( I N , . . . , I N ) for some α > . These perturbations are problematic for theproblem considered here because we are interested in solutionscomprised of positive semideﬁnite rank-one matrices, andadding these perturbations to an iterate X = ( X , . . . , X M ) may result in indeﬁnite full-rank component matrices X m − α I N . To avoid this problem, we introduce the function f : H M → R + given by f ( X ) := M (cid:88) m =1 (cid:107) X m (cid:107) ∗ , (15) For the case of real symmetric matrices, see, e.g., [19, Lemma 2.1]. Theresult in [19] is based on [20, Corollary 7.4.9.3], which assumes complexHermitian matrices. The generalization of [19, Lemma 2.1] to complexHermitian matrices is straightforward. where (cid:107) · (cid:107) ∗ is the nuclear norm. Since C (cid:63) ⊂ C + by (13), wehave ( ∀ X ∈ C (cid:63) )( ∀ m ∈ M )( ∀ i ∈ N ) σ i ( X m ) = λ i ( X m ) ,where λ i ( X m ) and σ i ( X m ) denote the i th eigenvalue andsingular value of the m th component matrix of X , respectively.Hence we can write f ( X ) = M (cid:88) m =1 N (cid:88) i =1 σ i ( X m ) (16) = M (cid:88) m =1 N (cid:88) i =1 λ i ( X m ) = M (cid:88) m =1 tr( X m ) . Therefore, by (5), minimizing f over C (cid:63) is equivalent tominimizing the linear objective function in (9) (or (10)) over C (cid:63) , in the sense that the solution sets to both formulations arethe same. As we will show below, this surrogate objectivefunction gives rise to power-reducing perturbations, whichare guaranteed not to increase the rank of their arguments’component matrices (see Remark 3).The power-reducing perturbations are designed accordingto two criteria. Firstly, they should decrease the value ofthe surrogate function f . Secondly, they should not be toolarge in order to avoid slowing down convergence of theBasic Algorithm. For a given point X ∈ H M we derive aperturbation Y (cid:63)τ satisfying these two criteria by solving theproblem Y (cid:63)τ := Y (cid:63)τ ( X ) ∈ arg min Y ∈H M (cid:18) τ f ( X + Y ) + 12 ||| Y ||| (cid:19) . (17)Here, ||| Y ||| acts as a regularization on the perturbations’magnitude, and the parameter τ ≥ balances the two designcriteria. The next proposition shows that Y (cid:63)τ can be easilycomputed. Proposition 1.

The unique solution to (17) is given by ( ∀ m ∈ M ) Y (cid:63)τ | m = D τ ( X m ) − X m , (18) where D τ : H → H is the singular value shrinkage operator[21] D τ ( X m ) := U m D τ ( Σ m ) V Hm , (19) D τ ( Σ m ) = diag (cid:16)(cid:8) ( σ i ( X m ) − τ ) + (cid:9) i ∈N (cid:17) , and ( ∀ m ∈ M ) X m = U m Σ m V m is the singular valuedecomposition of X m such that Σ m = diag (cid:0) { σ i ( X m ) } i ∈N (cid:1) . Proof:

Denote the perturbed point for a given choice of τ by Z (cid:63)τ := X + Y (cid:63)τ . By substituting Y = Z − X in (17) , we canidentify this point as Z (cid:63)τ = prox τf ( X ) , where the proximalmapping is given by prox τf ( X ) ∈ arg min Z ∈H M (cid:18) τ f ( Z ) + 12 ||| X − Z ||| (cid:19) . (20) Note that the function τ f ( Z )+ 12 ||| X − Z ||| = τ M (cid:88) m =1 (cid:107) Z m (cid:107) ∗ + 12 M (cid:88) m =1 (cid:107) X m − Z m (cid:107) is separable over m . Consequently, we can compute theproximal mapping in (20) by solving ( ∀ m ∈ M ) Z (cid:63)τ | m ∈ arg min Z ∈H τ (cid:107) Z (cid:107) ∗ + 12 (cid:107) X m − Z (cid:107) . (21) According to [21, Thm. 2.1], the unique solution to (21) isgiven by Z (cid:63)τ | m = D τ ( X m ) . Substituting Y (cid:63)τ = Z (cid:63)τ − X yields (18) , which is the desired result. By deﬁning ( ∀ X ∈ H M ) σ max ( X ) := max m ∈M i ∈N σ i ( X m ) (22)we can express the power-reducing perturbation for a point X ∈ H M as Y = T α P ( X ) − X , where the mapping T α P :=prox ασ max ( X ) f is given component-wise by ( ∀ m ∈ M ) T α P ( X ) | m = D τ ( X m ) with τ = ασ max ( X ) . (23)Note that T ( X ) = X , and ( ∀ α ≥ T α P ( X ) = .Therefore, the magnitude of the power-reducing perturbationscan be controlled by choosing the parameter α ∈ [0 , .Moreover, in contrast to performing subgradient steps for theoriginal cost function in (9), applying the perturbations in (23)cannot increase the rank: Remark 3.

For all α ≥ , T α P maps any point X =( X m ) m ∈M ∈ C + to a point Z = ( Z m ) m ∈M = T α P ( X ) ∈ C + satisfying ( ∀ m ∈ M ) rank( Z m ) ≤ rank( X m ) . This followsimmediately from (19) .2) Incorporating the Rank Constraints by BoundedPerturbations: Next, we deﬁne perturbations that steer theiterate towards the rank constraint set R in (8). While objectivefunctions used for superiorization are usually convex, thefunction f : H M → R + f ( X ) := d ( X , R ) , (24)i.e., the distance to the set R , constitutes a nonconvexsuperiorization objective, so our approach does not followexactly the superiorization methodology in [6] (but we canstill prove convergence).As the perturbations may steer the iterates away fromthe feasible set, their magnitude should not be unnecessarilylarge. Therefore, we choose the rank-reducing perturbations as P R ( X ) − X , where P R ( X ) ∈ Π R ( X ) denotes a (generalized)projection of a given point X ∈ H M onto the closednonconvex set R . Since R is a closed set, the set-valuedprojection Π R ( X ) is nonempty for all X ∈ H M . A projectiononto R can be computed by truncating all but the largestsingular value of each component matrix to zero. We formallystate this fact below. Fact 3.

Let X m = U m Σ m V Hm ∈ H be the singularvalue decomposition of the m th component matrix of X with Σ m = diag( σ ( X m ) , . . . , σ N ( X m )) . Then, ( ∀ X ∈ H M ) the m th component matrix of a point P R ( X ) ∈ Π R ( X ) is givenby [22, Lemma 3.2] P R ( X ) | m = U m diag ( σ ( X m ) , , . . . , V Hm . (25)

3) Combining Power- and Rank Perturbations:

Since both T α P in (23) and P R in (25) operate on the singular values of thecomponent matrices, their composition is given by ( ∀ m ∈ M ) P R T α P ( X ) | m = ( σ ( X m ) − ασ max ( X )) + u m v Hm ∈ H , The proof in [21] is for real matrices. However, the generalization tocomplex matrices is straightforward. where, ( ∀ m ∈ M ) U m = [ u m , . . . , u mN ] and V m =[ v m , . . . , v mN ] . Moreover, it is easy to verify that ( ∀ X ∈H M )( ∀ α ≥ , T α P P R ( X ) = P R T α P ( X ) . We will now use thecomposition of T α P and P R to deﬁne a mapping Y α : H M →H M by Y α := P R T α P − Id , i.e., ( ∀ X = ( X m ) m ∈M ∈ H M )( ∀ m ∈ M ) Y α ( X ) | m = ( σ ( X m ) − ασ max ( X )) + u m v Hm − X m . (26)Finally, we deﬁne the sequence (cid:0) β ( n ) Y ( n ) (cid:1) n ∈ N ofperturbations in (12) by ( ∀ n ∈ N ) Y ( n ) := Y α ( n ) (cid:16) X ( n ) (cid:17) , (27)where (cid:0) α ( n ) (cid:1) n ∈ N is a sequence in [0 , and (cid:0) β ( n ) (cid:1) n ∈ N is asummable sequence in [0 , . The following proposition showsthat the perturbations in (27) can simultaneously reduce theobjective value and the distance to the rank constraint set. Proposition 2.

Let α ∈ R + and λ ∈ [0 , . Then each of thefollowing holds for Y α : H M → H M in (26) . The perturbations cannot increase the distance to the set C + , i.e., ( ∀ X ∈ H M ) d ( X + λ Y α ( X ) , C + ) ≤ d ( X , C + ) .In particular, ( ∀ X ∈ H M ) X ∈ C + ⇒ X + λ Y α ( X ) ∈C + . If α > , the perturbations decrease the value of thefunction f in (16) : (cid:0) ∀ X ∈ H M (cid:1) f ( X + λ Y α ( X )) . If α > and X ∈ C + , then the perturbationsdecrease the objective value of Problem (9) , i.e., (cid:104)(cid:104) J , X + λ Y α ( X ) (cid:105)(cid:105) < (cid:104)(cid:104) J , X (cid:105)(cid:105) whenever (cid:104)(cid:104) J , X (cid:105)(cid:105) > . If λ > , the perturbations decrease the distance tothe rank constraint set R . More precisely, (cid:0) ∀ X ∈ H M (cid:1) f ( X + λ Y α ( X )) < f ( X ) whenever f ( X ) > . Proof:1)

This is an immediate consequence of (26) . It follows from (19) and (23) that ( ∀ X ∈ H M )( ∀ α > f ( X ) > ⇒ f ( T α P ( X )) < f ( X ) . Moreover,by (25) we have that ( ∀ λ ∈ [0 , f ((1 − λ ) X + λP R ( X )) ≤ f ( X ) . This implies f ( X + λ Y α ( X )) = f ((1 − λ ) X + λP R T α P ( X )) ≤ f ( T α P ( X )) < f ( X ) whenever f ( X ) > . This result follows from 1) and 2), since ( ∀ X ∈ C + ) (cid:104)(cid:104) J , X (cid:105)(cid:105) = f ( X ) according to (16) . Since R is closed, we can write f ( X ) = d ( X , R ) = ||| X − P R ( X ) ||| = (cid:113)(cid:80) m ∈M (cid:80) Ni =2 σ i ( X m ) . Therefore,it follows from (19) that ( ∀ X ∈ H M )( ∀ α ∈ R + ) f ( T α P ( X )) ≤ f ( X ) . Moreover, by (25) , ( ∀ λ ∈ (0 , f ( X ) > implies that f ((1 − λ ) X + λP R ( X )) . With the perturbations deﬁned in (27), the iteration in (12)yields the update rule ( ∀ n ∈ N ) X ( n +1) = T (cid:63) (cid:16) X ( n ) + β ( n ) Y α ( n ) (cid:16) X ( n ) (cid:17)(cid:17) (28)of the proposed algorithm, where X (0) ∈ H M is arbitrary, (cid:0) α ( n ) (cid:1) n ∈ N is a sequence in [0 , , and (cid:0) β ( n ) (cid:1) n ∈ N is asummable sequence in [0 , . C. Convergence of the Proposed Algorithm

We will now examine the convergence of the proposedalgorithm in (28). For this purpose, let (cid:0) β ( n ) (cid:1) n ∈ N be asummable sequence in [0 , , let (cid:0) α ( n ) (cid:1) n ∈ N be a sequenceof nonnegative numbers, and denote by (cid:0) Y ( n ) (cid:1) n ∈ N thesequence of perturbations according to (27). Then the sequence (cid:0) X ( n ) (cid:1) n ∈ N produced by the algorithm in (28) converges to afeasible point of Problem (10) for all X (0) ∈ H M . To showthis, we prove the following facts.1) The mapping T (cid:63) in (14) is bounded perturbationresilient.2) The sequence (cid:0) Y ( n ) (cid:1) n ∈ N is bounded, such that (cid:0) β ( n ) Y ( n ) (cid:1) n ∈ N is a sequence of bounded perturbations.

1) Bounded Perturbation Resilience of the Basic Algorithm:

The operator T (cid:63) in (14) is known to be α -averaged (see,e.g., [13, Example 17.12(a)]). We include this fact here forcompleteness: Remark 4.

The operator T (cid:63) in (14) is α -averagednonexpansive. Proof:

Note that, for every nonempty subset

C ⊂ H M ,the reﬂector R C = Id + 2( P C − Id) is nonexpansive[11, Corollary 4.18]. Therefore, according to Deﬁnition 2, ( ∀ µ ∈ (0 , the relaxed projector T µ C = Id + µ ( P C − Id) = Id + µ R C − Id) is µ/ -averaged. Further (see Fact 1), the composite of ﬁnitelymany averaged mappings is α -averaged for some α ∈ (0 , . Consequently, the bounded perturbation resilience of T (cid:63) follows directly from [12, Thm. 3.1]. We summarize this factin the following Lemma. Lemma 1. [12] The algorithm in (12) is guaranteed toconverge to a point in the solution set C (cid:63) of the feasibilityproblem in (13) if C (cid:63) (cid:54) = ∅ and (cid:0) β ( n ) Y ( n ) (cid:1) n ∈ N is a sequenceof bounded perturbations. Proof:

The authors of [12] have proved the boundedperturbation resilience of α -averaged nonexpansive mappingswith nonempty ﬁx-point set in a real Hilbert space.Consequently, this lemma follows from Remark 4 and [12,Thm. 3.1].2) Boundedness of the Perturbations: It remains to showthat the sequence (cid:0) Y ( n ) (cid:1) n ∈ N is bounded for all sequences (cid:0) α ( n ) (cid:1) n ∈ N of nonnegative numbers and (cid:0) β ( n ) (cid:1) n ∈ N in [0 , such that (cid:80) n ∈ N β ( n ) < ∞ , regardless of the choice of X (0) ∈H M .To this end, we note that ( ∀ n ∈ N ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Y ( n ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X ( n ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) for any sequence (cid:0) α ( n ) (cid:1) n ∈ N of nonnegative numbers: Lemma 2.

The mapping Y α in (26) satisﬁes (cid:0) ∀ X ∈ H M (cid:1) ( ∀ α ∈ R + ) |||Y α ( X ) ||| ≤ ||| X ||| . (29)Proof: Let ( ∀ m ∈ M ) X m = U m diag( { σ i ( X m ) } i ∈N ) V Hm denote the singular value decomposition of the m th componentmatrix of X . According to (26) , the m th component matrix of Y α ( X ) is given by Y α ( X ) | m = − U m S m V Hm , where ( ∀ m ∈M ) S m = diag (min( σ ( X m ) , τ ) , σ ( X m ) , . . . , σ N ( X m )) with τ = ασ max ( X ) . Since ( ∀ W ∈ H ) (cid:107) W (cid:107) = (cid:80) i ∈N σ i ( W ) , we can write |||Y α ( X ) ||| = M (cid:88) m =1 (cid:107) S m (cid:107) ≤ M (cid:88) m =1 (cid:107) X m (cid:107) = ||| X ||| , which concludes the proof. The following known result, which is a special case of [11,Lemma 5.31], will be used in Lemma 3 to prove that theproposed perturbations are bounded:

Fact 4.

Let (cid:0) a ( n ) (cid:1) n ∈ N , (cid:0) β ( n ) (cid:1) n ∈ N , and (cid:0) γ ( n ) (cid:1) n ∈ N besequences in R + such that (cid:80) n ∈ N β ( n ) < ∞ , (cid:80) n ∈ N γ ( n ) < ∞ and ( ∀ n ∈ N ) a ( n +1) ≤ (1 + β ( n ) ) a ( n ) + γ ( n ) . Then thesequence (cid:0) a ( n ) (cid:1) n ∈ N converges. Lemma 3.

Suppose that (cid:0) β ( n ) (cid:1) n ∈ N is a summable sequencein [0 , and that ( ∀ n ∈ N ) α ( n ) ≥ . Then the sequenceof perturbations (cid:0) β ( n ) Y ( n ) (cid:1) with Y ( n ) deﬁned by (27) isbounded. Proof:

We need to show that ( ∃ R ∈ R )( ∀ n ∈ N ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Y ( n ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ R . To this end, observe that (cid:0) ∀ X ( n ) ∈ H M (cid:1) ( ∀ Z ∈ Fix( T (cid:63) )) it holds that (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X ( n +1) − Z (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) T (cid:63) (cid:16) X ( n ) + β ( n ) Y ( n ) (cid:17) − Z (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ( a ) ≤ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X ( n ) + β ( n ) Y ( n ) − Z (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ( b ) ≤ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X ( n ) − Z (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) + β ( n ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Y ( n ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) , where (a) follows from the nonexpansivity of T (cid:63) , and (b) isa consequence of the triangle inequality. By Lemma 2, theperturbations deﬁned in (27) satisfy ( ∀ n ∈ N ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Y ( n ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X ( n ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) . Consequently, applying the triangle inequality againyields (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X ( n +1) − Z (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X ( n ) − Z (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) + β ( n ) (cid:16)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X ( n ) − Z (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) + ||| Z ||| (cid:17) . By deﬁning ( ∀ n ∈ N ) a ( n ) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X ( n ) − Z (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) and γ ( n ) = β ( n ) ||| Z ||| , we can deduce from Fact 4 that the sequence (cid:0) a ( n ) (cid:1) n ∈ N converges. This implies that there exists r ∈ R such that ( ∀ n ∈ N ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X ( n ) − Z (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ r .Consequently, we have ( ∀ n ∈ N ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Y ( n ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ( a ) ≤ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X ( n ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ( b ) ≤ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X ( n ) − Z (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) + ||| Z ||| ( c ) ≤ r + ||| Z ||| =: R where (a) follows from Lemma 2, (b) follows from the triangleinequality, and (c) follows from Fact 4. Combining Lemmas 1 and 3 shows that the proposedalgorithm converges to a feasible point of the relaxedsemideﬁnite program in (10). This is summarized in thefollowing proposition.

Proposition 3.

The sequence produced by the algorithmin (12) with perturbations given by (27) is guaranteed toconverge to a feasible point of Problem (10) for all X (0) ∈H M if (cid:0) β ( n ) (cid:1) n ∈ N is a summable sequence in [0 , and (cid:0) α ( n ) (cid:1) n ∈ N is a sequence in R + . Proof:

Follows immediately from Lemma 1 and Lemma 3.D. Relation to the Superiorization Methodology

The authors of [6] deﬁne superiorization as follows: ’The superiorization methodology works by takingan iterative algorithm, investigating its perturbationresilience, and then, using proactively suchpermitted perturbations, forcing the perturbedalgorithm to do something useful in addition towhat it is originally designed to do.’

Although our proposed algorithm matches this informaldeﬁnition, there are some slight differences to the formaldeﬁnition in [6], where the perturbations are required to benonascending vectors for a convex superiorization objectivefunction.

Deﬁnition 4 (Nonascending Vectors [6]) . Given a function φ : R J → R and a point y ∈ R J , a vector d ∈ R J is said tobe nonascending for φ at y iff (cid:107) d (cid:107) ≤ and there is a δ > such that for all λ ∈ [0 , δ ] we have φ ( y + λ d ) ≤ φ ( y ) . In our case, the goal of superiorization is two-fold, in thesense that it is expressed by two separate functions f : H M → R and f : H M → R . While the function f in (15) is convex,the function f in (24) (i.e., the distance to nonconvex rankconstraint set R in (8)) is a nonconvex function. Moreover,we use perturbations that are not restricted to a unit ball,and therefore they are not necessarily nonascending vectors.However, as we have shown in Proposition 2, the proposedperturbations simultaneously reduce the values of f and f .Keeping these slight distinctions in mind, we will refer to theproposed algorithm in (12) as Superiorized Projections ontoConvex Sets . E. Summary of the Proposed Algorithm

The proposed multi-group multicast beamforming algorithmis summarized in Algorithm 1. It is deﬁned by the relaxationparameters µ , . . . , µ K +2 of the operator T (cid:63) in (14), a scalar a ∈ (0 , controlling the decay of the power-reducingperturbations, a scalar b ∈ (0 , controlling the decay ofthe sequence of perturbation scaling factors, i.e., ( ∀ n ∈ N ) α ( n ) = a n and β ( n ) = b n . The stopping criterion is basedon a tolerance value (cid:15) > , and a maximum number n max ofiterations.The arguments of the algorithm are the indices g , . . . , g K assigning a multicast group to each user, the channel vectors h , . . . , h K ∈ C N , SINR requirements γ , . . . , γ K , and noisepowers σ , . . . , σ K of all users as well as the per-antennapower constraints p , . . . , p N . At each step, the algorithmcomputes a perturbation according to (26) and applies thefeasibility seeking operator T (cid:63) in (14). It terminates whenthe relative variation of the estimate falls within the tolerance (cid:15) , or when the maximum number n max of iterations is reached. Finally, the beamforming vectors w = { w m } m ∈M are computed by extracting the strongest principal component ( ∀ m ∈ M ) w m = ψ ( X m ) := (cid:112) σ ( X m ) u m , (30)where ( ∀ m ∈ M ) X m = U m Σ m V Hm , U m =[ u m , · · · , u mN ] , and Σ m = diag ( σ ( X m ) , . . . , σ N ( X m )) . Algorithm 1

Superiorized Projections onto Convex Sets Parameters: { µ k } K +2 k =1 , a, b ∈ (0 , , (cid:15) > , n max ∈ N Input: { g k } k ∈K , { h k } k ∈K , { γ k } k ∈K , { σ k } k ∈K , { p i } i ∈N Output: { w m ∈ C N } m ∈M Initialization:

Choose arbitrary X (0) ∈ H M for n = 0 , . . . , n max − do Y ( n ) ← Y a n (cid:0) X ( n ) (cid:1) (cid:46) Eq. (26) X ( n +1) ← T (cid:63) (cid:0) X ( n ) + b n Y ( n ) (cid:1) (cid:46) Eq. (14) if (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X ( n +1) − X ( n ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) < (cid:15) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X ( n +1) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) then break end if end for return w = (cid:110) ψ (cid:16) X ( n +1) m (cid:17)(cid:111) m ∈M (cid:46) Eq. (30)IV. N

UMERICAL R ESULTS

In this section, we compare Algorithm 1 (

S-POCS ) toseveral other methods from the literature. We choose identicalnoise levels and target SINRs for all users, i.e., ( ∀ k ∈ K ) σ k = σ and γ k = γ . For each problem instance, we generate K i.i.d.Rayleigh-fading channels ( ∀ k ∈ K ) h k ∼ CN ( , σ I N ) .In the ﬁrst simulation, we drop the per-antenna powerconstraints, i.e., we set ( ∀ i ∈ N ) p i = ∞ , and we considerthe following algorithms: • The proposed method summarized in Algorithm 1(

S-POCS ) • Semideﬁnite relaxation with Gaussian randomization [3](

SDR-GauRan ) • The successive convex approximation algorithm from[23], [24] (

FPP-SCA ) • The ADMM-based convex-concave procedure from [7](

CCP-ADMM )The

S-POCS algorithm is as described in Algorithm 1, withparameters a = 0 . , b = 0 . , (cid:15) = 10 − , n max = 10 .For the QoS-constraint sets, we use relaxation parameters ( ∀ k ∈ K ) µ k = 1 . , and for the per-antenna power constraintset P and the PSD constraint C + , we use unrelaxed projections,i.e., µ K +2 = µ K +1 = 1 . We initialize the S-POCS algorithmwith X (0) = . The convex optimization problems in the SDR-GauRan and

FPP-SCA algorithms are solved withthe interior point solver SDPT3 [25]. The parameters of the

CCP-ADMM algorithm are as speciﬁed in [7]. Achieving a faircomparison between these methods is difﬁcult because thestructure of the respective algorithms is quite different.The

SDR-GauRan algorithm begins by solving therelaxed problem in (10), and, subsequently, generates randomcandidate beamforming vectors using the

RandA method [2],[3]. In the multi-group setting, where

M > , an additionalconvex optimization problem (multigroup multicast powercontrol (MMPC), [3]) needs to be solved for each candidatevector. If no feasible MMPC problem is found during the RandA procedure, we deﬁne the output of the

SDR-GauRan algorithm to be { ψ ( X (cid:63)m ) } m ∈M , where X (cid:63) ∈ H M is a solutionto the relaxed SDP in (10).The FPP-SCA algorithm from [23] works by solvinga sequence of convex subproblems. By introducing slackvariables, the feasibility of each subproblem is ensured. Thisobviates the need for a feasible initialization point, whichis typically required to ensure convergence of CCP/SCAalgorithms.The

CCP-ADMM algorithm uses an ADMM algorithm to ﬁnda feasible starting point for the CCP. Subsequently, a similarADMM algorithm is used to approximate each subproblemof the CCP. Because the ADMM is a ﬁrst-order method,the performance of

CCP-ADMM is heavily dependent on thestopping criteria of the inner ADMM algorithm.By contrast, the

S-POCS algorithm does not require aninitialization phase, and it works by iteratively applying asequence of operators, which can be computed in a ﬁxednumber of steps. Therefore, we compare the performancebased on computation time. Although we exclude the timerequired for evaluating the performance, we note that thecomputation time required by each of the methods severelydepends on the particular implementation.The authors of [7] assess the performance of the consideredalgorithms based by comparing the transmit power achievedby the resulting beamformers. However, none of the methodsconsidered here can guarantee feasibility of the beamformingvectors, when the algorithms are terminated after a ﬁnitenumber of iterations. Furthermore, in the multi-group case, itmay not be possible to scale an arbitrary candidate beamformer w = { w m ∈ C N } m ∈M such that it satisﬁes all constraints inProblem (1). In principle, we could evaluate the performanceby observing both the objective value (i.e., the transmit powerof the beamformers) and a measure of constraints violationsuch as the normalized proximity function used in [26].However, deﬁning this measure of constraints violation isnot straightforward, as the considered methods approach theproblem in different spaces. Moreover, we are interested inexpressing the quality of a beamforming vector by a singlevalue to simplify the presentation. Therefore, we will comparethe performance based on the minimal SINR achieved by thebeamformer (cid:112) ρ ( w ) · w with ρ ( w ) = min (cid:32) P (cid:63) SDR (cid:80) Mm =1 w Hm w m , min i ∈N (cid:32) p i (cid:80) Mm =1 | w im | (cid:33)(cid:33) . The scaled vector (cid:112) ρ ( w ) · w satisﬁes all power constraints,and its total power is bounded by the optimal objectivevalue P (cid:63) SDR of the relaxed SDP in (10). More compactly,given a candidate beamformer w = { w m ∈ C N } m ∈M forProblem (1), we assess its performance based on the function SINR min ρ ( w ) = min k ∈K | w Hm h k | (cid:80) l (cid:54) = m | w Hl h k | + σ k ρ ( w ) . (31)Since P (cid:63) SDR is a lower bound on the objective value of theoriginal problem in (1), it holds ( ∀{ w m ∈ C N } m ∈M ) that SINR min ρ ( w ) ≤ γ , where equality can only be achieved, ifthe relaxed problem in (10) has a solution composed of rank-one matrices. For the sake of simplicity, we will refer to the minimal SINR achieved bythe scaled beamformer (cid:112) ρ ( w ) · w in (31) as SINR in the following.

A. Performance vs. Computation Time

We will now examine how the performance metric in(31) evolves over time for beamforming vectors produced bythe respective algorithms. Figure 1 shows the performancecomparison for an exemplary scenario with N = 20 antennas,and K = 20 users split evenly into M = 2 groups, where σ = 1 , γ = 1 , and ( ∀ i ∈ N ) p i = ∞ . computation time [s] -10-9-8-7-6-5-4-3-2-10 S I NR m i n [ d B ] Target SINRSDR-GauRanFPP-SCACCP-ADMMS-POCS

Fig. 1.

SINR min ρ ( w ( t ) ) over time in a system with N = 20 antennas and K = 20 users users split evenly into M = 2 multicast groups. It can be seen that the

S-POCS algorithm quickly convergesto a point achieving an SINR close to the speciﬁed target value γ . The discontinuities in the SINR curve for the CCP-ADMM algorithm are due to the inner- and outer optimizationloops. For the

SDR-GauRan algorithm, the SINR increaseswhenever the randomization produces a beamformer withbetter performance than the previous one. The SINR ofthe

FPP-SCA algorithm improves continuously, albeit moreslowly than the

S-POCS and

CCP-ADMM algorithms. computation time [s] -10-9-8-7-6-5-4-3-2-10 S I NR m i n [ d B ] SDR-GauRanFPP-SCACCP-ADMMS-POCSTarget SINR

Fig. 2.

SINR min ρ ( w ( t ) ) over time in a system with N = 20 antennas and K = 20 users split evenly into M = 2 multicast groups. The shaded regionsinclude the outcomes for , , , and out of 100 probleminstances, respectively, and the bold line represents the median. Next, we evaluate the performance over 100 randomlygenerated problems. Since the SINR does not increasemonotonically for all of the methods considered, we assumethat each algorithm can keep track of the best beamformerproduced so far. In this way, the oscillations in the SINR metricfor the

CCP-ADMM algorithm do not have a negative impacton its average performance.Figure 2 shows the performance of the beamforming vectorscomputed with the respective algorithms over time for asystem with N = 20 transmit antennas, and K = 20 userssplit evenly into M = 2 multicast groups. The shaded regionscorrespond to the , , , and quantiles overall randomly generated problems. More precisely, the marginsof the shaded regions correspond to the 1st, 13th, 26th, 38th,63rd, 75th, 88th, and 100th out of 100 sorted y-axis values.For each algorithm, the median is represented by a boldline. The S-POCS algorithm achieves the highest medianSINR, while requiring the lowest computation time amongall methods considered. Moreover, it can be seen that thevariation around this median value is less severe comparedto the remaining approaches. Put differently, the time requiredfor reaching a certain SINR varies much less severely for the

S-POCS algorithm than for the remaining methods. This canbe of particular interest in delay sensitive applications, wherea beamforming vector for a given channel realization must becomputed within a ﬁxed time period.

B. Varying number of antennas

In this subsection, we investigate the impact of the transmitantenna array size N on the performance of the respectivebeamforming algorithms. To do so, we generate 100 randomproblem instances for each array size N with K = 20 userssplit evenly in to M = 2 multicast groups. We choose unittarget SINR and unit noise power for all users, and unit per-antenna power constraints, i.e., γ = 1 , σ = 1 and ( ∀ i ∈ N ) p i = 1 . number of antenna elements N -3.5-3-2.5-2-1.5-1-0.50 S I NR m i n [ d B ] Target SINRSDR-GauRanFPP-SCACCP-ADMMS-POCS

Fig. 3.

SINR min ρ ( w ) for K = 20 users split evenly into M = 2 groups forvarying antenna array sizes N . For the

SDR-GauRan algorithm, we generate candidate beamforming vectors for each problem instance. Weuse the

CCP-ADMM algorithm with parameters as speciﬁed in [7]. Since the inner ADMM iteration converges slowly forsome problem instances, we set the maximal number of stepsof the ADMM to j max = 300 . For the outer CCP loop, we usethe stopping criteria speciﬁed in [7], i.e., we stop the algorithmonce the relative decrease of the objective value is below − or t max = 30 outer iterations are exceeded. For the FPP-SCA algorithm, we use a ﬁxed number of successive convexapproximation steps. number of antenna elements N -1 c o m pu t a t i on t i m e [ s ] SDR-GauRanFPP-SCACCP-ADMMS-POCS

Fig. 4. Computation time for K = 20 users split evenly into M = 2 groupsfor varying antenna array sizes N . Figure 3 shows the performance metric in (31) fordifferent numbers N of transmit antennas, averagedover 100 random problem instances each. For all N , S-POCS achieves highest value for

SINR min ρ ( · ) , followedby the FPP-SCA , CCP-ADMM , and

SDR-GauRan algorithms. For N ≥ , the S-POCS algorithmachieves an SINR of

SINR min ρ ( w S-POCS ) ≥ − .

05 dB .By contrast, the remaining methods do not exceed

SINR min ρ ( w FPP-SCA ) = − .

12 dB , SINR min ρ ( w CCP-ADMM ) ≥− .

15 dB , SINR min ρ ( w SDR-GauRan ) ≥ − .

18 dB , respectively.The corresponding average computation times are shownin Figure 4. The

S-POCS algorithm requires .

26 % – .

38 % of the computation time required by

SDR-GauRan , .

95 % – .

64 % of the computation time required by

FPP-SCA ,and .

49 % – . of the computation time required by CCP-ADMM . For N ≥ , the computation time of S-POCS exceeds that of

CCP-ADMM . C. Varying number of users

In the following simulation, we ﬁx an array size of N = 50 antenna elements, and we evaluate the performance of eachmethod for K ∈ { , , , , , } users split evenly into M = 4 multicast groups. Figure 5 shows the performancemetric in (31) averaged over random problem instancesfor each K . As before, we choose γ = 1 , σ = 1 , and ( ∀ i ∈ N ) p i = 1 .While all algorithms achieve close to optimal performancefor small numbers of users, the SINR in (31) decreasesconsiderably faster for SDR-GauRan than for the remainingmethods. For all values of K , S-POCS achieves the highestvalue for

SINR min ρ ( · ) among all methods. number of users K -6-5-4-3-2-10 S I NR m i n [ d B ] Target SINRSDR-GauRanFPP-SCACCP-ADMMS-POCS

Fig. 5.

SINR min ρ ( w ) for a system with N = 50 transmit antennas and avarying number of users split evenly into M = 4 multicast groups. The corresponding average computation times are shown inFigure 6.

S-POCS requires .

76 % – .

12 % of the computationtime required by

SDR-GauRan , .

75 % – .

41 % of thecomputation time required by

FPP-SCA , and .

18 % – of the computation time required by CCP-ADMM . Whilethe

CCP-ADMM takes only a fraction of the time requiredby

S-POCS for small K , it slows down considerably as K increases. For moderate and large numbers of users, S-POCS outperforms the remaining methods in terms of bothapproximation gap and computation time. number of users K -1 c o m pu t a t i on t i m e [ s ] SDR-GauRanFPP-SCACCP-ADMMS-POCS

Fig. 6. Computation time for a system with N = 50 transmit antennas anda varying number of users split evenly into M = 4 multicast groups. D. Varying Target SINR

In the following simulation, we evaluate the impact of thetarget SINR on the respective algorithms in a system with N = 30 antenna elements, K = 20 users split evenly into M = 2 multicast groups, and unit noise power σ = 1 . Sincethe target SINR has a strong impact on the transmit power, weset ( ∀ i ∈ N ) p i = ∞ , to avoid generating infeasible instancesof Problem (1). Target SINR [dB] -2024681012 S I NR m i n [ d B ] Target SINRSDR-GauRanFPP-SCACCP-ADMMS-POCS

Fig. 7.

SINR min ρ ( w ) for a system with N = 30 transmit antennas and K = 20 users split evenly into M = 2 multicast groups. Figure 7 shows the performance metric in (31) achievedby each method for the respective target SINR. Exceptfor the

SDR-GauRan algorithm, which exhibits a gap ofabout to the target SINR, all methods achieve close tooptimal performance for each target SINR. Figure 8 showsthe computation time required by each algorithm for varyingtarget SINR γ . The average computation time of FPP-SCA is almost constant. For

SDR-GauRan and

CCP-ADMM , thecomputation decreases slightly with an increasing target SINR.While the proposed

S-POCS algorithm converges quickly forlow target SINR levels, its computation time exceeds that ofthe

CCP-ADMM for target SINRs above . This indicatesthat the best choice of ﬁrst-order algorithms for multicastbeamforming depends on the regime in which the system isoperated.

Target SINR [dB] c o m pu t a t i on t i m e [ s ] SDR-GauRanFPP-SCACCP-ADMMS-POCS

Fig. 8. Computation time for a system with N = 30 transmit antennas and K = 20 users split evenly into M = 2 multicast groups. V. C

ONCLUSION

In this paper, we proposed an algorithm for multi-groupmulticast beamforming with per-antenna power constraints.We showed that the sequence produced by this algorithm is guaranteed to converge to a feasible point of the relaxedsemideﬁnite program, while the perturbations added in eachiteration reduce the objective value and the distance to thenonconvex rank constraints. Numerical comparisons show thatthe proposed method outperforms state-of-the-art algorithmsin terms of both approximation gap and computation timein many cases. Its advantage over existing algorithms isparticularly pronounced in the low target SINR regime as wellas for large numbers of receivers. This makes the proposedmethod particularly relevant for low-energy or massive accessapplications.In comparison to other techniques, the computation timeof the proposed method varies less severely across differentproblem instances of the same dimension. In communicationsystems, which are typically subject to strict latencyconstraints, the iteration can be terminated after a ﬁxed numberof steps without suffering severe performance loss. Moreover,the simple structure of the proposed method allows for astraightforward implementation in real-world systems.The applicability of the proposed algorithm is not restrictedto the multicast beamforming problem considered here. Aslight modiﬁcation of the rank-constraint naturally leads toan algorithm for the general rank multicast beamformingproblem considered in [27]. Future research could applysuperiorized projections onto convex sets to other nonconvexQCQP problems such as MIMO detection or sensor networklocalization [14]. VI. A PPENDIX

Remark 5.

The function (cid:104)· , ·(cid:105) deﬁned in (3) is a real innerproduct. Proof:

Given a real vector space V , a real inner product isa function (cid:104)· , ·(cid:105) : V × V → R satisfying [28] ( ∀ x ∈ V ) (cid:104) x , x (cid:105) ≥ and (cid:104) x , x (cid:105) = 0 ⇐⇒ x = ( ∀ x , y ∈ V ) (cid:104) x , y (cid:105) = (cid:104) y , x (cid:105) ( ∀ x , y ∈ V )( ∀ α ∈ R ) (cid:104) α x , y (cid:105) = α (cid:104) x , y (cid:105) ( ∀ x , y , z ∈ V ) (cid:104) x + y , z (cid:105) = (cid:104) x , y (cid:105) + (cid:104) y , z (cid:105) .Note that ( ∀ X ∈ V ) Re { tr( X H X ) } = tr( X H X ) = (cid:107) X (cid:107) ,where (cid:107) · (cid:107) F is the standard Frobenius norm. Consequently, 1)follows from the nonnegativity and positive-deﬁniteness of anorm. The symmetry in 2) follows from the fact that tr( AB ) =tr( BA ) for matrices A , B with compatible dimensions, and Re { tr( X ) } = Re { tr( X H ) } for X ∈ V . Moreover, 3) and 4)follow from the linearity of Re {·} and tr( · ) . R EFERENCES[1] Yang Zheng, Shengbo Eben Li, Jianqiang Wang, Dongpu Cao, andKeqiang Li, “Stability and scalability of homogeneous vehicularplatoon: Study on the inﬂuence of information ﬂow topologies,”

IEEETransactions on intelligent transportation systems , vol. 17, no. 1, pp.14–26, 2015.[2] Nikos D Sidiropoulos, Timothy N Davidson, and Zhi-Quan Luo,“Transmit beamforming for physical-layer multicasting,”

IEEE Trans.Signal Processing , vol. 54, no. 6-1, pp. 2239–2251, 2006.[3] Eleftherios Karipidis, Nicholas D Sidiropoulos, and Zhi-Quan Luo,“Quality of service and max-min fair transmit beamforming to multiplecochannel multicast groups,”

IEEE Transactions on Signal Processing ,vol. 56, no. 3, pp. 1268–1279, 2008.[4] Charles F Van Loan and Gene H Golub,

Matrix computations , JohnsHopkins University Press Baltimore, 1983.[5] Gabor T Herman, Edgar Gardu˜no, Ran Davidi, and Yair Censor,“Superiorization: An optimization heuristic for medical physics,”

Medical physics , vol. 39, no. 9, pp. 5532–5546, 2012. [6] Yair Censor, “Weak and strong superiorization: Between feasibility-seeking and minimization,”

Analele Universitatii” Ovidius” Constanta-Seria Matematica , vol. 23, no. 3, pp. 41–54, 2015.[7] Erkai Chen and Meixia Tao, “ADMM-based fast algorithm for multi-group multicast beamforming in large-scale wireless systems,”

IEEETransactions on Communications , vol. 65, no. 6, pp. 2685–2698, 2017.[8] Henry Stark and Yongi Yang,

Vector space projections: a numericalapproach to signal and image processing, neural nets, and optics , JohnWiley & Sons, Inc., 1998.[9] Heinz H Bauschke, Patrick L Combettes, and D Russell Luke, “Phaseretrieval, error reduction algorithm, and ﬁenup variants: a view fromconvex optimization,”

JOSA A , vol. 19, no. 7, pp. 1334–1345, 2002.[10] Isao Yamada and Nobuhiko Ogura, “Adaptive projected subgradientmethod for asymptotic minimization of sequence of nonnegative convexfunctions,”

Numerical Functional Analysis and Optimization , vol. 25,no. 7-8, pp. 593–617, 2005.[11] Heinz H Bauschke and Patrick L Combettes,

Convex analysis andmonotone operator theory in Hilbert spaces , Springer, 2 edition, 2011.[12] Hongjin He and Hong-Kun Xu, “Perturbation resilience andsuperiorization methodology of averaged mappings,”

Inverse Problems ,vol. 33, no. 4, pp. 044007, 2017.[13] Isao Yamada, Masahiro Yukawa, and Masao Yamagishi, “Minimizingthe Moreau envelope of nonsmooth convex functions over the ﬁxed pointset of certain quasi-nonexpansive mappings,” in

Fixed-Point Algorithmsfor Inverse Problems in Science and Engineering , pp. 345–390. Springer,2011.[14] Zhi-Quan Luo, Wing-Kin Ma, Anthony Man-Cho So, Yinyu Ye, andShuzhong Zhang, “Semideﬁnite relaxation of quadratic optimizationproblems,”

IEEE Signal Processing Magazine , vol. 27, no. 3, pp. 20–34, 2010.[15] Yurii Nesterov,

Lectures on convex optimization , vol. 137, Springer,2018.[16] Yair Censor, Ran Davidi, Gabor T Herman, Reinhard W Schulte,and Luba Tetruashvili, “Projected subgradient minimization versussuperiorization,”

Journal of Optimization Theory and Applications , vol.160, no. 3, pp. 730–747, 2014.[17] Israel Halperin, “The product of projection operators,”

Acta Sci.Math.(Szeged) , vol. 23, no. 1, pp. 96–99, 1962.[18] Omer Ginat, “The method of alternating projections,” arXiv preprintarXiv:1809.05858 , 2018.[19] Paul J Goulart, Yuji Nakatsukasa, and Nikitas Rontsis, “Accuracy ofapproximate projection to the semideﬁnite cone,”

Linear Algebra andits Applications , vol. 594, pp. 177–192, 2020.[20] Roger A Horn and Charles R Johnson,

Matrix analysis , Cambridgeuniversity press, 2 edition, 2013.[21] Jian-Feng Cai, Emmanuel J Cand`es, and Zuowei Shen, “A singularvalue thresholding algorithm for matrix completion,”

SIAM Journal onoptimization , vol. 20, no. 4, pp. 1956–1982, 2010.[22] D Russell Luke, “Prox-regularity of rank constraint sets and implicationsfor algorithms,”

Journal of Mathematical Imaging and Vision , vol. 47,no. 3, pp. 231–238, 2013.[23] Omar Mehanna, Kejun Huang, Balasubramanian Gopalakrishnan, AritraKonar, and Nicholas D Sidiropoulos, “Feasible point pursuit andsuccessive approximation of non-convex QCQPs,”

IEEE SignalProcessing Letters , vol. 22, no. 7, pp. 804–808, 2014.[24] Dimitrios Christopoulos, Symeon Chatzinotas, and Bj¨orn Ottersten,“Multicast multigroup beamforming for per-antenna power constrainedlarge-scale arrays,” in .IEEE, 2015, pp. 271–275.[25] Kim-Chuan Toh, Michael J Todd, and Reha H T¨ut¨unc¨u, “SDPT3—aMATLAB software package for semideﬁnite programming, version 1.3,”

Optimization methods and software , vol. 11, no. 1-4, pp. 545–581, 1999.[26] Yair Censor, Wei Chen, Patrick L Combettes, Ran Davidi, and Gabor THerman, “On the effectiveness of projection methods for convexfeasibility problems with linear inequality constraints,”

ComputationalOptimization and Applications , vol. 51, no. 3, pp. 1065–1088, 2012.[27] Dima Taleb and Marius Pesavento, “General rank beamforming usingfull rate real-value OSTBC for multicasting networks,” in

WSA 2020;24th International ITG Workshop on Smart Antennas . VDE, 2020, pp.1–5.[28] P.K. Jain, O.P. Ahuja, and Khalil Ahmad,