Tightness and Equivalence of Semidefinite Relaxations for MIMO Detection
TTIGHTNESS AND EQUIVALENCE OF SEMIDEFINITERELAXATIONS FOR MIMO DETECTION
RUICHEN JIANG ∗ , YA-FENG LIU † , CHENGLONG BAO ‡ , AND
BO JIANG § Abstract.
The multiple-input multiple-output (MIMO) detection problem, a fundamental prob-lem in modern digital communications, is to detect a vector of transmitted symbols from the noisyoutputs of a fading MIMO channel. The maximum likelihood detector can be formulated as a com-plex least-squares problem with discrete variables, which is NP-hard in general. Various semidefiniterelaxation (SDR) methods have been proposed in the literature to solve the problem due to theirpolynomial-time worst-case complexity and good detection error rate performance. In this paper, weconsider two popular classes of SDR-based detectors and study the conditions under which the SDRsare tight and the relationship between different SDR models. For the enhanced complex and realSDRs proposed recently by Lu et al. , we refine their analysis and derive the necessary and sufficientcondition for the complex SDR to be tight, as well as a necessary condition for the real SDR to betight. In contrast, we also show that another SDR proposed by Mobasher et al. is not tight withhigh probability under mild conditions. Moreover, we establish a general theorem that shows theequivalence between two subsets of positive semidefinite matrices in different dimensions by exploit-ing a special “separable” structure in the constraints. Our theorem recovers two existing equivalenceresults of SDRs defined in different settings and has the potential to find other applications due toits generality.
Key words.
MIMO detection, semidefinite relaxation, tight relaxation, equivalent relaxation
AMS subject classifications.
1. Introduction.
Multiple-input multiple-output (MIMO) detection is a funda-mental problem in modern digital communications [33, 36]. The MIMO channel canbe modeled as(1.1) r = H x ∗ + v , where r ∈ C m is the vector of received signals, H ∈ C m × n is a complex channelmatrix, x ∗ is the vector of transmitted symbols, and v is the vector of additiveGaussian noises. Moreover, each entry of x ∗ is drawn from a discrete symbol set S determined by the modulation scheme.The MIMO detection problem is to recover the transmitted symbol vector x ∗ from the noisy channel output r , with the information of the symbol set S and thechannel matrix H . Under the assumption that each entry of x ∗ is drawn uniformlyand independently from the symbol set S , it is known that the maximum likelihooddetector can achieve the optimal detection error rate performance. Mathematically,it can be formulated as a discrete least-squares problem:(1.2) min x ∈ C n (cid:107) H x − r (cid:107) s . t . x i ∈ S , i = 1 , , . . . , n, ∗ Department of Electronic Engineering, Tsinghua University, Beijing 100084, China([email protected]). † State Key Laboratory of Scientific and Engineering Computing, Institute of ComputationalMathematics and Scientific/Engineering Computing, Academy of Mathematics and Systems Science,Chinese Academy of Sciences, Beijing 100190, China (yafl[email protected]). ‡ Yau Mathematical Sciences Center, Tsinghua University, Beijing 100084, China([email protected]). § School of Mathematical Sciences, Key Laboratory for NSLSCS of Jiangsu Province, NanjingNormal University, Nanjing 210023, China ([email protected]).1 a r X i v : . [ m a t h . O C ] F e b R. JIANG, Y.-F. LIU, C. BAO, AND B. JIANG where x i denotes the i -th entry of the vector x and (cid:107) · (cid:107) denotes the Euclideannorm. In this paper, unless otherwise specified, we will focus on the M -ary phaseshift keying ( M -PSK) modulation, whose symbol set is given by(1.3) S M := (cid:8) z ∈ C : | z | = 1 , arg( z ) ∈ { jπ/M, j = 0 , , . . . , M − } (cid:9) , where | z | and arg( z ) denote the modulus and argument of a complex number, respec-tively. As in most practical digital communication systems, throughout the paper werequire M = 2 b where b ≥ .Many detection algorithms have been proposed to solve problem (1.2) either ex-actly or approximately. However, for general H and r , problem (1.2) has been provedto be NP-hard [32]. Hence, no polynomial-time algorithms can find the exact solution(unless P = NP). Sphere decoding [4], a classical combinatorial algorithm based onthe branch-and-bound paradigm, offers an efficient way to solve problem (1.2) exactlywhen the problem size is small, but its expected complexity is still exponential [11].On the other hand, some suboptimal algorithms such as linear detectors [23, 6] anddecision-feedback detectors [35, 5] enjoy low complexity but at the expense of sub-stantial performance loss: see [36] for an excellent review.Over the past two decades, semidefinite relaxation (SDR) has gained increasingattention in non-convex optimization [7, 18, 34]. It is a celebrated technique to tacklequadratic optimization problems arising from various signal processing and wirelesscommunication applications, such as beamforming design [24, 15], sensor networklocalization [3, 2, 27], and angular synchronization [25, 1, 37]. Such SDR-based ap-proaches can usually offer superior performance in both theory and practice whilemaintaining polynomial-time worst-case complexity.For MIMO detection problem (1.2), the first SDR detector [30, 20] was designedfor the real MIMO channel and the binary symbol set S = { +1 , − } . Notably, itis proved that this detector can achieve the maximal possible diversity order [12],meaning that it achieves an asymptotically optimal detection error rate when thesignal-to-noise ratio (SNR) is high. It was later extended to the more general settingwith a complex channel and an M -PSK symbol set in [28, 19], which we refer toas the conventional SDR or (CSDR). However, this conventional approach fails tofully utilize the structure in the symbol set S . To overcome this issue, researchershave developed various improved SDRs and we consider the two most popular classesbelow. The first class proposed in [22] is based on an equivalent zero-one integerprogramming formulation of problem (1.2). Four SDR models were introduced andtwo of them will be discussed in details later (see (ESDR1- T ) and (ESDR2- T ) furtherahead). The second class proposed in [17] further enhances (CSDR) by adding validcuts, resulting in a complex SDR and a real SDR (see (ESDR- X ) and (ESDR- Y )later on).In this paper, we focus on two key problems in SDR-based MIMO detection: thetightness of SDRs and the relationship between different SDR models. Firstly, notethat SDR detectors are suboptimal algorithms as they replace the original discreteoptimization problem (1.2) with tractable semidefinite programs (SDPs). Hence, aftersolving an SDP, we need some rounding procedure to make final symbol decisions.However, under some favorable conditions on H and v , an SDR can be tight , i.e.,it has an optimal rank-one solution corresponding to the true vector of transmittedsymbols. Such tightness conditions are of great interest since they give theoreticalguarantees on the optimality of SDR detectors. While it has been well studied for the Our results in section 3 also hold for the more general case where M is a multiple of four.IGHTNESS AND EQUIVALENCE OF SDRS FOR MIMO DETECTION H ∈ R m × n , v ∈ R m , and S = { +1 , − } , for the moregeneral case where H ∈ C m × n , v ∈ C m , and S = S M ( M ≥ if the following conditionis satisfied:(1.4) λ min ( H † H ) sin (cid:16) πM (cid:17) > (cid:107) H † v (cid:107) ∞ , where λ min ( · ) denotes the smallest eigenvalue of a matrix, ( · ) † denotes the conjugatetranspose, and (cid:107) · (cid:107) ∞ denotes the L ∞ -norm. To the best of our knowledge, this isthe best condition that guarantees a certain SDR to be tight for problem (1.2) in the M -PSK settings.Secondly, researchers have noticed some rather unexpected equivalence betweendifferent SDR models independently developed in the literature. The earliest oneof such results is reported in [21], where three different SDRs for the high-orderquadrature amplitude modulation (QAM) symbol sets are proved to be equivalent.Very recently, the authors in [16] showed that the enhanced real SDR proposed in [17]is equivalent to one SDR model in [22]. It is worth noting that while these two papersare of the same nature, the proof techniques are quite different and it is unclear howto generalize their results at present.In this paper, we make contributions to both problems. For the tightness of SDRs,we sharpen the analysis in [17] to give the necessary and sufficient condition for thecomplex enhanced SDR to be tight, and a necessary condition for the real enhancedSDR to be tight. Specifically, for the case where M ≥
4, we show that the enhancedcomplex SDR (ESDR- X ) is tight if and only if(1.5) H † H + Diag (cid:0) Re(Diag( x ∗ ) − H † v ) (cid:1) − cot (cid:16) πM (cid:17) Diag( | Im(Diag( x ∗ ) − H † v ) | (cid:1) (cid:23) , while the enhanced real SDR (ESDR- Y ) is tight only if(1.6) H † H + Diag (cid:0) Re(Diag( x ∗ ) − H † v ) (cid:1) − cot (cid:18) πM (cid:19) Diag( | Im(Diag( x ∗ ) − H † v ) | (cid:1) (cid:23) , where A (cid:23) A is positive semidefinite (PSD), Diag( x ) denotesa diagonal matrix whose diagonals are the vector x , and Re( · ), Im( · ), and |·| denote theentrywise real part, imaginary part, and absolute value of a number/vector/matrix,respectively. Moreover, we prove that one of the SDR models proposed in [22] isgenerally not tight: under some mild assumptions, its probability of being tight decaysexponentially with respect to the number of transmitted symbols n .For the relationship between different SDR models, we propose a general theoremshowing the equivalence between two subsets of PSD cones. Specifically, we provethe correspondence between a subset of a high-dimensional PSD cone with a special“separable” structure and the one in a lower dimension. Our theorem covers bothequivalence results in [21] and [16] as special cases, and has the potential to find otherapplications due to its generality.The paper is organized as follows. We introduce the existing SDRs for (1.2) insection 2 and analyze their tightness in section 3. In section 4, we propose a general The definition of tightness in [17] is slightly different from ours since they also require theoptimal solution of the SDR to be unique.
R. JIANG, Y.-F. LIU, C. BAO, AND B. JIANG theorem that establishes the equivalence between two subsets of PSD cones in differentdimensions, and discuss how our theorem implies previous results. Section 5 providessome numerical results to validate our analysis. Finally, section 6 concludes the paper.We summarize some standard notations used in this paper. We use x i to denotethe i -th entry of a vector x and X i,j to denote the ( i, j )-th entry of a matrix X . Weuse | · | , (cid:107) · (cid:107) , and (cid:107) · (cid:107) ∞ to denote the entrywise absolute value, the Euclidean norm,and the L ∞ norm of a vector, respectively. For a given number/vector/matrix, we use( · ) † to denote the conjugate transpose, ( · ) T to denote the transpose, and Re( · )/Im( · )to denote the entrywise real/imaginary part. We use Diag( x ) to denote the diagonalmatrix whose diagonals are the vector x , and diag( X ) to denote the vector whoseentries are the diagonals of the matrix X . Given an m × n matrix A and the indexsets α ⊂ { , , . . . , m } and β ⊂ { , , . . . , n } , we use A [ α, β ] to denote the submatrixwith entires in the rows of A indexed by α and the columns indexed by β . Moreover,we denote the principal submatrix A [ α, α ] by A [ α ] in short. For two matrices A and B of appropriate size, (cid:104) A , B (cid:105) := Re(Tr( A † B )) denotes the inner product, A ⊗ B denotes the Kronecker product, and A (cid:23) B means A − B is PSD. For a set A in avector space, we use conv( A ) to denote its convex hull. For a random variable X andmeasurable sets B and C , Prob ( X ∈ B ) denotes the probability of the event { X ∈ B} , Prob ( X ∈ B | C ) denotes the conditional probability given C , and E [ X ] denotes theexpectation of X . Finally, the symbols i , n , I n , and S n + represent the imaginary unit,the n × n × n identity matrix, and the n -dimensional PSD cone,respectively.
2. Review of semidefinite relaxations.
In this paper, we focus on the M -PSKsetting with the symbol set S M given in (1.3). To simplify the notations, we let s ∈ C M be the vector of all symbols, where s j = e i θ j and θ j = ( j − πM , j = 1 , , . . . , M, and further we let s R = Re( s ) and s I = Im( s ).The objective in (1.2) can be written as (cid:107) H x − r (cid:107) = x † Q x + 2Re( c † x ) + r † r = (cid:104) Q , xx † (cid:105) + 2Re( c † x ) + r † r , where we define(2.1) Q = H † H and c = − H † r . By introducing X = xx † and discarding the constant r † r , we can reformulate (1.2)as(2.2) min x , X (cid:104) Q , X (cid:105) + 2Re( c † x )s . t . X i,i = 1 , i = 1 , , . . . , n,x i ∈ S M , i = 1 , , . . . , n, X = xx † , where the constraint X i,i = 1 comes from X i,i = | x i | = 1. The conventional SDR(CSDR) in [28, 19] simply drops the discrete symbol constraints x i ∈ S M and relaxes IGHTNESS AND EQUIVALENCE OF SDRS FOR MIMO DETECTION X (cid:23) xx † , resulting in the following relaxation:(CSDR) min x , X (cid:104) Q , X (cid:105) + 2Re( c † x )s . t . X i,i = 1 , i = 1 , , . . . , n, X (cid:23) xx † , where x ∈ C n and X ∈ C n × n . Since X (cid:23) xx † is equivalent to (cid:20) x † x X (cid:21) (cid:23) , the above (CSDR) is an SDP on the complex domain. Moreover, for the simple casewhere H ∈ R m × n , v ∈ R m , and M = 2, a real SDR similar to (CSDR) has the form:(2.3) min x , X (cid:104) Q , X (cid:105) + 2 c T x s . t . X i,i = 1 , i = 1 , , . . . , n, X (cid:23) xx T , where x ∈ R n , X ∈ R n × n , and we redefine Q = H T H and c = − H T r (cf. (2.1)). Theproblem (2.3) has also been extensively studied in the literature [30, 20, 9, 10, 14, 26].It is proved in [9, 10] that (2.3) is tight if and only if(2.4) H T H + [Diag( x ∗ )] − Diag( H T v ) (cid:23) , while (CSDR) is not tight for M ≥ x i ∈ S M as in (CSDR), the authors replaced the discretesymbol set S M by its convex hull to get a continuous relaxation:(ESDR- X ) min t , x , X (cid:104) Q , X (cid:105) + 2Re( c † x )s . t . X i,i = 1 , i = 1 , , . . . , n,x i = M (cid:88) j =1 t i,j s j , M (cid:88) j =1 t i,j = 1 , i = 1 , , . . . , n,t i,j ≥ , j = 1 , , . . . , M, i = 1 , , . . . , n, X (cid:23) xx † , where x ∈ C n , X ∈ C n × n , and t ∈ R Mn is the concatenation of M -dimensional vectors t , t , . . . , t n with t i = [ t i, , t i, , . . . , t i,M ] T . The authors in [17] further proved that(ESDR- X ) is tight if condition (1.4) holds. We term the above SDP as “ESDR- X ”,where “E” stands for “enhanced” and “ X ” refers to the matrix variable. The samenaming convention is adopted for all the SDRs below.We can also formulate (2.2) in the real domain and then use the same techniqueto get a real counterpart of (ESDR- X ). Let(2.5) y = (cid:20) Re( x )Im( x ) (cid:21) , ˆ Q = (cid:20) Re( Q ) − Im( Q )Im( Q ) Re( Q ) (cid:21) , and ˆ c = (cid:20) Re( c )Im( c ) (cid:21) , R. JIANG, Y.-F. LIU, C. BAO, AND B. JIANG then the real enhanced SDR (ESDR- Y ) is given by(ESDR- Y ) min t , y , Y (cid:104) ˆ Q , Y (cid:105) + 2ˆ c T y s . t . Y ( i ) = M (cid:88) j =1 t i,j K j , M (cid:88) j =1 t i,j = 1 , i = 1 , , . . . , n,t i,j ≥ , j = 1 , , . . . , M, i = 1 , , . . . , n, Y (cid:23) yy T , where t ∈ R Mn , y ∈ R n , Y ∈ R n × n , and we define Y ( i ) := y i y n + i y i Y i,i Y i,n + i y n + i Y n + i,i Y n + i,n + i , i = 1 , , . . . , n. In (ESDR- Y ), these 3 × K j = s R,j s I,j s R,j s I,j T , j = 1 , , . . . , M, where s R,j = Re( s j ) and s I,j = Im( s j ). It has been shown that (ESDR- Y ) is tighterthan (ESDR- X ) [17, Theorem 4.1], and hence (ESDR- Y ) is tight whenever (ESDR- X )is tight.Now we turn to another class of SDRs developed from a different perspectivein [22], which is applicable to a general symbol set. The idea is to introduce binaryvariables to express x i ∈ S M by(2.7) x i = t T i s , i = 1 , , . . . , n, where t i = [ t i, , t i, , . . . , t i,M ] T , (cid:80) Mj =1 t i,j = 1, and t i,j ∈ { , } . The above con-straints (2.7) can be rewritten in a compact form as x = S t , where S = I n ⊗ s T andwe concatenate all vectors t i to get t = [ t T , . . . , t T n ] T ∈ R Mn . Similarly, we can alsoformulate (2.7) in the real domain as y = ˆ S t , where(2.8) y = (cid:20) Re( x )Im( x ) (cid:21) and ˆ S = (cid:20) Re( S )Im( S ) (cid:21) = (cid:20) I n ⊗ s T R I n ⊗ s T I (cid:21) . By introducing T = tt T ∈ R Mn × Mn , the problem (1.2) is equivalent to(2.9) min t , T (cid:104) ¯ Q , T (cid:105) + 2¯ c T t s . t . M (cid:88) j =1 t i,j = 1 , i = 1 , , . . . , n,t i,j ∈ { , } , j = 1 , , . . . , M, i = 1 , , . . . , n, T = tt T , where(2.10) ¯ Q = ˆ S T ˆ Q ˆ S and ¯ c = ˆ S T ˆ c . IGHTNESS AND EQUIVALENCE OF SDRS FOR MIMO DETECTION t i,j to take any value between 0 and 1. Forthe rank-one constraint T = tt T , the authors in [22] proposed four ways of relaxationand we will introduce two of them in the following . We first partition T as an n × n block matrix T = T , T , . . . T ,n T , T , . . . T ,n ... ... . . . ... T n, T n, . . . T n,n , where T i,j ∈ R M × M for i = 1 , , . . . , n and j = 1 , , . . . , n . In the first model, werelax T = tt T to T (cid:23) tt T and impose constraints on the diagonal elements:(ESDR1- T ) min t , T (cid:104) ¯ Q , T (cid:105) + 2¯ c T t s . t . t i,j ≥ , M (cid:88) j =1 t i,j = 1 , j = 1 , , . . . , M, i = 1 , , . . . , n, diag( T i,i ) = t i , i = 1 , , . . . , n, T (cid:23) tt T , where t ∈ R Mn and T ∈ R Mn × Mn . The second model further requires T i,i to be adiagonal matrix, leading to the following SDR:(ESDR2- T ) min t , T (cid:104) ¯ Q , T (cid:105) + 2¯ c T t s . t . t i,j ≥ , M (cid:88) j =1 t i,j = 1 , j = 1 , , . . . , M, i = 1 , , . . . , n, T i,i = Diag( t i ) , i = 1 , , . . . , n, T (cid:23) tt T . Since (ESDR2- T ) puts more constraints on the variables t and T , (ESDR2- T ) istighter than (ESDR1- T ). Notably, it is shown in [16] that (ESDR2- T ) is equivalentto (ESDR- Y ), and hence (1.4) is also a sufficient condition for (ESDR2- T ) to be tight.Table 1 summarizes all SDR models discussed in this paper, where we highlightour contributions on the tightness of different SDRs in bold.
3. Tightness of semidefinite relaxations.3.1. Tightness of (ESDR-X).
Let X ∗ = x ∗ ( x ∗ ) † , and the key idea of showingthe tightness of (ESDR- X ) is to certify ( x ∗ , X ∗ ) as the optimal solution by consideringthe Karush-Kuhn-Tucker (KKT) conditions of (ESDR- X ). Our derivation is basedon [17, Theorem 4.2] and we provide a simplified version for completeness. Theorem
Suppose that M ≥ . Then ( x ∗ , X ∗ ) is theoptimal solution of (ESDR- X ) if and only if there exist λ i ∈ R , µ i, − ≥ , and µ i, ≥ , i = 1 , , . . . , n, Our formulations are slightly different from the original ones in [22] since they used the equalityconstraints to eliminate one variable for each t i before relaxing the PSD constraint. However, innumerical tests we found that this variation only causes a negligible difference in the optimal solutionsof the SDRs. R. JIANG, Y.-F. LIU, C. BAO, AND B. JIANG
Table 1
Summary of SDR models in this paper.
SDR model Origin Domain Dimension ofPSD cone CommentsCSDR Ma et al . [19] C n + 1 tight with probability 0 [17]ESDR- X CSDP2 in Lu et al . [17] C n + 1 tight if and only if (1.5) holds ESDR- Y ERSDP in Lu et al . [17] R n + 1 tight only if (1.6) holds ESDR1- T Model II inMobasher etal . [22] R Mn + 1 tight with probability nogreater than (2 /M ) n ESDR2- T Model III inMobasher etal . [22] R Mn + 1 equivalent to ESDR- Y [16] such that H and v in (1.1) satisfy ( x ∗ i ) − ( H † v ) i = λ i + µ i, − e − i πM + µ i, e i πM , i = 1 , , . . . , n, and Q + Diag( λ ) (cid:23) . The authors in [17] further derived the sufficient condition (1.4), under which theconditions in Theorem 3.1 are met by choosing λ i = − λ min ( Q ) for i = 1 , , . . . , n .To strengthen their analysis, we view the conditions in Theorem 3.1 as a semidefinitefeasibility problem. To be specific, if we define(3.1) z i = ( x ∗ i ) − ( H † v ) i , i = 1 , , . . . , n, and C ( λ ) = (cid:110) z ∈ C : ∃ µ − , µ ≥ z = λ + µ − e − i πM + µ e i πM (cid:111) , then Theorem 3.1 states that (ESDR- X ) is tight if and only if the following problemis feasible:(3.2) find λ ∈ R n s . t . Q + Diag( λ ) (cid:23) ,z i ∈ C ( λ i ) , i = 1 , , . . . , n. Each constraint z i ∈ C ( λ i ) turns out to be a simple inequality on λ i . To see this, weplot C ( λ i ) as the shaded area in Figure 1. It is clear from the figure that z i ∈ C ( λ i ) ⇔ | Im( z i ) | ≤ (cid:0) − λ i + Re( z i ) (cid:1) tan (cid:16) πM (cid:17) , which leads to z i ∈ C ( λ i ) ⇔ λ i ≤ Re( z i ) − | Im( z i ) | cot (cid:16) πM (cid:17) . This, together with (3.2), gives the necessary and sufficient condition for (ESDR- X )to be tight and we formally state it in Theorem 3.2. IGHTNESS AND EQUIVALENCE OF SDRS FOR MIMO DETECTION ReIm
Fig. 1 . Illustration of C ( λ i ) in the complex plane. Theorem
Suppose that M ≥ . Then (ESDR- X ) is tight if and only if (3.3) Q + Diag (cid:0) Re( z ) (cid:1) − cot (cid:16) πM (cid:17) Diag( | Im( z ) | (cid:1) (cid:23) , where z = [ z , z , . . . , z n ] T ∈ C n . Note that (3.3) is exactly the same as (1.5) if we recall the definitions of Q in(2.1) and z i in (3.1). Furthermore, if we set M = 2 and H , v to be real in (1.5),it becomes the same as the previous result (2.4). Hence, our result extends (2.4) tothe more general case where M ≥ H , v are complex. Finally, the sufficientcondition (1.4) in [17] can be derived from our result. SinceRe( z i ) − | Im( z i ) | cot (cid:16) πM (cid:17) = 1sin (cid:0) πM (cid:1) (cid:16) Re( z i ) sin (cid:16) πM (cid:17) − | Im( z i ) | cos (cid:16) πM (cid:17)(cid:17) ≥ − (cid:0) πM (cid:1) | z i | ≥ − (cid:0) πM (cid:1) (cid:107) H † v (cid:107) ∞ , we have Diag (cid:0) Re( z ) (cid:1) − cot (cid:16) πM (cid:17) Diag( | Im( z ) | (cid:1) (cid:23) − (cid:0) πM (cid:1) (cid:107) H † v (cid:107) ∞ I n . Combining this with Q (cid:23) λ min ( Q ) I n , we can see that (1.4) is a stronger condition on H and v than (1.5). Similar to Theorem 3.1, we have the followingcharacterization for (ESDR- Y ) to be tight. Since the proof technique is essentiallythe same as that in [17], we put the proof in a separate technical report [13]. Theorem
Suppose that M ≥ . Let the transmitted symbol vector x ∗ be x ∗ i = s u i , u i ∈ { , , . . . , M } , i = 1 , , . . . , n, and define ˆ v = (cid:20) Re( v )Im( v ) (cid:21) ∈ R n , ˆ H = (cid:20) Re( H ) − Im( H )Im( H ) Re( H ) (cid:21) ∈ R n × n , (3.4) y ∗ = (cid:20) Re( x ∗ )Im( x ∗ ) (cid:21) ∈ R n , Y ∗ = y ∗ ( y ∗ ) T ∈ R n × n . R. JIANG, Y.-F. LIU, C. BAO, AND B. JIANG
Then ( y ∗ , Y ∗ ) is the optimal solution of (ESDR- Y ) if and only if there exist λ ∈ R n , µ ∈ R n , and g ∈ R n that satisfy ˆ H T ˆ v = g + ( Λ + M ) y ∗ , (3.5) (cid:104) Γ i , K u i (cid:105) ≥ (cid:104) Γ i , K j (cid:105) , j = 1 , , . . . , M, i = 1 , , . . . , n, (3.6) and ˆ Q + Λ + M (cid:23) , where K j is defined in (2.6) , ˆ Q is defined in (2.5) , and (3.7) Λ = Diag( λ ) , M = (cid:20) Diag( µ )Diag( µ ) (cid:21) , Γ i = g i g n + i g i λ i µ i g n + i µ i λ n + i . Furthermore, (3.5) and (3.6) in Theorem 3.3 can be simplified to the followinginequalities on ( λ , µ ) (see Appendix A):(3.8) sin (cid:18) θ u i + ∆ θ j (cid:19) λ i + cos (cid:18) θ u i + ∆ θ j (cid:19) λ n + i − sin (2 θ u i + ∆ θ j ) µ j ≤ Re( z i ) − cot (cid:18) ∆ θ j (cid:19) Im( z i ) , j ∈ { , , . . . , M }\{ u i } , i = 1 , , . . . , n. Here ∆ θ j = θ j − θ u i , θ u i is the phase of the i -th transmitted symbol x ∗ i , and z i isdefined in (3.1). Similar to (3.2), we formulate the conditions in Theorem 3.3 as asemidefinite feasibility problem as follows:(3.9) find λ ∈ R n and µ ∈ R n s . t . ˆ Q + Λ + M (cid:23) , (3.8) is satisfied,where Λ and M are defined in (3.7). However, unlike problem (3.2) where everyinequality only involves one dual variable, problem (3.9) has inequalities with threevariables coupled together and it is unclear how to choose the “optimal” λ and µ . Inthe following, we give a simple necessary condition for (ESDR- Y ) being tight basedon (3.9). Theorem
Suppose that M ≥ . If (ESDR- Y ) is tight, then (3.10) Q + Diag (cid:0) Re( z ) (cid:1) − cot (cid:18) πM (cid:19) Diag( | Im( z ) | (cid:1) (cid:23) , where Q = H † H and z is defined in (3.1) . Before proving Theorem 3.4, we first introduce the following lemma.
Lemma
Suppose that V is a PSD matrix in R n and is partitioned as V = (cid:20) A BB T C (cid:21) , where A = A T , C = C T , and A , B , C ∈ R n × n . Then U = 12 ( A + C ) + i B T − B ) is a PSD matrix in C n . IGHTNESS AND EQUIVALENCE OF SDRS FOR MIMO DETECTION Proof.
We observe that U = 12 (cid:2) I n iI n (cid:3) (cid:20) A BB T C (cid:21) (cid:20) I n − iI n (cid:21) . The result follows immediately.
Proof of Theorem . If (ESDR- Y ) is tight, we can find λ ∈ R n and µ ∈ R n that satisfy the constraints in (3.9). By Lemma 3.5, the constraint ˆ Q + Λ + M (cid:23) Q + Diag(¯ λ ) (cid:23) , where ¯ λ ∈ R n is given by¯ λ i = 12 ( λ i + λ n + i ) , i = 1 , , . . . , n. Fix i ∈ { , , . . . , n } and let ˆ z i and ˆ z n + i denote Re( z i ) and Im( z i ), respectively. Ifˆ z n + i ≥
0, we set ∆ θ j = πM in (3.8) to getsin (cid:16) θ u i + πM (cid:17) λ i + cos (cid:16) θ u i + πM (cid:17) λ n + i − sin (cid:16) θ u i + 2 πM (cid:17) µ j ≤ ˆ z i − cot (cid:16) πM (cid:17) | ˆ z n + i | . Since M ≥
4, we can also set ∆ θ j = πM + π to getcos (cid:16) θ u i + πM (cid:17) λ i + sin (cid:16) θ u i + πM (cid:17) λ n + i + sin (cid:16) θ u i + 2 πM (cid:17) µ j ≤ ˆ z i + tan (cid:16) πM (cid:17) | ˆ z n + i | . Adding the above two inequalities and dividing both sides by two, we have(3.12) ¯ λ i = 12 ( λ i + λ n + i ) ≤ ˆ z i − cot (cid:18) πM (cid:19) | ˆ z n + i | . If ˆ z n + i <
0, we can also arrive at (3.12) by setting ∆ θ j to be − πM and − πM + π ,respectively. Finally, Theorem 3.4 follows from (3.11) and (3.12).Note that (3.10) is the same as (1.6) if we recall the definitions of Q in (2.1) and z i in (3.1). Moreover, since (ESDR- Y ) is tighter than (ESDR- X ), (ESDR- Y ) willalso be tight if (1.5) holds. Therefore, we have both a necessary condition (1.6) anda sufficient condition (1.5) for (ESDR- Y ) to be tight. In the same spirit, we first give a necessaryand sufficient condition for (ESDR1- T ) to be tight. Since the technique is essentiallythe same as that used in Theorem 3.3, we omit the proof details due to the spacelimitation. Theorem
Suppose that M ≥ . Let the transmitted symbol vector x ∗ be x ∗ i = s u i , u i ∈ { , , . . . , M } , i = 1 , , . . . , n, and define t ∗ i,u i = 1 , t ∗ i,j = 0 , j (cid:54) = u i , i = 1 , , . . . , n, T ∗ = t ∗ ( t ∗ ) T . R. JIANG, Y.-F. LIU, C. BAO, AND B. JIANG
Then ( t ∗ , T ∗ ) is the optimal solution of (ESDR1- T ) if and only if there exist α ∈ R n and γ ∈ R Mn such that (3.13) Diag(1 − t ∗ ) γ = − S T ˆ H T ˆ v + α ⊗ M , and (3.14) ¯ Q + Diag( γ ) (cid:23) . where ˆ S is defined in (2.8) , ˆ H , ˆ v are defined in (3.4) , and ¯ Q is defined in (2.10) . Now we provide a corollary that will serve as our basis for further derivation.
Corollary If (ESDR1- T ) is tight, then there exist α ∈ R n and γ , γ , . . . , γ n ∈ R M that satisfy (3.15) γ i,j = (cid:40) − s † j ( H † v ) i ] + α i , if j (cid:54) = u i , s † j ( H † v ) i ] − α i , if j = u i , j = 1 , , . . . , M, i = 1 , , . . . , n, and w T Diag( γ i ) w ≥ , i = 1 , , . . . , n, for any w ∈ R M such that w T s R = w T s I = 0 .Proof. By Theorem 3.6, if (ESDR1- T ) is tight, we can find α ∈ R n and γ ∈ R Mn that satisfy (3.13) and (3.14). Let γ be partitioned as γ = [ γ T , γ T , . . . , γ T n ] T where γ j ∈ R M is the j -th block of γ . It is straightforward to verify that (3.13) is equivalentto (3.15).Moreover, for any i ∈ { , , . . . , n } and any w ∈ R M that satisfies w T s R = w T s I = 0, we set ¯ w = [ ¯ w T , ¯ w T , . . . , ¯ w T n ] T ∈ R Mn to be¯ w j = (cid:40) , if j (cid:54) = i, w , if j = i. It is simple to check that ˆ S ¯ w = . Therefore, recalling that ¯ Q = ˆ S T ˆ Q ˆ S , by (3.14) wehave ¯ w T ( ¯ Q + Diag( γ )) ¯ w = ¯ w T Diag( γ ) ¯ w = w T Diag( γ i ) w ≥ . The proof is complete.In practice, the symbol set S , such as the one in (1.3) considered in this paper, issymmetric with respect to the origin. Therefore, we can find u (cid:48) i ∈ { , , . . . , M } thatsatisfies s u (cid:48) i = − s u i . Now let w ∈ R M be w j = (cid:40) , if j / ∈ { u i , u (cid:48) i } , , if j ∈ { u i , u (cid:48) i } , j = 1 , , . . . , M. We have w T s R = s R,u i + s R,u (cid:48) i = 0 and w T s I = s I,u i + s I,u (cid:48) i = 0. Hence, when(ESDR1- T ) is tight, Corollary 3.7 implies that w T Diag( γ i ) w = γ u i + γ u (cid:48) i = 4Re[ s † u i ( H † v ) i ] = 4Re[( x ∗ i ) † ( H † v ) i ] ≥ . IGHTNESS AND EQUIVALENCE OF SDRS FOR MIMO DETECTION T ). Corollary
Suppose that the symbol set S is symmetric with respect to theorigin and / ∈ S . We further assume that(a) The entries of x ∗ are drawn from S uniformly and independently;(b) x ∗ , H , and v are mutually independent; and(c) the distribution of H and v are continuous.Then we have Prob (cid:0) (ESDR1- T ) is tight (cid:1) ≤ (cid:18) (cid:19) n . Proof.
Let z i = ( x ∗ i ) † ( H † v ) i , i = 1 , , . . . , n . Since H and v are independentcontinuous random variables, the event n (cid:91) i =1 (cid:91) s ∈S { Re( s † ( H † v ) i ) = 0 } happens with probability zero. Hence, because of the symmetry of S , with prob-ability one exactly half of the symbols s ∈ S satisfy Re( s † ( H † v ) i ) > i ∈ { , , . . . , n } when H and v are given. By the assumption that x ∗ i is uniformlydistributed over S , we obtain Prob (cid:0)
Re( z i ) ≥ | H , v (cid:1) = 12 almost surely . Moreover, { z i } ni =1 are mutually independent conditioned on H and v . This leads to Prob (cid:0)
Re( z i ) ≥ , i = 1 , , . . . , n (cid:1) = E H , v (cid:2) Prob (cid:0)
Re( z i ) ≥ , i = 1 , , . . . , n | H , v (cid:1)(cid:3) = E H , v (cid:2) n (cid:89) i =1 Prob (cid:0)
Re( z i ) ≥ | H , v (cid:1)(cid:3) = (cid:18) (cid:19) n . Finally, Corollary 3.8 follows from the fact that the tightness of (ESDR1- T ) impliesRe( z i ) ≥ i = 1 , , . . . , n .It is worth noting that all the assumptions in Corollary 3.8 are mild: they aresatisfied if we use the M -PSK or QAM modulation scheme and the entries of H and v follow the complex Gaussian distribution.Intuitively, we will expect that (ESDR1- T ) is less likely to recover the transmittedsymbols with an increasing symbol set size M . In the following, we present a morerefined upper bound on the tightness probability specific to the M -PSK setting andthe proof can be found in [13]. Theorem
Suppose that M -PSK is used with M ≥ and the same assump-tions in Corollary hold. Then we have Prob (cid:0) (ESDR1- T ) is tight (cid:1) ≤ (cid:18) M (cid:19) n . R. JIANG, Y.-F. LIU, C. BAO, AND B. JIANG
From Corollary 3.8 and Theorem 3.9, we can see that the tightness probabilityof (ESDR1- T ) is bounded away from one regardless of the noise level, and it tendsto zero exponentially fast when the number of transmitted symbols n increases. Thisis in sharp contrast to (ESDR- X ) and (ESDR- Y ), whose tightness probabilities willapproach one if the noise level is sufficiently small and the number of received signals m is sufficiently large compared to n [17, Theorem 4.5].
4. Equivalence between different SDRs.
In this section, we focus on therelationship between different SDR models of (1.2).Related to the SDRs discussed sofar, a recent paper [16] proved that (ESDR2- T ) is equivalent to (ESDR- Y ) for theMIMO detection problem with a general symbol set. An earlier paper [21] comparedthree different SDRs in the QAM setting and showed their equivalence. Comparedwith those in section 2, the SDRs considered in [21] differ greatly in their motivationsand structures, and the two equivalence results are proved using different techniques.In this section, we provide a more general equivalence theorem from which both resultsfollow as special cases. This not only reveals the underlying connection between thesetwo works, but also may potentially lead to new equivalence between SDRs. In [16], the authors established the followingcorrespondence between a pair of feasible points of (ESDR2- T ) and (ESDR- Y ):(4.1) Y = ˆ ST ˆ S T and y = ˆ S t , where ˆ S ∈ R n × Mn is defined in (2.8). In [21], the authors considered the feasible setof a virtually-antipodal SDR (VA-SDR):(VA-SDR) (cid:20) b T b B (cid:21) ∈ S qn +1+ s . t . B i,i = 1 , i = 1 , , . . . , qn, and that of a bounded-constrained SDR (BC-SDR):(BC-SDR) (cid:20) x T x X (cid:21) ∈ S n +1+ s . t . ≤ X i,i ≤ (2 q − , i = 1 , , . . . , n, where q ≥ X = WBW T and x = W b , where W = (cid:2) I n I n I n . . . q − I n (cid:3) ∈ R n × qn . Note that both equivalence results in (4.1) and (4.2) fall into the following form: (cid:26)(cid:20) y T y Y (cid:21) ∈ F (cid:27) = (cid:26)(cid:20) y T y Y (cid:21) = (cid:20)
00 P (cid:21) (cid:20) t T t T (cid:21) (cid:20)
00 P T (cid:21) : (cid:20) t T t T (cid:21) ∈ F (cid:27) , where F is a subset of S k +1+ , F is a subset of S d +1+ , and we call P ∈ R k × d asthe transformation matrix. Moreover, both the transformation matrices ˆ S in (4.1)and W in (4.2) have a special “separable” property that we now define for ease ofpresentation. IGHTNESS AND EQUIVALENCE OF SDRS FOR MIMO DETECTION Definition
A matrix P ∈ R k × d is called separable if there exist a partitionof rows α , α , . . . , α l and a partition of columns β , β , . . . , β l such that P [ α i , β j ] = , ∀ i (cid:54) = j. In other words, a matrix is separable if, after possibly rearranging rows and columns,it has a block diagonal structure. In particular, for the transformation matrix ˆ S in(4.1), the corresponding row and column partitions are given by(4.3) α i = { i, n + i } , β i = { ( i − M + 1 , ( i − M + 2 , . . . , iM } , i = 1 , , . . . , n ;for the transformation matrix W in (4.2), they are given by(4.4) α i = { i } , β i = { i, i + n, i + 2 n, . . . , i + ( q − n } , i = 1 , , . . . , n. Now we are ready to present our mainequivalence result.
Theorem
Suppose that the matrix P ∈ R k × d is separable with row partition α , α , . . . , α l and column partition β , β , . . . , β l .Moreover, define k i := | α i | , d i := | β i | , and P i := P [ α i , β i ] ∈ R k i × d i , i = 1 , , . . . , l, where we use | · | to denote the cardinality of a set. Then given arbitrary constraintsets A i ⊂ R d i × d i for i = 1 , , . . . , l , the following set (4.5) (cid:20) y T y Y (cid:21) ∈ S k +1+ s . t . Y = PTP T , y = P t , (cid:20) t T t T (cid:21) ∈ S d +1+ , (cid:20) t [ β i ] T t [ β i ] T [ β i ] (cid:21) ∈ A i , i = 1 , , . . . , l, where the variables are y ∈ R k , Y ∈ R k × k , t ∈ R d , and T ∈ R d × d , is the same as (4.6) (cid:20) y T y Y (cid:21) ∈ S k +1+ s . t . Y [ α i ] = P i T ( i ) P T i , y [ α i ] = P i t ( i ) , (cid:20) t ( i ) ) T t ( i ) T ( i ) (cid:21) ∈ S d i +1+ , (cid:20) t ( i ) ) T t ( i ) T ( i ) (cid:21) ∈ A i , i = 1 , , . . . , l, where the variables are y ∈ R k , Y ∈ R k × k , t ( i ) ∈ R d i , and T ( i ) ∈ R d i × d i with i = 1 , , . . . , l . The following lemma will be useful in our proof.
Lemma
Let A ∈ R p × n and B ∈ R q × n where p ≤ q .Then A T A = B T B if and only if there exists a matrix U ∈ R q × p with U T U = I p such that B = UA . R. JIANG, Y.-F. LIU, C. BAO, AND B. JIANG
Proof of Theorem . Without loss of generality, we assume the transformationmatrix P ∈ R k × d is in the form P = P P . . . P l , where P i ∈ R k i × d i , (cid:80) li =1 k i = k , and (cid:80) li =1 d i = d .For one direction, suppose that ( y , Y , t , T ) satisfies the constraints in (4.5). Thenit is straightforward to see that ( y , Y ) also satisfies the constraints in (4.6) togetherwith t ( i ) = t [ β i ] , T ( i ) = T [ β i ] , i = 1 , , . . . , l. The other direction of the proof is more involved. Given ( y , Y ) and the variables { t ( i ) , T ( i ) } li =1 in (4.6), our goal is to construct ( t , T ) satisfying the conditions in (4.5).To simplify the notations, we define(4.7) ˜ Y := (cid:20) y T y Y (cid:21) , ˜ T ( i ) := (cid:20) t ( i ) ) T t ( i ) T ( i ) (cid:21) , and ˜ P i := (cid:20)
00 P i (cid:21) . Let r = max { k, d } . Since ˜ Y ∈ S k +1+ , it can be factorized as(4.8) ˜ Y = ˜ V T ˜ V , where ˜ V ∈ R ( r +1) × ( k +1) . The above factorization can be done because r ≥ k . Further,we partition ˜ V as ˜ V = (cid:2) v V V . . . V l (cid:3) , where v ∈ R r +1 and V i ∈ R ( r +1) × k i contains the columns of ˜ V indexed by α i for i = 1 , , . . . , l . Moreover, we have v T v = ˜ Y , = 1. Similarly, ˜ T ( i ) can be factorizedas(4.9) ˜ T ( i ) = ( ˜ Z ( i ) ) T ˜ Z ( i ) , i = 1 , , . . . , l, where ˜ Z ( i ) ∈ R ( d i +1) × ( d i +1) and is partitioned as(4.10) ˜ Z ( i ) = (cid:2) z ( i ) Z ( i ) (cid:3) . Combining (4.9) with the equality constraints in (4.6), we get (cid:20) y [ α i ] T y [ α i ] Y [ α i ] (cid:21) = ˜ P i ˜ T ( i ) ˜ P T i = (cid:0) ˜ Z ( i ) ˜ P T i (cid:1) T (cid:0) ˜ Z ( i ) ˜ P T i (cid:1) , i = 1 , , . . . , l, where ˜ Z ( i ) ˜ P T i ∈ R ( d i +1) × ( k i +1) . On the other hand, the factorization in (4.8) implies (cid:20) y [ α i ] T y [ α i ] Y [ α i ] (cid:21) = (cid:2) v V i (cid:3) T (cid:2) v V i (cid:3) , where [ v V i ] ∈ R ( r +1) × ( k i +1) . By Lemma 4.3, we can find U i ∈ R ( r +1) × ( d i +1) with U T i U i = I d i +1 such that(4.11) (cid:2) v V i (cid:3) = U i ˜ Z ( i ) ˜ P T i . IGHTNESS AND EQUIVALENCE OF SDRS FOR MIMO DETECTION v = U i z ( i ) and V i = U i Z ( i ) P T i . Finally, we define R = (cid:2) v U Z (1) U Z (2) . . . U l Z ( l ) (cid:3) ∈ R ( r +1) × ( d +1) , whose columns indexed by β i are given by U i Z ( i ) , and construct ( t , T ) by(4.13) (cid:20) t T t T (cid:21) = R T R . Next we verify that ( t , T ) in (4.13) indeed satisfies all the constraints in (4.5). Thepositive semidefiniteness is evident by our construction. For the equality constraints,by using (4.12) we have R (cid:20)
00 P T (cid:21) = (cid:2) v U Z (1) P T U Z (2) P T . . . U l Z ( l ) P T l (cid:3) = (cid:2) v V V . . . V l (cid:3) = ˜ V . Hence, we get (cid:20)
00 P (cid:21) (cid:20) t T t T (cid:21) (cid:20)
00 P T (cid:21) = (cid:20)
00 P (cid:21) R T R (cid:20)
00 P T (cid:21) = ˜ V T ˜ V = (cid:20) y T y Y (cid:21) , which is equivalent to Y = PTP T and y = P t . Lastly, note that (cid:20) t [ β i ] T t [ β i ] T [ β i ] (cid:21) = (cid:2) v U i Z ( i ) (cid:3) T (cid:2) v U i Z ( i ) (cid:3) (4.14) = (cid:2) U i z ( i ) U i Z ( i ) (cid:3) T (cid:2) U i z ( i ) U i Z ( i ) (cid:3) (4.15) = ( ˜ Z ( i ) ) T U T i U i ˜ Z ( i ) = ( ˜ Z ( i ) ) T ˜ Z ( i ) (4.16) = ˜ T ( i ) = (cid:20) t ( i ) ) T t ( i ) T ( i ) (cid:21) , where we used (4.13) in (4.14), v = U i z ( i ) (cf. (4.12)) in (4.15), and U T i U i = I d i +1 in (4.16). Hence, the remaining constraints in (4.5) are also satisfied because of theconditions on ˜ T ( i ) in (4.6).The proof of Theorem 4.2 is complete.Two remarks are in order. Firstly, the variables in (4.5) are in a high-dimensionalPSD cone S d +1+ , while those in (4.6) are in the Cartesian product of smaller PSDcones S k +1+ × S d +1+ × S d +1+ × · · · × S d l +1+ . When k, d , d , . . . , d l are much smallerthan d , using (4.6) instead of (4.5) can achieve dimension reduction without any ad-ditional cost. This can bring substantially higher computational efficiency for solving8 R. JIANG, Y.-F. LIU, C. BAO, AND B. JIANG the corresponding SDP in practice (see section 5). Secondly, Theorem 4.2 is verygeneral and thus could be applicable to a potentially wide range of problems. It isworth highlighting that we require no assumptions on the sets A i that constrain thesubmatrices, as well as the row and column partitions of the separable matrix P . Thisenables us to accommodate both the equivalence results (4.1) and (4.2), as we willshow next. As we noted be-fore, the transformation matrix ˆ S in (4.1) is separable with the row and column par-titions given in (4.3) and we haveˆ S [ α i , β i ] = (cid:20) s T R s T I (cid:21) ∈ R × M , i = 1 , , . . . , n. Moreover, we can see that the feasible set of (ESDR2- T ) is in the form of (4.5) withthe set A i given by A i = (cid:26)(cid:20) t T t Diag( t ) (cid:21) : t ∈ R M , M (cid:88) j =1 t j = 1 , t j ≥ , j = 1 , , . . . , M (cid:27) = (cid:26) M (cid:88) j =1 t j E j : M (cid:88) j =1 t j = 1 , t j ≥ , j = 1 , , . . . , M (cid:27) = conv { E , E , . . . , E M } , where E j = (cid:20) e j (cid:21) (cid:20) e j (cid:21) T , j = 1 , , . . . , M, and e j ∈ R M is the j -th unit vector. Applying Theorem 4.2 to (ESDR2- T ) gives thefollowing equivalent formulation:(4.17) (cid:20) y T y Y (cid:21) ∈ S n +1+ s . t . y i y n + i y i Y i,i Y i,n + i y n + i Y n + i,i Y n + i,n + i = s T R s T I (cid:20) t ( i ) ) T t ( i ) T ( i ) (cid:21) (cid:20) s R s I (cid:21) , (cid:20) t ( i ) ) T t ( i ) T ( i ) (cid:21) ∈ S M +1+ , (cid:20) t ( i ) ) T t ( i ) T ( i ) (cid:21) ∈ conv { E , E , . . . , E M } , i = 1 , , . . . , n. Since each matrix E j is PSD, their convex hull is a subset of S M +1+ and hence thePSD constraints in (4.17) are redundant. Furthermore, note that s T R s T I E j (cid:20) s R s I (cid:21) = s T R s T I (cid:20) e j (cid:21) (cid:20) e j (cid:21) T (cid:20) s R s I (cid:21) = s R,j s I,j (cid:2) s R,j s I,j (cid:3) , IGHTNESS AND EQUIVALENCE OF SDRS FOR MIMO DETECTION K j defined in (2.6). Therefore, we can conclude that (4.17)is the same as (ESDR- Y ), and hence (ESDR2- T ) and (ESDR- Y ) are equivalent. Similarly, we ob-serve that the transformation matrix W in (4.2) is separable with row and columnpartitions given in (4.4), and let(4.18) w T := W [ α i , β i ] = (cid:2) . . . q − (cid:3) . The feasible set in (VA-SDR) conforms to (4.5) with the set A i given by A i = (cid:26)(cid:20) b T b B (cid:21) : b ∈ R q , B ∈ R q × q , diag( B ) = q (cid:27) . Hence, by applying Theorem 4.2 to (VA-SDR), we get the following equivalent for-mulation:(4.19) (cid:20) x T x X (cid:21) ∈ S n +1+ s . t . X i,i = w T B ( i ) w , x i = w T b ( i ) , (cid:20) b ( i ) ) T b ( i ) B ( i ) (cid:21) ∈ S q +1+ , diag( B ( i ) ) = q , i = 1 , , . . . , n. Next we argue that all the constraints x i = w T b ( i ) are redundant, i.e., the set in(4.19) is equivalent to(4.20) (cid:26)(cid:20) x T x X (cid:21) ∈ S n +1+ : X i,i = w T B ( i ) w , B ( i ) ∈ S q + , diag( B ( i ) ) = q , i = 1 , , . . . , n (cid:27) . To show this, we need to prove that, for any x , X , B (1) , . . . , B ( n ) satisfying the con-straints in (4.20), there must exist b ( i ) ∈ R q such that(4.21) x i = w T b ( i ) and (cid:20) b ( i ) ) T b ( i ) B ( i ) (cid:21) (cid:23) , i = 1 , , . . . , n. Fix i ∈ { , , . . . , n } . Note that the PSD constraints in (4.20) implies(4.22) (cid:20) x i x i X i,i (cid:21) (cid:23) ⇔ x i ≤ X i,i . When X i,i = 0, we must have x i = 0 and we can achieve (4.21) by simply letting b ( i ) = . Otherwise, we have X i,i > b ( i ) = x i X i,i B ( i ) w . Since X i,i = w T B ( i ) w , we can see that w T b ( i ) = ( w T B ( i ) w ) x i /X i,i = x i .To verify the PSD constraint in (4.21), it suffices to show that B ( i ) (cid:23) b ( i ) ( b ( i ) ) T .Note that (cid:20) X i,i w T B ( i ) B ( i ) w B ( i ) (cid:21) = (cid:20) w T B ( i ) w w T B ( i ) B ( i ) w B ( i ) (cid:21) = (cid:20) w T I q (cid:21) B ( i ) (cid:2) w I q (cid:3) (cid:23) , R. JIANG, Y.-F. LIU, C. BAO, AND B. JIANG which implies the Schur complement is also PSD, i.e., B ( i ) − X i,i B ( i ) ww T B ( i ) (cid:23) . This, together with (4.22) and (4.23), shows b ( i ) ( b ( i ) ) T = x i X i,i B ( i ) ww T B ( i ) (cid:22) X i,i X i,i B ( i ) ww T B ( i ) (cid:22) B ( i ) . Hence both conditions in (4.21) are satisfied.Finally, to show that (4.20) is the same as (BC-SDR), we need the followinglemma.
Lemma
Let w ∈ R q be the vector defined in (4.18) . It holds that (4.24) (cid:8) x : ∃ B ∈ S q + s . t . x = w T B w , diag( B ) = q (cid:9) = { ≤ x ≤ (2 q − } . Proof.
See Appendix B.Putting all pieces together, we have proved that (VA-SDR) is equivalent to (BC-SDR) by showing the correspondence (4.2).
5. Numerical results.
In this section, we present some numerical results. Fol-lowing standard assumptions in the wireless communication literature (see, e.g., [31,Chapter 7]), we assume that all entries of the channel matrix H are independent andidentically distributed (i.i.d.) following a complex circular Gaussian distribution withzero mean and unit variance, and all entries of the additive noise v are i.i.d. followinga complex circular Gaussian distribution with zero mean and variance σ . Further,we choose the transmitted symbols x ∗ , x ∗ , . . . , x ∗ n from the symbol set S M in (1.3)independently and uniformly. We define the SNR as the received SNR per symbol:SNR := E [ (cid:107) H x ∗ (cid:107) ] n E [ (cid:107) v (cid:107) ] = mnn · mσ = 1 σ . We first consider a MIMO system where ( m, n ) = (16 ,
10) and M = 8. Toevaluate the empirical probabilities of SDRs not being tight, we compute the optimalsolutions of (ESDR- X ), (ESDR- Y ), and (ESDR1- T ) by the general-purpose SDPsolver SeDuMi [29] with the desired accuracy set to 10 − . The SDR is decided to betight if the output ˆ x returned by the SDP solver satisfies (cid:107) ˆ x − x ∗ (cid:107) ∞ ≤ − . Wealso evaluate the empirical probabilities of conditions (1.4)–(1.6) not being satisfied.We run the simulations at 8 SNR values in total ranging from 3 dB to 24 dB. Foreach SNR value, 10,000 random instances are generated and the averaged results areplotted in Figure 2.We can see from Figure 2 that our results (1.5) and (1.6) provide better character-izations than the previous tightness condition (1.4) in [16]. The empirical probabilityof (ESDR- X ) not being tight matches perfectly with our analysis given by the neces-sary and sufficient condition (1.5). The probability of (1.6) not being satisfied is alsoa good approximation to the probability of (ESDR- Y ) not being tight, underestimat-ing the latter roughly by a factor of 9. Moreover, the numerical results also validate The output ˆ x is directly given by the optimal solution in (ESDR- X ), while it is obtained from therelation (2.5) between x and y in (ESDR- Y ) and the relation (2.7) between x and t in (ESDR1- T ).IGHTNESS AND EQUIVALENCE OF SDRS FOR MIMO DETECTION -4 -3 -2 -1 ESDR1-T not tightESDR-X not tightESDR-Y not tight(1.4) not satisfied(1.5) not satisfied(1.6) not satisfied
Fig. 2 . Error probabilities versus the SNR in a × MIMO system with 8-PSK. our analysis that (ESDR1- T ) is not tight with high probability. In fact, (ESDR1- T )fails to recover the vector of transmitted symbols in all 80,000 instances.Next, we compare the optimal values as well as the CPU time of solving (ESDR- Y ) and (ESDR2- T ). Table 2 shows the relative difference between the optimal valuesof (ESDR- Y ) (denoted as opt ESDR- Y ) and (ESDR2- T ) (denoted as opt ESDR2- T ) aver-aged over 300 simulations, which is defined as | opt ESDR- Y − opt ESDR2- T | / | opt ESDR2- T | .We can see from Table 2 that the difference is consistently in the order 1e − Y ) and (ESDR2- T ). In Fig-ure 3, we plot the average CPU time consumed by solving (ESDR- Y ) and (ESDR2- T )in an 8-PSK system with increasing problem size n . For fair comparison, both SDRsare implemented and solved by SeDuMi and we repeat the simulations for 300 times.With the same error performance, we can see that (ESDR- Y ) indeed solves the MIMOdetection problem (1.2) more efficiently and saves roughly 90% of the computationaltime in our experiment. Table 2
Average relative difference between (ESDR- Y ) and (ESDR2- T ) in optimal objective values. SNR Relative diff. in optimal objective values.( m, n ) = (4 ,
4) ( m, n ) = (6 ,
4) ( m, n ) = (10 ,
10) ( m, n ) = (15 , − − − − − − − − − − − −
6. Conclusions.
In this paper, we studied the tightness and equivalence of var-ious existing SDR models for the MIMO detection problem (1.2). For the two SDRs(ESDR- X ) and (ESDR- Y ) proposed in [17], we improved their sufficient tightnesscondition and showed that the former is tight if and only if (1.5) holds while thelatter is tight only if (1.6) holds. On the other hand, for the SDR (ESDR1- T ) pro-posed in [22], we proved that its tightness probability decays to zero exponentiallyfast with an increasing problem size under some mild assumptions. Together withknown results, our analysis provides a more complete understanding of the tightnessconditions for existing SDRs. Moreover, we proposed a general theorem that unifies2 R. JIANG, Y.-F. LIU, C. BAO, AND B. JIANG
10 11 12 13 14 15 16 17 1810 -2 -1 ESDR2-TESDR-Y
Fig. 3 . Average CPU time of solving (ESDR- Y ) and (ESDR2- T ) when M = 8 . previous results on the equivalence of SDRs [21, 16]. For a subset of PSD matriceswith a special “separable” structure, we showed its equivalence to another subset ofPSD matrices in a potentially much smaller dimension. Our numerical results demon-strated that we could significantly improve the computational efficiency by using suchequivalence.Due to its generality, we believe that our equivalence theorem can be applied toSDPs in other domains beyond MIMO detection and we would like to put this asa future work. Additionally, we noticed that the SDRs for problem (1.2) combinedwith some simple rounding procedure can detect the transmitted symbols successfullyeven when the optimal solution has rank more than one. Similar observations havealso been made in [12]. It would be interesting to extend our analysis to take thepostprocessing procedure into account. Appendix A. Simplification of (3.5) and (3.6).
Fix i ∈ { , , . . . , n } . From(3.5), we have( ˆ H T ˆ v ) i = g i + λ i y ∗ i + µ i y ∗ n + i and ( ˆ H T ˆ v ) n + i = g n + i + µ i y ∗ i + λ n + i y ∗ n + i , which can be written in a matrix form:(A.1) (cid:20) ( ˆ H T ˆ v ) i ( ˆ H T ˆ v ) n + i (cid:21) = (cid:20) g i g n + i (cid:21) + (cid:20) λ i µ i µ i λ n + i (cid:21) (cid:20) y ∗ i y ∗ n + i (cid:21) . Recall the definitions of Γ i in (3.7) and K j in (2.6). Then(A.2) (cid:104) Γ i , K j (cid:105) = 2 (cid:20) cos( θ j )sin( θ j ) (cid:21) T (cid:20) g i g n + i (cid:21) + (cid:20) cos( θ j )sin( θ j ) (cid:21) T (cid:20) λ i µ i µ i λ n + i (cid:21) (cid:20) cos( θ j )sin( θ j ) (cid:21) . Using (A.1), we have(A.3) (cid:20) cos( θ j )sin( θ j ) (cid:21) T (cid:20) g i g n + i (cid:21) = (cid:20) cos( θ j )sin( θ j ) (cid:21) T (cid:20) ( ˆ H T ˆ v ) i ( ˆ H T ˆ v ) n + i (cid:21) − (cid:20) cos( θ j )sin( θ j ) (cid:21) T (cid:20) λ i µ i µ i λ n + i (cid:21) (cid:20) y ∗ i y ∗ n + i (cid:21) = (cid:20) cos(∆ θ j )sin(∆ θ j ) (cid:21) T (cid:20) ˆ z i ˆ z n + i (cid:21) − (cid:20) cos( θ j )sin( θ j ) (cid:21) T (cid:20) λ i µ i µ i λ n + i (cid:21) (cid:20) y ∗ i y ∗ n + i (cid:21) , IGHTNESS AND EQUIVALENCE OF SDRS FOR MIMO DETECTION θ j = θ j − θ u i , ˆ z i = Re( z i ), and ˆ z n + i = Im( z i ) (cf. (3.1)). Combining (A.2)with (A.3), we get (cid:104) Γ i , K j (cid:105) = 2 (cid:20) cos(∆ θ j )sin(∆ θ j ) (cid:21) T (cid:20) ˆ z i ˆ z n + i (cid:21) − (cid:20) cos( θ j )sin( θ j ) (cid:21) T (cid:20) λ i µ i µ i λ n + i (cid:21) (cid:20) y ∗ i y ∗ n + i (cid:21) + (cid:20) cos( θ j )sin( θ j ) (cid:21) T (cid:20) λ i µ i µ i λ n + i (cid:21) (cid:20) cos( θ j )sin( θ j ) (cid:21) . In particular, when j = u i , the above becomes (cid:104) Γ i , K u i (cid:105) = 2 (cid:20) (cid:21) T (cid:20) ˆ z i ˆ z n + i (cid:21) − (cid:20) y ∗ i y ∗ n + i (cid:21) T (cid:20) λ i µ i µ i λ n + i (cid:21) (cid:20) y ∗ i y ∗ n + i (cid:21) . Hence, when j (cid:54) = u i , (3.6) is equivalent to2 (cid:20) − cos(∆ θ j ) − sin(∆ θ j ) (cid:21) T (cid:20) ˆ z i ˆ z n + i (cid:21) ≥ (cid:20) y ∗ i − cos( θ j ) y ∗ n + i − sin( θ j ) (cid:21) T (cid:20) λ i µ i µ i λ n + i (cid:21) (cid:20) y ∗ i − cos( θ j ) y ∗ n + i − sin( θ j ) (cid:21) ⇔ (cid:34) − cot (cid:16) ∆ θ j (cid:17)(cid:35) T (cid:20) ˆ z i ˆ z n + i (cid:21) ≥ (cid:34) sin( θ u i + ∆ θ j ) − cos( θ u i + ∆ θ j ) (cid:35) T (cid:20) λ i µ i µ i λ n + i (cid:21) (cid:34) sin( θ u i + ∆ θ j ) − cos( θ u i + ∆ θ j ) (cid:35) , which is exactly the same as (3.8). Appendix B. Proof of Lemma 4.4.
To simplify the notations, we use A and B to denote the left-hand side and the right-hand side in (4.24), respectively.We first prove that A ⊃ B . Let C = (cid:8) B ∈ S q + : diag( B ) = q (cid:9) , and we canview A as the image of the convex set C under the affine mapping B (cid:55)→ (cid:104) B , ww T (cid:105) .Therefore, the set A is also convex. Moreover, note that both the rank-one matrices q T q and [ − q − ][ − q − ] T belong to C . Direct computations show that w T q T q w = (cid:18) q (cid:88) i =1 i − (cid:19) = (2 q − , w T (cid:20) − q − (cid:21) (cid:20) − q − (cid:21) T w = (cid:18) q − − q − (cid:88) i =1 i − (cid:19) = 1 , and hence both 1 and (2 q − belong to A . Finally, the convexity of A implies B ⊂ A .Now we prove the other direction, i.e.,
A ⊂ B . This is equivalent to showing1 ≤ w T B w ≤ (2 q − , ∀ B ∈ C . For the upper bound, we first note that B ∈ S q + implies(B.1) | B i,j | ≤ (cid:112) B i,i B j,j = 1 , i (cid:54) = j, ≤ i, j ≤ q. Since every entry of the matrix ww T is positive, we have w T B w = (cid:104) B , ww T (cid:105) ≤ (cid:104) T , ww T (cid:105) = (2 q − for any B ∈ C , and hence the upper bound holds.4 R. JIANG, Y.-F. LIU, C. BAO, AND B. JIANG
For the lower bound, it clearly holds when q = 1. When q >
1, for any matrix B ∈ C we partition it as B = (cid:20) B (cid:48) b (cid:48) ( b (cid:48) ) T (cid:21) , where B (cid:48) ∈ R ( q − × ( q − and b (cid:48) ∈ R q − . Note that we have | b (cid:48) i | ≤ ≤ i ≤ q − B ∈ S q + implies B (cid:48) (cid:23) b (cid:48) ( b (cid:48) ) T . Further, we let w (cid:48) = (cid:2) . . . q − (cid:3) T ∈ R q − such that w = [( w (cid:48) ) T q − ] T (cf. (4.18)). We have w T B w = ( w (cid:48) ) T B w (cid:48) + 2 q ( w (cid:48) ) T b (cid:48) + (2 q − ) ≥ ( w (cid:48) ) T b (cid:48) ( b (cid:48) ) T w (cid:48) + 2 q ( w (cid:48) ) T b (cid:48) + (2 q − ) = (cid:0) ( w (cid:48) ) T b (cid:48) + 2 q − (cid:1) . Since ( w (cid:48) ) T b (cid:48) = q − (cid:88) i =1 i − b (cid:48) i ≥ − q − (cid:88) i =1 i − = − q − + 1 , we immediately get w T B w ≥ ( − q − + 1 + 2 q − ) = 1, and hence the lower boundalso holds.The proof is now complete. Acknowledgments.
The authors would like to thank Professors Zi Xu andCheng Lu for their useful discussions on an earlier version of this paper.
REFERENCES[1]
A. S. Bandeira, N. Boumal, and A. Singer , Tightness of the maximum likelihood semidefiniterelaxation for angular synchronization , Math. Program., 163 (2017), pp. 145–167.[2]
P. Biswas, T.-C. Lian, T.-C. Wang, and Y. Ye , Semidefinite programming based algorithmsfor sensor network localization , ACM Trans. Sen. Netw., 2 (2006), pp. 188–220.[3]
P. Biswas and Y. Ye , Semidefinite programming for ad hoc wireless sensor network local-ization , in Proceedings of the 3rd International Symposium on Information Processing inSensor Networks (IPSN’04), New York, NY, 2004, ACM, pp. 46–54.[4]
O. Damen, A. Chkeif, and J.-C. Belfiore , Lattice code decoder for space-time codes , IEEECommun. Lett., 4 (2000), pp. 161–163.[5]
A. Duel-Hallen , Decorrelating decision-feedback multiuser detector for synchronous code-division multiple-access channel , IEEE Trans. Commun., 41 (1993), pp. 285–290.[6]
G. J. Foschini , Layered space-time architecture for wireless communication in a fading envi-ronment when using multi-element antennas , Bell Labs Tech. J., 1 (1996), pp. 41–59.[7]
M. X. Goemans and D. P. Williamson , Improved approximation algorithms for maximum cutand satisfiability problems using semidefinite programming , J. ACM, 42 (1995), pp. 1115–1145.[8]
R. A. Horn and C. R. Johnson , Matrix Analysis , Cambridge University Press, New York,USA, 2nd ed., 2013.[9]
J. Jald´en , Detection for Multiple Input Multiple Output Channels , PhD thesis, School ofElectrical Engineering, KTH, Stockholm, Sweden, 2006.[10]
J. Jald´en, C. Martin, and B. Ottersten , Semidefinite programming for detection in linearsystems - Optimality conditions and space-time decoding , in Proceedings of the IEEE Inter-national Conference on Acoustics, Speech, and Signal Processing (ICASSP’03), Piscataway,NJ, 2003, IEEE Press, pp. 9–12.[11]
J. Jald´en and B. Ottersten , On the complexity of sphere decoding in digital communications ,IEEE Trans. Signal Process., 53 (2005), pp. 1474–1484.[12]
J. Jald´en and B. Ottersten , The diversity order of the semidefinite relaxation detector ,IEEE Trans. Inf. Theory, 54 (2008), pp. 1406–1422.IGHTNESS AND EQUIVALENCE OF SDRS FOR MIMO DETECTION [13] R. Jiang, Y.-F. Liu, C. Bao, and B. Jiang , A companion technical report of “tightness andequivalence of semidefinite relaxations for MIMO detection” , tech. report, Academy ofMathematics and Systems Science, Chinese Academy of Sciences, 2020, http://lsec.cc.ac.cn/ ∼ yafliu/Technical Report MIMO.pdf.[14] M. Kisialiou and Z.-Q. Luo , Performance analysis of quasi-maximum-likelihood detectorbased on semi-definite programming , in Proceedings of the IEEE International Conferenceon Acoustics, Speech and Signal Processing (ICASSP’05), Piscataway, NJ, 2005, IEEEPress, pp. 433–436.[15]
Y.-F. Liu, M. Hong, and Y.-H. Dai , Max-Min fairness linear transceiver design problem fora multi-user SIMO interference channel is polynomial time solvable , IEEE Signal Process.Lett., 20 (2013), pp. 27–30.[16]
Y.-F. Liu, Z. Xu, and C. Lu , On the equivalence of semidifinite relaxations for MIMO detec-tion with general constellations , in Proceedings of the IEEE International Conference onAcoustics, Speech and Signal Processing (ICASSP’19), Piscataway, NJ, 2019, IEEE Press,pp. 4549–4553.[17]
C. Lu, Y.-F. Liu, W.-Q. Zhang, and S. Zhang , Tightness of a new and enhanced semidefiniterelaxation for MIMO detection , SIAM J. Optim., 29 (2019), pp. 719–742.[18]
Z.-Q. Luo, W.-K. Ma, A. M.-C. So, Y. Ye, and S. Zhang , Semidefinite relaxation of qua-dratic optimization problems , IEEE Signal Process. Mag., 27 (2010), pp. 20–34.[19]
W.-K. Ma, P.-C. Ching, and Z. Ding , Semidefinite relaxation based multiuser detection forM-ary PSK multiuser systems , IEEE Trans. Signal Process., 52 (2004), pp. 2862–2872.[20]
W.-K. Ma, T. N. Davidson, K. M. Wong, Z.-Q. Luo, and P.-C. Ching , Quasi-maximum-likelihood multiuser detection using semi-definite relaxation with application to synchro-nous CDMA , IEEE Trans. Signal Process., 50 (2002), pp. 912–922.[21]
W.-K. Ma, C.-C. Su, J. Jald´en, T.-H. Chang, and C.-Y. Chi , The equivalence of semidefiniterelaxation MIMO detectors for higher-order QAM , IEEE J. Sel. Top. Signal Process., 3(2009), pp. 1038–1052.[22]
A. Mobasher, M. Taherzadeh, R. Sotirov, and A. K. Khandani , A near-maximum-likelihood decoding algorithm for MIMO systems based on semi-definite programming ,IEEE Trans. Inf. Theory, 53 (2007), pp. 3869–3886.[23]
K. S. Schneider , Optimum detection of code division multiplexed signals , IEEE Trans. Aerosp.Electron. Syst., 15 (1979), pp. 181–185.[24]
N. D. Sidiropoulos, T. N. Davidson, and Z.-Q. Luo , Transmit beamforming for physical-layer multicasting , IEEE Trans. Signal Process., 54 (2006), pp. 2239–2251.[25]
A. Singer , Angular synchronization by eigenvectors and semidefinite programming , Appl.Comput. Harmon. Anal., 30 (2011), pp. 20–36.[26]
A. M.-C. So , Probabilistic analysis of the semidefinite relaxation detector in digital communi-cations , in Proceedings of the Twenty-First Annual ACM-SIAM Symposium on DiscreteAlgorithms (SODA’10), Philadelphia, PA, 2011, SIAM, pp. 698–711.[27]
A. M.-C. So and Y. Ye , Theory of semidefinite programming for sensor network localization ,Math. Program., 109 (2007), pp. 367–384.[28]
B. Steingrimsson, Z.-Q. Luo, and K. M. Wong , Soft quasi-maximum-likelihood detection formultiple-antenna wireless channels , IEEE Trans. Signal Process., 51 (2003), pp. 2710–2719.[29]
J. F. Sturm , Using SeDuMi 1.02, a MATLAB toolbox for optimization over symmetric cones ,Optim. Methods Softw., 11 (1999), pp. 625–653.[30]
P. H. Tan and L. K. Rasmussen , The application of semidefinite programming for detectionin CDMA , IEEE J. Sel. Areas Commun., 19 (2001), pp. 1442–1449.[31]
D. Tse and P. Viswanath , Fundamentals of Wireless Communication , Cambridge UniversityPress, New York, 2005.[32]
S. Verd´u , Computational complexity of optimum multiuser detection , Algorithmica, 4 (1989),pp. 303–312.[33]
S. Verd´u , Multiuser Detection , Cambridge University Press, New York, 1998.[34]
I. Waldspurger, A. D’Aspremont, and S. Mallat , Phase recovery, MaxCut and complexsemidefinite programming , Math. Program., 149 (2015), pp. 47–81.[35]
Z. Xie, R. T. Short, and C. K. Rushforth , A family of suboptimum detectors for coherentmultiuser communications , IEEE J. Sel. Areas Commun., 8 (1990), pp. 683–690.[36]
S. Yang and L. Hanzo , Fifty years of MIMO detection: The road to large-scale MIMOs , IEEECommun. Surveys Tuts., 17 (2015), pp. 1941–1988.[37]