Block- and Rank-Sparse Recovery for Direction Finding in Partly Calibrated Arrays
aa r X i v : . [ c s . I T ] F e b Block- and Rank-Sparse Recovery for Direction Findingin Partly Calibrated Arrays
Christian Steffens and Marius PesaventoCommunication Systems GroupDarmstadt University of Technology, Germanye-mail: { steffens, pesavento } @nt.tu-darmstadt.deFebruary 2017 Abstract
A sparse recovery approach for direction finding in partly calibrated arrays composed of sub-arrays with unknown displacements is introduced. The proposed method is based on mixednuclear norm and ℓ norm minimization and exploits block-sparsity and low-rank structurein the signal model. For efficient implementation a compact equivalent problem reformu-lation is presented. The new technique is applicable to subarrays of arbitrary topologiesand grid-based sampling of the subarray manifolds. In the special case of subarrays witha common baseline our new technique admits extension to a gridless implementation. Asshown by simulations, our new block- and rank-sparse direction finding technique for partlycalibrated arrays outperforms the state of the art method RARE in difficult scenarios of lowsample numbers, low signal-to-noise ratio or correlated signals. Direction finding with sensor arrays has applications in various fields of signal processing such aswireless communications, radar, sonar or astronomy. In these types of applications it is desiredto achieve a high angular resolution and to identify a large number of sources. This can beachieved by sensor arrays with a large aperture and a large number of sensors [1]. However, alarge aperture size makes it difficult to achieve and maintain precise array calibration. Possiblereasons for imperfect calibration are inaccuracies in the sensor positions, timing synchronizationerrors, or other unknown gain and phase offsets among sensors [2–8]. Standard approachesto this problem usually rely on either offline or online calibration. Offline calibration of theoverall array is performed using reference sources at known positions and can easily becomea challenging and time consuming task [2]. Alternatively, several online calibration techniqueshave been proposed which use calibration sources at unknown positions [3–7], but the complexityof these techniques is prohibitively high, and performance can be severely limited in the caseof large sensor position errors [7]. Moreover, these techniques cannot be employed in scenarioswith imperfect time synchronization of sensors or other unknown sensor gain and phase offsets.One way to overcome the calibration problem is to partition the overall array into smallersubarrays which are themselves comparably easy to calibrate. This type of array is referred to aspartly calibrated array (PCA) and has received considerable interest in recent years. Generally,direction finding approaches for this type of arrays can be classified into non-coherent and1oherent methods. In the non-coherent case the subarrays independently perform estimation ofthe directions of arrival (DoAs) or the signal covariance matrix to communicate these estimatesto a central processor, where further processing is performed to achieve an improved jointestimate [9–13]. In the coherent approach parameter estimation is performed based on jointcoherent processing of all available sensor measurements, e.g., by computing a global samplecovariance matrix, and imperfect calibration among the different subarrays is taken account ofin the estimation process [8, 14–21]. In this work we consider the latter of the two approaches.A prominent class of DoA estimation methods is based on subspace separation. In [14–18]the authors consider PCAs composed of multiple identical subarrays. Such types of array exhibitmultiple shift invariances and methods such as the multiple invariant MUSIC and MODE [14]or multiple invariance ESPRIT [16–18] can be used to provide DoA estimates in a search-freefashion. In [19] it is assumed that the PCA is composed of identically oriented linear arrays thatcan be transformed to a uniform linear array by linear translations of the subarrays. The authorspresent the root-RARE algorithm which admits search-free DoA estimation. The root-RAREmethod in [19] was modified to the spectral RARE in [8] which admits application to arbitraryarray topologies at the cost of increased computational complexity. Subspace-based methodsare well investigated and are shown to asymptotically achieve an estimation performance closeto the Cram´er-Rao bound at low computational complexity. However, these subspace-basedmethods often have difficulties in certain practical scenarios. First, correlated source signals,e.g., in multipath environments, can significantly reduce the estimation performance. Second,subspace-based methods yield poor performance in the case of low number of snapshots and lowsignal-to-noise ratio, e.g., in fast changing environments.Recently, sparse recovery (SR) methods came into focus of DoA estimation studies. As re-ported in [22, 23], SR methods provide high-resolution parameter estimation performance with-out the aforementioned shortcomings of subspace-based methods. Moreover, SR methods arecomputationally tractable since they can be formulated as convex optimization problems. Whileclassical SR methods aim at recovering sparse signal vectors and admit grid-based parameterestimation [24, 25], the special case of fully calibrated arrays (FCAs) of uniform linear topologywith possibly missing sensors allows for gridless SR methods as proposed in [26–28]. In [12, 13]the authors propose grid-based and gridless SR methods applicable for non-coherent processingin PCAs, where joint sparsity in the subarray signal representations is exploited. SR meth-ods for coherent processing in PCAs have been presented in citesteffens2017shiftinvariance,6882328. The method in [20] is based on the recently proposed SPARROW formulation [23]and exploits multiple shift-invariances in PCAs composed of identical subarrays to provide grid-less parameter estimation. In [21] the well-known ℓ , mixed-norm minimization approach [22]for FCAs is generalized to grid-based SR in PCAs of arbitrary topology by means of a mixednuclear norm [29, 30] and ℓ norm, termed here as ℓ ∗ , mixed-norm. As shown by numericalexperiments [21], ℓ ∗ , mixed-norm minimization clearly outperforms the spectral RARE [8] infrequency resolution performance for low signal-to-noise ratio and low number of snapshots.In this paper we consider the ℓ ∗ , mixed-norm minimization problem proposed in [21] andderive an equivalent compact reformulation, termed as COmpact Block- and RAnk-Sparse recov-ery (COBRAS). The COBRAS formulation has a reduced number of optimization parametersas compared to the original ℓ ∗ , mixed-norm minimization problem and we provide efficientimplementations of the COBRAS formulation by means of semidefinite programming (SDP).While the SDP implementation is based on grid-based sampling of the subarray manifolds andapplicable to arbitrary array topologies, we furthermore present a search-free implementation of2ur COBRAS formulation for the special case of linear subarrays with a common baseline. Weshow by extensive numerical experiments that the COBRAS approach outperforms the state ofthe art methods in difficult scenarios. In summary, our main contributions are given as: • We introduce a sparse recovery approach for coherent processing in PCAs using ℓ ∗ , mixed-norm minimization. • We derive a compact reformulation of the ℓ ∗ , mixed-norm minimization problem, termedas COBRAS. • We develop a computationally efficient grid-based SDP implementation of the COBRASformulation for arbitrary array topologies, and • an efficient gridless SDP implementation of the COBRAS formulation for PCAs composedof subarrays with a common baseline.The paper is organized as follows: Section 2 introduces the PCA signal model. The ℓ , and ℓ ∗ , mixed-norm minimization problems for FCAs and PCAs are discussed in Section 3. TheCOBRAS formulation is derived in Section 4 while grid-based and gridless SDP implementationsare provided in Sections 5 and 6. Numerical results are presented in Section 7 before the paperis concluded in Section 8. Notation:
Boldface uppercase letters X denote matrices, boldface lowercase letters x de-note column vectors, and regular letters x, N denote scalars, with j denoting the imaginary unit.Superscripts X T and X H denote transpose and conjugate transpose of a matrix X , respectively.The term B P + K denotes the set of positive semidefinite block-diagonal matrices composed of K blocks of size P × P on the main diagonal . We write [ X ] m,n to indicate the element in the m throw and n th column of matrix X . The statistical expectation of a random variable x is denotedas E { x } , and the trace of a matrix X is referred to as Tr( X ). The Frobenius norm and the ℓ p,q mixed-norm of a matrix X are referred to as k X k F and k X k p,q , respectively, while the ℓ p normof a vector x is denoted as k x k p . The term diag( x , . . . , x K ) denotes a diagonal matrix with theelements x , . . . , x K on its main diagonal while blkdiag( X , . . . , X K ) denotes a block-diagonalmatrix composed of submatrices X , . . . , X K on its main block-diagonal. Consider a linear array of arbitrary topology, composed of M omnidirectional sensors, as depictedin Figure 1. Assume the overall array is partitioned into P subarrays with M p sensors insubarray p , for p = 1 , . . . , P , such that M = P Pp =1 M p . We define η = [ η (2) , . . . , η ( P ) ] T as thevector containing the P − η (2) , . . . , η ( K ) expressed inhalf signal wavelength and relative to the first subarray, i.e., η (1) = 0. Furthermore, let ρ ( p ) m ,for m = 1 , . . . , M p , p = 1 , . . . , P , denote the perfectly known intra-subarray position of the m thsensor of subarray p relative to the first sensor in the subarray, hence ρ ( p )1 = 0, and expressedin half signal wavelength. Consequently, the position of sensor m in subarray p , relative to thefirst sensor in the first subarray, can be expressed as r ( p ) m = ρ ( p ) m + η ( p ) , (1)for m = 1 , . . . , M p and p = 1 , . . . , P . 3 (2) η (3) ρ (1)2 ρ (2)2 ρ (2)3 ρ (2)4 ρ (3)2 ρ (3)3 θ θ Source 1 Source 2
Figure 1: Partly calibrated array composed of M = 9 sensors partitioned in P = 3 subarrays,and L = 2 source signalsMoreover, assume a number of L narrowband and far-field sources are illuminating thesensor array from angular directions θ , . . . , θ L , as illustrated in Figure 1. The correspondingspatial frequencies are defined as µ l = cos θ l ∈ [ − , l = 1 , . . . , L , and comprise thevector µ = [ µ , . . . , µ L ] T . A total of N signal snapshots are obtained at the output of eachsubarray p and collected in the M p × N subarray measurement matrix Y ( p ) , for p = 1 , . . . , P ,where [ Y ( p ) ] m,n denotes the output of the m th sensor in the p th subarray at time instant n .The subarray measurement matrices are collected in the M × N array measurement matrix Y = [ Y (1) T , . . . , Y ( P ) T ] T , which is modeled as Y = A ( µ , η ) Ψ + N , (2)where Ψ ∈ C L × N is the source signal matrix and N ∈ C M × N denotes a spatio-temporal whiteGaussian sensor noise matrix. The M × L array steering matrix A ( µ , η ) in (2) is given by A ( µ , η ) = [ a ( µ , η ) , . . . , a ( µ L , η )] , (3)and represents the response of the entire array, where a ( µ, η ) denotes the steering vector forspatial frequency µ and subarray displacements η . Based on the sensor position definition in(1), the array steering vectors can be factorized as a ( µ, η ) = B ( µ ) ϕ ( µ, η ) (4)where the M × P block-diagonal matrix B ( µ ) = blkdiag (cid:0) b (1) ( µ ) , . . . , b ( P ) ( µ ) (cid:1) (5)contains the perfectly known subarray steering vectors b ( p ) ( µ ) = (cid:2) , e j µ ρ ( p )2 , . . . , e j µ ρ ( p ) Mp (cid:3) T , (6)for p = 1 , . . . , P , on its diagonal, and the L × ϕ ( µ, η ) = [1 , α (2) e j πµη (2) , . . . , α ( P ) e j πµη ( P ) ] T (7)takes account of the subarray displacement shifts e j πµη ( p ) , for p = 2 , . . . , P , depending on thespatial frequencies in µ and the subarray displacements in η , and further unknown shifts α ( p ) ,4.g., gain/phase or timing offsets among the subarrays [8]. In relation to (3), let us define the M × P L matrix B ( µ ) = (cid:2) B ( µ ) , . . . , B ( µ L ) (cid:3) (8)containing all subarray responses for the spatial frequencies in µ , and the P L × L block-diagonalmatrix Φ ( µ , η ) = blkdiag (cid:0) ϕ ( µ , η ) , . . . , ϕ ( µ L , η ) (cid:1) , (9)composed of the subarray shift vectors in (7). Using (8) and (9), the overall array steeringmatrix (3) can be factorized as A ( µ , η ) = B ( µ ) Φ ( µ , η ) (10)such that the overall array measurement matrix in (2) is equivalently modeled as Y = B ( µ ) Φ ( µ , η ) Ψ + N , (11)which forms the basis for the ℓ ∗ , mixed-norm minimization problem discussed in the followingsection. In this section we will shortly review the ℓ , mixed-norm minimization approach for FCAsbefore turning to the ℓ ∗ , mixed-norm minimization approach for PCAs. We first consider the case of an FCA where the subarray displacements in η are perfectly known.Based on the signal model in (2) we introduce a sparse representation of the measurement matrixas Y = A ( ν , η ) X + N . (12)The M × K overcomplete dictionary matrix A ( ν , η ) is obtained by sampling the field-of-viewin K ≫ L spatial frequencies ν = [ ν , . . . , ν K ] T . For ease of presentation we assume thatthe frequency grid is sufficiently fine, such that the true frequencies in µ are contained in thefrequency grid ν , i.e., { µ l } Ll =1 ⊂ { ν k } Kk =1 . In Section 6 we present an extension of our proposedformulation for subarrays with a common baseline which does not rely on the on-grid assumption.The K × N sparse signal matrix X in (12) contains elements[ X ] k,n = ( [ Ψ ] l,n if ν k = µ l k = 1 , . . . , K , l = 1 , . . . , L . Thus X exhibits a row-sparse structure, i.e., the elements in arow of X are either jointly zero or primarily non-zero. Based on the sparse representation (12),the frequency estimation problem can be formulated as the mixed-norm minimization problemmin X k A ( ν , η ) X − Y k F + λ √ N k X k , , (14)5here λ > X . Row-sparsity is enforced by minimizing the ℓ , mixed-norm in (14),which is defined as the number of non-zero rows x k of the matrix X = [ x , . . . , x K ] T , i.e., thecardinality of the union support set according to k X k , = (cid:12)(cid:12) { k : k x k k > } (cid:12)(cid:12) . (15)Since the problem in (14) is NP-hard, several approximation methods have been proposed inthe literature, including convex relaxation to the well-known ℓ , mixed-norm minimizationproblem [22, 23, 31, 32] min X k A ( ν , η ) X − Y k F + λ √ N k X k , . (16)The ℓ , mixed-norm in (16) is defined as k X k , = K X k =1 k x k k (17)and induces a non-linear coupling among the elements in each row x k , k = 1 , . . . , K , of thematrix X such that the ℓ norm, i.e., the nonnegative summation, is performed on the ℓ normsof the rows in ˆ X . Given a minimizer ˆ X = [ˆ x , . . . , ˆ x K ] T of (16), the frequency estimationproblem reduces to finding the local maxima in the vector of the signal ℓ row-norms ˆ x ℓ =[ k ˆ x k , . . . , k ˆ x K k ] T and assigning the corresponding frequency grid points to the set { ˆ µ } ofestimated frequencies.For the PCA case with uncertain array response A ( ν , η ) due to the unknown displacementsin η , the ℓ , mixed-norm minimization approach in (16) cannot be applied and a more sophis-ticated approach has to be devised, as discussed in the following subsection. Analogous to the FCA case in (12), we introduce a sparse representation of the signal model in(11) for the PCA case as Y = B ( ν ) Φ ( ν , η ) X + N , (18)where the row-sparse matrix X is defined similarly as for the FCA case in (13). Furthermore,the M × P K overcomplete subarray dictionary matrix B ( ν ) and the P K × K overcompletesubarray shift matrix Φ ( ν , η ) are defined in correspondence to (8) and (9), respectively.In the PCA case, the inter-subarray displacements in η are unknown and thus representadditional estimation variables, hence the subarray shifts in Φ ( ν , η ), which depend on the spatialfrequencies in ν and the subarray displacements η , have to be appropriately included in thesparse estimation problem. To this end we introduce a model that couples among the variables x k in the rows of X = [ x , . . . , x K ] T and the subarray shifts in ϕ ( ν k , η ), for k = 1 , . . . , K . Wedefine the KP × N extended signal matrix Q as Q = Φ ( ν , η ) X (19)containing the products of the subarray shifts and the signal waveforms. Note that in thisformulation the number of the unknown complex-valued signal variables is increased to KP N elements in the matrix Q , as compared to the total K ( N + P −
1) complex-valued unknowns in6oth X and Φ ( ν , η ). On the other hand, due to the block structure of the subarray shift matrix Φ ( ν , η ) as defined in (9), the matrix Q = [ Q T , . . . , Q T K ] T in (19) enjoys a special structure as itis composed of K stacked rank-one matrices Q k = ϕ ( ν k , η ) x T k , for k = 1 , . . . , K. (20)Using the formulation in (19), the sparse representation for the PCA case in (18) is equivalentlydescribed by Y = BQ + N , (21)where for ease of presentation we use B = B ( ν ) to denote the dictionary matrix in (21) andthroughout the paper. An SR approach to take account of the special structure of the signalmatrix Q in (19) is given asmin Q k BQ − Y k F + λ √ N K X k =1 rank( Q k ) . (22)The formulation in (22) takes twofold advantage of the sparsity assumption. First, minimiza-tion of the rank-terms encourages low-rank blocks ˆ Q , . . . , ˆ Q K in the minimizer ˆ Q . Second,minimizing the sum-of-ranks provides a block-sparse structure of ˆ Q , i.e., the elements in eachblock ˆ Q k , for k = 1 , . . . , K , are either jointly zero or primarily non-zero. However, the problemin (22) is NP-hard and computationally intractable.The nuclear norm represents a tight convex approximation of the rank function and it hasbeen successfully applied in a variety of rank minimization problems [29, 30]. The definition ofthe nuclear norm is given as k Q k k ∗ = Tr (cid:0) ( Q H k Q k ) / (cid:1) = r X i =1 σ k,i , (23)where r = min( P, N ) and σ k,i is the i th singular value of Q k . Along these lines it has beenproposed in [21] to approximate the sparse estimation problem (22) by the following convexminimization problem min Q k BQ − Y k F + λ √ N k Q k ∗ , , (24)where k Q k ∗ , denotes the ℓ ∗ , mixed-norm, computed as k Q k ∗ , = K X k =1 k Q k k ∗ . (25)Similar to (22), the problem in (24) motivates low-rank blocks ˆ Q , . . . , ˆ Q K and a block-sparsestructure in the minimizer ˆ Q = [ ˆ Q T , . . . , ˆ Q T K ] T . Note that the PCA formulations in (22) and(24) reduce to the FCA formulations in (14) and (16), respectively, in the case of a singlesubarray, i.e., P = 1.Performing singular value decomposition on the matrix blocks in ˆ Q , i.e.,ˆ Q k = ˆ U k ˆ Σ k ˆ V T k for k = 1 , . . . , K, (26)7he signal waveform ˆ x k and subarray shifts ˆ ϕ ( ν k , η ) corresponding to the spatial frequency ν k can be recovered according toˆ x k = ˆ σ k, [ˆ u k, ] ˆ v k, and ˆ ϕ ( ν k , η ) = ˆ u k, [ˆ u k, ] . (27)The left and right singular vectors ˆ u k, and ˆ v k, in (27) correspond to the largest singular valueˆ σ k, of ˆ Q k and normalization to the first element [ˆ u k, ] of ˆ u k, in (27) is performed to takeaccount of the structure of the subarray shift vectors ˆ ϕ ( ν k , η ) according to (7).While the ℓ ∗ , mixed-norm minimization problem in (24) provides a tractable approach forSR in PCAs, it suffers from high computational complexity in the case of a large number ofsnapshots N and grid points K . To overcome this difficulty we provide in the following sectiona compact reformulation of problem (24). One of the main results of this paper is formulated in the following theorem:
Theorem 1 (Problem Equivalence) . The block- and rank-sparsity inducing ℓ ∗ , mixed-normminimization problem min Q k BQ − Y k F + λ √ N k Q k ∗ , (28) is equivalent to the convex problem min S ∈B P + K Tr (cid:0) ( BSB H + λ I ) − ˆ R (cid:1) + Tr( S ) , (29) with ˆ R = Y Y H /N and B P + K denoting the sample covariance matrix and the set of positivesemidefinite block-diagonal matrices composed of K blocks of size P × P , respectively. Theequivalence holds in the sense that a minimizer ˆ Q for problem (28) can be factorized as ˆ Q = ˆ SB H ( B ˆ SB H + λ I ) − Y (30) where ˆ S is a minimizer for problem (29) . A proof of the equivalence is provided in Appendix A, while a proof of the convexity of (29)is provided in Section 5.2 by establishing equivalence to a semidefinite program.In addition to relation (30), it can be shown (see Appendix A) that a minimizer ˆ S =blkdiag( ˆ S , . . . , ˆ S K ) of (29) relates to the signal matrix ˆ Q = [ ˆ Q T , . . . , ˆ Q T K ] T according toˆ S k = 1 √ N ( ˆ Q k ˆ Q H k ) / , (31)for k = 1 , . . . , K , such that the block-support of ˆ Q is equivalently represented by the block-support of the matrix [ ˆ S T , . . . , ˆ S T K ] T . Similarly, the rank of the matrix blocks ˆ Q k is equivalentlyrepresented by the matrix blocks ˆ S k , i.e., rank( ˆ Q k ) = rank( ˆ S k ), for k = 1 , . . . , K .We observe that the problem in (29) only relies on the measurement matrix Y through thesample covariance matrix ˆ R , leading to a significantly reduced problem size, especially in the caseof large number of snapshots N . In this context we term the formulation in (29) as COmpact8lock- and RAnk-Sparse recovery (COBRAS). The compact formulation (29) contains KP real-valued optimization parameters in the positive semidefinite matrix S , as opposed to the2 KP N real-valued optimization parameters in Q in problem (28). Consequently, in the caseof a large number of snapshots N > P/
2, the reformulation (29) has reduced computationalcomplexity as compared to (28).
The ℓ ∗ , mixed-norm minimization problem (28) has been well investigated in literature andimplementations based on the coordinate descent method [21], the STELA algorithm [33] andsemidefinite programming (SDP) [34] have been proposed. Here we will shortly revise the SDPimplementation of the ℓ ∗ , mixed-norm minimization problem (28) to highlight the reduction incomputational complexity obtained by employing the COBRAS formulation in (29). ℓ ∗ , Mixed-Norm Minimization Problem
As discussed in [29], minimization of the nuclear normmin Q k ∈C k Q k k ∗ , (32)for some convex set C , can be expressed as the SDPmin Q k ∈C , P k, , P k, (cid:0) Tr( P k, ) + Tr( P k, ) (cid:1) (33a)s . t . (cid:20) P k, Q k Q H k P k, (cid:21) (cid:23) , (33b)where P k, = P H k, and P k, = P H k, are auxiliary variables of size P × P and N × N , respectively.The SDP formulation (33) admits simple implementation of the nuclear norm minimizationproblem using standard convex solvers, such as SeDuMi [35].Based on the equivalence of (32) and (33), the ℓ ∗ , mixed-norm minimization problem (28)can be equivalently formulated asmin { Q k , P k, , P k, } k BQ − Y k F + λ √ N K X k =1 Tr( P k, ) + Tr( P k, ) (34a)s . t . (cid:20) P k, Q k Q H k P k, (cid:21) (cid:23) , for k = 1 , . . . , K. (34b)Note that with the auxiliary variables in P k, and P k, , for k = 1 , . . . , K , the problem (34)has K ( P + N ) real-valued optimization variables as opposed to the problem formulation in(28) which has 2 KP N real-valued optimization variables. For a large number of grid points K or snapshots N the SDP formulation (34) becomes intractable and alternative implementationsare required as presented in the next subsection. We remark that the problem of large snap-shot number has been addressed in previous literature by matching the signal subspace of themeasurements Y instead of the measurements itself, see [22, 34], leading to a reduced numberof effective signal snapshots, however at the expense of potential performance degradation, e.g.,in the case of correlated source signals. 9 ∗ , Mixed-Norm (34)
COBRAS (35)
COBRAS (38)Number of realparameters K ( P + N ) KP + N KP + M Number × sizeof SDP constraints K × { ( P + N ) × ( P + N ) } K × { P × P } and1 × { ( M + N ) × ( M + N ) } K × { P × P } and1 × { (2 M ) × (2 M ) } Table 1: Comparison of Equivalent SDP Implementations
In order to solve the COBRAS formulation in (29) by means of a tractable SDP which can betreated by standard convex solvers consider the following corollaries [36]:
Corollary 1.
The COBRAS formulation in (29) is equivalent to the convex semidefinite program min S , Z N N Tr( Z N ) + Tr( S ) (35a)s . t . (cid:20) Z N Y H Y BSB H + λ I (cid:21) (cid:23) (35b) S ∈ B P + K , (35c) where Z N is a Hermitian matrix of size N × N . To see the equivalence between the two problems we note that
BSB H + λ I ≻ is positivedefinite for any λ > Z N (cid:23) Y H ( BSB H + λ I ) − Y (36)which implies 1 N Tr( Z N ) ≥ N Tr (cid:0) Y H ( BSB H + λ I ) − Y (cid:1) = Tr (cid:0) ( BSB H + λ I M ) − ˆ R (cid:1) . (37)Since in problem (35) Tr( Z N ) is minimized, it can be proved by contradiction that the relationin (37) must hold with equality, proving the equivalence of (29) and (35). Corollary 2.
The COBRAS formulation in (29) admits the equivalent problem formulation min S , Z M Tr( Z M ˆ R ) + Tr( S ) (38a)s . t . (cid:20) Z M I M I M BSB H + λ I M (cid:21) (cid:23) (38b) S ∈ B P + K (38c) where Z M is a Hermitian matrix of size M × M . N . It follows that either problem formulation (35) or(38) can be selected to solve (29), depending on the number of snapshots N and the resulting sizeof the semidefinite constraint. The problems (35) and (38) have KP real-valued optimizationvariables in S and additional N or M real-valued parameters in Z N and Z M , respectively.Thus, in the undersampled case N < M it is preferable to use the SDP formulation in (35),while in the oversampled case N ≥ M it is preferable to apply the SDP formulation in (38). Weremark that the subspace matching approach discussed in [22, 34] can be applied to formulation(35) as well. A further investigation of the subspace matching approach is, however, beyond thescope this paper. The various equivalent SDP implementations and the corresponding numberof variables and constraints are listed in Table 1. While the SDP formulations in Section 5 are applicable to arbitrary array topologies, we considerin the following the special case of linear subarrays with a common baseline, where the sensorswithin each subarray are located at integer multiples of a baseline δ , i.e., ρ ( p ) m = δd ( p ) m with d ( p ) m ∈ Z for m = 1 , . . . , M P and p = 1 , . . . , P . This type of array topologies admits theextension of the COBRAS formulation to gridless frequency estimation.We start noting that strong duality holds for problem (38) and consider the Lagrange dualproblem, which is given asmax Υ , Υ − { Tr( Υ ) } − λ Tr( Υ ) (39a)s.t. (cid:20) ˆ R Υ Υ H Υ (cid:21) (cid:23) (39b) I P − B H ( ν k ) Υ B ( ν k ) (cid:23) , k = 1 , . . . , K, (39c)where Υ is an M × M positive semidefinite matrix and Υ is of size M × M and does notexhibit specific structure. Complementary slackness requires thatTr (cid:0) S k ( I P − B H ( ν k ) Υ B ( ν k )) (cid:1) = 0 , (40)for k = 1 , . . . , K , i.e., if S k = then I P − B H ( ν k ) Υ B ( ν k ) must be singular, such thatdet (cid:0) I P − B H ( ν k ) Υ B ( ν k ) (cid:1) ( = 0 if S k = ≥ S k = . (41)Condition (41) indicates that instead of solving the primal problem (29) and identifying theblock-support from S , we can equivalently solve the dual problem (39) and identify the block-support from the roots of (41).Let us consider the limiting case of an infinitesimal frequency grid spacing, i.e., lim K →∞ ν k − ν k − = 0 for k = 2 , . . . , K , such that the frequency becomes a continuous parameter ν . Byintroducing the variable z = e j πνδ , (42)11he subarray steering matrices in (5) can be equivalently described as B ( z ) = blkdiag (cid:0) b (1) ( z ) , . . . , b ( P ) ( z ) (cid:1) , (43)where the subarray steering vectors are given as b ( p ) ( z ) = (cid:2) , z d ( p )2 , . . . , z d ( p ) Mp (cid:3) T , (44)with d ( p ) m ∈ Z for m = 1 , . . . , M p and p = 1 , . . . , P . By the definition in (43), the matrixproduct B H ( z ) Υ B ( z ) in constraint (39c) constitutes a trigonometric matrix polynomial ofdegree D = max p,m p d ( p ) m p , according to M ( z ) = B H ( z ) Υ B ( z ) = D X i = − D K i z i , (45)with matrix coefficients K i of size P × P [37]. In the continuous case the constraint (39c) isreplaced by the constraint I q − B H ( z ) Υ B ( z ) (cid:23) (46)which provides an upper bound on the matrix polynomial (45) and can be implemented bysemidefinite programming, e.g., by the problem formulation (84) derived in Appendix B orby other techniques discussed in [37]. Once the continuous implementation of problem (39) issolved, the spatial frequencies can be recovered by finding the roots for which the left-hand sideof (46) becomes singular, e.g., by rooting the continuous counterpart of (41) using the techniquesdiscussed in [38]. In a recent work [20] it has been shown that PCAs composed of identical subarrays admit gridlesscompressed sensing by means of the SPARROW formulation [23]. Interestingly, for the specialcase considered in this section of PCAs composed of linear subarrays with a common baseline(and possibly missing sensors in particular subarrays) the dual problem formulation (39) canequivalently be derived by means of the SPARROW formulation.Let us consider the ℓ , mixed-norm minimization problem for FCAs given in (16) which canequivalently be formulated as the SPARROW problem [23]min S ∈B K Tr (cid:0) ( A ( ν , η ) SA H ( ν , η ) + ˜ λ I ) − ˆ R (cid:1) + Tr( S ) (47)where ˜ λ > S = diag( s , . . . , s K ) (cid:23) is of size K × K .Problem (47) can be formulated as the semidefinite programmin S , Z M Tr( Z M ˆ R ) + Tr( S ) (48a)s.t. (cid:20) Z M II A ( ν , η ) SA H ( ν , η ) + ˜ λ I (cid:21) (cid:23) (48b) S ∈ B K . ˜ Υ , ˜ Υ − { Tr( ˜ Υ ) } − ˜ λ Tr( ˜ Υ ) (49a)s.t. " ˆ R ˜ Υ ˜ Υ H ˜ Υ (cid:23) (49b) a H ( ν k , η ) ˜ Υ a ( ν k , η ) ≤ , k = 1 , . . . , K, (49c)and with strong duality it follows from complementary slackness that1 − a H ( ν k , η ) ˜ Υ a ( ν k , η ) ( = 0 if s n ≥ ≥ s n = 0 . (50)As previously discussed in the context of condition (41), the support of the vector s = [ s , . . . , s K ] T ,i.e., the spatial frequency estimates, can equivalently be identified by rooting the function in(50).Making use of the notation a ( ν k , η ) = B ( ν k ) ϕ ( ν k , η ), as introduced in (4), condition (50)can be rewritten as 1 − a H ( ν k , η ) ˜ Υ a ( ν k , η )= 1 − ϕ H ( ν k , η ) B H ( ν k ) ˜ Υ B ( ν k ) ϕ ( ν k , η )= 1 − ˜ ϕ H ( ν k , η ) B H ( ν k ) Υ B ( ν k ) ˜ ϕ ( ν k , η )= ˜ ϕ H ( ν k , η ) (cid:0) I p − B H ( ν k ) Υ B ( ν k ) (cid:1) ˜ ϕ ( ν k , η ) ≥ , (51)where ˜ ϕ ( ν k , η ) = ϕ ( ν k , η ) / k ϕ ( ν k , η ) k (52) Υ = k ϕ ( ν k , η ) k ˜ Υ . (53)Condition (51) is fulfilled if I q − B H ( ν k ) Υ B ( ν k ) (cid:23) , (54)which is identical to the constraint (39c) in problem (39). Replacing the constraint (49c) in prob-lem (49) by the condition (54) and further using (53) and the substitutions λ = ˜ λ/ k ϕ ( ν k , η ) k and Υ = ˜ Υ shows that for the PCA case the dual problem (49) can be reformulated as thedual problem (39). As demonstrated in the previous section, condition (54) can be extended toan infinitesimal grid spacing, resulting in a matrix polynomial constraint, such that the resultinggridless estimation problem can be implemented by semidefinite programming (see AppendixB). For experimental performance evaluation of our proposed COBRAS method we compare itsestimation performance to the state-of-the-art methods spectral RARE [8] and root-RARE [19]as well as the Cram´er-Rao bound (CRB) [8]. 13or all simulations we use circular complex Gaussian source signals Ψ with covariance matrixE( Ψ Ψ H ) = N I , if not specified otherwise. We further consider spatio-temporal white circularcomplex Gaussian sensor noise N with covariance matrix E( N N H ) = σ N I and define thesignal-to-noise ratio (SNR) as SNR = 1 /σ . The vector r ( p ) = [ r ( p )1 , . . . , r ( p ) M p ] T contains theglobal sensor positions r ( p ) m of subarray p , for m = 1 , . . . , M p , p = 1 , . . . , P , expressed in half-wavelength, as defined in (1). If not stated otherwise, we perform T = 1000 Monte Carlo trialsfor each experimental setup and compute the statistical error.The estimation performance of the COBRAS method strongly depends on proper selectionof a regularization parameter λ . While regularization parameter selection is a research field ofits own, in this paper we follow a heuristic approach and select the regularization parameter as λ = max p σ q M p log( M ) , (55)which has shown good estimation performance in all investigated scenarios.We remark that the RARE and COBRAS method make different assumptions on the avail-ability of a-priori knowledge. While the RARE method requires knowledge of the number ofsource signals, the regularization parameter selection for the COBRAS method according to(55) requires knowledge of the noise power. However, since estimation of these parameters itselfmight affect the frequency estimation performance of the RARE and COBRAS methods, weapply the standard assumption of perfectly known number of source signals and noise powerand investigate the achievable performance under these idealized assumptions. In the first scenario we consider a PCA with a large aperture, composed of M = 11 sensors whichare partitioned in P = 4 linear subarrays with 3,2,3, and 3 sensors, respectively. The sensor po-sitions for each subarray are r (1) = [0 . , . , . T , r (2) = [12 . , . T , r (3) = [21 . , . , . T ,and r (4) = [37 . , . , . T , and we assume no additional gain/phase offsets among the sub-arrays, i.e., α = [1 , , , T in (7). We further consider L = 3 uncorrelated Gaussian sourcesignals with spatial frequencies µ = [0 . , . , − . T .The array topology does not admit a direct implementation of the gridless COBRAS and theroot-RARE methods such that we limit the experiments in this subsection to the investigation ofthe grid-based COBRAS method and the spectral RARE method. For both grid-based methodswe use a gird of K = 400 grid points according to ν = [ − . , − . , − . , . . . , . T .To investigate the frequency estimation performance, we compute the root-mean-square errorof the frequency estimates in ˆ µ asRMSE( ˆ µ ) = vuut LT T X t =1 L X l =1 (cid:12)(cid:12) µ l − ˆ µ l ( t ) (cid:12)(cid:12) , (56)where ˆ µ l ( t ) denotes the frequency estimate of signal l in trial t and | µ − µ | wa = min i ∈ Z | µ − µ + 2 i | denotes the wrap-around distance for two frequencies µ , µ ∈ [ − , L to be equal to the truenumber of source signals L , we have to consider two special cases: in the case of overestimationof the model order, ˆ L > L , we select the L frequency estimates with the largest correspond-ing magnitudes, whereas we select L − ˆ L additional random spatial frequencies in the case ofunderestimation ˆ L < L . 14 − − − Snapshots R M S E ( ˆ µ ) COBRAS (38)Spect. RARE [8]CRB [8]
Figure 2: Frequency estimation performance for a PCA of M = 11 sensors in P = 4 subarrays,with SNR = 6 dB and varying number of snapshots N − − − − − − SNR in dB R M S E ( ˆ µ ) COBRAS (38)Spect. RARE [8]CRB [8]
Figure 3: Frequency estimation performance for a PCA of M = 11 sensors in P = 4 subarrays,with N = 20 snapshots and varying SNRIn the first experiment the signal-to-noise ratio (SNR) is fixed to SNR = 6 dB, while thenumber of snapshots N is varied. Figure 2 clearly demonstrates that our proposed grid-basedCOBRAS technique outperforms the spectral RARE for low number of signal snapshots N .While the spectral RARE method is not able to always resolve the two closely spaced signalswith spatial frequencies µ = 0 . µ = 0 . N ≤
500 signal snapshots, our proposedCOBRAS method resolves the signals for any N ≥
30 snapshots.In a second experiment we fix the number of snapshots as N = 20 and vary the SNR.As can be observed from Figure 3 the grid-based COBRAS method shows superior thresholdperformance as compared to the spectral RARE. While the spectral RARE can reliably resolvethe two closely spaced sources only for SNR ≥
22 dB, our proposed COBRAS can do so forSNR ≥ .2 Resolution Performance and Estimation Bias For further investigation of the spatial frequency estimation bias, we consider a uniform lineararray of M = 9 sensors, partitioned into P = 3 identical, uniform linear subarrays of 3 sensorseach, without additional gain/phase offsets, i.e., α = [1 , , T in (7). For the experiment weconsider L = 2 uncorrelated signals and fix the spatial frequency of the first signal as µ = 0 . µ = µ − ∆ µ with10 − ≤ ∆ µ ≤
1. For all grid-based estimation methods we make use of a uniform grid of K = 200 points according to ν = [ − , − . , − . , . . . , . T . The SNR and number ofsnapshots are fixed as SNR = 0 dB and N = 20.First, we observe from Figure 4 that the spectral RARE performs significantly worse thanroot-RARE in terms of threshold performance, i.e., the spectral RARE cannot always resolvethe two signals for a frequency separation of ∆ µ / . µ ' .
12. The reason for this difference in resolution performance is that the root-RARE method locates the roots of the corresponding matrix polynomial in the entire complexplane, while the spectral RARE only searches minima on the unit circle (see also [39, 40]). Incontrast to that, the grid-based and the gridless COBRAS methods both show rather similarestimation performance, comparable to that of the root-RARE method, and reach the CRB forsufficiently large frequency separation. This observation can be explained by the fact that bothdual COBRAS optimization problems provide matrix polynomials with the roots of interestconstrained on the unit circle, as discussed in Section 6. This explains the similar performanceresults of the grid-based and gridless COBRAS methods. The only difference between the grid-based and gridless COBRAS methods is that in the first case the roots are generated on a gridof candidate frequencies on the unit circle, while in the latter case the roots are continuouslylocated on the unit circle.In a slightly modified experiment we fix the SNR and number of snapshots to SNR = 20 dBand N = 50, respectively. While the root-RARE method performs close to the CRB for theregion of interest, the spectral RARE can not always resolve the signals for ∆ µ / .
06 andreaches an estimation bias for large source separation, which is caused by the finite frequencygrid. Furthermore, it can be observed that the estimation performance of the COBRAS methodsdeviates from that of the root-RARE method. For large frequency separation ∆ µ ' . µ / . µ ) = vuut L L X l =1 ( µ l − Mean(ˆ µ l )) , (57)where the mean estimate for spatial frequency µ l is computed as Mean(ˆ µ l ) = 1 /T P Tt =1 ˆ µ l ( t ).For the given scenario, the estimation bias is displayed in Figure 6. In the case of lowfrequency separation ∆ µ / . µ ) ≈ .
01. For larger frequency separation ∆ µ ' .
2, the bias of the grid-based COBRASmethod is mainly determined by the finite grid, while the bias of the gridless COBRAS methodshows to be periodic in ∆ µ . In difficult scenarios, with low SNR and low number of snapshotsas for the previous setup, the estimation bias is below the CRB, such that it is negligible in16 − − − − Frequency Separation ∆ µ R M S E ( ˆ µ ) GL-COBRAS (84)COBRAS (38)Root-RARE [19]Spect. RARE [8]CRB [8]
Figure 4: Frequency estimation performance for uniform linear PCA of M = 9 sensors in P = 3linear subarrays, for N = 20 snapshots and SNR = 0 dB − − − − − Frequency Separation ∆ µ R M S E ( ˆ µ ) GL-COBRAS (84)COBRAS (38)Root-RARE [19]Spect. RARE [8]CRB [8]
Figure 5: Frequency estimation performance for uniform linear PCA of M = 9 sensors in P = 3linear subarrays, for N = 50 snapshots and SNR = 20 dB − − − − − − Frequency Separation ∆ µ B i a s ( ˆ µ ) GL-COBRAS (84)COBRAS (38)Root-RARE [19]Spect. RARE [8]
Figure 6: Frequency estimation bias for uniform linear PCA of M = 9 sensors in P = 3 linearsubarrays, for N = 50 snapshots and SNR = 20 dB17 . . . . − − Correlation Coefficient ρ R M S E ( ˆ µ ) GL-COBRAS (84)COBRAS (38)Root-RARE [19]Spect. RARE [8]CRB [8]
Figure 7: Frequency estimation performance for PCA of M = 9 sensors in P = 3 subarrays forSNR = 0 dB, N = 30 snapshot and L = 2 source signals with varying real-valued correlationcoefficient ρ the RMSE performance. The frequency estimation bias is a well known phenomenon in SRresearch [22, 23] and bias mitigation techniques have been discussed, e.g., in [41]. As discussed in the previous subsection, gridless COBRAS and root-RARE show approximatelyequal resolution performance for uncorrelated signals in difficult scenarios with low SNR, lownumber of snapshots and uncorrelated signals. This situation changes in the case of correlatedsignals, where preprocessing in form of subspace separation, as required for the RARE method,becomes difficult. For further investigation of this aspect we consider a PCA of M = 9 sensorspartitioned into P = 3 subarrays of 3,4 and 2 sensors with positions r (1) = [0 , , T , r (2) =[17 . , . , . , . T and r (3) = [24 . , . T . Furthermore we consider gain/phase offsetsamong the subarrays according to α = [1 , . · e j π , . · e j π ] T in (7). The SNR and numberof snapshots are selected as SNR = 0 dB and N = 30. We consider L = 2 source signals withspatial frequencies µ = [0 . , . T and a source covariance matrix given asE = N (cid:20) ρρ ∗ (cid:21) , (58)where the correlation coefficient ρ is assumed to be real-valued and varied in the experiment. Forthe grid-based estimation methods we consider a grid of K = 200 candidate frequencies, definedas in the previous subsection. As seen from Figure 7, the spectral and root-RARE methods failto properly estimate the spatial frequencies for high correlation ( ρ > .
6) while the grid-basedand gridless COBRAS methods still show estimation performance close to the CRB, since thesemethods do not require subspace separation.
Besides estimation of the spatial frequencies, the COBRAS method also admits estimation ofthe subarray shifts in ϕ as defined in (7). Since the RARE methods do not provide direct18 − − − − − SNR in dB R M S E ( ˆ µ ) GL-COBRAS (84)COBRAS (38)Root-RARE [19]Spect. RARE [8]CRB [8]
Figure 8: Frequency estimation performance for PCA of M = 10 sensors in P = 4 subarrays,for N = 20 snapshot and L = 2 uncorrelated source signals − − − − − SNR in dB R M S E ( ˆ ϕ ) GL-COBRAS (84)COBRAS (38)Root-RARE [15, 19]Spect. RARE [8, 15]CRB [8]
Figure 9: Displacement phase estimation performance for PCA of M = 10 sensors in P = 4subarrays, for N = 20 snapshot and L = 2 uncorrelated source signalsestimation of the subarray shifts, we use the method presented in [15] in equation (11) on thebasis of the spatial frequency estimates obtained by the RARE methods.The setup under investigation consists of a PCA of M = 10 sensors partitioned into P =4 subarrays of 3,2,3 and 2 sensors at positions r (1) = [0 , , T , r (2) = [10 . , . T , r (3) =[27 . , . , . T and r (4) = [54 . , . T . The subarray gain/phase offsets are set as α =[1 , . · e j π , . · e − j π , . · e − j π ] T in (7). We consider L = 3 uncorrelated source signals withspatial frequencies µ = [0 . , . , − . T and the number of snapshot is set to N = 20.Figure 8 displays the frequency estimation error of the different methods for varying SNR,where both COBRAS methods show the best thresholding performance but reach an estimationbias for SNR ≥
15 dB. Similarly, the spectral RARE algorithm reaches an estimation bias whichis caused by the finite grid. On the other hand, the root-RARE performs asymptotically optimaland reaches the CRB for high SNR. The corresponding subarray shift estimation performance Without the restriction that the complex phase terms must be of unit magnitude.
19s displayed in Figure 9, where the root-mean-square error is computed according toRMSE( ˆ ϕ ) = vuut LT ( P − T X t =1 L X l =1 (cid:13)(cid:13) ϕ l − ˆ ϕ l ( t ) (cid:13)(cid:13) , (59)with ˆ ϕ l ( t ) being the displacement phase vector estimate for signal l in Monte Carlo trial t .As can be observed from Figure 9, the subarray shift estimation method in [15], based on thefrequency estimates obtained from the RARE methods, achieves a relatively large estimationbias for high SNR. In contrast to that, the grid-based and gridless COBRAS methods show asignificantly reduced estimation error, which demonstrates the advantage of joint frequency anddisplacement phase estimation. To investigate the computation time of the COBRAS formulation, we perform simulations inMatlab using the SeDuMi solver [35] with the CVX interface [42, 43] on a machine with anIntel Core i5-760 CPU @ 2 .
80 GHz × µ = 0 .
505 and µ = − .
205 and a uniform linear PCA of M = 9 sensors partitioned into P = 3 identical anduniform linear subarrays of 3 sensors. We neglect subarray gain/phase offsets, i.e., α = [1 , , T in (7).For the first experiment the SNR is fixed at SNR = 0 dB while the number of snapshots N is varied. Figure 10 shows the average computation time for T = 100 Monte Carlo runs of theSDP implementation of the ℓ ∗ , mixed-norm minimization (34) and the grid-based COBRASformulations (35) and (38) with a grid size of K = 100, as well as the gridless (GL-) COBRASformulation in (84). The computation time is measured only for solving the corresponding opti-mization problem in CVX. Pre-processing steps, such as computation of the sample covariancematrix, or post-processing steps, such as peak-search or polynomial rooting, are not includedinto this consideration. As can be observed from Figure 10, for a number of N < ≤ N <
40 the ℓ ∗ , mixed-norm minimization problem has largest computation time while the COBRAS for-mulation (35) requires longest computation time for N >
40, due to the large dimension ofthe semidefinite constraint (35b). Regarding the computation time of the grid-based COBRASformulation using the sample covariance matrix (38) we observe that it is relatively constantfor any number of snapshots N and lower than for the other implementations especially forlarge number of snapshots N >
10. The lowest computation time is required for solving theGL-COBRAS implementation (84).Figure 11 shows the average computation time for T = 100 Monte Carlo runs for a varyingnumber of grid points K and a fixed number of N = 9 signal snapshots, corresponding tothe case where the signal subspace matching techniques are applied (compare [22, 34]). For allgrid-based methods the number of SDP constraints grows nearly linear with the number of gridpoints K . Clearly, the ℓ ∗ , mixed-norm minimization approach has the largest computation timefor any investigated number of grid points K . The grid-based COBRAS formulations (35) and(38) show approximately equal computation time, since both formulations have SDP constraintsof identical dimension. Since the GL-COBRAS formulation is independent of the grid size K ,it is constant for all grid size numbers K in Figure 11 and provides the fastest computation ofall methods under investigation. 20 − Number of Snapshots N A v . C o m pu t a t i o n T i m e i nS ec s ℓ ∗ , Mixed-Norm (34)COBRAS (35)COBRAS (38)GL-COBRAS (84)
Figure 10: Average computation time of different SDP implementations for uniform linear PCAwith M = 9 sensors in P = 3 subarrays, with K = 100 grid points and varying number ofsnapshots N − Grid Size K A v . C o m pu t a t i o n T i m e i nS ec s ℓ ∗ , Mixed-Norm (34)COBRAS (35)COBRAS (38)GL-COBRAS (84)
Figure 11: Average computation time of different SDP implementations for uniform linear PCAwith M = 9 sensors in P = 3 subarrays, with N = 9 signal snapshots and varying grid size K Conclusion
Partly calibrated arrays are attractive setups for direction of arrival estimation. Computationallyefficient subspace-based methods such as RARE and ESPRIT show asymptotically optimalperformance, but have problems in difficult scenarios such as low sample size, low signal-to-noiseratio or correlated source signals. Sparse recovery in the form of ℓ ∗ , mixed-norm minimizationhas been shown to be an attractive alternative to subspace-based methods in these difficultscenarios. In this paper we have derived a compact, equivalent formulation of the ℓ ∗ , mixed-norm minimization, referred to as COmpact Block- and RAnk-Sparse recovery (COBRAS). TheCOBRAS formulation is attractive especially in the case of a large number of signal snapshots.For the special case of subarrays with common baseline we have presented an extension togridless estimation, referred to as gridless COBRAS (GL-COBRAS).As shown by numerical results, the grid-based COBRAS significantly outperforms the spec-tral RARE method in terms of thresholding performance for closely spaced source signals. Fur-thermore, the COBRAS method outperforms the RARE method in the case of strongly corre-lated source signals and in the calibration (subarray shift estimation) performance. A drawbackof the ℓ ∗ , mixed-norm minimization approach and the COBRAS formulation is the estima-tion bias which becomes significant in the asymptotic case of large number of snapshots or highsignal-to-noise ratio. However, if higher estimation accuracy is required, the COBRAS estimatescan be used to provide initial estimates, e.g., for a subsequent maximum likelihood estimator. Acknowledgment
This work was supported by the German Research Foundation (DFG) within the DFG priorityprogram on Compressed Sensing in Information Processing (CoSIP DFG-SPP 1798).
Appendix A - Proof of Problem Equivalence
The proof of Theorem 1 relies on the following lemma, see [30, 44]:
Lemma 1.
The nuclear norm of the P × N matrix Q k is equivalently computed by the mini-mization problem k Q k k ∗ = min Γ k , G k n (cid:0) k Γ k k F + k G k k F (cid:1) : Γ k G k = Q k o (60) where Γ k and G k are complex matrices of dimensions P × r and r × N , respectively, with r = min( N, P ) .Proof of Lemma 1. Let us define the compact singular value decomposition of matrix Q k as Q k = U k Σ k V H k , (61)such that the factorization terms of Q k = Γ k G k can be expressed as Γ k = U k Π k, W H k and G k = W k Π k, V H k , (62)22or Σ k = Π k, Π k, of size r × r and some r × r arbitrary unitary matrix W k , i.e., W H k W k = I r .Based on (62) it holds that k Q k k ∗ = k Σ k k ∗ = k Π k, Π k, k ∗ ≤ k Π k, k F k Π k, k F = k Γ k k F k G k, k F , (63)where the inequality stems from the Cauchy-Schwartz inequality and is fulfilled with equality ifand only if Π k, = Π k, = Σ / k . In this case, the matrix factors in (62) are given as Γ k = U k Σ k W H k and G k = W k Σ k V H k . (64)Furthermore, by the arithmetic-geometric-mean inequality it follows that k Γ k k F k G k, k F ≤ (cid:0) k Γ k k F + k G k k F (cid:1) , (65)where equality holds if k Γ k k F = k G k k F , such that the minimum of (60) is given by k Q k k ∗ = (cid:0) k Γ k k F + k G k k F (cid:1) with Γ k and G k given by (64). (cid:4) Proof of Theorem 1.
Based on Lemma 1, the ℓ ∗ , mixed-norm of the source signal matrix Q =[ Q T , . . . , Q T K ] T , as defined in (25), is equivalently computed by k Q k ∗ , = K X k =1 k Q k k ∗ = min { Γ k , G k } n K X k =1 (cid:0) k Γ k k F + k G k k F (cid:1) : Γ k G k = Q k o = min Γ ∈B P × rK , G n
12 ( k Γ k F + k G k F ) : Q = Γ G o (66)where r = min( N, P ), Γ = blkdiag( Γ , . . . , Γ K ) is taken from the set B P × rK of block-diagonalmatrices composed of K blocks of size P × r on the main diagonal, and G = [ G T , . . . , G T K ] T isa ( Kr ) × N complex matrix composed of blocks G k , for k = 1 , . . . , K . Inserting equation (66)into the ℓ ∗ , mixed-norm minimization problem in (28) we formulate the minimization problemmin Γ ∈B P × rK , G k BΓ G − Y k F + λ √ N k Γ k F + k G k F ) . (67)For a fixed matrix Γ , the minimizer ˆ G of problem (66) has the closed form expressionˆ G = ( Γ H B H BΓ + λ √ N I ) − Γ H B H Y = Γ H B H ( BΓ Γ H B H + λ √ N I ) − Y (68)where the last equation is derived from the Woodbury matrix identity [45, p.151]. Reinsertingthe optimal matrix ˆ G into equation (67) and using basic reformulations of the objective functionresults in the concentrated minimization problemmin Γ ∈B P × rK λ √ N (cid:16) Tr (cid:0) ( BΓ Γ H B H + λ √ N I )- Y Y H (cid:1) + Tr (cid:0) Γ Γ H (cid:1)(cid:17) . (69)23pon summarizing Y Y H /N = ˆ R and defining the positive semidefinite block-diagonal matrix S = Γ Γ H / √ N ∈ B P + K (70)we can rewrite (69) as min S ∈B P + K λN (cid:16) Tr (cid:0) ( BSB H + λ I ) − ˆ R (cid:1) + Tr (cid:0) S (cid:1)(cid:17) . (71)Neglecting the factor λN/ S = blkdiag( ˆ S , . . . , ˆ S K ) in (70) we conclude thatˆ S k = 1 √ N ˆ Γ k ˆ Γ H k = 1 √ N ( ˆ Q k ˆ Q H k ) / (72)as given in (31). Making further use of (68) and the factorization in (66) we obtainˆ Q = ˆ Γ ˆ G = ˆ Γ ˆ Γ H B H ( B ˆ Γ ˆ Γ H B H + λ √ N I ) − Y = ˆ SB H ( B ˆ SB H + λ I ) − Y (73)which corresponds to relation (30). (cid:4) Appendix B - SDP Form of the Matrix Polynomial Constraint
As discussed in Section 6, the matrix M ( z ) represents a matrix polynomial of degree D in thevariable z [37]. To see the relation between matrix Υ and the matrix coefficients K i , let usdefine the (( D + 1) P ) × P matrix Ω ( z ) = (cid:2) I P z I P z I P . . . z D I P (cid:3) T (74)and introduce the M × (( D + 1) P ) permutation and selection matrix J such that the subarraysteering matrix can be expressed as B ( z ) = J Ω ( z ) . (75)Inserting (75) in (45) yields M ( z ) = B H ( z ) Υ B ( z )= Ω H ( z ) J H Υ J Ω ( z )= Ω H ( z ) F Ω ( z ) , (76)where F = J H Υ J is of size (( D + 1) P ) × (( D + 1) P ) and is composed of the P × P blocks F i,j , for i, j = 1 , . . . D + 1, as F = F , · · · F ,D +1 ... . . . ... F D +1 , · · · F D +1 ,D +1 . (77)24quation (76) is also referred to as the Gram matrix representation of the polynomial M ( z ),and F is referred to as the corresponding Gram matrix [37].We define the block trace operator for matrix F asblkTr P ( F ) = D +1 X i =1 F i,i , (78)i.e., the summation of the P × P submatrices F i,i , for i = 1 , . . . , D + 1, on the main diagonal ofmatrix F . Furthermore, let us define the ( D + 1) × ( D + 1) elementary Toeplitz matrix Θ i , withones on the i th diagonal and zeros elsewhere, as well as the elementary block Toeplitz matrix Ξ i = Θ i ⊗ I P .Using the block trace operator (78) and the elementary block Toeplitz matrices Ξ i , thematrix coefficients K i in (45) can be computed from the Gram matrix F in (76) as K i = blkTr P ( Ξ i F ) , (79)i.e., the summation of the P × P submatrices on the i th block-diagonal of the Gram matrix F .Note that the mapping (45) is unique, i.e., for any PCA steering matrix block B ( z ) and matrix Υ the coefficients K i , i = 1 , . . . , D , of the matrix polynomial M ( z ) are unique. However, theGram matrix F in (76) is not unique, i.e., a matrix polynomial M ( z ) generally admits differentGram matrix representations.Let us define a second matrix polynomial which has constant value I P as Ω H ( z ) H Ω ( z ) = I P (80)such that the corresponding Gram matrix H of size (( D + 1) P ) × (( D + 1) P ) fulfillsblkTr P ( H ) = I P , (81a)blkTr P ( Ξ i H ) = for i = 0 . (81b)By using (76), (80) and (81) we can express the constraint (46) as I P − B ( z ) H Υ B ( z ) = Ω H ( H − J H Υ J ) Ω (cid:23) (82)which is fulfilled for H − J H Υ J (cid:23) . (83)Applying (81) and (83) in problem (39) we can define the gridless frequency estimation problemmax Υ , Υ , H − { Tr( Υ ) } − λ Tr( Υ ) (84a)s.t. (cid:20) ˆ R Υ Υ H Υ (cid:21) (cid:23) (84b) H − J H Υ J (cid:23) (84c)blkTr P ( H ) = I P (84d)blkTr P ( Ξ i H ) = for i = 0 . (84e)Given a minimizer ˆ Υ to problem (84) the frequency estimation problem reduces to finding rootsfor which the constraint (46) becomes singular, as discussed in Section 6.25 eferences [1] H. Krim and M. Viberg, “Two decades of array signal processing research: the parametricapproach,” IEEE Signal Processing Magazine , vol. 13, no. 4, pp. 67–94, Jul 1996.[2] B. Porat and B. Friedlander, “Accuracy requirements in off-line array calibration,”
IEEETransactions on Aerospace and Electronic Systems , vol. 33, no. 2, pp. 545–556, April 1997.[3] Y. Rockah and P. Schultheiss, “Array shape calibration using sources in unknown locations–part I: Far-field sources,”
IEEE Transactions on Acoustics, Speech and Signal Processing ,vol. 35, no. 3, pp. 286–299, Mar 1987.[4] A. J. Weiss and B. Friedlander, “Self-calibration in high-resolution array processing,” in
Advances in Spectrum Estimation and Array Processing , S. Haykin, Ed. Prentice-Hall,1991, vol. II.[5] M. Viberg and A. Swindlehurst, “A Bayesian approach to auto-calibration for parametricarray signal processing,”
IEEE Transactions on Signal Processing , vol. 42, no. 12, pp.3495–3507, Dec 1994.[6] B. C. Ng and C. M. S. See, “Sensor-array calibration using a maximum-likelihood ap-proach,”
IEEE Transactions on Antennas and Propagation , vol. 44, no. 6, pp. 827–835,Jun 1996.[7] B. P. Flanagan and K. L. Bell, “Array self-calibration with large sensor position errors,”
Signal Processing , vol. 81, no. 10, pp. 2201–2214, 2001.[8] C. See and A. Gershman, “Direction-of-arrival estimation in partly calibrated subarray-based sensor arrays,”
IEEE Transactions on Signal Processing , vol. 52, no. 2, pp. 329–338,2004.[9] M. Wax and T. Kailath, “Decentralized processing in sensor arrays,”
IEEE transactions onacoustics, speech, and signal processing , vol. 33, no. 5, pp. 1123–1129, 1985.[10] P. Stoica, A. Nehorai, and T. S¨oderstr¨om, “Decentralized array processing using the MODEalgorithm,”
Circuits, Systems, and Signal Processing , vol. 14, no. 1, pp. 17–38, 1995.[11] W. Suleiman and P. Parvazi, “Search-free decentralized direction-of-arrival estimation usingcommon roots for non-coherent partly calibrated arrays,” in
Acoustics, Speech and SignalProcessing (ICASSP), 2014 IEEE International Conference on , May 2014, pp. 2292–2296.[12] M. F. Duarte, S. Sarvotham, D. Baron, M. B. Wakin, and R. G. Baraniuk, “Distributedcompressed sensing of jointly sparse signals,” in
Conference Record of the Thirty-NinthAsilomar Conference onSignals, Systems and Computers, 2005. , October 2005, pp. 1537–1541.[13] Z. Lu, R. Ying, S. Jiang, P. Liu, and W. Yu, “Distributed compressed sensing off the grid,”
IEEE Signal Processing Letters , vol. 22, no. 1, pp. 105–109, Jan 2015.[14] A. Swindlehurst, P. Stoica, and M. Jansson, “Exploiting arrays with multiple invariancesusing MUSIC and MODE,”
IEEE Transactions on Signal Processing , vol. 49, no. 11, pp.2511–2521, Nov 2001. 2615] P. Parvazi, M. Pesavento, and A. Gershman, “Direction-of-arrival estimation and array cal-ibration for partly-calibrated arrays,” in , 2011, pp. 2552–2555.[16] A. Swindlehurst, B. Ottersten, R. Roy, and T. Kailath, “Multiple invariance ESPRIT,”
IEEE Transactions on Signal Processing , vol. 40, no. 4, pp. 867–881, Apr 1992.[17] W. Suleiman, M. Pesavento, and A. Zoubir, “Decentralized direction finding using partlycalibrated arrays,” in , Sept 2013, pp. 1–5.[18] W. Suleiman, P. Parvazi, M. Pesavento, and A. Zoubir, “Decentralized direction findingusing Lanczos method,” in
Sensor Array and Multichannel Signal Processing Workshop(SAM), 2014 IEEE 8th , June 2014, pp. 9–12.[19] M. Pesavento, A. Gershman, and K. M. Wong, “Direction finding in partly calibrated sensorarrays composed of multiple subarrays,”
IEEE Transactions on Signal Processing , vol. 50,no. 9, pp. 2103–2115, 2002.[20] C. Steffens, W. Suleiman, and M. Sorg, A. Pesavento, “Gridless compressed sensing undershift-invariant sampling,” in
The 42nd IEEE International Conference on Acoustics, Speechand Signal Processing (ICASSP) , March 2017.[21] C. Steffens, P. Parvazi, and M. Pesavento, “Direction finding and array calibration basedon sparse reconstruction in partly calibrated arrays,” in
Sensor Array and MultichannelSignal Processing Workshop (SAM), 2014 IEEE 8th , June 2014, pp. 21–24.[22] D. Malioutov, M. C¸ etin, and A. Willsky, “A sparse signal reconstruction perspective forsource localization with sensor arrays,”
IEEE Transactions on Signal Processing , vol. 53,no. 8, pp. 3010–3022, 2005.[23] C. Steffens, M. Pesavento, and M. E. Pfetsch, “A compact formulation for the ℓ , mixed-norm minimization problem,” ArXiv e-prints, arXiv:1606.07231v1 , Jun. 2016.[24] R. Tibshirani, “Regression shrinkage and selection via the lasso,”
Journal of the RoyalStatistical Society. Series B (Methodological) , vol. 58, pp. 267–288, 1996.[25] S. S. Chen, D. L. Donoho, and M. A. Saunders, “Atomic decomposition by basis pursuit,”
SIAM Journal On Scientific Computing , vol. 20, pp. 33–61, 1998.[26] E. J. Cand`es and C. Fernandez-Granda, “Towards a mathematical theory of super-resolution,”
Communications on Pure and Applied Mathematics , vol. 67, no. 6, pp. 906–956,2014.[27] ——, “Super-resolution from noisy data,”
Journal of Fourier Analysis and Applications ,vol. 19, no. 6, pp. 1229–1254, 2013.[28] G. Tang, B. Bhaskar, P. Shah, and B. Recht, “Compressed sensing off the grid,”
IEEETransactions on Information Theory , vol. 59, no. 11, pp. 7465–7490, Nov 2013.2729] M. Fazel, H. Hindi, and S. Boyd, “A rank minimization heuristic with application to min-imum order system approximation,” in
Proceedings of the American Control Conference ,vol. 6, 2001, pp. 4734–4739.[30] B. Recht, M. Fazel, and P. A. Parrilo, “Guaranteed minimum-rank solutions of linear matrixequations via nuclear norm minimization,”
SIAM Review , vol. 52, no. 3, pp. 471–501, 2010.[31] Y. L. Ming Yuan, “Model selection and estimation in regression with grouped variables,”
Journal of the Royal Statistical Society. Series B (Statistical Methodology) , vol. 68, no. 1,pp. 49–67, 2006.[32] M. Kowalski, “Sparse regression using mixed norms,”
Applied and Computational HarmonicAnalysis , vol. 27, no. 3, pp. 303 – 324, 2009.[33] C. Steffens, Y. Yang, and M. Pesavento, “Multidimensional sparse recovery for MIMO chan-nel parameter estimation,” in
Proceedings of the 2016 European Signal Processing Confer-ence , Budapest, Hungary, September 2016.[34] J. Steinwandt, C. Steffens, M. Pesavento, and M. Haardt, “Sparsity-aware direction findingfor strictly non-circular sources based on rank minimization,” July 2016.[35] J. Sturm, “Using SeDuMi 1.02, a MATLAB toolbox for optimization over symmetric cones,”
Optimization Methods and Software , vol. 11–12, pp. 625–653, 1999.[36] L. Vandenberghe and S. Boyd, “Semidefinite programming,”
SIAM review , vol. 38, no. 1,pp. 49–95, 1996.[37] B. Dumitrescu,
Positive Trigonometric Polynomials and Signal Processing Applications .Berlin: Springer, 2007.[38] M. Pesavento, “Fast algorithms for multidimensional harmonic retrieval,” Ph.D. disserta-tion, Ruhr-Universit¨at Bochum, 2005.[39] A. Barabell, “Improving the resolution performance of eigenstructure-based direction-finding algorithms,” in
Acoustics, Speech, and Signal Processing, IEEE International Con-ference on ICASSP ’83. , vol. 8, Apr 1983, pp. 336–339.[40] M. Pesavento, A. B. Gershman, and M. Haardt, “Unitary root-MUSIC with a real-valuedeigendecomposition: a theoretical and experimental performance study,”
IEEE Transac-tions on Signal Processing , vol. 48, no. 5, pp. 1306–1314, May 2000.[41] E. Northardt, I. Bilik, and Y. Abramovich, “Spatial compressive sensing for direction-of-arrival estimation with bias mitigation via expected likelihood,”
IEEE Transactions onSignal Processing , vol. 61, no. 5, pp. 1183–1195, 2013.[42] M. Grant and S. Boyd, “Graph implementations for nonsmooth convex programs,” in
RecentAdvances in Learning and Control , ser. Lecture Notes in Control and Information Sciences,V. Blondel, S. Boyd, and H. Kimura, Eds. Springer-Verlag Limited, 2008, pp. 95–110.[43] ——, “CVX: Matlab software for disciplined convex programming, version 2.1,”http://cvxr.com/cvx, Mar. 2014. 2844] N. Srebro and A. Shraibman, “Rank, trace-norm and max-norm,” in
Proceedings of the18th Annual Conference on Learning Theory . Springer-Verlag, 2005, pp. 545–560.[45] S. Searle,