[PDF] Piecewise Sparse Recovery in Unions of Bases

Abstract

Sparse recovery is widely applied in many fields, since many signals or vectors can be sparsely represented under some frames or dictionaries. Most of fast algorithms at present are based on solving l 0 or l 1 minimization problems and they are efficient in sparse recovery. However, compared with the practical results, the theoretical sufficient conditions on the sparsity of the signal for l 0 or l 1 minimization problems and algorithms are too strict. \par In many applications, there are signals with certain structures as piecewise sparsity. Piecewise sparsity means that the sparse signal x is a union of several sparse sub-signals, i.e., x=( x T 1 ,…, x T N ) T , corresponding to the matrix A which is composed of union of bases A=[ A 1 ,…, A N ] . In this paper, we consider the uniqueness and feasible conditions for piecewise sparse recovery. We introduce the mutual coherence for the sub-matrices A i (i=1,…,N) to study the new upper bounds of ∥x ∥ 0 (number of nonzero entries of signal) recovered by l 0 or l 1 optimizations. The structured information of measurement matrix A is used to improve the sufficient conditions for successful piecewise sparse recovery and also improve the reliability of l 0 and l 1 optimization models on recovering global sparse vectors.

Full PDF

aa r X i v : . [ m a t h . NA ] M a r Piecewise Sparse Recovery in Unions of Bases

Chong-Jun Li and Yi-Jun Zhong School of Mathematical Sciences, Dalian University of Technology, Dalian 116024, China

Abstract

Sparse recovery is widely applied in many ﬁelds, since many signals or vectors can be sparselyrepresented under some frames or dictionaries. Most of fast algorithms at present are based on solving l or l minimization problems and they are efﬁcient in sparse recovery. However, compared with thepractical results, the theoretical sufﬁcient conditions on the sparsity of the signal for l or l minimizationproblems and algorithms are too strict.In many applications, there are signals with certain structures as piecewise sparsity. Piecewise sparsitymeans that the sparse signal x is a union of several sparse sub-signals, i.e., x = ( x T ,..., x TN ) T , correspondingto the matrix A which is composed of union of bases A = [ A ,..., A N ] . In this paper, we consider theuniqueness and feasible conditions for piecewise sparse recovery. We introduce the mutual coherence forthe sub-matrices A i ( i = ,..., N ) to study the new upper bounds of k x k (number of nonzero entries ofsignal) recovered by l or l optimizations. The structured information of measurement matrix A is used toimprove the sufﬁcient conditions for successful piecewise sparse recovery and also improve the reliabilityof l and l optimization models on recovering global sparse vectors. Keywords piecewise sparse, coherence, greedy algorithm, BP method, union of bases

MR(2010) Subject Classiﬁcation

This work was supported by the National Natural Science Foundation of China (Grant Nos. 11871137, 11572081), and theFundamental Research Funds for the Central Universities of China (Grant No. QYWKC2018007). Corresponding author: Chong-Jun Li (Email: [email protected]).

DRAFT

I. I

NTRODUCTION

In this paper, we consider recovering a sparse signal x ∗ ∈ R n from an underdetermined system of linearequation A x ∗ = b , (1)where b ∈ R m is a measurement vector, A ∈ R m × n is a measurement matrix. If the vector x ∗ has at most s ≤ m < n nonzero entries, then it is named as s -sparse vector, the corresponding index set of nonzeroentries is called support S = supp ( x ∗ ) . There are many theories, algorithms and applications on thisproblem of sparse recovery [1].One approach to ﬁnd the sparsest solution of Eq. (1) is greedy algorithm (GA), which solves thefollowing l minimizing solution, named as P problem:min x k x k , s . t . A x = b . (2)One of the most popular greedy methods is the orthogonal matching pursuit (OMP) as proposed in [2],[3], [4]. It iteratively adds components to the support of the approximation x k whose correlation to thecurrent residual is maximal. There are many other greedy methods for sparse recovery, for example,iterative hard thresholding (IHT) [5], stagewise OMP (StOMP) [6], regularized OMP (ROMP) [7], [8],compressive sampling matching pursuit (CoSaMP) [9], subspace pursuit (SP) [10], iterative thresholdingwith inversion (ITI) [11], hard thresholding pursuit (HTP) [12] etc.Another approach is convex relaxation which solves a convex program whose minimizer is obtainedto approximate the target signal. The basis pursuit gains lots of attention which determines the sparsestrepresentation of x ∗ by solving the following l minimization problem, named as P problem or BPproblem (method): min x k x k , s . t . A x = b . (3)Many algorithms have been proposed to complete the optimization, including interior-point methods [13],projected gradient methods [14], and iterative thresholding [15] etc.There are three fundamental problems concerned in this paper:1) Uniqueness of solution of the P problem.2) Feasibility of GA (or OMP) for solving the P problem.3) Equivalence between the P problem and the P problem, or feasibility of BP method to obtain thesparsest solution. DRAFT

There are several tools proposed to formalized the notion of the suitability for the above three problems,such as the mutual coherence [16], the spark [17], the cumulative coherence [18], the exact recoverycondition (ERC) [18], and the restricted isometry constants (RICs) [19], [20], [21]. It is well-known thatthe sufﬁcient and necessary condition for the uniqueness of the solution of the P problem (2) is ([17]) k x k < spark ( A ) / , (4)or the RIC of the matrix A satisﬁes δ s < k x k ≤ s ([19]). The equivalence between the P modeland the P model is guaranteed by δ s < √ − A , however,it is difﬁcult to compute the spark or verify the RIP conditions. By contrast, we can easily compute themutual coherence of matrix. The general case discussed in [16] showed that one sufﬁcient condition forthe uniqueness of the solution of the P problem (2) is k x k < ( + µ ) , (5)where µ is the mutual coherence of measurement matrix A . Furthermore, the condition Eq. (5) is also asufﬁcient condition which ensures the OMP (greedy method) and BP methods for recovering the optimal s -sparse solution [18]. However, in applications, OMP or BP method can work well even when the condition(5) is not satisﬁed, i.e., when 12 ( + / µ ( A )) ≤ k x k < spark ( A ) / , which means the sufﬁcient condition (5) is strict for sparse recovery to some extent, or the ”gap” betweenthe optimal upper bound spark ( A ) and the practical upper bound ( + / µ ( A )) is big.To our surprise, the result in [22] shows that uniqueness of the l minimization P problem solutioncan be achieved for improved condition k x k < µ . The authors also showed that the solutions of the P and P problems coincide for k x k < √ − . µ . These two improved conditions were obtained in the specialcase where A is in pairs of orthogonal bases. It was also shown in [23], [18] that if the matrix A is aunion of N ( ≥ ) orthogonal bases, improved conditions are possible. The sufﬁcient condition for OMPto solving the P model for N orthogonal bases was improved with k x k < ( + ( N − ) ) µ , (6)and for BP to solving P problem was improved with k x k < ( √ − + ( N − ) ) µ . (7) DRAFT condition 1condition 2condition 3condition 4 ( s1 < s2 ) s1 = s2 s Fig. 1. A plot of the upper bounds for the sparse recovery in Table I with µ ( A ) = . Especially, when the vector x is ( s , . . . , s N ) -piecewise sparse (see Deﬁnition 1, k x k = s + · · · + s N ) orvector b is a superposition of s i atoms from the i -th basis, and s ≤ · · · ≤ s N with A in union of N orthogonal bases, the exact recovery condition (ERC) is guaranteed by N ∑ j = µ s j + µ s j < ( + µ s ) . (8)Thus, both the OMP and BP can ﬁnd the sparest solution under condition (8) [18]. When n =

2, Eq. (8)is 2 µ s s + µ s < ( s ≤ s ) . TABLE IA

LIST OF THEORETICAL UPPER BOUNDS FOR SPARSE RECOVERY . condition index structure of A the upper bound of k x k (= s + s ) condition 1 general case k x k < ( + µ ( A ) ) condition 2 pair of orthogonal bases (uniqueness) k x k < µ ( A ) condition 3 pair of orthogonal bases (equivalence) k x k < √ − . µ ( A ) condition 4 pair of orthogonal bases (ERC) 2 µ s s + µ s < DRAFT

The four conditions are concluded in Table I for the case of A in pairs of orthogonal bases comparedwith the general case of A . It is observed from the Fig. 1 (presented in [22]) that the sufﬁcient conditionsbased on the mutual coherence can be improved by considering the structure of matrix A .Note that, many natural and useful redundant dictionaries (measurement matrix A ) cannot be written asa union of orthogonal bases. Thus, it is necessary to study the sufﬁcient conditions for successful recoverywhen the measurement matrix A in general setting, i.e, A is a union of non-orthogonal bases. In this case,the corresponding vector x is partitioned into several parts according to the structure of matrix A . Byconsidering the different sparsity of x according to the structure of A makes it possible to deeply studythe relaxed sufﬁcient conditions for successful sparse recovery.For a given vector x = ( x , . . . , x d | {z } x T , x d + , . . . , x d + d | {z } x T , . . . , x n − d N + , . . . , x n | {z } x TN ) T , where n = N ∑ i = d i and s i = k x i k , i = , . . . , N . There are three types of sparsity of vector x :1) global sparsity: x is assumed to have s = k x k = N ∑ i = k x i k nonzero entries.2) block sparsity ([24], [25], [26], [27]): A block s -sparse vector x is assumed to have at most s blocks with nonzero entries, i.e., the block l or l norm k x k , = N ∑ i = I ( k x i k ) or k x k , = N ∑ i = k x i k is minimized to recovery.3) piecewise sparsity : as the following deﬁnition. Deﬁnition 1.

A vector x = ( x T , . . . , x TN ) T ∈ R n is partitioned into N components and it is assumed thatevery x Ti ∈ R n i containing nonzero entries is sparse, s i = k x i k , i = , . . . , N. We call the vector x is ( s , . . . , s N ) -piecewise sparse. Piecewise sparse vector is a type of vector which each part of the vector is sparse. The piecewise sparsityis different from the block sparsity. Piecewise sparse recovery are common in applications, such as theproblem of the decomposition of texture part and cartoon part of image in [28], i.e., b = A n x n + A t x t where n and t represent the cartoon and texture. It is assumed that both parts can be represented insome given dictionaries, thus x n and x t are two sparse vectors. The coefﬁcient vector x = ( x Tn , x Tt ) T is“piecewise” sparse vector. Another example is the problem of reconstructing a surface from scattereddata in approximation space H = S Ni = H j , where H j ⊆ H j + are principal shift invariant (PSI) spacesgenerated by a single compactly supported function [29], the ﬁtting surface is g = ∑ Ni = g i , g i ∈ H i with g i = ∑ n i j = c ij φ ij . The coefﬁcients c = ( c , c , . . . , c N ) T (by N pieces c i = ( c i , . . . , c in i ) T ) is the vector to bedetermined. Due to the property of PSI spaces, the coefﬁcients to be determined by l minimization in DRAFT [29] are “piecewise” sparse structured, i.e. each c i ∈ R n i is a sparse vector in H i . In [30], we ﬁrstly try torecover the piecewise sparse vector by the piecewise inverse scale space algorithm with deletion rule.It is obvious that piecewise sparsity is more general in applications, since the nonzero entries can appearin scattered way. The corresponding matrix can be structured in a union of some bases (orthogonal basesis a special case) A = [ A , . . . , A N ] . In this paper, We use the mutual coherence and cumulative mutualcoherence to give the conditions of piecewise sparse recovery, which can be efﬁciently calculated for anarbitrary given matrix A and A i ( i = , . . . , N ). Inspired by the works in [23], [18], which provide improvedsufﬁcient conditions for having unique sparse representation of signals in unions of orthogonal bases, westudy the generalization of the sufﬁcient conditions for having unique sparse representation of signals inunions of general bases corresponding to piecewise sparsity.II. P RELIMINARIES

We use x to represent a vector and x represents a scalar. Deﬁne h x , y i = x T y . Let x = ( x T , . . . , x TN ) T correspond to A = [ A , . . . , A N ] . For convenience, let S = supp ( x ) and T be its complement, i.e. T = { i : x i = } . A S denotes the submatrix of A formed by the columns of A in S . Similarly deﬁne A T so that [ A S A T ] = A . Denote s = | S | = | supp ( x ) | , s i = | S i | = | supp ( x i ) | . A. Tools used in sparse recovery

In this part, we introduce the widely used tool for sparse signal recovery: the mutual coherence of adictionary A ∈ R m × n . Denote a ik by the k -th column in the submatrix A i ∈ R m × n i , the matrix A is assumedto have unit l norm for each column, i.e., k a ik k = , i = , . . . , N , k = , . . . , n i . Deﬁnition 2. [16] mutual coherence of the dictionary : µ : = max i , j |h a i , a j i| . Roughly speaking, the coherence measures how much two vectors in the dictionary can look alike. It isobvious that every orthogonal basis has coherence µ =

0. A union of two orthogonal bases has coherence µ ≥ m − / [16]. Deﬁnition 3. [18] the cumulative mutual coherence (Babel function) is deﬁned by µ ( s ) = max | S | = s max a i ∈ S c ∑ a j ∈ S |h a i , a j i| . DRAFT

A close examination of the formula shows that µ ( ) = µ and that µ is a non–decreasing function of s. Proposition 1. [18] If a dictionary A has coherence µ , then µ ( m ) ≤ m µ . Theorem 1. [18]

Exact Recovery Condition (ERC) . A sufﬁcient condition for both Orthogonal MatchingPursuit and Basis Pursuit to recover the s-sparse x with S = supp ( x ) successfully is that max j ∈ S C k ( A T S A S ) − A T S A j k < . Deﬁnition 4. [17] The spark of a matrix (dictionary) A counts the least number of columns which forma linearly dependent set. spark ( A ) = min x ∈ Ker ( A ) , x = k x k , where Ker ( A ) is the kernel of the dictionary deﬁned as Ker ( A ) = { x : A x = } .B. Tools used in piecewise sparse recovery Assume A = [ A , . . . , A N ] a union of N general bases, we generalize the concepts of mutual coherenceand cumulative mutual coherence to the piecewise sparse case. Deﬁnition 5.

The mutual coherence of the i-th sub-matrix A i is µ i , i = max k = l |h a ik , a il i| , a ik , a il ∈ A i , i = , . . . , N . It is clear when A is a union of orthogonal bases, µ i , i = Proposition 2.

The i-th sub-matrix coherence µ i , i satisﬁes ≤ µ i , i = α i µ ≤ µ with α i ∈ [ , ] . The parameter α i for i -th block measures the ratio of coherence within i -th block compared with thecoherence of the whole matrix. Deﬁnition 6.

The cumulative coherence between two blocks A i and A j is µ i , j ( m ) = max | S i | = m max l ∈{ ,..., n j } ∑ k ∈ S i |h a ik , a jl i| , where S i is the index set of m columns in sub-matrix A i and n j is the number of columns in A j . Remark 1.

Notice that the cumulative coherence between two blocks A i and A j is different from thedeﬁnition of cumulative block coherence in [24]. The cumulative block coherence µ B ( m ) measurescoherence between m blocks, the m represents the number of blocks. DRAFT corollary 1.

The cumulative coherence between two blocks A i and A j is bounded by µ i , j ( m ) ≤ m µ . Deﬁnition 7.

The cumulative coherence within A i is µ i , i ( m ) = max | S i | = m max k / ∈ S i ∑ l ∈ S i |h a ik , a il i| . Remark 2.

Notice that µ i , i ( m ) only measures the cumulative coherence within the submatrix A i , i.e, howmuch the atoms in the same block A i are “speaking the same language”. corollary 2. µ i , i ( m ) ≤ m α i µ . III. P

IECEWISE SPARSE RECOVERY IN UNION OF GENERAL BASES

In the piecewise sparse setting, the P problem (2) is equivalent to the following problem:min x k x k + · · · + k x N k s . t . b = A x + . . . + A N x N , (9)where x = ( x T , . . . , x TN ) T , denote (9) as piecewise P problem. A. Uniqueness of piecewise sparse recovery via piecewise P problem Theorem 2.

Suppose the measurement matrix A = [ A , . . . , A N ] is a union of N bases (or frames) withan overall coherence µ and sub–block coherence parameters α i for i = , . . . , N, if x is a solution ofpiecewise P problem (9) and k x k < N ( + α max µ ) ( N − + α max ) µ , (10) then x is the unique solution of problem (9).Proof. By the sufﬁcient and necessary condition Eq. (4) for the P problem, we need ﬁnd the lower boundof the spark ( A ) in the piecewise case.Let x ∈ Ker ( A ) , x =

0, denote the support of x by S = S S S S · · · S S N and s i = | S i | , correspondingto the blocks of A = [ A , . . . , A N ] . Thenspark ( A ) = min x ∈ Ker ( A ) , x = k x k = min x ∈ Ker ( A ) , x = ( s + · · · + s N ) . DRAFT

Step 1.

We start similarly to the proof of Lemma 3 in [23]. Let r i = rank ( A i ) , i = , . . . , N . Since A S = [ A S , . . . , A S N ] , and x S =  x S ... x S N  ∈ Ker ( A ) . In order to ﬁnd the minimum of s + · · · + s N , we can suppose that s i ≤ r i and A S i is full columnrank for i = , . . . , N . Because ∑ Ni = A S i x S i =

0, for every i we have A S i x S i = − ∑ j = i A S j x S j , hence x S i = − ∑ j = i ( A TS i A S i ) − ( A TS i A S j ) x S j . Then we can deduce that k x S i k ≤ − µ i , i ( s i − ) ∑ j = i k A TS i A S j k k x S j k ≤ ∑ j = i µ i , j ( s i ) − µ i , i ( s i − ) k x S j k . Since k x S k = k x S k + · · · + k x S N k , then  + max j = i µ i , j ( s i ) − µ i , i ( s i − )  k x S i k ≤ max j = i µ i , j ( s i ) − µ i , i ( s i − ) k x S k , which results in k x S k ≤ ( N ∑ i = v i v i ) k x S k , where v i = max j = i µ i , j ( s i ) / ( − µ i , i ( s i − )) and v i = + v i . Thus N ∑ i = v i v i ≥ . (11)Using the inequalities: µ i , i ( s i − ) ≤ ( s i − ) α i µ and µ i , j ( s i ) ≤ s i µ , the inequality (11) becomes N ∑ i = s i µ − ( s i − ) α max µ + s i µ ≥ N ∑ i = s i µ − ( s i − ) α i µ + s i µ ≥ , where α max = max i = ,..., N { α i } . Step 2.

In the following we evaluate the spark ( A ) , i.e, when s = s + . . . + s N reaches the minimum. DRAFT0

Denote g i = s i µ − ( s i − ) α max µ + s i µ , we consider the following minimization problemmin ( s + · · · + s N ) , s . t . N ∑ i = g i − ≥ . Using the Lagrange function and KKT conditions we obtain that s = N ∑ i = s i reaches minimum when s = · · · = s N . Then N ∑ i = g i = Ns µ N ( + α max µ ) + ( − α max ) s µ ≥ , which results in s ≥ N ( + α max µ )( N − + α max ) µ . By the deﬁnition of spark, we have spark ( A ) ≥ N ( + α max µ )( N − + α max ) µ . Thus by Eq. (4), if k x k < N ( + α max µ ) ( N − + α max ) µ then x is the unique solution of the piecewise P problem (9). Remark 3. (1)

In particular, if α max = , i.e, A is a union of N orthogonal bases. The result inTheorem 2 becomes k x k < N ( N − ) µ which corresponds to the upper bound of Eq. (6). (2) When α = · · · = α N = , i.e, A i has the same coherence with A. The result in Theorem 2 becomes k x k < + µ µ which corresponds to the upper bound of Eq. (5). Example 1.

Consider the case when N = , i.e., A = [ A , A ] . In this example we set µ = . and α max = . , then the sufﬁcient conditions which ensure the uniqueness for P problem and piecewise P problem are listed as follows: (condition 1) k x k < + µ µ (general condition). (condition 2) k x k < µ (A is union of orthogonal bases). (condition 5) k x k < + α max µ ( + α max ) µ (A is union of general bases). From the observation of Fig. 2, in the general case (condition 1) one can only ensure to recover 4-sparsevector. When it comes to the piecewise sparse recovery, one can recover at least ( , ) -piecewise sparse DRAFT1 condition 1condition 2condition = s2 s Fig. 2. Comparison of the upper bounds for uniqueness in Example 1. vector with global 7-sparsity by condition 5. It means that the upper bound in Theorem 2 (condition 5) ismore relaxed than the upper bound in Eq. (5) (condition 1). The condition 2 for the union of orthogonalbases is the best case. The improved condition 5 also makes a relation between general case and theunion of orthogonal matrices. Thus the results in Theorem 2 enlarge the scope the theoretical guaranteesfor sparse recovery by considering piecewise sparsity.

B. Feasible conditions of algorithms for piecewise sparse recovery

Theorem 3.

Suppose the measurement matrix A = [ A , . . . , A N ] is a union of N bases (or frames) with anoverall coherence µ and sub–block coherence parameters α i for i = , . . . , N. If N ∑ i = µ s i + α i µ + ( − α i ) µ s i < + α Z µ + ( − α Z ) µ s Z + α Z µ + ( − α Z ) µ s Z , (12) where Z = { Z : + α Z µ ( − α Z ) s Z = max i = ,..., N + α i µ ( − α i ) s i } , the exact recovery condition (ERC) holds. In which case bothOrthogonal Matching Pursuit and Basis Pursuit recover the sparest representation.Proof. Follow the proof of Theorem 3.7 in [18] and the notations in the proof of Theorem 2, the

DRAFT2

Grassmannian matrix Φ S = A T S A S =  A TS ... A TS N  (cid:16) A S · · · A S N (cid:17) =  A TS A S A TS A S · · · A TS A S N A TS A S A TS A S · · · A TS A S N ... ... . . . ... A TS N A S A TS N A S · · · A TS N A S N  = I s − G , where G =  I s − A TS A S − A TS A S · · · − A TS A S N − A TS A S I s − A TS A S · · · − A TS A S N ... ... . .. ... − A TS N A S − A TS N A S · · · I s N − A TS N A S N  , with the diag-block matrix I s i − A TS i A S i of the form  − A Ti A i · · · − A Ti A i si − A Ti A i · · · − A Ti A i si ... ... . . . ... − A Ti si A i − A Ti si A i · · ·  . Denote | G | by the entrywise absolute value of the matrix G . Since all the entries in the off-diag blocks of | G | can bebounded by µ , and the diag-block matrix | I s i − A TS i A S i | ≤  µ i , i · · · µ i , i µ i , i · · · µ i , i ... ... . . . ... µ i , i µ i , i · · ·  , i = , · · · , N , we have | G | ≤ µ s − µ B , where 1 s is the s × s matrix with unit entries, B is the block matrix B =  B · · · B · · · ... ... . . . ... · · · B N  , DRAFT3 where B i = α i I s i + ( − α i ) s i is the matrix with 1 on the diagonal, and all the off-diag entries are 1 − µ i , i µ = − α i , i = , . . . , N . Hence, we have the entrywise inequality | Φ − S | = | ( I s − G ) − | = | I s + ∞ ∑ k = G k | ≤ I s + ∞ ∑ k = | G | k ≤ I s + ∞ ∑ k = ( µ s − µ B ) k = (( I s + µ B ) − µ s ) − = ( I s − µ ( I s + µ B ) − s ) − ( I s + µ B ) − . step 1 : Compute ( I s + µ B ) − =  + α µ ( I s − ( − α ) µ + α µ +( − α ) µ s s ) · · · ... . . . ... · · · + α N µ ( I s N − ( − α N ) µ + α N µ +( − α N ) µ s N s N )  step 2 : Compute ( I s − µ ( I s + µ B ) − s ) − = I s + ∞ ∑ k = ( µ ( I s + µ B ) − s ) k , (13)with µ ( I s + µ B ) − s =  µ + α µ +( − α ) µ s s ... µ + α N µ +( − α N ) µ s N s N  h Ts · · · Ts N i de f = v1 Ts . We use indicates the column vector with unit entries. Moreover, the inner product Ts v = N ∑ i = µ s i + α i µ + ( − α i ) µ s i , therefore, the series ∞ ∑ k = ( v1 Ts ) k = ( v1 Ts ) ∞ ∑ k = ( Ts v ) k − = ( v1 Ts ) ∞ ∑ k = ( Ts v ) k = − N ∑ i = µ s i + α i µ +( − α i ) µ s i ( v1 Ts ) . (14)Combined with Eq. (13), we have | Φ − S | ≤  I s + − N ∑ i = µ s i + α i µ +( − α i ) µ s i ( v1 Ts )  ( I s + µ B ) − . (15) step 3 : Assume vector A i is drawn from basis number Z , then | A T S A j | ≤ h | A TS A j | · · · | A TS N A j | i T ≤ h µ Ts · · · α Z µ Ts Z · · · µ Ts N i T . So ( I s + µ B ) − | A T S A j | ≤ h µ + α µ +( − α ) µ s Ts · · · α Z µ + α Z µ +( − α Z ) µ s Z Ts Z · · · µ + α N µ +( − α N ) µ s N Ts N i T . (16) DRAFT4 step 4 : Moreover, we calculate the inner product of ERC condition | ( A T S A S ) − A T S A j | in combinationwith Eqs. (15), (16): | ( A T S A S ) − A T S A j | ≤ | Φ − S || A T S A j | ≤  µ + α µ +( − α ) µ s s ... α Z µ + α Z µ +( − α Z ) µ s Z s Z ... µ + α N µ +( − α N ) µ s N s N  + ∑ i = Z µ s i + α i µ +( − α i ) µ s i + α Z µ s Z + α Z µ +( − α Z ) µ s Z − N ∑ i = µ s i + α i µ +( − α i ) µ s i  µ + α µ +( − α ) µ s s ... µ + α N µ +( − α N ) µ s N s N  , (17)apply the l norm to inequality (17) to reach k ( A T S A S ) − A T S A j k ≤ ∑ i = Z µ s i + α i µ +( − α i ) µ s i + α Z µ s Z + α Z µ +( − α Z ) µ s Z − N ∑ i = µ s i + α i µ +( − α i ) µ s i . (18) step 5 : Since k A TT A † S k ∞ = max j ∈ T k ( A T S A S ) − A T S A j k , we consider the maximum of the right side of Eq. (18), rewrite it as k ( A T S A S ) − A T S A j k ≤ N ∑ i = µ s i + α i µ +( − α i ) µ s i − ( − α Z ) µ s Z + α Z µ +( − α Z ) µ s Z − N ∑ i = µ s i + α i µ +( − α i ) µ s i . (19)The right side of (19) reach the maximum when f Z de f = ( − α Z ) µ s Z + α Z µ +( − α Z ) µ s Z reach the minimum, f Z = ( − α Z ) s Z µ ( − α Z ) s Z µ + + α Z µ = µµ + + α Z µ ( − α Z ) s Z . Let Z = { Z : + α Z µ ( − α Z ) s Z = max i = ,..., N + α i µ ( − α i ) s i } , and2 N ∑ i = µ s i + α i µ + ( − α i ) µ s i < + ( − α Z ) µ s Z + α Z µ + ( − α Z ) µ s Z = + α Z µ + ( − α Z ) µ s Z + α Z µ + ( − α Z ) µ s Z , then the Exact Recovery Condition holds as k A TT A † S k ∞ <

1, thus we complete the proof.In particular, when A is a union of orthogonal bases, α Z = Z is chosen from the minimum s i , i = , . . . , N , then the condition Eq. (12) corresponds to the condition Eq. (8) in [18]. DRAFT5

Example 2.

Consider the case where N = , i.e. A = [ A , A ] and x is ( s , s ) -piecewise sparse vector.In this example we set overall coherence µ = . , α = . , α = . . The following sufﬁcient conditionswhich ensure the feasibility of OMP and BP algorithms are listed (as shown in Fig. 3): (condition 1) k x k < ( + µ ) , (general condition) (condition 3) k x k < √ − . µ , (equivalence condition when A in pairs of orthogonal bases) (condition 4) µ s + µ s < ( + µ s ) ⇔ µ s s + µ s − < . (ERC condition when A in pairs of orthogonal bases) (condition 6) (Assume α ≥ α and s ≤ s , then Z = ) ∑ i = µ s i + α i µ + ( − α i ) µ s i < + α µ + ( − α ) µ s + α µ + ( − α ) µ s , ⇔ ( + α µ + ( − α ) µ s )( α µ s − α µ − ) + µ s ( + α µ + ( − α ) µ s ) < . (20) (ERC condition when A in pairs of general bases). Example 3.

In this example we show the upper bounds when x is ( s , s ) -piecewise sparse vector withdifferent piecewise sparsity. We change the piecewise sparsity by vary the parameter pair ( α , α ) . Theupper bounds of condition 6 Eq. (20) in Example 2 are plotted in Fig.4 for the following three cases: (case 1) µ = . , ( α , α ) = ( . , . ) , (the sub-matrix coherence of A is differs greatly to that ofA and α is quite close to ). (case 2) µ = . , ( α , α ) = ( . , . ) , (the sub-matrix coherence of A is differs slightly to thatof A and both α and α are small). (case 3) µ = . , ( α , α ) = ( . , . ) , (the sub-matrix coherence of A is differs slightly to thatof A and both α and α are very small ). DRAFT6 condition 1condition 3condition 4 （ s1 < s2 ) condition 6 （ s1 < s2 ) s1 = s2 s Fig. 3. Comparison of upper bounds for feasible conditions of sparse recovery for Example 2.

Remark 4.

It is observed from Fig. 4 that different setting of ( α , α ) , i.e., different piecewise sparsitymay result in different global sparsity conditions. Especially, when the sub-matrix coherences of A andA are small, some relaxed conditions of the sparsity can be obtained. This phenomenon provides us aguidance on the setting of piecewise structure for a given matrix in order to obtain an optimal piecewisesparsity condition, which is another interesting problem in our future work. IV. C

ONCLUSION

In this paper, we introduce the piecewise sparsity of signals and use the mutual coherence for matrix inunions of general bases (or frames) to study the conditions for piecewise sparse recovery. We generalizethe results in orthogonal cases to the cases of general bases. We provide the new upper bounds ofglobal sparsity and piecewise sparsity of the signal recovered by both l and l optimizations when themeasurement matrix A is a union of general bases. The structured information of the matrix A is used toimprove the sufﬁcient conditions for successful piecewise sparse recovery and the reliability of the greedyalgorithms and the BP algorithms. R EFERENCES[1] M. Elad,

Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing , Springer, 2010.

DRAFT7 case 1 ( s1 < s2 ) case 2 ( s1 < s2 ) case ( s1 < s2 ) condition 1condition 3condition 4s1 = s2 s Fig. 4. Comparison of upper bounds for feasible conditions of sparse recovery for Example 3.[2] S.G. Mallat and Z. Zhang,“Matching pursuits with time-frequency dictionarie”,

IEEE Trans. on Sig. Proces . vol. 12, pp.3397–3415, 1993.[3] Y.C. Pati, R. Rezaiifar, and P.S. Krishnaprasad, “Orthogonal matching pursuit: recursive function approximation with applicationsto wavelet decomposition”,

Proceedings of the 27th Annual Asilomar Conference on Signals, Systems, and Computers , 1993,pp. 40–44.[4] J.A.Tropp and A.C. Gilbert, “Signal recovery from random measurements via orthogonal matching pursuit”,

IEEE Trans. Inform.Theory , vol. 53, no. 12, pp. 4655–4666, 2007.[5] T. Blumensath and M. E. Davies,“Iterative hard thresholding for compressed sensing”,

Appl. Comput. Harmon. Anal. , vol. 27,no.3, pp.265–274, 2009.[6] D.L. Donoho, Y. Tsaig, I. Drori, J.L. Starck, “Sparse solution of underdetermined linear equations by stagewise orthogonalmatching pursuit (StOMP)”,

IEEE Trans. on Infor. Theo. , vol. 58, no. 2, pp.1094–1121, 2012.[7] D.Needell and R. Vershynin, “Uniform uncertainty principle and signal recovery via regularized orthogonal matching pursuit”,

Found. Comput. Math. vol. 9, no.3, pp.317–334, 2009.[8] D. Needell and R. Vershynin, “Signal recovery from incomplete and inaccurate measurements via regularized orthogonalmatching pursuit”,

IEEE J-STSP , vol. 4, no. 2, pp.310–316, 2010.[9] D. Needell and J.A Tropp, “CoSaMP: iterative signal recovery from incomplete and inaccurate samples”,

Appl. Comput. Harmon.A. , vol. 26, no.3, pp.301–321, 2009.[10] W. Dai and O. Milenkovic, “Subspace pursuit for compressive sensing signal reconstruction”,

IEEE Trans. Inform. Theory ,vol. 55, no. 5, pp.2230–2249, 2009.

DRAFT8 [11] A. Maleki, “Coherence analysis of iterative thresholding algorithms”,

Proceedings of the 47th Annual Allerton Conference onCommunication, Control, and Computing , IEEE Press, 2009, pp. 236–243.[12] S. Foucart, “Hard thresholding pursuit: an algorithm for compressive sensing”,

SIAMJ. Numer. Anal. vol. 49, no.6, pp. 2543–2563, 2011.[13] S.J. Kim, K. Koh, M. Lustig, S. Boyd, D. Gorinevsky, “A interior–point method for large–scale l -regularized least squares”, IEEE J. Select. Top. Signal Process. , vol.1, no.4, pp. 606–617, 2007.[14] M.A.T. Figueiredo, R.D. Nowak, S.J. Wright, “Gradient projection for sparse reconstruction: Application to compressed sensingand other inverse problems”,

IEEE J. Select. Top. Signal Process.: Special Issue on Convex Optimization Methods for SignalProcessing , vol. 1, no. 4, pp. 586–598, 2007.[15] I. Daubechies, M. Defrise, C.D. Mol, “An iterative thresholding algorithm for linear inverse problems with a sparsity constraint”,

Comm. Pure Appl. Math. , vol. 57, pp. 1413–1457, 2004.[16] D.L. Donoho and X. Huo, “Uncertainty principles and ideal atomic decomposition”,

IEEE Trans. It. , vol. 47, no.7, pp.2845-2862, 2001.[17] D.L. Donoho and M. Elad, “Optimally sparse representation in general (nonorthogonal) dictionaries via l minimization”, Proceedings of the National Academy of Sciences of the United States of America , vol. 100, no.5, 2003, pp. 2197-2202.[18] J.A Tropp, “Greed is good: algorithmic results for sparse approximation”,

IEEE T. Inform. Theory , vol. 50, no. 10, pp.2231–2242, 2004.[19] E.J. Canddes and T. Tao, “Decoding by linear programming”.

IEEE T. Inform. Theory , vol. 51, pp. 4203-4215, 2005.[20] E.J. Canddes, J.K. Romberg, T. Tao, “Stable signal recovery from incomplete and inaccurate measurements”,

Communicationson Pure & Applied Mathematics , vol. 59, no.8, pp.1207–1223, 2006.[21] E.J. Canddes, “The restricted isometry property and its implications for compressed sensing”,

Comptes rendus-Mathsmatique ,vol. 346, no. 9, pp. 589–592, 2008.[22] M. Elad and A. M. Bruckstein,“A generalized uncertainty principle and sparse representation in pairs of bases”,

IEEE T.Inform. Theory , vol. 48, no. 9, pp. 2558–2567, 2002.[23] R. Gribonval and M. Nielsen, “Sparse representations in unions of bases”,

IEEE T. Inform. Theory , vol. 49, no.12, pp. 3320–3325, 2003.[24] L. Peotta and P. Vandergheynst,“Matching pursuit with block incoherent dictionaries”,

IEEE Trans. Signal Proces. , vol. 55,no. 9, pp. 4549–4557, 2007.[25] Y.C. Eldar and M, Mishali, “Block sparsity and sampling over a union of subspaces”,in

International Conference on DigitalSignal Processing . IEEE, 2009, pp. 1–8.[26] Y.C. Eldar, P. Kuppinger, H. Bolcskei, “ Block-sparse signals: uncertainty relations and efﬁcient recovery”,

IEEE Trans. SignalProces. ,vol. 58, no.6, pp.3042–3054, 2010.[27] E. Elhamifar and R. Vidal, “Block-sparse recovery via convex optimization”,

IEEE Trans.Signal Proces. ,vol. 60, no. 8, pp.4094–4107, 2011.[28] J.L. Starck, M. Elad and D.L. Donoho, “Image decomposition via the combination of sparse representations and a variationalapproach”,

IEEE Trans. image proces. ,vol. 14, no. 10, pp. 1570–1582, 2010.[29] Y.X. Hao, C.J. Li and R.H. Wang, “Sparse approximate solution of ﬁtting surface to scattered points by MLASSO model”,

Sci.China Math. , vol. 7, pp. 1–18, 2018.[30] Y.J. Zhong, C.J. Li, “Piecewise sparse recovery via piecewise inverse scale space algorithm with deletion rule”, Journal ofComputational Mathematics, to be published., vol. 7, pp. 1–18, 2018.[30] Y.J. Zhong, C.J. Li, “Piecewise sparse recovery via piecewise inverse scale space algorithm with deletion rule”, Journal ofComputational Mathematics, to be published.