[PDF] The high-order block RIP for non-convex block-sparse compressed sensing

Abstract

This paper concentrates on the recovery of block-sparse signals, which is not only sparse but also nonzero elements are arrayed into some blocks (clusters) rather than being arbitrary distributed all over the vector, from linear measurements. We establish high-order sufficient conditions based on block RIP to ensure the exact recovery of every block s -sparse signal in the noiseless case via mixed l 2 / l p minimization method, and the stable and robust recovery in the case that signals are not accurately block-sparse in the presence of noise. Additionally, a lower bound on necessary number of random Gaussian measurements is gained for the condition to be true with overwhelming probability. Furthermore, the numerical experiments conducted demonstrate the performance of the proposed algorithm.

Full PDF

aa r X i v : . [ c s . I T ] J un The high-order block RIP for non-convex block-sparsecompressed sensing a Jianwen Huang b,c

Xinling Liu b,c

Jinyao Hou c Jianjun Wang ∗ a School of Mathematics and Statistics, Tianshui Normal University, Tianshui, 741001, China b School of Mathematics and Statistics, Southwest University, Chongqing, 400715, China c College of Artiﬁcial Intelligence, Southwest University, Chongqing, 400715, China

Abstract.

This paper concentrates on the recovery of block-sparse signals, which is notonly sparse but also nonzero elements are arrayed into some blocks (clusters) rather thanbeing arbitrary distributed all over the vector, from linear measurements. We establishhigh-order suﬃcient conditions based on block RIP to ensure the exact recovery of everyblock s -sparse signal in the noiseless case via mixed l /l p minimization method, and thestable and robust recovery in the case that signals are not accurately block-sparse in thepresence of noise. Additionally, a lower bound on necessary number of random Gaussianmeasurements is gained for the condition to be true with overwhelming probability.Furthermore, the numerical experiments conducted demonstrate the performance ofthe proposed algorithm. Keywords.

Compressed sensing; block restricted isometry property; block sparsity;mixed l /l p minimization. Block-sparse signal recovery (BSR) appears in some ﬁelds of sparse modelling and machinelearning, including color imaging [13], equalization of sparse communication channels [2], multi-response linear regression [3] and imagine annotation [4] and so forth. Essentially, the importantproblem in BSR is how to reconstruct a block-sparse or approximately block-sparse signal from alinear system. Commonly, one thinks over the below model: y = Φ x + e, where y ∈ R n is the observation measurement, Φ ∈ R n × N is a known measurement matrix (orsensing matrix) with n < N , and e ∈ R n is a vector of measurement errors. Generally, theconventional compressed sensing (CS) simply thinks out the sparsity of the signal to be recovered,however it doesn’t consider any additional structure, i.e., non-zero elements appear in blocks (orclusters) rather than being arbitrarily spread all over the vector. We call these signals as theblock-sparse signals. In order to deﬁne block sparsity, we need to give several additional notations. ∗ Corresponding author, E-mail: [email protected], [email protected](J.J. Wang), E-mail:[email protected] (J. Huang) x over the block index set I = { d , d , · · · , d M } , then we can describe thesignal x as x = [ x , · · · , x d | {z } x [1] , x d +1 , · · · , x d + d | {z } x [2] , · · · , x N − d M +1 , · · · , x N | {z } x [ M ] ] ⊤ , (1.1)where x [ i ] represents the i th block of x and N = P Mi =1 d i . We call a vector x as block s -sparseover I = { d , d , · · · , d M } if x [ i ] is non-zero for no more than s indices i [5]. In fact, if the block-sparse structure of signal is neglected, the conventional CS doesn’t eﬃciently treat such structuredsignal. To reconstruct block-sparse signal, researchers [5] [6] proposed the following mixed l /l -minimization: ˆ x = arg min ˜ x ∈ R N k ˜ x k , I s.t. y − Φ˜ x ∈ B , (1.2)where k x k , I = P Mi =1 k x [ i ] k is the mixed l /l norm of a vector x . The set B stands for some noisestructure, B l ( ρ ) := { e : k e k ≤ ρ } (1.3)and B DS ( ρ ) := { e : k Φ ⊤ e k ∞ ≤ ρ } , (1.4)where Φ ⊤ represents the conjugate transpose of the matrix Φ. (1.2) is a convex optimization issueand could be converted into a second-order cone program, so can be solved eﬃciently.To study the theoretical performance of mixed l /l -minimization, Eldar and Mishali [5] pro-posed the deﬁnition of block restricted isometry property (block RIP). Deﬁnition 1.1. (Block RIP [5]) Given a matrix Φ ∈ R n × N , for every block s -sparse x ∈ R N over I = { d , d , · · · , d M } , there is a positive number δ ∈ (0 , , if (1 − δ ) k x k ≤ k Φ x k ≤ (1 + δ ) k x k , (1.5) then the matrix Φ obeys the s -order block RIP over I . Deﬁne the block RIP constant (RIC) δ s |I asthe smallest positive constant δ such that (1.5) holds for all x ∈ R N that are block s -sparse. For the reminder of this paper, for simplicity, δ s represents the block RIP constant δ s |I . Eldarand Mishali [5] showed that the mixed l /l -minimization method can exactly reconstruct anyblock s -sparse signal when the measurement matrix Φ fulﬁlls the block RIP with δ s < . δ s to 0 . δ s < .

307 for accurate recovery. In 2019, the conclusions of literature [8] and [9] together presenta complete characterization to the block RIP condition on δ ts that the mixed l /l minimizationmethod guarantees the block-sparse signal recovery in the ﬁeld of block-sparse compressed sensing.Recently, a lot of researchers [10] [11] [12] have revealed that l p (0 < p <

1) minimizationnot only constantly needs less constrained the RIP requirements, but also could ensure exactrecovery for smaller p compared with the l minimization. In the present paper, we are interestedin investigating the stable reconstruction of block-sparse signals by the mixed l /l p (0 < p < x = arg min ˜ x ∈ R N k ˜ x k p ,p s.t. y − Φ˜ x ∈ B , (1.6)2here k x k p ,p = P Mi =1 k x [ i ] k p . Simulation experiments [13] [14] indicated that fewer linear measure-ments are needed for accurate reconstruction when 0 < p < p = 1. More related workcan be found in literature [15] [16] [17] [18] [19] [20] [21].In this paper, we further investigate the high-order block RIP conditions for the exact andstable reconstruction for (approximate) block-sparse signals by mixed l /l p minimization. Thecrux is extend sparse representation of an l p -polytope (Lemma 2.2 [22]) to the block scenario. Withthis technique, we obtain a suﬃcient condition on RIC δ ts that guarantees the exact and stablereconstruction of approximate block-sparse signals via mixed l /l p minimization, and establisherror bounds between the solution to (1.6) and the signal x to be recovered. Obviously, when x is accurately block-sparse and B = { } (i.e., y = Φ x ), we will derive the accurate reconstructioncondition. Particularly, we will determine how many random Gaussian measurements suﬃce forthe condition to hold with high probability.The remainder of the paper is constructed as follows. In Section 2, we will provide somenotations and a few lemmas. In Section 3, we will present the main results, and the associatingproofs are given in Section 5. In Section 4, a series of numerical experiments are presented tosupport our theoretical results. Lastly, the conclusion is drawn in Section 6. Throughout this article, we use the below notations unless special mentioning. For a subset T in R M , T c denotes the complement of T in R M . For any vector x ∈ R N , x T represents a vector whichis equal to x on block indices T and displaces other blocks with zero. Denote T by block indicesof the s largest block in l norm of the vector x , i.e., k x [ i ] k ≥ k x [ j ] k for any i ∈ T and j ∈ T c .We represent x max( s ) as x with all but the largest s blocks in l norm set to zero. Henceforth, weinvariably choose that h = ˆ x − x max( s ) , where ˆ x is the minimizer of (1.6).In order to prove our main results, it is necessary to present the below lemma which is a crucialtechnical tool. Factually, we extend sparse expression of an l p -polytope proposed by [22] to theblock context. Lemma 2.1.

For a positive integer s , a positive number α and given p ∈ (0 , , deﬁne the block l p -polytope T ( α, s, p ) ∈ R N by T ( α, s, p ) = { x ∈ R N : k x k p ,p ≤ sα p , k x k , ∞ ≤ α } . Then any x ∈ T ( α, s, p ) can be expressed as the convex combination of block s -sparse vectors, i.e., x = X i λ i u i . Here λ i > and P i λ i = 1 and k u i k , ≤ s . In addition, X i λ i k u i k , ≤ α p k x k − p , − p . (2.1) Proof.

We can prove the assertion holds by induction. If x is block s -sparse, we can set u = x and λ = 1, then k u k , = k x k , ≤ α p k x k − p , − p . Suppose that assertion holds for all block ( l − x ( l − ≥ s ). Then for any block l -sparse vectors x such that k x k p ,p ≤ sα p and k x k , ∞ ≤ α ,without loss of generality suppose that x is not block ( l − l − x can be represented as x = P li =1 c i E i , where c ≥ c ≥ · · · ≥ c l > c is equal to the largest k x [ i ] k for every i ∈ { , , · · · , M } , c is equal tothe next largest k x [ i ] k , etc. Here E i denotes a unit vector in R N , which is equal to x/c i on the i th largest block of x and zero other places. Set c = α , and ﬁx a = ( a , a , · · · , a l ) ∈ R l + , where a i = c p − i , i = 0 , , · · · , l . Then we have P li =1 a i c i ≤ sα p and α p = a c ≥ a c ≥ · · · ≥ a l c l . Denote the set Γ = { ≤ j ≤ l − l X i = j a i c i ≤ ( l − j ) a j − c j − } . (2.2)It is easy to see that 1 ∈ Γ, hence Γ is not empty. Then we note that j = max Γ, which implies l X i = j a i c i ≤ ( l − j ) a j − c j − , l X i = j +1 a i c i > ( l − j − a j c j . (2.3)It follows that ( l − j ) a j c j < l X i = j a i c i ≤ ( l − j ) a j − c j − . Set y w = j − X i =1 c i E i + P li = j a i c i l − j l X i = j,i = w a − i E i ,ξ w = 1 − l − j P li = j a i c i a w c w , where w = j, j +1 , · · · , l . Then by simple calculations, we obtain P lw = j ξ w = 1 and x = P lw = j ξ w y w ,where y w is block ( l − w = j, j + 1 , · · · , l . Finally, since y w is block ( l − y w = P i µ w,i u w,i , where u w,i is block s -sparse, and µ w,i ∈ [0 , P i µ w,i = 1. Hence, x = P i P li = j mu w,i u w,i , which implies that statement is true for l . We will utilize the below lemma in the process of proving the main conclusions, which is a usefulimportant inequality. Lemma 2.2. (Lemma 5.3 [23]) Suppose that M ≥ s , a ≥ a ≥ · · · ≥ a M ≥ , P si =1 a i ≥ P Mi = s +1 a i , then for all α ≥ , M X j = s +1 a αj ≤ s X i =1 a αi . More generally, assume that a ≥ a ≥ · · · ≥ a M ≥ , λ ≥ and P si =1 a i + λ ≥ P Mi = s +1 a i , thenfor all α ≥ , M X j = s +1 a αj ≤ s α r P si =1 a αi s + λs ! α .

4n view of the deﬁnition of h , ˆ x and x max( s ) , we get the below lemma. Lemma 2.3.

Recall that h = ˆ x − x max( s ) , where ˆ x is the solution to (1.6). It holds that k h − max( s ) k p ,p ≤ k h max( s ) k p ,p . Proof.

Assume that T is the block index set over s blocks with largest l norm of the vector x .Therefore, x T = x max( s ) . By applying the minimality of the solution ˆ x and the reverse triangularinequality of k · k p ,p , we get k x T k p ,p ≥ k ˆ x k p ,p = k x T + h T k p ,p + k h T c k p ,p ≥ k x T k p ,p − k h T k p ,p + k h T c k p ,p , which implies k h T c k p ,p ≤ k h T k p ,p . Note that k h − max( s ) k p ,p ≤ k h T c k p ,p and k h T k p ,p ≤ k h max( s ) k p ,p . Combining with the above in-equalities, the desired result can be derived. Lemma 2.4. (Lemma 5.1 [25]) Let Φ ∈ R n × N be a random matrix whose entries obey one of thedistributions given by (3.6) and that fulﬁlls the concentration inequality P ( |k Φ x k − k x k | ≥ ǫ k x k ) ≤ e − nc ( ǫ ) , ǫ ∈ (0 , , (2.4) where the probability is taken over all n × N matrices Φ and c ( ǫ ) is a constant relying merely on ǫ and such that for all ǫ ∈ (0 , , c ( ǫ ) > . Suppose that ≤ s ≤ n . Then, for any δ ∈ (0 , , wehave (1 − δ ) k x k ≤ k Φ x k ≤ (1 + δ ) k x k (2.5) for all s -sparse vectors x ∈ R N with probability ≥ − (cid:18) δ (cid:19) s e − c ( δ ) n . (2.6) Based on the knowledge prepared above, we present the main results in this part-a high-orderblock RIP condition for the robust reconstruction of arbitrary signals with block structure viamixed l /l p minimization. In the case that the signal to be recovered is block-sparse, the conditioncan respectively guarantee the accurate construction and stable recovery in the noise-free case andin the noise situation. When the original signal x is not block-sparse and the linear measurementis corrupted by noise, the below result presents a suﬃcient condition for recovery of structuredsignals. Theorem 3.1.

Let y = Φ x + e be noisy measurements of a signal x ∈ R N with y, e ∈ R n , Φ ∈ R n × N ( n < N ) and k e k ≤ ρ . Assume that B = B l ( ε ) with ρ + σ (Φ) k x − max( s ) k ≤ ε in (1.6).If Φ satisﬁes the block RIP with δ ts < µ − pt − − µ := φ ( t, p ) (3.1)5 or some < t ≤ , where µ ∈ [( p p − p − /p, (1 − ( t − √ t − t ) p ) / ( t − is the sole positivesolution of the equation g ( µ, p ) = p µ p + µ − − p t − . (3.2) Then the solution ˆ x l to (1.6) fulﬁlls k ˆ x l − x k ≤ C ( ε + ρ ) + C k x − max( s ) k , (3.3) where C = √ φ ( t, p ) φ ( t, p ) − δ ts (2 − p )(1 − ( t − µ )2 − p − ( t − µ p δ ts + φ ( t, p ) s − pφ ( t, p ) − δ ts ! ,C = √ σ (Φ) φ ( t, p ) φ ( t, p ) − δ ts (2 − p )(1 − ( t − µ )2 − p − ( t − µ p δ ts + φ ( t, p ) s − pφ ( t, p ) − δ ts ! + 1 . Remark 3.1.

In the case of d i = 1 , i = 1 , , · · · , M , (3.3) is the same as Theorem 2 in [24]. Theorem 3.2.

Let y = Φ x + e be noisy measurements of a signal x ∈ R N with y, e ∈ R n , Φ ∈ R n × N ( n < N ) and k Φ ⊤ e k ∞ ≤ ρ . Assume that B = B DS ( ε ) with ρ + σ (Φ) k x − max( s ) k ≤ ε in(1.6). If Φ satisﬁes the block RIP with δ ts < φ ( t, p ) for some < t ≤ , then the solution ˆ x DS to(1.6) fulﬁlls k ˆ x DS − x k ≤ D ( ε + ρ ) + D k x − max( s ) k , (3.4) where D = √ dsφ ( t, p ) φ ( t, p ) − δ ts (cid:18) (2 − p )(1 − ( t − µ )2 − p − ( t − µ + (1 + √ N − ds )(1 − p ) φ ( t, p ) (cid:19) ,D = √ dsφ ( t, p ) σ (Φ) φ ( t, p ) − δ ts (cid:18) (2 − p )(1 − ( t − µ )2 − p − ( t − µ + (1 + √ N − ds )(1 − p ) φ ( t, p ) (cid:19) + 1 . Remark 3.2.

In the case of d i = 1 , i = 1 , , · · · , M , we obtain the same results as Theorem 3 in[24]. Corollary 3.1.

Under the same conditions as in Theorem 3.1, suppose that e = 0 and x is block s -sparse. Then x can be accurately reconstructed via ˆ x = arg min ˜ x ∈ R N k ˜ x k p ,p s.t. y = Φ˜ x. (3.5)In the following, we will decide how many random Gaussian measurements are demanded for(3.1) to be fulﬁlled with overwhelming probability. In this sequel, let (Ω , τ ) be a probability measurespace and z be a random variable which follows one of the following probability distributions: z ∼ N (0 , /n ) , z ∼ ( / √ n, w.p. 1 / , − / √ n, w.p. 1 / , or z ∼ p /n, w.p. 1 / , , w.p. 2 / , − p /n, w.p. 1 / . (3.6)Given n and N , the random matrices Φ can be produced by making choice of the elements Φ i,j asindependent copies of z . This generates the random matrices Φ.6 heorem 3.3. Let Φ be an n × N matrix with n < N whose elements are i.i.d. random variablesdeﬁned by (3.6). If n ≥ ts log Ndsφ ( t,p )16 − φ ( t,p )48 , the next assertion holds with probability more than − n ts (cid:16) d log φ ( t,p ) + log et + log Ms (cid:17) − n (cid:16) φ ( t,p )16 − φ ( t,p )48 (cid:17)o : for any block s -sparse signal x ∈ R N over I = { d = d, d = d, · · · , d M = d } with M d = N , x is the unique solution to (3.5) whenthe matrix Φ satisﬁes δ ts < φ ( t, p ) . In this section, we carry out a few numerical simulations to hold out the application of our the-oretical results. We could transform the constrained optimization problem (1.6) into an alternativeunconstrained form below: min ˜ x ∈ R N λ k ˜ x k p ,p + 12 k y − Φ˜ x k . (4.1)Solving the problem (4.1), we adopt the standard Alternating Direction Method of Multipliers(ADMM)[27][28][26]. Utilizing a auxiliary variable v ∈ R N , we can rewrite the formulation (4.1) asmin ˜ x, v ∈ R N λ k v k p ,p + 12 k y − Φ˜ x k s.t. ˜ x − v = 0 . (4.2)The augmented Lagrangian function of the above problem is L γ (˜ x, v, z ) = λ k v k p ,p + 12 k y − Φ˜ x k + h z, ˜ x − v i + γ k ˜ x − v k , (4.3)where z ∈ R n is a Lagrangian multiplier, and γ > x k +1 = arg min 12 k y − Φ˜ x k + γ k z k + ˜ x − v k k (4.4) v k +1 = arg min λ k v k p ,p + γ k z k + ˜ x k +1 − v k (4.5) z k +1 = z k + ˜ x k +1 − v k +1 . (4.6)The solution of problem (4.4) is explicitly provided by˜ x k +1 = (Φ ⊤ Φ + γI n ) − (Φ ⊤ y − γ ( z k − v k )) . (4.7)To use the existing conclusions on the proximity operator of l p -norm (0 < p < k x k < k x k p for x ∈ R n and 0 < p <

1, the optimization problem (4.5) can beconverted into v k +1 = arg min M X i =1 λ k v [ i ] k pp + γ k z k [ i ] + ˜ x k +1 [ i ] − v [ i ] k . (4.8)7n our experiments, without loss of generality, we think over the block-sparse signal x with evenblock size, i.e., d i = d , i = 1 , · · · , M , and take the signal length N = 1024. For each experiment,ﬁrst of all, we randomly produce block-sparse signal x with the amplitude of each nonzero entrygenerated according to the Gaussian distribution. We use an n × N orthogonal Gaussian randommatrix as the measurement matrix Φ. We set the number of random measurement m = 128 unlessotherwise speciﬁed. With x and Φ, we generate the linear measurement y by y = Φ x + e , where e is the Gaussian noise vector. Each given experimental result is an average over 100 independenttrails.In Fig. 1a, we produce the signals by making choice of 64 blocks uniformly at random with n = 64 , N = 128, i.e., the block size d = 2. The relative error of recovery k x − x ∗ k / k x k is plottedversus the regularization parameter λ for the diﬀerent values of p , i.e., p = 0 . , . , . , . ,

1. The λ ranges from 10 − to 10 − . From the ﬁgure, the parameter λ = 10 − is a proper choice. Fig. 1bpresents experimental results regarding the performance of the non-block algorithm and the blockalgorithm with p = 0 .

4. Two curves of relative error are provided via mixed l /l p minimization andorthogonal greedy algorithm (OGA) [29]. Fig. 1b reveals the signal construction is quite signiﬁcantin signal recovery. -8 -7 -6 -5 -4 -3 -2 log λ || x - x * || / || x || p=0.2p=0.4p=0.6p=0.8p=1 (a) || x - x * || / || x || nonblockblock (b) Figure 1: (a) Recovery performance of mixed l /l p minimization versus λ for block size d = 2,(b) Recovery performance of mixed l /l p minimization with p = 0 . k = 64Signal-to-noise ratio (SNR, SNR= 20 log ( k x k / k x − x ∗ k )) versus the values of p and thenonzero entries k , the results are respectively given in Fig. 2a and b. In Fig. 2a, the valuesof p vary from 0 .

01 to 1, and in Fig. 2b, the number of nonzero entries k ranges from 8 to 48.Figs. 2a and b evidences that mixed l /l p minimization performs better than that of standard l p minimization. Figs. 2a and b provide the relationship between the relative error and the numberof measurements n in diﬀerent block sizes d = 1 , , , p = 0 . , . , . , . , p = 0 . l /l solver and Block-ADMalgorithm [32]. We exploit SNR to weigh the algorithm eﬃciency. In Fig. 4a, we select signalswhose block size d = 2 , , , ,

32 with the number of nonzero entries k = 32 as the test signals,and in Fig. 4b, the block size d = 2. One can see that, overall, the performance of Group-Lp( p = 0 .

4) is much better than that of other three algorithms.8 S NR d=1d=2d=4d=8 (a) S NR d=1d=2d=4d=8 (b) Figure 2: SNR versus the values of p and the number of nonzero entries k in (a) and (b) respectively(a) For k = 8, (b) For p = 0 .

96 112 128 144 160n00.10.20.30.40.50.60.7 || x - x * || / || x || d=1d=2d=4d=8 (a)

96 112 128 144 160n00.050.10.150.20.250.3 || x - x * || / || x || p=0.2p=0.4p=0.6p=0.8p=1 (b) Figure 3: Relative error versus the number of measurements n for (a) p = 0 . k = 16 and (b) d = 2, k = 16 Proof of Theorem 3.1.

First of all, suppose that ts is an integer. Recollect that h = ˆ x l − x max( s ) ,where ˆ x l is the solution to (1.6) with B = B l ( ε ). Then, by employing the condition of Theorem3.1, we have k y − Φ x max( s ) k ≤ k y − Φ x k + k Φ( x − x max( s ) ) k ≤ ρ + σ (Φ) k x − max( s ) k ≤ ε, that is, y − Φ x max( s ) ∈ B l ( ε ) . Denote α p = k h max( s ) k p ,p /s . Then, by utilizing Lemma 2.3, we have k h − max( s ) k , ∞ ≤ α ≤ α ( t − /p , and k h − max( s ) k p ,p ≤ s ( t − (cid:18) α ( t − /p (cid:19) p . (5.1)9 d S NR ( d B ) Block-OMPBlock-SL0Block-ADMGroup-Lp (a) k S NR ( d B ) Block-OMPBlock-SL0Block-ADMGroup-Lp (b)

Figure 4: Comparison recovery performance with respect to SNR (a) Number of nonzero coeﬃcients k = 32 and (b) block size d = 2Through applying Lemma 2.1 and combining with (5.1), we have h − max( s ) = P Ni =1 λ i u i , where u i is block ( t − s -sparse, P Ni =1 λ i = 1 with λ i ∈ [0 , X i λ i k u i k , ≤ α p t − k h − max( s ) k − p , − p . (5.2)Hence, X i λ i k u i k , ≤ α p t − (cid:0) k h − max( s ) k , (cid:1) − p )2 − p (cid:16) k h − max( s ) k p ,p (cid:17) p − p ≤ α p t − (cid:0) k h − max( s ) k , (cid:1) − p )2 − p ( sα p ) p − p (b) ≤ t − (cid:0) k h − max( s ) k , (cid:1) − p )2 − p (cid:0) k h max( s ) k , (cid:1) p − p (c) ≤ t − (cid:0) k h − max( s ) k (cid:1) − p )2 − p (cid:0) k h max( s ) k (cid:1) p − p , (5.3)where (a) follows from H¨ o lder’s inequality, (b) is due to the fact that k x k p ≤ k x k q ≤ n /q − /p k x k p , x ∈ R n and given 0 < q < p ≤ ∞ , and (c) is from the fact that k x k , = P Mi =1 k x [ i ] k = P Ni =1 x i = k x k , x ∈ R N .From Cauchy-Schwartz inequality and the deﬁnition of block RIP, we get (cid:10) Φ h max( s ) , Φ h (cid:11) ≤ k Φ h max( s ) k k Φ h k ≤ p δ ts k h max( s ) k k Φ h k . (5.4)Since k Φ h k ≤ k Φ(ˆ x l − x max( s ) ) k ≤ k Φˆ x l − y k + k Φ x max( s ) − y k ≤ ε + ρ + σ (Φ) k x − max( s ) k , (5.5)therefore (cid:10) Φ h max( s ) , Φ h (cid:11) ≤ p δ ts ( ε + ρ + σ (Φ) k x − max( s ) k ) k h max( s ) k . (5.6)10ake β i = h max( s ) + ( t − µu i . Then, N X j =1 λ j β j − p β i − ( t − µh = h − ( t − µ − p i h max( s ) − p t − µu i . (5.7)Furthermore, both P Nj =1 λ j β j − p β i − ( t − µh and β i − β j = ( t − µ ( u i − u j ) are block ts -sparse,because h max( s ) is block s -sparse, and u i is block ( t − s -sparse.Observe that the identity below [22] X i λ i (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) Φ X j λ j β j − p β i (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) + 1 − p X i,j λ i λ j k Φ( β i − β j ) k = (cid:16) − p (cid:17) X i λ i k Φ β i k , (5.8)where P i λ i = 1.First of all, we determine the left hand side (LHS) of (5.8). Putting (5.7) into LHS of (5.8) andcombining with (5.6) and the concept of block RIP, we get LHS = X i λ i k Φ[(1 − ( t − µ − p h max( s ) − p t − µu i + ( t − µh ] k + 1 − p X i,j λ i λ j ( t − µ k Φ( u i − u j ) k = X i λ i k Φ[(1 − ( t − µ − p h max( s ) − p t − µu i ] k + 2(1 − p − ( t − µ )( t − µ < Φ h max( s ) , Φ h > +(1 − p )( t − µ k Φ h k + 1 − p t − µ X i,j λ i λ j ( t − µ k Φ( u i − u j ) k ≤ (1 + δ ts )[ X i λ i k (1 − ( t − µ − p h max( s ) − p t − µu i k + 1 − p t − µ X i,j λ i λ j ( t − µ k u i − u j k ]+ 2(1 − p − ( t − µ )( t − µ p δ ts ( ε + ρ + σ (Φ) k x − max( s ) k ) k h max( s ) k + (1 − p )( t − µ ( ε + ρ + σ (Φ) k x − max( s ) k ) =(1 + δ ts )[1 − ( t − µ − p k h max( s ) k + (1 − p ( t − µ (1 + δ ts ) X i λ i k u i k − (1 − p )( t − µ (1 + δ ts ) k h − max( s ) k + 2(1 − p − ( t − µ )( t − µ p δ ts ( ε + ρ + σ (Φ) k x − max( s ) k ) k h max( s ) k + (1 − p )( t − µ ( ε + ρ + σ (Φ) k x − max( s ) k ) . (5.9)Then again, by exploiting the representation of β i and the notion of block RIP, we get RHS = (1 − p X i λ i k Φ β i k

11 (1 − p X i λ i k Φ( h max( s ) + ( t − µu i ) k ≥ (1 − p (1 − δ ts ) X i λ i k h max( s ) + ( t − µu i k = (1 − p (1 − δ ts ) k h max( s ) k + (1 − p ( t − µ (1 − δ ts ) X i λ i k u i k . (5.10)A combination of (5.9) and (5.10), we obtain { (1 + δ ts )[1 − ( t − µ − p − (1 − p (1 − δ ts ) }k h max( s ) k + 2(1 − p ( t − µ δ ts X i λ i k u i k − (1 − p )( t − µ (1 + δ ts ) k h − max( s ) k + 2(1 − p − ( t − µ )( t − µ p δ ts ( ε + ρ + σ (Φ) k x − max( s ) k ) k h max( s ) k + (1 − p )( t − µ ( ε + ρ + σ (Φ) k x − max( s ) k ) ≥ . (5.11)Substituting (5.3) into (5.11), we get { (1 + δ ts )[1 − ( t − µ − p − (1 − p (1 − δ ts ) }k h max( s ) k + 2(1 − p ( t − µ δ ts (cid:0) k h − max( s ) k (cid:1) − p )2 − p (cid:0) k h max( s ) k (cid:1) p − p − (1 − p )( t − µ (1 + δ ts ) k h − max( s ) k + 2(1 − p − ( t − µ )( t − µ p δ ts ( ε + ρ + σ (Φ) k x − max( s ) k ) k h max( s ) k + (1 − p )( t − µ ( ε + ρ + σ (Φ) k x − max( s ) k ) ≥ . (5.12)Regarding the LHS as the function of k h − max( s ) k , we consider its extremum problem. Then, { (1 + δ ts )[1 − ( t − µ − p − (1 − p (1 − δ ts ) }k h max( s ) k + p t − µ (1 + δ ts )( (2 − p ) δ ts ( t − δ ts ) ) − pp k h max( s ) k + 2(1 − p − ( t − µ )( t − µ p δ ts ( ε + ρ + σ (Φ) k x − max( s ) k ) k h max( s ) k + (1 − p )( t − µ ( ε + ρ + σ (Φ) k x − max( s ) k ) ≥ . (5.13)Noting that (3.1) and (3.2), we get[2 − p − ( t − µ ] ( δ ts − µ − pt − − µ ) k h max( s ) k + 2(1 − p − ( t − µ )( t − µ p δ ts ( ε + ρ + σ (Φ) k x − max( s ) k ) k h max( s ) k + (1 − p )( t − µ ( ε + ρ + σ (Φ) k x − max( s ) k ) ≥ . (5.14)The condition δ ts < φ ( t, p ) guarantees that the inequality (5.14) is a second-order inequality for k h max( s ) k , and the quadratic coeﬃcient is less than zero. Consequently, we have k h max( s ) k ≤ − p − ( t − µ ] ( φ ( t, p ) − δ ts ) { − p − ( t − µ )( t − µ p δ ts θ { [2(1 − p − ( t − µ )( t − µ p δ ts θ ] + 4[2 − p − ( t − µ ] ( φ ( t, p ) − δ ts )(1 − p )( t − µ θ } } (a) ≤ { φ ( t, p ) φ ( t, p ) − δ ts (2 − p )(1 − ( t − µ )2 − p − ( t − µ p δ ts + φ ( t, p ) s − pφ ( t, p ) − δ ts } θ, (5.15)where (a) follows from the fact that ( u + v ) / ≤ u / + v / for u, v ≥

0, and θ = ε + ρ + σ (Φ) k x − max( s ) k . Combining with Lemmas 2.2 and 2.3, it follows that k h − max( s ) k ≤ k h max( s ) k .Accordingly, it is not diﬃcult to check that k ˆ x l − x k ≤ k ˆ x l − x max( s ) k + k x − x max( s ) k ≤ √ k h max( s ) k + k x − max( s ) k ≤ √ { φ ( t, p ) φ ( t, p ) − δ ts (2 − p )(1 − ( t − µ )2 − p − ( t − µ p δ ts + φ ( t, p ) s − pφ ( t, p ) − δ ts } θ + k x − max( s ) k . (5.16)If ts is not an integer, we represent t ′ = ⌈ ts ⌉ /s , then t ′ s is an integer and t < t ′ . Due to ∂φ ( t,p ) ∂t >

0, the function φ ( t, p ) is growing with t ∈ (1 , δ t ′ s = δ ts < φ ( t, p ) <φ ( t ′ , p ). Analogous to the above proof, we can prove the result by working on δ t ′ s . Proof of Theorem 3.2.

Similar to the proof of the former noisy situation, for the case of noisetype B = B DS ( ρ ), deﬁne h = ˆ x DS − x max( s ) . We can derive k Φ ⊤ ( y − Φ x max( s ) ) k ∞ ≤ k Φ ⊤ ( y − Φ x ) k ∞ + k Φ ⊤ (Φ x − Φ x max( s ) ) k ∞ ≤ ρ + σ (Φ) k x − max( s ) k ≤ ε, which reveals that y − Φ x max( s ) ∈ B DS ( ε ) . From the proof of Theorem 3.1, we have k h − max( s ) k ≤k h max( s ) k . By using the inequalities k x k p ≤ k x k q ≤ n /q − /p k x k p , x ∈ R n and given 0 < q < p ≤∞ , we obtain k h − max( s ) k ≤ √ N − ds k h max( s ) k . Hence, k h k ≤ (1 + √ N − ds ) k h max( s ) k ≤ (1 + √ N − ds ) √ ds k h max( s ) k . (5.17)Then, k Φ h k = < h, Φ ⊤ Φ h > ≤ k h k k Φ ⊤ Φ h k ∞ ≤k h k ( k Φ ⊤ (Φˆ x DS − y ) k ∞ + k Φ ⊤ (Φ x max( s ) − y ) k ∞ ) ≤ (1 + √ N − ds ) √ ds k h max( s ) k ( ε + ρ + σ (Φ) k x − max( s ) k ) (5.18)and (cid:10) Φ h max( s ) , Φ h (cid:11) ≤ D h max( s ) , Φ ⊤ Φ h E ≤ k h k k Φ ⊤ Φ h k ∞ ≤√ ds k h max( s ) k ( ε + ρ + σ (Φ) k x − max( s ) k ) . (5.19)Then, it follows that the below second-order inequality for k h max( s ) k [2 − p − ( t − µ ] ( φ ( t, p ) − δ ts ) k h max( s ) k − p − ( t − µ )( t − µ √ ds k h max( s ) k ( ε + ρ + σ (Φ) k x − max( s ) k ) − (1 + √ N − ds ) √ ds (1 − p )( t − µ k h max( s ) k ( ε + ρ + σ (Φ) k x − max( s ) k ) ≤ . (5.20)Therefore, k h max( s ) k ≤ φ ( t, p ) φ ( t, p ) − δ ts { (2 − p )(1 − ( t − µ ) √ ds − p − ( t − µ + (1 + √ N − ds ) φ ( t, p )(1 − p ) √ ds }× ( ε + ρ + σ (Φ) k x − max( s ) k ) . The remaining proof is similar with the case of noise type B l , we omit it here. Proof of Theorem 3.3.

Similar to the proof of Theorem 5.2 [25], from Lemma 2.4 and the unionbound, for ﬁxed δ ∈ (0 , I = { d = d, d = d, · · · , d M = d } with probability ≥ − δ ) sd ( Ms ) e − c ( δ/ n , where c ( δ/

2) = δ / − δ / N = M d . Therefore, we have P ( δ s < δ ) ≥ − (cid:18) δ (cid:19) sd ( Ms ) e − c ( δ ) n (5.21)Corollary 3.1 shows that in the case of free-noise, the guarantee to exactly recover block s -sparsesignals is δ ts < φ ( t, p ) (1 < t ≤ t ∈ (1 , δ ts < φ ( t, p ) with probability P ( δ ts < φ ( t, p )) ≥ − (cid:18) φ ( t, p ) (cid:19) tsd ( Mts ) e − n (cid:18) φ t,p )16 − φ t,p )48 (cid:19) (a) ≤ − e ts (cid:16) d log φ ( t,p ) +log et +log Ms (cid:17) − n (cid:18) φ t,p )16 − φ t,p )48 (cid:19) , (5.22)where (a) follows from the inequality ( uv ) ≤ ( eu/v ) v for integers u > v >

0. When

M/s → ∞ , toensure that δ ts < φ ( t, p ) with overwhelming probability, the number of measurements must satisfy n ≥ ts log Nds / ( φ ( t,p )16 − φ ( t,p )48 ). In recent years, the research of non-convex block-sparse compressed sensing has become a hottopic. This paper mainly discusses non-convex block-sparse compressed sensing by employing blockRIP. We establish a suﬃcient condition that guarantee the stable and robust signal reconstructionvia mix l /l p minimization method. Meanwhile, we present the upper bound estimation of recov-ery error. In addition, we give the number of samples needed to satisfy the suﬃcient conditionswith high probability. Besides, we conduct a series of numerical experiments to show the veriﬁa-bility of our results, and generally speaking, compared with other representative algorithms, theperformance of Group-Lp algorithm is much better. References [1] A. Majumdar , R. Ward ,Compressed sensing of color images, Signal Process. 90 (2010) 3122-3127.

2] F. Parvaresh , H. Vikalo , S. Misra , B. Hassibi , Recovering sparse signals using sparse measurementmatrices in compressed DNA microarrays, IEEE J. Sel. Top- ics Signal Process. 2 (3) (2008) 275-285 .[3] S. Cotter , B. Rao , Sparse channel estimation via matching pursuit with applica- tion to equalization,IEEE Trans. Commun. 50 (3) (2002) 374-377 .[4] J. Huang , X. Huang , D. Metaxas , Learning with Dynamic Group Sparsity, in: IEEE 12th Interna-tional Conference on Computer Vision, 2009, pp. 64-71 .[5] Y. Eldar , M. Mishali , Robust recovery of signals from a structured union of sub- spaces, IEEE Trans.Inf. Theory 55 (11) (2009) 5302-5316 .[6] Y. Eldar , P. Kuppinger , H. Bolcskei , Block-sparse signals: uncertainty relations and eﬃcient recovery,IEEE Trans. Signal Process. 58 (6) (2010) 3042-3054 .[7] J. Lin , S. Li , Block sparse recovery via mixed l /l minimization, Acta Mathe- matica Sinica 29 (7)(2013) 1401-1412 .[8] Li Y, Chen W. The high order block RIP condition for signal recovery[J]. Journal of ComputationalMathematics, 2019, 37(1): 61-75.[9] Huang J, Wang J, Wang W, et al. Sharp suﬃcient condition of block signal recovery via l /l -minimisation[J]. IET Signal Processing, 2019, 13(5): 495-505.[10] R. Chartrand , V. Staneva , Restricted isometry properties and nonconvex compressive sensing, InverseProbl. 24 (2008) 1-14 .[11] Y. Shen , S. Li , Restricted p -isometry property and its application for nonconvex compressive sensing,Adv. Comput. Math. 37 (2012) 441-452 .[12] M. Lai , Y. Xu , W. Yin , Improved iteratively reweighted least squares for unconstrained smoothedl q minimization, SIAM J. Numer. Anal. 51 (2) (2013) 927-957 .[13] A. Majumdar , R. Ward ,Compressed sensing of color images, Signal Process. 90 (2010) 3122-3127 .[14] Y. Wang , J. Wang , Z. Xu , On recovery of block-sparse signals via mixed l /l q (0 < q ≤

1) normminimization, EURASIP J. Adv. Signal Process. 76 (2013) 1-17 .[15] Y. Wang , J. Wang , Z. Xu , Restricted p -isometry properties of nonconvex block-sparse compressedsensing, Signal Process. 104 (2014) 188-196 .[16] Wen J, Zhou Z, Liu Z, et al. Sharp suﬃcient conditions for stable recovery of block sparse signals byblock orthogonal matching pursuit[J]. Applied and Computational Harmonic Analysis, 2019, 47(3):948-974.[17] Gao Y, Peng J, Yue S. Stability and robustness of the l /l q -minimization for block sparse recovery[J].Signal Processing, 2017, 137: 287-297.[18] Wang J, Huang J, Zhang F, et al. Group sparse recovery in impulsive noise via alternating directionmethod of multipliers[J]. Applied and Computational Harmonic Analysis, 2019.[19] Ge H, Chen W. Recovery of signals by a weighted l /l minimization under arbitrary prior supportinformation[J]. Signal Processing, 2018, 148: 288-302.[20] Haifeng Li and Jinming Wen. A New Analysis for Support Recovery With Block Orthogonal MatchingPursuit, IEEE Signal Processing Letters, 26(2019), 247-251.[21] Wang W, Wang J, Zhang Z. Block-sparse signal recovery via l /l − minimisation method[J]. IETSignal Processing, 2017, 12(4): 422-430.