[PDF] AutoBCS: Block-based Image Compressive Sensing with Data-driven Acquisition and Non-iterative Reconstruction

Abstract

Block compressive sensing is a well-known signal acquisition and reconstruction paradigm with widespread application prospects in science, engineering and cybernetic systems. However, state-of-the-art block-based image compressive sensing (BCS) methods generally suffer from two issues. The sparsifying domain and the sensing matrices widely used for image acquisition are not data-driven, and thus both the features of the image and the relationships among subblock images are ignored. Moreover, doing so requires addressing high-dimensional optimization problems with extensive computational complexity for image reconstruction. In this paper, we provide a deep learning strategy for BCS, called AutoBCS, which takes the prior knowledge of images into account in the acquisition step and establishes a subsequent reconstruction model for performing fast image reconstruction with a low computational cost. More precisely, we present a learning-based sensing matrix (LSM) derived from training data to accomplish image acquisition, thereby capturing and preserving more image characteristics than those captured by existing methods. In particular, the generated LSM is proven to satisfy the theoretical requirements of compressive sensing, such as the so-called restricted isometry property. Additionally, we build a noniterative reconstruction network, which provides an end-to-end BCS reconstruction framework to eliminate blocking artifacts and maximize image reconstruction accuracy, in our AutoBCS architecture. Furthermore, we investigate comprehensive comparison studies with both traditional BCS approaches and newly developed deep learning methods. Compared with these approaches, our AutoBCS framework can not only provide superior performance in terms of image quality metrics (SSIM and PSNR) and visual perception, but also automatically benefit reconstruction speed.

Full PDF

aa r X i v : . [ ee ss . SP ] S e p AutoBCS: Block-based Image Compressive Sensingwith Data-driven Acquisition and Non-iterativeReconstruction

Hongping Gan, Yang Gao, Chunyi Liu, Haiwei Chen, and Feng Liu

Abstract —Block compressive sensing is a well-known signalacquisition and reconstruction paradigm with widespread appli-cation prospect of science, engineering and cybernetic systems.However, the state-of-the-art block-based image compressivesensing (BCS) generally suffer from two issues. The sparsifyingdomain and the sensing matrices widely used for image acquisi-tion are not data-driven, thus ignoring both the features of theimage and the relationship among sub-block images. Moreover, itrequires to address high-dimensional optimization problem withextensive computational complexity for image reconstruction.In this paper, we provide a deep learning strategy for BCS,called AutoBCS, which takes into account the prior knowledgeof image in the acquisition step and establishes a subsequentreconstruction model to obtain fast image reconstruction withlow computational cost. More precisely, we present a learning-based sensing matrix (LSM) from training data so as to ac-complish image acquisition, therefore capturing and preservingmore image characteristics. In particular, the generated LSMis proved to satisfy the theoretical requirements, such as the so-called restricted isometry property. Additionally, we build a non-iterative reconstruction network, which provides an end-to-endBCS reconstruction to eliminate blocking artifacts and maximizeimage reconstruction accuracy, in our AutoBCS architecture.Furthermore, we investigate comprehensive comparison studieswith both traditional BCS approaches as well as newly-developingdeep learning methods. Compared with these approaches, ourAutoBCS framework can not only provide superior performancein both image quality metrics (SSIM and PSNR) and visualperception, but also automatically beneﬁt reconstruction speed.

Index Terms —Deep Learning, Image Compressive Sensing,Block Diagonal Matrix, Data-driven Acquisition, Fast ImageReconstruction

I. INTRODUCTIONCompressive sensing(CS) has renewed explosive interestin essential sampling techniques with the goal being to ac-quire and reconstruct signals at sub-Nyquist sampling rates.It automatically enables data hardware compression duringthe sampling process of the signals of interest xxx ∈ R n , whichprovides signiﬁcantly potential of lifting the energy efﬁciencyof sensors in modern signal processing applications. Underthe guidance of a series of landmark works of Cand`es [1] and This work was supported in part by National Natural Science Foundation ofChina under Grant 61372069, in part by the Chongqing Municipal EducationCommission under Grant KJ1501105, and in part by the 111 Project of Chinaunder Grant B08038.Hongping Gan is with the School of Software, Northwestern PolytechnicalUniversity, Xi’an 710072, China (e-mail: [email protected])Yang Gao, Chunyi Liu, Haiwei Chen and Feng Liu are with the Schoolof Information Technology and Electrical Engineering, The University ofQueensland, Brisbane, QLD 4072, Australia (e-mail: { yang.gao; chunyi.liu;haiwei.chen; feng } @uq.edu.au) Donoho [2], there have emerged a large number of CS-basedapplications in recent years, such as single-pixel camera [3],data security [4, 5], multilayer network [6], and transcriptomicproﬁling [7].Mathematically, CS states that xxx can be exactly recon-structed from a set of non-adaptive measurements formed by yyy = BBB · xxx with BBB of size m × n ( m << n ) , assuming that xxx hassparsity property and the sensing matrix (sampling patterns) BBB meets certain structural conditions. Note that the number ofmeasurements, m , is on the order of the information-theoreticdimension instead of that of linear-algebraic ambient dimen-sion of xxx [8]. Obviously, the sparsity property of xxx , sensingmatrix BBB , and nonlinear reconstruction are three ingredientsfor perfect reconstruction in CS theory.

Sparsity property.

Signals of interest can always be well-represented as a linear combination with just a few entriesfrom a certain dictionary or basis. Let’s focus on the sparsesignal ﬁrstly. A signal xxx is regarded as s -sparse when k xxx k ≤ s .Let Σ s = { xxx : k xxx k ≤ s } . (1)be the array of all s -sparse signals. In general, we alwaysdeal with that xxx is not itself sparse, but it admits a well-approximated sparse in a sparsifying basis ΨΨΨ ∈ R n . If so, wecan write xxx as xxx = ΨΨΨ · vvv with vvv being s -sparse. Originally,typical choices of transform basis for image sparse regular-ization include discrete cosine transform [9], wavelet domain[10], and etc.. However, these conventional sparse regular-izations cannot seize higher-order dependency of coefﬁcientvector vvv . To capture their dependence, various specialized andsophisticated regularizations have been exploited for CS ofimages, most remarkably, group/structured sparsity [11, 12],Bayesian/model-based sparsity [13], low-rank regularization[14], and non-local sparsity [15]. These sparse regularizationscan develop interpretable sparsity prior of xxx , resulting in thatthe corresponding reconstruction approaches often providemore accurate and efﬁcient reconstruction. For optimal CSrecovery, unfortunately, the aforementioned sparsity regular-izations are either not data-driven learned or predeﬁned inCS-based image applications. Sensing matrix.

The sensing matrix

BBB should satisfy cer-tain structural properties in order to seize and preserve thesalient information of xxx during linear dimensionality reduction: R n → R m . An elegant property of BBB that guarantees such signalacquisition is called restricted isometry property (RIP) [16] if

BBB approaches Σ s as an approximate isometry. More formally,a matrix BBB satisﬁes the ( s , δ s ) -RIP if ∀ xxx ∈ Σ s , ( − δ ) ≤ k BBBxxx k k xxx k ≤ ( + δ ) , (2)where the smallest constant δ s ≤ δ obeying Eq. (2) is referredto as the restricted isometry constant. Other famous propertiesfor selecting BBB include the coherence [17], the spark [8],and the null space property [18]. Based on these guideproperties, numerous insightful sensing matrices have by nowbeen introduced for CS applications [19–21]. In particular,it is widely believed that random matrices (e.g. Gaussiansensing matrix) can allow us to perfectly reconstruct xxx fromonly O ( s log ( n / s )) measurements in polynomial time throughdifferent reconstruction algorithms [22]. However, the sensingmatrices used in practical CS applications are often signalindependent, which discards some prior information of signal. Nonlinear reconstruction.

With the obtained measure-ments yyy and the transform sparsity of xxx , it is a well-establishedfact that we can reconstruct the original signal xxx via thefollowing optimization problem:˜ xxx = arg min xxx k BBBxxx − yyy k + λ k ΨΨΨ xxx k , (3)where λ denotes the regularization parameter. We can refer tothis procedure as a nonlinear reconstruction mapping ˜ xxx = f ( yyy ) .An intuitive way to solve Eq. (3) is to utilize a convexprogramming method. However, such a method often suffersfrom high computational complexity when handling largesignals, such as image. To avoid this, a variety of lower costiterative approaches have been developed, including iterativeshrinkage-thresholding algorithm (ISTA) [23], alternating di-rection method of multipliers (ADMM) [24], and approximatemessage passing [25], to name just a few. Although thesealgorithms beneﬁt convergence guarantee and theoretical anal-ysis, the involved parameters (e.g., penalty parameters andstep size) are always hand-crafted in the solvers. Moreover,they usually need hundreds of iterations to converge to a (sub-)optimal solution, leading to that the whole process is muchtime consuming and may take a few minutes.In summary, the aforementioned three issues are the coreissues of CS. From a theoretical perspective, the general-purpose CS is always effective. However, practitioners in mostengineering applications are often faced with the industrialbottlenecks, such as big data problem, high memory require-ment, slow reconstruction speed. Many efﬁcient strategieshave been exploited to promote CS from theory to industry,including block CS [26], inﬁnite dimensional CS [27], andquantized CS [28], etc.. A. Motivation

Over the past decade, block CS has drawn particular interestas it shows widespread application prospect of science andengineering such as sparse subspace clustering [29], spectrumsensing [30], as well as multiple measurement vector [31].In particular, such a block-based CS framework has beensuccessfully adapted to image acquisition systems so as to relieve the burdens in both memory and energy consumptionof sensor, and reconstruction algorithms [26]. As a conse-quence, the block-based image compressive sensing (BCS) isextraordinarily desirable for low-power imaging devices (e.g.,wireless visual sensor network) because of their limited com-putation capabilities. Compared with the traditional general-purpose CS of image, BCS usually beneﬁts the followingaspects. Firstly, block-based measurement is naturally suitablefor real-time, energy-limited imaging applications since itavoids dealing with high-dimensional data. Secondly, we canaccelerate the image reconstruction as each sub-block imageis independently handled. Thirdly, the sensing operator onlytakes a small amount of memory and corresponds to feasiblehardware architectures owing to its structured construction.The initial BCS framework introduced by Gan et.al [26]contains two separate phases: sampling and reconstruction.In the sampling phase, the original signal/image of interestis divided into non-overlapping sub-block images, then eachsub-block is independently measured via the same sensingoperator. In other words, the so-called BCS accomplishesimage acquisition in a block-by-block style. We can considerthe equivalent full sensing matrix in BCS as a block diagonalmatrix whose sub-blocks are identical copies of the usedsensing operator. In the reconstruction phase, linear estimationcoupled with two-stage iterative hard thresholding has been de-voted to reconstruct the original image. Following this baselineframework, some sophistical reconstruction algorithms havebeen proposed to further improve the performance of BCSby exploiting either extra optimization criteria [32, 33] orimage priors [34, 35], such as smoothed projected Landweberreconstruction (SPL) [32, 36], collaborative sparsity (RCoS)[34], and group-based sparse representation (GbSR) [35].Although the aforementioned algorithms have promisingperformance and are beneﬁcial with theoretical analysis, theyare still encountered the challenges of adjusting parameters be-cause they are essentially iterative algorithms solving Eq. (3).Moreover, the transform domain and sensing operator in theBCS are handcrafted (not data-driven) and therefore do notadequately make use of the prior knowledge of images, as de-scribed before. To overcome these problems, it unsurprisinglymotivates us to seek for the data-driven BCS strategy, that is,learning BCS from a set of training data. To this end, oneshould center on the following questions: • Can we design a learning-based image acquisition withouthandcrafting both the sparsifying domain and the sensingoperator? • With the data-dependant acquisition, is there any possi-bility that we can customize non-iterative reconstructionalgorithms for BCS? • How does the performance of the data-driven BCS com-pare to that of the state-of-the-art algorithms?These questions naturally guide us to incorporate the deeplearning (DL) concept in BCS. Roughly speaking, DL is aframework for automating feature learning and extracting,which has broadly used in computer vision tasks. Inspired bythis technique, different attempts have been exploited to adoptthe DL concept to CS imaging. For one thing, improvements

Learning-based Sensing Matrix

Initial ReconstructionOct-ReconNet

ConcatenationBlocking O r i g i n a l I m ag e R ec o n s t r u c t i o n Oct-Reconstruction

Fig. 1: Schematic representation of our proposed AutoBCS architecture. AutoBCS replaces the traditional BCS approach witha uniﬁed image acquisition and reconstruction framework.to iterative algorithms using DL technique like ADMM-CSnet[37] and ISTA-net [38] have been introduced for image re-construction, however, these frameworks still employ the tradi-tional sensing operator and thus the data acquisition are signalindependent. For another thing, the block-based reconstructionnetworks for BCS such as ReconNet [39] and DR -net [40]have been developed, nevertheless, these networks only utilizeintra-block information to recover a sub-block thereby yieldingheavy blocking artifacts, and thus requiring a postprocessingalgorithm. And thirdly, the pure DL-based BCS pipelines [41–43] have been proposed, unfortunately, existing frameworksusually use the undesirable fully connected network, and theyare not fully recognized in theoretical analysis. B. Main contributions

To address the previously described problems, motivated bythe perceptual learning archetype, we propose a pure DL-basedframework for BCS from data acquisition to reconstructionin this paper. Figure 1 shows the schematic representationof our proposed AutoBCS architecture, which replaces thetraditional BCS approach, with a uniﬁed image acquisitionand reconstruction framework. The main contributions aresummarized as follows: • We build a bridge between traditional non-learning strate-gies and prior knowledge from training sub-block imagesets and develop a learning-based sensing matrix (LSM)for image acquisition, therefore without handcrafting thesparsifying domain and the sensing operator. • The generated LSM is proved to satisfy the theoreticalguarantees, such as RIP, and thus can be applied to thetraditional BCS framework while the sampling efﬁciencycan be signiﬁcantly improved. • We develop a non-iterative image reconstruction strategymainly relying on our customized octave reconstruction,which establishes an end-to-end BCS reconstruction toeliminate blocking artifacts and maximize image recon-struction accuracy, in the proposed AutoBCS architecture.The experimental results on several public testing databasesall demonstrate that, compared with the other state-of-the-art approaches, our AutoBCS framework can not only providesuperior performance in both image quality metrics (PSNR andSSIM) and visual perception, but also automatically beneﬁtreconstruction speed.The rest paper is organized as follows. In Section II,we brieﬂy review the related works to highlight our mo-tivations. Section III describes our AutoBCS framework indetail. Extensive experiments are derived in Section IV topresent the superiorities and effectiveness of the proposedframework. Section V demonstrates the relevant discussionand we conclude this paper in Section VI.II. RELATED WORKIn this section, we will review some related works onboth traditional block-based and DL-based image compressivesensing via a general insight to highlight our motivationsand contributions, and simultaneously introduce the octaveconvolution.

A. Traditional block-based image compressive sensing

Considering a full-size image xxx of n = H × W pixels, BCSdivides xxx into non-overlapping sub-block images with size A × A and measure each sub-block using the same sensingoperator. Let xxx i denote the vectorized vector of the i th sub-block in raster-scan manner. Using an m a × A sensing matrix BBB A , we can obtain the corresponding measurements yyy i via yyy i = BBB A xxx i . (4)Such a way is equivalent to apply general-purpose CS to thewhole image xxx by using a block diagonal sensing operator BBB deﬁned as:

BBB =  BBB A · · · BBB A · · · · · · BBB A  . (5)Note that m a = ⌊ mA n ⌋ and the sampling rate is τ = mn , where m is the total number of the measurements yyy through BBB . As an inspiring work, Gan et. al [26] used linear MMSEestimation as criterion to obtain initial solution, and thenemployed two-stage iterative hard thresholding to furtherimprove the quality of initial image. Following this baselineframework, Mun et. al [32] then developed a better imagereconstruction strategy for BCS, dubbed D-SPL, which com-bines SPL with direction transforms to simultaneously beneﬁtsmoothness and sparsity. It suggests that the dual-tree discretewavelet transform (DWT) and contourlets can provide thebetter reconstruction performance at low sampling rates. Animprovement to this method, which utilizes multiple scalesand sub-bands of DWT, was proposed by Fowler et.al in [33].After that, they used a multi-hypothesis prediction to developa different BCS strategy, dubbed MH-SPL, for images andvideos [36]. For still image, MH-SPL ﬁrst calculates a multipleprediction of an sub-block image from spatially surroundingsub-blocks as an initial reconstruction and then obtains theﬁnal prediction of the sub-block through an optimal linearcombination. For video, MH-SPL can obtain the correspond-ing multi-hypothesis prediction of a frame from previouslyrecovered adjacent frames. Comprehensive reviews on thiskind of BCS method and its applications can be found in[44]. Moreover, block/patch-based methods that enforce thenonlocal self-similarity and local sparsity, such as RCoS [34]and GBsR [35], were developed for natural image. Theseaforementioned approaches can always obtain a higher-qualityimage but with a low reconstruction speed.In a nutshell, the traditional BCS strategies generally en-hance the performance by mainly designing the sparsifyingdomain or exploiting the prior knowledge of the image.Different from them, our method is a pure data-driven BCSframework from image acquisition to reconstruction. Com-pared with traditional BCS paradigm, our data-driven methodautomatically learns the features of each sub-block image andthe relationship among sub-blocks and develops a learning-based sensing matrix from training data to obtain imageacquisition. In addition, the proposed AutoBCS can providebetter image recovery accuracy with an extremely fast recoveryspeed.

B. Deep learning-based image compressive sensing

To substitute the traditional iterative reconstruction ap-proaches, several attempts have been devoted to adopt theDL technique to CS imaging. Roughly speaking, we candivide these DL-based BCS frameworks into three categories.First of all, interpretable optimization-inspired image recon-struction strategies have been developed by casting iterativeoptimization algorithms into DL form, in which the involvedparameters are not hand-crafted, but being gradually learnedend-to-end. For example, Yang et. al [37] introduced aneffective iterative deep architecture, called ADMM-CSnet,which is inspired by the iterative ADMM solver for solvingEq. (3). Based on ISTA solver, similarly, Jian et. al [38]developed ISTA-net to solve the proximal mapping related tothe sparsity-inducing regularizer, which indeed improves therecovery performance. Nevertheless, these frameworks mainlyfocus on learning the sparsifying domain and reconstruction strategy without concerning the sensing operator, and thus thedata acquisition are not data-dependant, which may weakenthe CS performance.In the second place, learning block-based image reconstruc-tion networks have been designed for CS imaging. At thebeginning, Mousavi et. al [45] applied a stacked denoisingautoencoder non-iterative network (SDA-net) to reconstructthe image from its measurements. Following this work, Kulka-rni et. al [39] and Yao et. al [40] respectively developed twonon-iterative reconstruction frameworks, known as ReconNetand DR -net, by borrowing insights from traditional BCS re-construction methods, which have competitive reconstructionperformance with a signiﬁcantly short recovery speed. How-ever, such non-iterative reconstruction networks usually havetwo deﬁciencies. For one thing, these non-iterative networksonly employ intra-block information to recover a sub-blockimage thereby yielding blocking artifacts, and thus requiringan addition de-blocking algorithm with high computationalcomplexity. For another, the involved sensing matrix for imageacquisition is hand-crafted as well, i.e., these networks still donot take into account the sampling patterns of data acquisition.To address these problems, ﬁnally, the pure DL-based BCSpipelines [41–43] have been proposed for CS imaging, whichtrain a non-iterative reconstruction network associated withlearning the sampling patterns. For example, Wu et. al [42]introduced a BCS framework, called CS-net, for jointly opti-mizing the sampling patterns and the reconstruction strategyvia convolutional neural network, which was further extendedin [43]. In their research, CS-net can provide substantial im-provements on reconstruction accuracy than other algorithms,with a fast running speed. However, the existing pure DL-based BCS frameworks still suffer from many disadvantages,such as less theoretical or comprehensive analysis, using theundesirable fully connected or repetitive network, and etc.,which may hinder their practical applications.Fundamentally, our proposed AutoBCS is a pure DL-basedBCS framework, which utilizes networks to learn mappingbetween a set of measurements obtained by LSM and high-quality image. Although both our proposed AutoBCS andother pure DL-based ones have similar inspirations, they aredifferent due to the following reasons. On one hand, ourframework not only learns the prior information of images,but also develops LSM for image acquisition. The generatedLSM is proved to satisfy the theoretical guarantees of thesampling patterns. One the other hand, existing reconstructionnetworks generally use either fully connected or repetitive con-volutional layer to accomplish image reconstruction, while ourcustomized non-iterative reconstruction module in AutoBCSgoes beyond that to consistently boost accuracy for image,reducing computational complexity and memory cost. C. Octave convolution

In convolutional neural networks, we can consider theoutput feature maps of a convolution layer as a mixture ofinformation at multiple spatial frequencies. Octave convolutionintroduced by Chen et. al [46] is a novel frequency decom-position of convolution operation, which stores and processes

Fig. 2: The deep neural network architecture of AutoBCS contains two parts: data-driven image acquisition module and non-iterative data reconstruction module. Note that we use different colors for the corresponding block processing to distinguishdifferent sub-block images.the mixed feature maps, while decreasing spatial redundancy.As an alternative of vanilla convolutions, it is a kind of plug-and-play convolutional operator, and can effectively reduce theresolution for low frequency maps and enlarge the receptiveﬁeld, and thus saving both computation and storage. Withachieving signiﬁcant performance gain, octave convolution isperfectly suitable for a variety of backbone deep convolutionalnetworks related to image and video processing tasks, suchas Res2Net [47] and stabilizing GANs [48], without anyadjustment on the backbone network architecture.For the sake of augmenting the detail recovery and max-imizing image reconstruction accuracy, we specially modifythe original octave convolution and customize an octave trans-posed convolution to make the octave idea more suitable forour non-iterative reconstruction network in this work. Therelated details about modiﬁed octave convolution and octavetransposed convolution are introduced in Appendix A, PartI and Part II, respectively. Compared with general DL-basedones, our customized octave reconstruction strategy does notcontain fully connected or repetitive network, nor does itrequire de-blocking processing.III. PROPOSED METHODIn this section, we will develop a pure DL-based BCSframework with data-driven image acquisition and non-iterative data reconstruction to solve the questions mentionedin Section I, that is, how to efﬁciently measure image withouthandcrafting both the sparsifying domain and the sensingoperator for better sampling; how to use non-iterative recon-struction network to reconstruct image for better recoveryaccuracy.

A. Framework of AutoBCS

The proposed AutoBCS is a pure DL-framework incorpo-rating data-driven image acquisition and reconstruction mod- ules, as described in Fig. 2. In the data acquisition mod-ule, AutoBCS automatically captures the features of eachblock image and the relationship among sub-block imagesand correspondingly develops a learning-based sensing matrix(LSM) from training data. In the image reconstruction module,AutoBCS learns a reconstruction mapping between a set ofmeasurements obtained by LSM and high-quality images. Asthe mapping is made, a low-dimensional and mutual jointmanifold of these two types of data is implicitly learned, seiz-ing an extremely expressive representation which maximizesimage reconstruction accuracy. To be speciﬁc, the non-iterativereconstruction module used in AutoBCS includes two phases:initial reconstruction sub-network and octave reconstructionsub-network. The former aims to obtain an initial recoveryimage with the global structure while the latter focuses onaugmenting ﬁne details and ﬁnally outputs a high-qualityreconstruction image.For data training, the image acquisition module and thereconstruction module in AutoBCS are jointly optimized,and form an end-to-end network in order to maximize bothsampling efﬁciency and recovery performance. For applicationimplementation, the trained LSM is utilized to yield a set ofmeasurements for each sub-block, and equivalently forms ablock-diagonal matrix with constrained structure for encodingthe full-size image to obtain image acquisition. Moreover, thetrained non-iterative reconstruction module is considered as adecoder to accomplish image reconstruction.

B. Data-driven image acquisition module

As described in Eq. (4), the traditional BCS employs

BBB A to respectively obtain a set of measurements yyy i from eachsub-block image xxx i . In our AutoBCS, a convolutional layerwith kernel size (A) and stride (A) and zero bias is used toimplement the linear non-overlapping block-based samplingstage, as each row of BBB A can be considered as a ﬁlter. More Fig. 3: Histograms of the element distribution of

PPP A in thecases: (a) τ = .

2; (b) τ = .

1; (c) τ = .

05; (d) τ = . N p denotes the total number ofsub-block images, and the size of each ﬁlter used in samplinglayer is the same as that of sub-block image, i.e., A × A . Ata sampling rate τ = mn , the sensing matrix BBB A has m a = ⌊ mA n ⌋ rows for image acquisition. As a consequence, the convolutionlayer has m a ﬁlters of size A × A to obtain m a measurements.For example, if τ = .

01, there are 10 ﬁlters of size 32 × FFF A represent m a ﬁlters of the sam-pling convolution layer. Mathematically, the non-overlappingblock-based image acquisition can be formulated as a convo-lution operation: yyy i = conv ( FFF A , xxx i ) = FFF A ∗ xxx i , (6)which corresponds to the sampling procedure of traditionalBCS described by Eq. (4). When inputting an sub-blockimage xxx i to the sampling layer, the output yyy i is a vectorof size m a , which can be considered as the correspondingmeasurements of xxx i . In AutoBCS architecture, the samplinglayer automatically learns the sampling patterns from thetraining data, i.e., the weights of FFF A are gradually optimizedfor better data acquisition. Once the training is completed, wecan obtain the corresponding LSM, denoted by PPP A . LSM cannaturally capture the features of each sub-block image and therelationship among sub-blocks, and thus can guarantee that yyy i inherits more structural characteristic of xxx i .In order to analyse the property of PPP A , we have trainedfour AutoBCS networks at four sampling rates, i.e., τ ∈{ . , . , . , . } . The training details will be introduced inSection V. To this end, Fig. 3 plots the element distribution ofthe trained PPP A for visualization, where the red line correspondsto standard Gaussian distribution. From these histograms, itcan be distinctly observed that the elements of PPP A , P k , j ( ≤ k ≤ m a , ≤ j ≤ A ) indeed obey Gaussian-like distribution,especially for AutoBCS with higher τ .Although the elements of LSM and Gaussian matrix meeta similar distribution, they are totally diverse in essence As done in other BCS frameworks, we employ block of size A =

32 inour work. because of the following two aspects. Firstly, Gaussian ma-trix in traditional framework is usually handcrafted withoutconsidering the prior information of the data, while our LSMis fundamentally learned according to the prior knowledge,rather than manually set. Secondly, the statistical propertiesof elements of Gaussian matrix, such as mean and variance,are ﬁxed and signal independent, while these characteristics ofelements of LSM are gradually optimized from training datato improve the sampling efﬁciency. Roughly speaking, we canconsider that

PPP A is automatically learned from the trainingdata, whose elements follow a Gaussian-like distribution.As described before, random matrices based on Gaussian,Bernoulli or more generally a Gaussian-like distribution [22]have been shown to perfectly promote the theory of CS tosatisfy real-world data acquisition demands. As a consequence,we can leverage the well-known theorems on random matricesto obtain many interesting conclusions for our trained LSM,coming with theoretical guarantees. For brevity, we present thedetails in Appendix B. In another viewpoint, for the full-sizeimage xxx , the equivalent sensing operator PPP can be consideredto follow the form of Eq. (5) with constant block diagonal el-ement

PPP A . As might be expected, the equivalent block sensingmatrix PPP follows the RIP- or coherence-based performanceguarantees as well. More related theoretical details on blockdiagonal matrices can be found in [49, 50] and the referencestherein.In short, the trained LSM implicitly learns the structuralcharacteristic of images, and meets asymptotically optimallytheoretical guarantees. As a result, the proposed AutoBCSis data-driven, which will automatically beneﬁt the imageacquisition.

C. Non-iterative data reconstruction module

We implement our non-iterative image reconstruction mod-ule with a deep neural network architecture consisted ofan initial reconstruction sub-network followed by an octavereconstruction sub-network.

1) Initial reconstruction sub-network:

Using linear MMSEestimation as optimization criterion, the traditional BCS frame-works [26] utilized an A × m a reconstruction matrix b BBB A = ρ x i , x i BBB TA ( BBB A ρ x i , x i BBB TA ) − to obtain initial reconstructed image,i.e., b xxx i = b BBB A yyy i , where b xxx i is the reconstructed vector of the i th sub-block image and ρ x i , x i denotes the autocorrelation functionof the input data. As a replacement to the reconstructionmatrix, a convolutional layer with 1024 kernels of size 1 × b xxx i = conv ( FFF int , yyy i ) = FFF int ∗ yyy i , (7)where b xxx i denotes the initial reconstruction vector of xxx i and FFF int represents A ﬁlters of size 1 × b xxx i is a 1 × A vector. The traditional BCSstrategies always resize and concatenate these reconstruction vectors obtained by reconstruction matrix to gain the initialreconstruction result of xxx . Following this baseline framework,we ﬁrst resize each 1 × A reconstruction vector b xxx i to an A × A sub-block result, and then concatenate all sub-blocks to obtainan initial reconstruction image b xxx in our AutoBCS. Let γ ( · ) and ζ ( · ) denote the reshape operator and the correspondingconcatenation, respectively. We can formulate the followingmodel: b xxx = ζ  γ ( b xxx ) γ ( b xxx ) · · · γ ( b xxx c ) γ ( b xxx ) γ ( b xxx ) · · · γ ( b xxx c ) ... ... . . . ... γ ( b xxx r ) γ ( b xxx r ) · · · γ ( b xxx rc )  , (8)where r and c jointly denote the positions of the sub-blockimages in the original full-size image.In fact, only exploiting the initial reconstruction sub-network is not sufﬁcient because the sub-block image qualityof the initial recovery is low and the concatenation operatorwill always yield blocking artifacts in the space domain. Forthe sake of exact reconstruction, we develop a customizedoctave reconstruction sub-network, which can automaticallydraw on the information of both intra- and inter- sub-blockimages, to eliminate blocking artifacts and maximize imagereconstruction accuracy.

2) Octave reconstruction sub-network:

As shown in thebottom part of Fig. 2, an octave convolution modiﬁed U-netarchitecture is introduced to further improve the reconstructionquality of xxx . Similar to traditional U-net [51], our proposedoctave reconstruction sub-network consists of a contractingpath and an expanding path as well. The goal of the former isto produce high-dimensional features from the local receptiveﬁeld. Thus, this part could be considered as feature extractionpart, which is a set of octave-convolution and max-poolingoperations (two convolutions followed by one max-pooling).For the contracting path, two separate operations

FFF e and FFF d (denoted by the blue and red arrows in Fig. 2, respectively)are involved and can be summarized as: FFF e ( b χχχ ) = ReLU ( ModOctConv ( b χχχ , ) + z ) , (9) FFF d ( b χχχ ) = MaxPool ( b χχχ , ) , (10)where b χχχ is the input feature maps, and ModOctConv ( · , ) de-notes the 3 × z is the bias of this layer, ReLU ( · ) rep-resents one of the most common activation function (Rectiﬁedlinear unit, max ( , x ) ), MaxPool ( · , ) is the 2 × FFF u , denoted bypurple arrow in Fig. 2, is included to oppose the effects ofpooling layers, and it can be expressed as: FFF u ( b χχχ ) = ReLU ( OctTransConv ( b χχχ , ) + z ) , (11) where OctTransConv ( · , ) represents the 2 × FFF c , denoted by black dash arrow in Fig. 2, as: FFF c ( b χχχ , b χχχ ) = MatrixCat ( b χχχ , b χχχ ) , (12)where b χχχ and b χχχ denote the corresponding feature maps of thelayers of contracting path and expanding path, respectively. Forexample, if the sizes of matrices b χχχ and b χχχ are 256 × × × ×

20, respectively, then the output will be amatrix of size 256 × × xxx can be obtained by:˜ xxx = b xxx + OctReconNet ( b xxx ) , (13)where OctReconNet ( b xxx ) denotes the output of the octave re-construction sub-network. D. Loss Function

The proposed AutoBCS is a pure DL-based BCS frameworkfrom image acquisition to reconstruction. Given an inputimage xxx , our aim is to estimate parameters

WWW of AutoBCSsuch that it can automatically and efﬁciently obtain samples yyy via image acquisition module, and then rapidly and exactlyreconstruct xxx from yyy through data reconstruction module.To achieve this goal, we take the original image xxx as theground truth and the corresponding reconstruction image ˜ xxx generated by AutoBCS as output. Specially, we design twoloss functions. One is for training the whole AutoBCS, andthe other is for the initial reconstruction sub-network. For thethe whole AutoBCS, the loss function is L ( WWW ) = N N ∑ j = k AutoBCS ( xxx j ; WWW ) − xxx j k , (14)where N is the number of the training images, xxx j denotes the j th original image, and AutoBCS ( xxx j ; WWW ) represents ˜ xxx j . For theinitial reconstruction sub-network, the loss function is L int ( WWW int ) = N N ∑ j = k I ( xxx j ; WWW int ) − xxx j k (15)where WWW int and I ( xxx j ; WWW int ) represents the parameters and theoutput of the initial reconstruction sub-network. Note thatthe image acquisition and reconstruction modules are jointlytrained, which impels AutoBCS to overcome the two mainchallenges of CS theory.IV. EXPERIMENTAL VALIDATIONThe proposed AutoBCS will be veriﬁed with extensiveexperiments in this section. In addition, we will compare ourAutoBCS with conventional BCS methods including TAL3[52], D-SPL [32], MH-SPL [36], RCoS [34] and GBsR [35], TABLE I: AutoBCS vs. different conventional BCS methods on three typical benchmark databases

Databases Samplingrate τ TAL3 D-SPL MH-SPL RCoS GBsR AutoBCSSSIM PSNR SSIM PSNR SSIM PSNR SSIM PSNR SSIM PSNR SSIM PSNRSet5 0.01 0.4553 15.51 0.1387 9.27 0.4352 18.01 0.4684 18.35 0.4901 18.78

Avg. 0.7088 25.32 0.5960 22.58 0.7378 27.28 0.7634 28.04 0.7873 29.07

Set14 0.01 0.3989 15.26 0.0978 8.95 0.4211 17.22 0.4274 17.56 0.4330 17.81

Avg. 0.6502 23.79 0.5360 21.30 0.6903 25.68 0.7064 26.18 0.7248 26.94

BSD100 0.01 0.3991 15.97 0.1065 9.65 0.4016 18.06 0.4258 18.31 0.4438 18.64

Avg. 0.6227 23.41 0.5101 20.77 0.6394 24.57 0.6588 25.02 0.6788 25.55

Fig. 4: Sample images from BSD500 database.and the newly-developing DL approaches including SDA-net[45], ReconNet [39], DR -net [40], ISTA-net [38], and CS-net[43]. Similar to our AutoBCS, the block size for these state-of-the-art frameworks is set to A =

32 as well. The structuralsimilarity index (SSIM) and the peak signal-to-noise ratio(PSNR in dB) [53] are used to evaluate the reconstructionaccuracies of network performance.

A. Training Details1) Implementation:

The implementation of the proposedAutoBCS model has been done using Pytorch 1.0 Frameworkon 2 TeslaV100 GPUs.

2) Experimental data:

Our training dataset is composed of400 images drawn from the popular database BSD500 [54],namely 200 train images and 200 test images. Some sampleimages for training are illustrated in Fig. 4. In order to improvethe network performance, data augmentation including imageﬂipping, rotating and their combination [55] are also appliedto account more detail cases of recovery images. In total, thetraining images are prepared as 89600 sub-images (96 × − for the ﬁrst50 epochs, 10 − for 51-80 epochs, and 10 − for ﬁnal 20epochs. In addition, ﬁve well-known datasets including Set5 [56], Set11 [39], Set14 [57], BSD68 and BSD100 [58], areutilized as test data to increase the diversity and generality ofthe benchmark. To facilitate visual perception, we only exploitgrayscale information of images for both training and testing,as in other BCS approaches. B. Comparisons with traditional BCS methods

In this part, the superiorities and effectiveness of the pro-posed AutoBCS are validated by comparing with ﬁve conven-tional BCS methods, namely TAL3, D-SPL, MH-SPL, RCoS,and GBsR, on three typical benchmark databases includingSet5, Set14 and BSD100. For these conventional algorithms,we use the default setups declared in the author’s websites,and run them on an Intel(R) Core(TV) i7-4770 CPU @ 3 . . . Set50 BSD1000123456

Fig. 5: Comparison of reconstruction time in logarithmic orderfor traditional BCS methods and AutoBCS on the Set5 andBSD100. Please zoom in for better comparison.

Ori 0.3 0.10.25 0.04 0.01

Fig. 6: Illustration of reconstructed images by using AutoBCS at different sampling rates. The ﬁrst column: original image; thereconstruction columns for left to right correspond to τ = . τ = . τ = . τ = .

04, and τ = .

01, respectively. Pleasezoom in for better comparison.7 . . .

74% at different sampling rates, where τ ∈ { . , . , . , . , . } , and the PSNR improvementsare about 5.31 dB, 5.33 dB, 2.45 dB, 1.69 dB, and 1.82dB, respectively. From these results, obviously, our proposedAutoBCS has great beneﬁts at all sampling rates. Although thesampling rate is 0.01, particularly, our method still providessuperior performance in terms of both SSIM and PSNR.Moreover, we plot the comparison of recovery time on thedatabases Set5 and Set14 in the case of τ = .

1. Please referto Fig. 5, where the vertical axis obeys t = + log ( t real ) ( t real represents real recovery time). For example, the realreconstruction time of AutoBCS t real = .

04 ms, then t = . τ from 0.01 to0.3. It once again veriﬁes that AutoBCS can still contain richsemantic content even at a particularly low sampling rate. As aTABLE II: Average PSNR (dB) comparison for AutoBCS andvarious DL-based BCS methods on Set11 Alg. Sampling rate τ -net 17.35 21.14 24.98 25.87 29.12 23.69ISTA-net 17.30 21.23 25.80 31.53 32.91 25.75CS-net 20.94 24.91 28.10 32.12 33.86 27.99AutoBCS consequence, our proposed AutoBCS is extraordinarily desir-able for low-power imaging devices. Finally, Fig. 7 depicts thecomparison of the reconstructed images at various samplingrates. From these visual illustrations, we can evidently observethat AutoBCS is superior to the traditional BCS methods,especially in recovering detail and texture.In summary, our proposed AutoBCS can improve the perfor-mance in both reconstruction accuracy and recovery speed, andbring about better results than the conventional BCS methodsin different cases on these public benchmark datasets. C. Comparisons with DL-based BCS methods

In this part, several DL-based BCS methods, namely SDA-net, ReconNet, DR -net, ISTA-net, and CS-net, are comparedwith our AutoBCS to further demonstrate the effectivenessof the proposed approach. Following the previous work [38],we select dataset Set11 and BSD68 for testing, and summa-rize the comparison result of PSNR for these approaches inTable II and Table III, respectively. For fair comparison, thelisted results are achieved from the corresponding works orreproduced as stated in the optimal setting provided by thecorresponding papers. The denoising postprocessing operatorof SDA, ReconNet and DR -net, i.e., BM3D denoiser, is alsoTABLE III: Average PSNR (dB) comparison for AutoBCS andvarious DL-based BCS methods on BSD68 Alg. Sampling rate τ -net 21.87 24.34 27.77 29.27 29.96 26.64ISTA-net 22.12 25.02 29.93 31.85 33.60 28.50CS-net 24.03 27.10 31.45 32.53 34.89 30.00AutoBCS Fig. 7: Comparison for reconstruction images for traditional BCS methods and AutoBCS at various sampling rates. The ﬁrstto third rows correspond to τ = . τ = .

2, and τ = .

3, respectively. Please zoom in for better comparison.

Sampling Rate 0.01503020100 Sampling Rate 0.140

Fig. 8: Comparison of average reconstruction time (unit: ms)to reconstruct a 256 ×

256 image for different DL-based BCSmethods and AutoBCS on Set5 in case of τ = .

01 and τ = . τ = .

01 and τ = .

1, which are shownin Fig. 8. We can evidentially conclude that, compared withtraditional BCS approaches, the DL-based BCS algorithmscan achieve better performance, which veriﬁes the superi-orities of non-iterative reconstruction methods. Among theaforementioned DL-based approaches, our proposed AutoBCSobtains signiﬁcantly better recovery accuracy with comparablereconstruction speed.In particular, several visual comparisons of reconstructedimages for τ = . -net have heavy blocking artifacts. On the contrary, our proposed AutoBCS cansmoothly reconstruct the image with more details and sharperedges. Overall, we can conclude from the results that AutoBCShave signiﬁcantly improved the recovery performance in bothquantitative validation and qualitative visualisation than otherstrategies being compared on the aforementioned databases.V. DISCUSSIONIn this section, we will ﬁrstly ﬁgure out LSM in comparativeperspective to show whether it can indeed enhance samplingefﬁciency and then discuss whether our proposed AutoBCS isrobust to noise. A. LSM versus traditional sensing matrices

As previously described, we develop LSM for data-drivenimage acquisition in our proposed AutoBCS architecture. Thegenerated LSM automatically captures the feature of eachblock image and the relationship among sub-block images,and is proved to satisfy the theoretical guarantees. As aconsequence, LSM can signiﬁcantly improve the samplingefﬁciency, and can be generally extended to the traditionalBCS frameworks. To quantify the performance, we apply LSMand other four traditional sensing matrices (GSM [8], BSM[22], BcSM [28] and CbSM [26]) in the aforementioned BCSalgorithms while other parameters remain unchanged. In caseof τ = . Note that GSM and BSM are two typical random sensing matrices, namelyGaussian and Bernoulli sensing matrices. BcSM is a widely used deterministicbinary code-based sensing matrix [28]. CbSM represents Chebyshev chaoticbinary matrix, which was introduced in [26]. Ori

PSNR/SSIMPSNR/SSIM

AutoBCSSDA-net ReconNet DR -net ISTA-net CS-net Fig. 9: Comparison for image reconstructions for various DL-based BCS approaches and AutoBCS at sampling rate τ = . -net, ISTA-net, CS-net and AutoBCS,respectively. Please zoom in for better comparison.TABLE IV: Comparison results for various sensing matrices with traditional BCS methods at τ = . Database SensingMatrix TAL3 D-SPL MH-SPL RCoS GBsR Avg.SSIM PSNR SSIM PSNR SSIM PSNR SSIM PSNR SSIM PSNR SSIM PSNRSet5 GSM 0.7855 26.85 0.7641 24.66 0.8217 28.63 0.8522 29.56 0.8679 30.12 0.8183 27.96BSM 0.7786 26.76 0.7747 25.16 0.8202 28.56 0.8498 29.42 0.8624 29.86 0.8171 27.95BcSM 0.7858 26.97 0.7892 26.08 0.8238 28.81 0.8582 29.67 0.8701 30.34 0.8254 28.37CbSM 0.7781 26.73 0.7571 24.47 0.8176 28.40 0.8487 29.46 0.8608 29.73 0.8125 27.76LSM

Ori GSM BcSM LSM

Fig. 10: Comparison of image reconstruction using threesensing matrices combined with traditional GBsR algorithm.First column: the original image; the reconstruction columnsare based on GSM, BcSM and our trained LSM, respectively.Please zoom in for better comparison.the comparison results, we can distinctly observe that thesetraditional BCS methods equipped with our generated LSMgreatly enhance the sampling efﬁciency for better reconstruc-tion, and they can improve the reconstruction accuracy byroughly 5 .

48% on SSIM and 10 .

39% on PSNR in average.Let us now validate the comparison methods in visual sightto make it more convinced. We select “bird” and “butterﬂy” asthe original images of interest. Following the previous setting,we show the comparison results in Fig. 10. From these visual TABLE V: Average PSNR (dB) comparison of the competingapproaches for various noise levels on dataset Set5

Noise Alg. Sampling rate τ σ n = .

01 D-SPL 9.26 12.70 24.61 32.49 33.45MH-SPL 17.96 22.48 28.62 33.02 33.87GBsR 18.76 23.80 29.73 35.59 36.68AutoBCS σ n = .

05 D-SPL 9.26 12.71 24.83 30.90 31.66MH-SPL 17.89 22.28 28.02 31.60 32.22GBsR 18.16 23.46 28.92 33.13 33.79AutoBCS σ n = . plots, we can distinctly observe that the reconstructed imagesseem to be much clearer and contain richer textures with moreedges and details. Totally speaking, LSM can indeed facilitatethe recovery performance in a large margin, compared withother classic sensing matrices. B. AutoBCS for noisy reconstruction

This part aims to measure the reconstruction performanceof AutoBCS for noisy occasions . To this end, three con-ventional BCS methods (D-SPL, MH-SPL, and GBsR) are It is worth noting that here we are still using the previously trained networkinstead of retraining a new AutoBCS under noisy environment. Ori D-SPL GBsR AutoBCS E rr o r E rr o r Fig. 11: Comparison of noisy image reconstruction for tra-ditional BCS approaches (D-SPL and GBsR) and AutoBCSon Set5 in case of τ = . σ n = . σ n = . σ n = .

05, and σ n = .

1, respectively. The average PSNR results at variousnoise levels are shown in Table V. According to these results,we can clearly see that our proposed AutoBCS consistentlyoutperforms the other competing approaches. For σ n = . σ n = .

05, AutoBCS can enhance PSNR onaverage by approximately 9.49 dB, 4.96 dB and 3.87 dB,respectively. For σ n = .

1, our proposed framework surpassesD-SPL, MH-SPL, and GBsR methods by 8.74 dB, 4.85 dBand 4.18 dB. In particular, we illustrate the visual comparisonsat sampling rate τ = . σ n = . x L x H y L y H Modified OctConvolution

H WW/2H/2 H WW/2H/2X L * conv(3ks, cin L , cout L )X H * conv(ks, cin H , cout H )cin L cin H cout L cout H AveragePool(X H ) * conv(ks, cin H , cout L ) convT(2) * X L * conv(ks, cin L , cout H ) Fig. 12: The concept of modiﬁed octave convolution.proposed AutoBCS develops a learning-based sensing patternsto obtain data acquisition without handcrafting the sparsifyingdomain and the related parameters, and customizes a subse-quent inference model to accomplish fast image reconstructionwith low computational cost. In a nutshell, the generatedLSM is data-driven and its sampling efﬁciency is theoreticallyguaranteed, and the computational complexity of image re-construction is signiﬁcantly reduced, through our framework.Our proposed AutoBCS is veriﬁed on several image databases,and the corresponding results demonstrate its effectiveness andsuperiorities, compared with traditional BCS approaches aswell as the other newly-developing DL-based methods.In the future work, we will further investigate why thesampling efﬁciency of LSM can outperform that of randommatrices, although we have shown that LSM meets the theoret-ical requirements. Moreover, some strong robust modiﬁed DLstrategies will be considered for noisy image reconstructionoccasions. Meanwhile, how to extend our proposed AutoBCSto other image inverse problems, such as inpainting anddeconvolution, is another concerned direction of our futurework. A

PPENDIX A Part I: Modiﬁed octave convolution

Let

XXX ∈ R { h × w × u } represent the input feature tensor ofa convolution layer, where h , w , and u denote the height,width, and channel of the feature maps, respectively. Unliketraditional vanilla convolutions, octave convolution explicitlyfactorize the feature maps XXX along the channel dimensioninto { XXX H , XXX L } , where XXX H ∈ R { h × w × ( − t ) u } denotes the highresolution features, XXX L ∈ R { h × w × tu } represents the low reso-lution features, and t ∈ [ , ] , respectively. Typically, for theinput layer of the octave network t =

0, and for the followinglayers t = .

5. In our work, the original octave convolution isspecially modiﬁed by replacing the nearest interpolation withthe transposed convolution. As a consequence, the correspond-ing outputs of modiﬁed octave convolution, { YYY H , YYY L } , can beobtained as following: YYY H = conv ( XXX H , WWW H − H ) + convT ( conv ( XXX L , WWW L − H ) , ) , YYY L = conv ( XXX L , WWW L − L ) + conv ( pool ( XXX H , ) , WWW H − L ) , where convT ( · , k ) is the traditional transposed convolutionoperation with kernels of size k × k , pool ( XXX , k ) denotes thepooling operation with kernel size k × k , and WWW is the learnable weights of different layers. For visual observation, we illustratethe concept of modiﬁed octave convolution in Fig. 12.Note that the replacement of the nearest interpolation bythe transposed convolution will result in that the up-samplingoperation becomes a learnable block and can be optimized dur-ing the training procedure of our network. Such a modiﬁcationwill degrade more redundancy on the spatial dimension, andobtain better multi-scale representation learning than vanillaconvolutions. Part II: Octave transposed convolution

An octave transposed convolution is designed for our cus-tomized octave reconstruction sub-network. Let { XXX H , XXX L } bethe input feature maps, and { YYY H , YYY L } , YYY H ∈ R { h × w × ( − t ) u } , YYY L ∈ R { h × w × tu } , represent the output features. Then we canformulate the octave transposed convolution as: YYY H = convT ( XXX H , WWW H − H , ) + convT ( XXX L , ) , YYY L = convT ( XXX L , ) + conv ( XXX H , WWW H − L ) . Such an octave transposed convolution will allow the featuremaps { XXX H , XXX L } double their spatial resolution, and can helpbuild the expanding part of the octave reconstruction sub-network. A PPENDIX

BTo begin with, we show the research of RIP-based guaranteefor LSM

PPP A , following the works of [8, 22]. Theorem 1.

An m a × A LSM, trained from AutoBCS network,satisﬁes the ( s a , δ a ) -RIP with the prescribed δ a and anys a ≤ c A / log ( m a / s a ) with probability exceeding − e − c m a ,where two constants c , c ≥ and s a is sparsity of anarbitrary signal ∈ R A . Theorem 1 implies that LSM meets asymptotically optimalsampling performance. Moreover, this property straightfor-ward allows us to pose other guarantees for LSM via theGershgˇorin circle theorem, such as the coherence and sparkproperty. For example, the coherence property can be bridgedto the ( s a , δ a ) -RIP. Therefore, it can be directly posed the fol-lowing condition on our trained LSM that guarantees samplingefﬁciency. Theorem 2.

For an m a × A LSM satisfying the ( s a , δ a ) -RIP,then the coherence µ ( PPP A ) and spark of LSM follow µ ( PPP A ) = δ a / ( s a − ) and spark ( PPP A ) > s a , respectively. Very similar results of sampling efﬁciency hold for randommatrices, such as Gaussian and Bernoulli matrices. For each s a -sparse signal to be uniquely representation by its samples yyy s , we can show that PPP A meets the ( s a , δ a ) -RIP with δ a > s a columns of PPP A are linearlyindependent, that is, spark ( PPP A ) > s a . From a theoreticalaspect, the RIP of LSM enables recovery guarantees thatare more robust than those based on coherence and sparkproperties. In addition, we can also draw similar conclusionsby analyzing the elements’ characteristics of PPP A . R EFERENCES [1] E. J. Cand`es, J. Romberg, and T. Tao, “Robust uncer-tainty principles: Exact signal reconstruction from highlyincomplete frequency information,”

IEEE Transactionson Information Theory , vol. 52, no. 2, pp. 489–509, 2006.[2] D. L. Donoho, “Compressed sensing,”

IEEE Transactionson Information Theory , vol. 52, no. 4, pp. 1289–1306,2006.[3] M. F. Duarte, M. A. Davenport, D. Takhar, J. N. Laska,T. Sun, K. F. Kelly, and R. G. Baraniuk, “Single-pixelimaging via compressive sampling,”

IEEE Signal Pro-cessing Magazine , vol. 25, no. 2, pp. 83–91, 2008.[4] L. Y. Zhang, K. Wong, Y. Zhang, and J. Zhou, “Bi-levelprotected compressive sampling,”

IEEE Transactions onMultimedia , vol. 18, no. 9, pp. 1720–1732, 2016.[5] Y. Zhang, Y. Xiang, L. Y. Zhang, Y. Rong, and S. Guo,“Secure wireless communications based on compressivesensing: A survey,”

IEEE Communications Surveys andTutorials , vol. 21, no. 2, pp. 1093–1111, 2019.[6] G. Mei, X. Wu, Y. Wang, M. Hu, J.-A. Lu, and G. Chen,“Compressive-sensing-based structure identiﬁcation formultilayer networks,”

IEEE Transactions on Cybernetics ,vol. 48, no. 2, pp. 754–764, 2017.[7] S. Zhang, X. Li, Q. Lin, and K. Wong, “Nature-inspiredcompressed sensing for transcriptomic proﬁling fromrandom composite measurements,”

IEEE Transactions onCybernetics , pp. 1–12, 2019.[8] Y. C. Eldar and G. Kutyniok,

Compressed sensing:Theory and applications . Cambridge University Press,2012.[9] N. Ahmed, T. Natarajan, and K. R. Rao, “Discrete cosinetransform,”

IEEE Transactions on Computers , vol. 100,no. 1, pp. 90–93, 1974.[10] L. He and L. Carin, “Exploiting structure in wavelet-based Bayesian compressive sensing,”

IEEE Transactionson Signal Processing , vol. 57, no. 9, pp. 3488–3497,2009.[11] M. Usman, C. Prieto, T. Schaeffter, and P. Batchelor,“K-t group sparse: A method for accelerating dynamicMRI,”

Magnetic Resonance in Medicine , vol. 66, no. 4,pp. 1163–1176, 2011.[12] Z. Lai, X. Qu, Y. Liu, D. Guo, J. Ye, Z. Zhan, andZ. Chen, “Image reconstruction of compressed sensingMRI using graph-based redundant wavelet transform,”

Medical Image Analysis , vol. 27, pp. 93–104, 2016.[13] R. G. Baraniuk, V. Cevher, M. F. Duarte, and C. Hegde,“Model-based compressive sensing,”

IEEE Transactionson Information Theory , vol. 56, no. 4, pp. 1982–2001,2010.[14] S. Ravishankar, B. E. Moore, R. R. Nadakuditi, andJ. A. Fessler, “Low-rank and adaptive sparse signal(LASSI) models for highly accelerated dynamic imag-ing,”

IEEE Transactions on Medical Imaging , vol. 36,no. 5, pp. 1116–1128, 2017.[15] W. Dong, G. Shi, X. Li, Y. Ma, and F. Huang, “Com-pressive sensing via nonlocal low-rank regularization,”

IEEE Transactions on Image Processing , vol. 23, no. 8, pp. 3618–3632, 2014.[16] E. J. Cand`es, “The restricted isometry property and itsimplications for compressed sensing,” Comptes RendusMathematique , vol. 346, no. 9-10, pp. 589–592, 2008.[17] M. Elad,

Sparse and redundant representations: Fromtheory to applications in signal and image processing .Springer Science & Business Media, 2010.[18] A. Cohen, W. Dahmen, and R. DeVore, “Compressedsensing and best k -term approximation,” Journal of theAmerican Mathematical Society , vol. 22, no. 1, pp. 211–231, 2009.[19] S. Li and G. Ge, “Deterministic construction of sparsesensing matrices via ﬁnite geometry,”

IEEE Transactionson Signal Processing , vol. 62, no. 11, pp. 2850–2859,2014.[20] H. Gan, S. Xiao, Y. Zhao, and X. Xue, “Constructionof efﬁcient and structural chaotic sensing matrix forcompressive sensing,”

Signal Processing: Image Com-munication , vol. 68, pp. 129–137, 2018.[21] H. Gan, S. Xiao, and F. Liu, “Chaotic binary sensing ma-trices,”

International Journal of Bifurcation and Chaos ,vol. 29, no. 09, p. 1950121, 2019.[22] R. Baraniuk, M. Davenport, R. DeVore, and M. Wakin,“A simple proof of the restricted isometry property forrandom matrices,”

Constructive Approximation , vol. 28,no. 3, pp. 253–263, 2008.[23] A. Maleki and D. L. Donoho, “Optimally tuned iterativereconstruction algorithms for compressed sensing,”

IEEEJournal of Selected Topics in Signal Processing , vol. 4,no. 2, pp. 330–341, 2010.[24] J. Yang and Y. Zhang, “Alternating direction algorithmsfor ℓ -problems in compressive sensing,” SIAM Journalon Scientiﬁc Computing , vol. 33, no. 1, pp. 250–278,2011.[25] D. L. Donoho, A. Maleki, and A. Montanari, “Message-passing algorithms for compressed sensing,”

Proceedingsof the National Academy of Sciences , vol. 106, no. 45,pp. 18914–18919, 2009.[26] L. Gan, “Block compressed sensing of natural images,”in , pp. 403–406, IEEE, 2007.[27] B. Adcock and A. C. Hansen, “Generalized sampling andinﬁnite-dimensional compressed sensing,”

Foundationsof Computational Mathematics , vol. 16, no. 5, pp. 1263–1323, 2016.[28] Z. Yang, L. Xie, and C. Zhang, “Variational Bayesian al-gorithm for quantized compressed sensing,”

IEEE Trans-actions on Signal Processing , vol. 61, pp. 2815–2824,June 2013.[29] V. Cevher, P. Indyk, C. Hegde, and R. G. Baraniuk,“Recovery of clustered sparse signals from compressivemeasurements,” tech. rep., Rice Univ Houston Tx Deptof Electrical and Computer Engineering, 2009.[30] Y. L. Polo, Y. Wang, A. Pandharipande, and G. Leus,“Compressive wide-band spectrum sensing,” in , pp. 2337–2340, IEEE, 2009.[31] S. Ji, D. Dunson, and L. Carin, “Multitask compres- sive sensing,”

IEEE Transactions on Signal Processing ,vol. 57, no. 1, pp. 92–106, 2008.[32] S. Mun and J. E. Fowler, “Block compressed sensingof images using directional transforms,” in ,pp. 3021–3024, IEEE, 2009.[33] J. E. Fowler, S. Mun, and E. W. Tramel, “Multi-scale block compressed sensing with smoothed projectedlandweber reconstruction,” in , pp. 564–568, IEEE, 2011.[34] J. Zhang, D. Zhao, C. Zhao, R. Xiong, S. Ma, andW. Gao, “Image compressive sensing recovery via collab-orative sparsity,”

IEEE Journal on Emerging and SelectedTopics in Circuits and Systems , vol. 2, no. 3, pp. 380–391, 2012.[35] J. Zhang, D. Zhao, and W. Gao, “Group-based sparserepresentation for image restoration,”

IEEE Transactionson Image Processing , vol. 23, no. 8, pp. 3336–3351,2014.[36] C. Chen, E. W. Tramel, and J. E. Fowler, “Compressed-sensing recovery of images and video using multihy-pothesis predictions,” in , pp. 1193–1198, IEEE, 2011.[37] Y. Yang, J. Sun, H. Li, and Z. Xu, “ADMM-CSNet: Adeep learning approach for image compressive sensing,”

IEEE Transactions on Pattern Analysis and MachineIntelligence , vol. 42, no. 3, pp. 521–538, 2020.[38] J. Zhang and B. Ghanem, “ISTA-Net: Interpretableoptimization-inspired deep network for image compres-sive sensing,” in

Proceedings of the IEEE Conferenceon Computer Vision and Pattern Recognition , pp. 1828–1837, 2018.[39] K. Kulkarni, S. Lohit, P. Turaga, R. Kerviche, andA. Ashok, “ReconNet: Non-iterative reconstruction ofimages from compressively sensed measurements,” in

Proceedings of the IEEE Conference on Computer Visionand Pattern Recognition , pp. 449–458, 2016.[40] H. Yao, F. Dai, S. Zhang, Y. Zhang, Q. Tian, and C. Xu,“DR -net: Deep residual reconstruction network for im-age compressive sensing,” Neurocomputing , vol. 359,pp. 483–493, 2019.[41] A. Adler, D. Boublil, and M. Zibulevsky, “Block-basedcompressed sensing of images via deep learning,” in , pp. 1–6, 2017.[42] W. Shi, F. Jiang, S. Zhang, and D. Zhao, “Deep networksfor compressed image sensing,” in , pp. 877–882,2017.[43] W. Shi, F. Jiang, S. Liu, and D. Zhao, “Image compressedsensing using convolutional neural network,”

IEEETransactions on Image Processing , vol. 29, pp. 375–388,2019.[44] J. E. Fowler, S. Mun, E. W. Tramel, et al. , “Block-basedcompressed sensing of images and video,”

Foundationsand Trends in Signal Processing , vol. 4, no. 4, pp. 297–416, 2012. [45] A. Mousavi, A. B. Patel, and R. G. Baraniuk, “A deeplearning approach to structured signal recovery,” in , pp. 1336–1343, IEEE, 2015.[46] Y. Chen, H. Fan, B. Xu, Z. Yan, Y. Kalantidis,M. Rohrbach, S. Yan, and J. Feng, “Drop an octave:Reducing spatial redundancy in convolutional neuralnetworks with octave convolution,” in Proceedings ofthe IEEE International Conference on Computer Vision ,pp. 3435–3444, 2019.[47] S. Gao, M. Cheng, K. Zhao, X. Zhang, M. Yang, andP. H. S. Torr, “Res2Net: A new multi-scale backbonearchitecture,”

IEEE Transactions on Pattern Analysis andMachine Intelligence , pp. 1–10, 2019.[48] R. Durall, F. Pfreundt, and J. Keuper, “Stabilizing GANswith octave convolutions,”

CoRR , vol. abs/1905.12534,2019.[49] A. Eftekhari, H. L. Yap, C. J. Rozell, and M. B. Wakin,“The restricted isometry property for random block di-agonal matrices,”

Applied and Computational HarmonicAnalysis , vol. 38, no. 1, pp. 1–31, 2015.[50] N. Koep, A. Behboodi, and R. Mathar, “The re-stricted isometry property of block diagonal matri-ces for group-sparse signal recovery,” arXiv preprintarXiv:1901.06214 , 2019.[51] O. Ronneberger, P. Fischer, and T. Brox, “U-net: Con-volutional networks for biomedical image segmentation,”in

International Conference on Medical image comput-ing and computer-assisted intervention , pp. 234–241,Springer, 2015.[52] C. Li, W. Yin, and Y. Zhang, “User’s guide for TVAL3:TV minimization by augmented lagrangian and alternat-ing direction algorithms,”

CAAM Report , vol. 20, no. 46-47, p. 4, 2009.[53] A. Hore and D. Ziou, “Image quality metrics: PSNR vs.SSIM,” in , pp. 2366–2369, IEEE, 2010.[54] P. Arbelaez, M. Maire, C. Fowlkes, and J. Malik, “Con-tour detection and hierarchical image segmentation,”

IEEE Transactions on Pattern Analysis and MachineIntelligence , vol. 33, no. 5, pp. 898–916, 2011.[55] J. Kim, J. Kwon Lee, and K. Mu Lee, “Accurateimage super-resolution using very deep convolutionalnetworks,” in

Proceedings of the IEEE Conference onComputer Vision and Pattern Recognition , pp. 1646–1654, 2016.[56] M. Bevilacqua, A. Roumy, C. Guillemot, and M. L.Alberi-Morel, “Low-complexity single-image super-resolution based on nonnegative neighbor embedding,”in

Proceeding of British Machine Vision Conference ,pp. 135.1–135.10, BMVA Press, 2012.[57] R. Zeyde, M. Elad, and M. Protter, “On single im-age scale-up using sparse-representations,” in

Interna-tional Conference on Curves and Surfaces , pp. 711–730,Springer, 2010.[58] D. Martin and C. Fowlkes, “A database of human seg-mented natural images and its application to evaluatingsegmentation algorithms and measuring ecological statis- tics,” in