Sparsity and `Something Else': An Approach to Encrypted Image Folding
aa r X i v : . [ c s . C V ] S e p Sparsity and ‘Something Else’: An Approach toEncrypted Image Folding
James Bowley and Laura Rebollo-NeiraMathematics Department, Aston University, Birmingham B4 7ET, UK
Abstract —A property of sparse representations in relation totheir capacity for information storage is discussed. It is shownthat this feature can be used for an application that we termEncrypted Image Folding. The proposed procedure is realizablethrough any suitable transformation. In particular, in this paperwe illustrate the approach by recourse to the Discrete CosineTransform and a combination of redundant Cosine and Diracdictionaries. The main advantage of the proposed technique isthat both storage and encryption can be achieved simultaneouslyusing simple processing steps.
I. I
NTRODUCTION
The problem of reducing the dimensionality of a piece ofdata without losing the information content is of paramountimportance in signal processing. Well-established transforms,from classical Fourier and Cosine Transforms to Wavelets,Wavelet Packets, and Lapped Transforms, just to mentionthe most popular ones, are usually applied for generating thetransformed domain where the processing tasks are realized.Signals amenable to transformation into data sets of smallercardinality are said to be compressible. Natural images, forinstance, provide a typical example of compressible data.In the last fifteen years emerging techniques for signalrepresentation are addressing the matter by means of highlynonlinear methodologies which decompose the signal into asuperposition of vectors, normally called ‘atoms’, selectedfrom a large redundant set called a ‘dictionary’. The repre-sentation qualifies to be sparse if the number of atoms for asatisfactory signal approximation is considerably smaller thanthe dimension of the original data. Available methodologiesfor highly nonlinear approximations are known as PursuitStrategies. This comprises Basis Pursuit [1], [2] and Matching-Pursuit-like algorithms, including Orthogonal Matching Pur-suit (OMP) and variations of these methods [3], [4], [5], [6],[7], [8], [9], [10]. The other ingredient of highly nonlinearapproximations is, of course, the dictionary providing theatoms for the selection. In this respect, Gabor dictionaries havebeen shown to be useful for image and video processing [11],[12]. Combined dictionaries, arising by merging for instanceorthogonal bases, have received consideration in relation tothe theoretical analysis of Pursuit Strategies [13], [14], [15],[16], [17]. From a different perspective, other approaches arebased on dictionaries learned from large data sets [18], [19].This communication exploits an inherent side-effect ofsparse representations. Since sparsity entails a projection ontoa subspace of lower dimensionality, a null space is generated.Extra information can be embedded in such a space andthen stably extracted. In particular, we discuss an application involving the null space yielded by the sparse representationof an image, to store part of the image itself in encryptedform. We term this application Encrypted Image Folding (EIF).The main advantage of this proposal, in relation to standardtechniques, is that storage and encryption can be achievedsimultaneously by means of simple data processing steps . Theproposed procedure can be carried out through any suitabletransformation. In particular, we consider here the DiscreteCosine Transform (DCT) and a mixed dictionary composed ofa Redundant Discrete Cosine (RDC) dictionary and a discreteDirac Basis (DB). RDC and DB dictionaries are consideredseparately in [1]. A theoretical discussion with regards to arandom collection of elements of a Discrete Sine basis anda DB is presented in [20]. In this letter we would simplylike to draw attention to the suitability of mixed dictionariescomposed of RDC and DB, for image representation. Asfar as sparsity is concerned, at the visually acceptable levelof 40dB PSNR, they may render a significant improvementin comparison to established fast transforms such as theDCT and Wavelet Transform (WT). An additional advantageof these dictionaries is that Matching Pursuit-like strategiesfor selecting the atoms can be implemented at a reducedcomplexity cost by means of the DCT. For these reasons,we illustrate our approach for EIF using a mixed RDC-DBdictionary, in addition to standard the DCT.The paper is organized as follows: Sec. II motivates the useof a mixed RDC-DB dictionary within the present framework.Sec. III discusses the fact that a sparse representation can beused for embedding information. Based on such a possibility, ascheme for image folding and a simple encryption procedure,fully implementable by data processing, are discussed inSec.IV. The conclusions are presented in Sec. V.II. S
PARSE IMAGE REPRESENTATION BY
RDC-DBD
ICTIONARIES
Let us start by introducing the dictionaries and methodologywhich will be used in Section III for illustrating the presentapproach. Consider the set D a defined as D a = { v i ; v j,i = p i cos( π (2 j − i − M ) , j = 1 , . . . , N } Mi =1 , with p i , i = 1 , . . . , M normalization factors and the notation v j,i indicating the component j of vector v i ∈ R N . If M = N this set is a Discrete Cosine (DC) orthonormal basis for R N . If M = 2 lN , with l a positive integer, the set is a DC dictionarywith redundancy l . We further consider the set D b , which is a discrete DB, alsoknown as standard orthonormal basis i.e., D b = { e i ∈ R N ; e j,i = δ i,j , j = 1 , . . . , N } Ni =1 , where δ i,j = 1 if i = j and zero otherwise. From the joint dic-tionary D ab = D a ∪ D b a redundant dictionary D for R N × N is obtained as the Kronecker product D = D ab ⊗ D ab . Wedenote by d n ∈ R N × N , n = 1 , . . . , J , where J = ( M + N ) ,the elements of dictionary D and use them to construct theatomic decomposition of an image I ∈ R N × N as I K = K X i c Ki d ℓ i . (1)The atoms d ℓ i , i = 1 , . . . , K are to be selected from thedictionary D by a Pursuit Strategy. In the examples we givehere we have used OMP, which evolves as follows: Setting R = I at iteration k + 1 the OMP algorithm selects theatom, d ℓ k +1 say, as the one maximazing the absolute value ofthe Frobenius inner products h d i , R k i F , i = 1 , . . . , J , i.e., ℓ k +1 = arg max i =1 ,...,J |h d i , R k i F | with R k = I − k X i =1 c ki d ℓ i . (2)The coefficients c ki , i = 1 , . . . , k in (2) are such that the Frobe-nius norm k R k k F is minimum. Our implementation is basedon Gram Schmidt orthonormalization and adaptive biorthog-onalization, as proposed in [5]. The complexity is dominatedby the calculation of the quantities h d i , R k i F , i = 1 , . . . , J in (2) at each iteration step. For the present dictionaries thesequantities can be evaluated by fast DCT. In order to discussthe matter let us re-name the dictionary atoms as followsfor n = 1 , . . . , M d n → v i ⊗ v j , i = 1 , . . . , M, j = 1 , . . . , M for n = M + 1 , . . . , M + M N d n → v i ⊗ e j , i = 1 , . . . , M, j = 1 , . . . , N for n = M + M N + 1 , . . . , M + 2 M N d n → e i ⊗ v j , i = 1 , . . . , N, j = 1 , . . . , M for n = M + 2 M N + 1 , . . . , J d n → e i ⊗ e j , i = 1 , . . . , N, j = 1 , . . . , N. Hence, by denoting as R k ( s, r ) the element ( s, r ) of matrix R k and defining ψ j,i = cos( π (2 j − i − M ) , the inner products h d i , R k i F , i = 1 , . . . , J are calculated as h v i ⊗ v j , R k i F = p i p j N X s,r =1 R k ( s, r ) ψ s,i ψ r,j (3) h v i ⊗ e j , R k i F = p i N X s =1 R k ( s, j ) ψ s,i (4) h e i ⊗ v j , R k i F = p j N X r =1 R k ( i, r ) ψ r,j (5) h e i ⊗ e j , R k i F = R k ( i, j ) . (6)If M = N (3) is the 2D DCT of the residual R k whilst (4)and (5) are the 1D DCT of the rows and columns of R k , Image Dictionary DCT DWTBarbara 7.09 4.05 3.92Boat 6.03 3.63 3.65Bridge 3.70 2.06 2.20Film Clip 8.06 4.53 4.81Jester 6.28 3.6 3.88Lena 10.06 6.50 6.97Mandrill 3.32 1.91 1.90Peppers 7.74 4.36 3.39Photo (Fig 1) 5.28 3.01 3.15TABLE IS
PARSITY R ATIO ( FOR
PSNR OF D B) ACHIEVED BY THE MIXED
RDC-DB
DICTIONARY AND THAT YIELDED BY
DCT
AND
DWT. respectively. If M = 2 lN , for some positive integer l , thecalculations can also be carried out through fast DCT by zeropadding. Thus, the complexity required for evaluation of innerproducts in (2) is O ( M log M ) . In order to highlight thecapacity of RDC-DB dictionaries to achieve sparse represen-tation of natural images, we use them to represent the populartest images which are listed in the first column of Table I andthe photo of Bertrand Russell shown in Fig 1. For the actualprocessing we divide each image into blocks of × pixels.The sparsity measure we use is the Sparsity Ratio (SR) definedas SR = total number of pixelstotal number of coefficients . In all the cases the number of coefficients are determined soas to produce a PSNR of 40dB in the image reconstructionand the dictionary is a mixed RDC (redundancy 2) and DB.The results are given in the second column of Table I. Forcomparison the third column of this table shows results pro-duced by DCT implemented using the same blocking scheme.For further comparison the results produced by the Cohen-Daubechies-Feauveau 9/7 DWT (applied on the whole imageat once) are displayed in the last column of Table I. Notice that,while for the fixed PSNR of 40 dB the DCT and DWT yieldcomparable SR, the corresponding SR obtained by the mixeddictionaries, for all the images, is significantly higher. Thismotivates the use of RDC-DB dictionaries in the applicationwe are proposing.III. R
OOM FOR INFORMATION EMBEDDING
Since a sparse representation involves a projection onto alower dimension subspace, it also creates room for storing‘something else’. The subspace, say S K , spanned by the K -dictionary’s atoms { d ℓ i } Ki =1 rendering a sparse representationof an image is a proper subspace of the image space R N × N .Thus, denoting by S ⊥ K the orthogonal complement of S K in R N × N we have R N × N = S K ⊕ ⊥ S ⊥ K where ⊕ ⊥ indicatesorthogonal sum. Hence, if we take an element F ∈ S ⊥ K andadd it to the image forming G = I + F , the image I can berecovered from G through the operation P S K G = P S K ( I + F ) = I , (7)where P S K is the orthogonal projection matrix, onto thesubspace S K . This suggests the possibility of using the sparserepresentation of an image to embed the image with additional Fig. 1. The small pictures at the top are the folded Image by DCT (left)and RDC-DB dictionary (right). The middle pictures are the correspondingunfolded images without knowledge of the private key to initialize the rotation.The bottom pictures are the unfolded images when the correct key is used. information stored in a matrix F ∈ S ⊥ K . In order to do this,we apply the earlier proposed scheme to embed redundantrepresentations [21], which in this case operates as describedbelow. Embedding Scheme:
Consider that I K as in (1) is thereconstruction of a sparse representation of an image I . Weembed L = N − K numbers h i , i = 1 , . . . , L into a matrix F ∈ S ⊥ K as prescribed below. • Take an orthonormal basis u i , i = 1 , . . . , L for S ⊥ K andform matrix F as the linear combination F = L X i =1 h i u i . (8) • Add F to I K to obtain G = I K + F . Information Retrieval:
Given G retrieve the numbers h i , i = 1 , . . . , L as follows. • Construct an orthogonal projection matrix P S K , onto thesubspace S K = span { d ℓ i } Ki =1 and extract the image ˜ I K from G as ˜ I K = P S K G . • From the given G and the extracted ˜ I K obtain F as F = G − ˜ I K . Use F and the orthonormal basis u i , i = 1 , . . . , L to retrieve the embedded numbers h i , i = 1 , . . . , Lh i = h u i , F i F , i = 1 , . . . , L. (9)One can encrypt the embedding procedure simply by randomlycontrolling the order of the orthogonal basis u i , i = 1 , . . . , L or by applying some random rotation to the basis. An exampleis given in the next section.IV. A PPLICATION TO E NCRYPTED I MAGE F OLDING (EIF)We apply now the above discussed embedding scheme tofold and encrypt an image. For this we process the image by dividing it into Q blocks I q , q = 1 , . . . , Q of N q × N q pixelseach and compute their sparse representation I K q q = K q X i =1 c K q i d q ℓi , q = 1 , . . . , Q. (10)We keep a number, H , of these block of pixels as hosts forembedding the coefficients of the remaining equations (10).Each host block I K q q is embedded as follows: Taking L q = N q − K q of the coefficients to be embedded, we build a blockof pixels F q as in (8) and add it to the host block to obtain G q = I K q q + F q . Since the number H of host blocks is thesuperior integer part of QSR , as sparsity increases less hostblocks are needed to embed the remaining ones. In the examplepresented here for each host block q , with q = 1 , . . . , H , wehave built the orthogonal basis u qi , i = 1 , . . . , L q (c.f. (8)) byrandomly generating matrices y qi ∈ R N q × N q , i = 1 , . . . , L q using a public initialization seed q for the random generator.Through a projection matrix P S Kq onto S K q = span { d q ℓi } K q i =1 ,we compute matrices o qi ∈ S ⊥ K as o qi = y qi − P S Kq y qi , i = 1 , . . . , L q . (11)Setting an initialization key , which remains unknown for anunauthorized user, we apply a random transformation Π key onthese matrices to obtain a private set of matrices Π key : ( o qi , i = 1 , . . . , L q ) → { ˜ o qi } L q i =1 . (12)Next, through an orthogonalization procedure \ Orth ( · ) weobtain the orthonormal basis { u qi } L q i =1 = \ Orth (˜ o qi , i = 1 , . . . , L q ) . (13)that we use for embedding the coefficients of the remaining Q − H blocks.We illustrate the results on a 8 bit × photo ofBertrand Russell divided into blocks of × pixels, usingboth standard DCT and the RDC-DB dictionary discussed inSec. II.The top pictures of Fig. 1 are the folded images using DCT(left) and the RDC-DB dictionary (right). Each block of × pixels in these figures is the superposition G q = I K q q + F q described above. In both cases the method applied for findingthe sparse representation I K q q is nonlinear, but the DCT caseis O ( K ) faster than the mixed dictionaries one ( K being theaverage number of coefficients per block). Since the SR forthe DCT is smaller than the SR for the mixed dictionary, thecorresponding folded image is larger. The middle pictures arethe unfolded images when an incorrect security key is used.They are obtained as follows: Each block G q in the top pic-tures is used to recover the host blocks ˜ I K q q , q = 1 , . . . , H as ˜ I K q q = P S Kq G q , q = 1 , . . . , H (top piece of image correctlyreconstructed). Subtracting these pixels to the corresponding G q of the top picture we obtain the pixels F q which areused to retrieve the embedded coefficients, as in (9) but withmatrices u qi , i = 1 , . . . , L q constructed with an incorrect key (c.f. (13)). As seen in the largest portion of the middle pictures,with these coefficients the image cannot be reconstructed at all.The bottom pictures are obtained in the same way but using the correct key . Let us point out that, for reconstructing theimage from the coefficients, additional space has to be allowedto store the indices of the atoms in the decomposition (10).This is a requirement of nonlinear approximations for generaldictionaries. Remark 1.
In order to store the folded image at the same bitdepth as the original image we need to quantize the blocks ofpixels G q to convert them into integer numbers, which impliessome loss of information. However, the quantization step doesnot prevent us from recovering the coefficients correspondingto the folded pixels with enough accuracy to produce a goodrepresentation of those blocks of image. The PSNR of therecovered image ˜ I K in Fig. 1 (after folding it with the RDC-DB dictionary and subsequent rounding) is . dB whilethe PSNR of the original approximation I K is . dB.This implies a relative error due to quantization of . . Infurther tests, involving forty five 8 bit images of different sizeand format, the mean value relative error due to quantizationwas . with standard deviation . . Remark 2.
Let us emphasize that since the proposed encryp-tion scheme is based on the orthogonal matrices which define alinear transformation (c.f. (8) ), as pointed out in [22] it mightbe vulnerable to plain text attack. This means that an attackercould discover the matrices u qi , i = 1 , . . . , L q by collecting,for each block, L q -correctly decrypted sets of L q numbers h qi,j i, j = 1 , . . . , L q encrypted with the identical matrices u qi , i = 1 , . . . , L q . This would indeed allow to pose L q equations of the form (8) and, for invertible systems, disclosethe operator used for the encryption. However, the initialstep for the construction of this operator involves matrices y qi , i = 1 , . . . , L q (c.f. (11) ) which are randomly generatedusing a public seed q . Hence, for a fixed secret key , it isenough to change the public seed q to avoid L q encryptionswith the identical matrices u qi , i = 1 , . . . , L . Thereby, asimple initial setup for the public random initialization of theencryption process prevents the possibility of plain text attack. V. C
ONCLUSIONS
A bonus of sparse image representation has been discussed:the capability for simultaneous storage and encryption bysimple processing steps. It was shown that this feature can beused for EIF. The proposed procedure is applicable throughany appropriate transformation. The example given here hasbeen produced by a)DCT and b)a combination of RDC and DBdictionaries which is suitable for image processing by block-ing. The latter was shown to improve sparsity performancethrough nonlinear approximation techniques such as OMP. Thegain in sparsity also implies that the processing time for theactual folding and unfolding operations is less in the RDC-DBcase, as it involves less host blocks to be processed. On thewhole, the time spent in both cases is comparable. Using a2.8Ghz AMD processor with 3GB of RAM, the running timefor producing the example of Fig. 1 with MATLAB is (averageof ten independent runs) a) 2.39 seconds for DCT and b) 7.05seconds for RDC-DB. Using a MEX file implementing OMPin C++ the time of b) is reduced to 1.28 seconds. These results suggest that advances in matters of sparse representations maybenefit this application.
Acknowledgements
Support from EPSRC, UK, grant (EP / D062632 /
1) is ac-knowledged. We would like to thank Prof Tony Constatinides,from Imperial College, London, for the enjoyable discussionswhich inspired and further encouraged the application ofImage Folding. The software for reproducing the example isavailable from [23], in section EIFS.R
EFERENCES[1] S.S. Chen, D.L. Donoho, and M.A. Saunders. Atomic decomposition bybasis pursuit.
SIAM Journal on Scientific Computing , 20:33–61, 1998.[2] D. Donoho and J. Tanner. Sparse nonegative solution of underdeter-mined linear equations by linear programming.
Proceedings of theNational Academy of Sciences , 102:94446 – 94451, 2005.[3] S. Mallat and Z. Zhang. Matching pursuits with time-frequencydictionaries.
IEEE Transactions on Signal Processing , 41:3397–3415,1993.[4] Y.C. Pati, R. Rezaiifar, and P.S. Krishnaprasad. Orthogonal matchingpursuit: recursive function approximation with applications to waveletdecomposition. In
Proceedings of the 27th Annual Asilomar Conferencein Signals, System and Computers , volume 1, pages 40–44, 1993.[5] L. Rebollo-Neira and D. Lowe. Optimized orthogonal matching pursuitapproach.
IEEE Signal Processing Letters , 9:137–140, 2002.[6] M. Andrle and L. Rebollo-Neira. A swapping-based refinement oforthogonal matching pursuit strategies.
Signal Processing , 86:480–495,2006.[7] P. Jost, P. Vandergheynst and P. Frossard. Tree-Based Pursuit: Algorithmand Properties.
IEEE Transactions on Signal Processing , vol 54, pp.4685-4695, 2006.[8] D. Donoho, Y. Tsaig, I. Drori and J-L Starck. Sparse solution ofunderdetermined linear equations by stagewise orthogonal matchingpursuit.
Technical Report TR-2006-2, Stanford Statistics Department ,2006[9] D. Needell and R. Vershynin. Uniform uncertainty principle and signalrecovery via regularized orthogonal matching pursuit.
Found. Comput.Math.,
Applied and Computational Har-monic Analysis , 26: 301–321, 2009.[11] S. Fischer, G. Cristobal, R. Redondo. Sparse overcomplete Gaborwavelet representation based on local competitions.
IEEE Transactionson Image Processing
15: 265–272, 2006.[12] R. Figueras i Ventura, P. Vandergheynst, P. Frossard Low-rate andflexible image coding with redundant representations
IEEE Transactionson Image Processing , 15: 726–739, 2006.[13] D. Donoho and X. Huo. Uncertainty principles and ideal atomicdecomposition.
IEEE Transactions on Information Theory , 47:2845–2862, 2001.[14] M. Elad and A.M. Bruckstein, A generalized uncertainty principleand sparse representations of pairs of bases.
IEEE Transactions onInformation Theory,
48: 2558–2567, 2002.[15] A. Feuer and A. Nemirosky, On sparse representation in pairs of bases.
IEEE Transactions on Information Theory,
49: 1579–1581, 2003.[16] R. Gribonval and M. Nielsen. Sparse representations in unions of bases.
IEEE Transactions on Information Theory , 49:3320–3325, 2003.[17] J.A. Tropp. Greed is good: algorithmic results for sparse approximation.
IEEE Transactions on Information Theory , 50: 2231–2242, 2004.[18] B.A. Olshausen and B.J. Field. Emergence of simple-cell receptivefield properties by learning a sparse code for natural images.
Nature ,381:607–609, 1997.[19] M. Elad M. Aharon and A.M. Bruckstein. K-svd: An algorithm fordesigning of overcomplete dictionaries for sparse representation.
IEEETrans on Signal Processing , 54:4311–4322, 2006.[20] J. A. Tropp. On the Linear Independence of Spikes and Sines,
Journalof Fourier Analysis and Applications , 14: 838–858,2007.[21] J. Miotke and L. Rebollo-Neira. Oversampling of Fourier Coefficientsfor Hiding Messages,
Applied and Computational Harmonic Analysis ,16: 203–207, 2004. [22] G. Bhatt, L. Kraus, L. Walters, E. Weber. On hiding messages inoversampled Fourier coefficients.
J. Math Anal. Appl.320: 492–498,2006.[23] Highly nonlinear approximations for sparse signal representation.