Associations among Image Assessments as Cost Functions in Linear Decomposition: MSE, SSIM, and Correlation Coefficient
aa r X i v : . [ c s . C V ] A ug Associations among Image Assessmentsas Cost Functions in Linear Decomposition: MSE,SSIM, and Correlation Coefficient
Jianji Wang, Nanning Zheng, Badong Chen, Jose C. Principe
Institute of Artificial Intelligence and Robotics, Xi’an Jiaotong University, Xi’an, [email protected]
Abstract
The traditional methods of image assessment, such as mean squared error (MSE),signal-to-noise ratio (SNR), and Peak signal-to-noise ratio (PSNR), are all based onthe absolute error of images. Pearson’s inner-product correlation coefficient (PCC)is also usually used to measure the similarity between images. Structural similarity(SSIM) index is another important measurement which has been shown to be moreeffective in the human vision system (HVS). Although there are many essential dif-ferences among these image assessments, some important associations among themas cost functions in linear decomposition are discussed in this paper. Firstly, theselected bases from a basis set for a target vector are the same in the linear decom-position schemes with different cost functions MSE, SSIM, and PCC. Moreover, fora target vector, the ratio of the corresponding affine parameters in the MSE-basedlinear decomposition scheme and the SSIM-based scheme is a constant, which is justthe value of PCC between the target vector and its estimated vector.
Key words : Image Quality Assessment (IQA), Mean Square Error (MSE), Struc-tural Similarity Index (SSIM), Pearson’s Correlation Coefficient (PCC), linear decom-position.
1. Introduction
In the big data era, there is an increasing importance of images in our lives. Wecan easily obtain images with cameras in various intelligent devices, and also, fromnetwork. Errors usually appear when images are obtained. For example, when wetake an image with a camera, the distortion usually happens because of the lens;when an image is downloaded in the internet, errors may also appear for severalreasons such as the image transmission and the image compression. In these cases,image quality assessments (IQA) are important tools to measure the effectiveness ofdifferent hardware and software systems to preserve image quality.The absolute error-based image assessments, such as the mean square error (MSE),the signal-to-noise ratio (SNR), and the Peak signal-to-noise ratio (PSNR), are themost common measurements to measure image error. Intuitively, the addition ofthe pixel-by-pixel squared errors between two images is the square of the distancebetween them. MSE is the average squared distance for the corresponding pixel intwo images. For two image blocks x and y , if the pixels in x are x , x , · · · , x p , andthe pixels in y are y , y , · · · , y p , then MSE value between x and y can be calculated1s following, MSE( x , y ) = 1 p p X i =1 ( y i − x i ) . (1)Therefore, MSE is a pixel-based error measurement. SNR and PSNR are both derivedfrom MSE. These absolute error-based assessments are not only used to measure imagequality, but also used to measure almost all kinds of signals.Pearson’s inner-product correlation coefficient (PCC) is an important statisticwhich is used to measure the correlation between two vectors. For non-zero varianceimage blocks x and y discussed above, if σ x and σ y are the standard deviations ofthe pixels in the image blocks x and y , respectively, x and y are the means of thepixels in the image blocks x and y , respectively, θ is the angle between x and y whenwe take them as two column vectors, and σ xy is the covariance between x and y ,correlation coefficient r xy between x and y then can be calculated as following, r xy = P pi =1 ( x i − x )( y i − y ) √ P pi =1 ( x i − x ) √ P pi =1 ( y i − y ) = σ xy σ x σ y = cos θ . (2)Correlation coefficient is also usually used to measure the similarity between images[1, 2]. Essentially, correlation coefficient measures the correlation of the structure oftwo signals. Hence, correlation coefficient is actually a correlation-based assessment.Structural similarity index (SSIM), proposed by Wang and Bovik [3], aims toimprove the effectiveness of IQA in human visual systems (HVS). In SSIM, the errorsare taken as three parts: the luminance error, the contrast error, and the structureerror. By combining the three parts, SSIM gets a form asSSIM( x , y ) = " µ x µ y + ε µ x + µ y + ε σ x σ y + ε σ x + σ y + ε , where µ x = x , µ y = y , and ε , ε << y is zero, y can be losslessly linearly expressed by with allones. Because this paper focuses on linear decomposition, here we only consider theimage blocks with non-zero variance. At this case, we can set ε = ε = 0, then SSIMgets a simpler form SSIM( x , y ) = 4 µ x µ y σ xy ( µ x + µ y )( σ x + σ y ) . (3)SSIM has experienced a fast development from the baseline SSIM to various forms[4, 5, 6]. Although many researchers do not think that SSIM do much to improvethe effectiveness of image quality assessment in HVS [6, 7], SSIM has been widelyaccepted.As described above, MSE is a pixel-based image assessment, correlation coefficientis a correlation-based image assessment, and SSIM is a structure-based image assess-ment for HVS. Thus, there are many significant and essential differences among them.2n this paper, we do not focus on these important differences among the image qualityassessments. On the contrary, some interesting associations are disccussed when wetake the image quality assessments as the cost functions in linear decomposition.
2. Linear Decomposition
Linear decomposition plays an important role in various fields such as linear ap-proximation, sparse coding [8], and portfolio [9]. Especially, sparse coding is an impor-tant tool in image processing. Suppose we have a vector set X with n vectors { x , x , · · · , x n } , and each vector is an image block with size l × l , p = l × l . For an image block y with size l × l , we need to find a linear transformation x = s x + s x + · · · + s n x n + o to linearly approximate y , where s , s , · · · , s n and o are the linear coefficients and isthe block with all ones. In linear approximation, the vector set X is called codebook,and each vector in the codebook is called codeword; In sparse coding, X is calledbasis set and a codeword is called a basis. X can be trained by K SVD algorithm[10]. Here, we use the names in sparse coding.Suppose that the image block y is linear decomposed with m bases in the basisset, then linear decomposition is the optimization problem as following:min x i , x i , ··· , x im ∈{ x , x , ··· , x n } d ( y , x ) x = s i x i + s i x i + · · · + s i m x i m + o , (4)where d ( y , x ) is the distance of x and y under a cost function. The traditional lineardecomposition scheme takes MSE between x and y as the cost function, and we willdiscuss different linear decomposition schemes with different cost functions in thispaper.Strictly speaking, Pearson’s correlation coefficient and the SSIM index are not dis-tance functions because both of them do not satisfy the triangle inequality of distancefunction. However, it is meaningful that we take the absolute value of correlation co-efficient and the absolute value of the SSIM index as the cost functions in imagequality assessment. Moreover, the results we obtain in this paper shows these costfunctions are equivalent with MSE in selecting basis vectors from the same basis setfor an given image block to be encoded, which further proves the validity of takingthe absolute values of correlation coefficient and the SSIM index as the cost functionsin image quality assessment.Many methods had been proposed to search the corresponding basis vectors withnon-zero coefficients from the basis set X for a target vector y . Matching Pursuit(MP) [11] and Orthogonal Matching Pursuit (OMP) [12] are two common used tech-nologies. MP is an iterative method which selects a basis vector from the basis set inits each step. OMP is an improved technology of MP with better convergence.In linear approximation, if only one s i in s , s , · · · , s n is non-zero, i = 1 , , · · · , n ,then linear approximation degrades to be vector quantization (VQ). Some workshad been performed on the MSE-based VQ scheme and the SSIM-based VQ scheme313, 14, 15]. Several works also discussed the application of SSIM to linear approxi-mation [17, 18, 16]. Here we will discuss the associations among the different lineardecomposition schemes based on different cost functions Pearson’s correlation coeffi-cient, MSE, and SSIM.
3. Linear Decomposition with Different Cost Functions
According to linear decomposition, a linear transformation s x + s x + · · · + s n x n + o with a few non-zero s i needs to be found to approximate a target vector y , i = 1 , , · · · , n .Without loss of generality, assume x , x , · · · , x m are the selected basis vectorswith non-zero value of s i , and x = s x + s x + · · · + s m x m + o is the best linearapproximation for the target vector y . Let the standard deviation of x i is σ i , thestandard deviations of x and y are σ x and σ y , respectively, the means of x i and y are µ i and µ y , respectively, the covariance between x i and x j is σ ij , the covariancebetween x i and y is σ i y , and the covariance between x and y is σ xy , i = 1 , , · · · , n .Let σ = σ σ · · · σ m σ σ · · · σ m ... ... . . . ... σ m σ m · · · σ mm , s = s s ... s m , and σ • y = h σ y σ y · · · σ m y i T .Then the different linear decomposition schemes with different cost functions willbe discussed below. Although the parameters s , s , · · · , s m in the linear transformation cannot be cal-culated by taking Pearson’s inner-product correlation coefficient as the cost function,we can still take correlation coefficient as a cost function to decide which basis vectorshould be chosen for the target vector in different steps of MP searching or OMPsearching.As it is known, correlation coefficient gets the values -1 or 1 if two vectors arelinear dependent, and it gets the value zero if two vectors are perpendicular witheach other. In the above linear transformation, we can set s i as any negative valuesor positive values, i = 1 , , · · · , n , so here we can only consider the absolute valueof Pearson’s correlation coefficient, which can measure the strength of correlation fortwo vectors.According to the analysis above, for a given image block y , we only need tomaximize | r xy | to find the best x for y when we take the correlation coefficient as thecost function. The MSE-based linear decomposition scheme is the traditional linear decomposi-tion. Here we firstly analyze the relationship between x and y under the cost function4f MSE. We have MSE( x , y ) = p k y − x k = p k y − ( s x + s x + · · · + s m x m + o ) k . To minimize MSE( x , y ), we need to solve the below equations ∂ MSE( x , y ) ∂o = 0 ∂ MSE( x , y ) ∂s i = 0 . (5)From the first equation in Eq. (5), we can obtain o = µ y − s µ − s µ − · · · − s m µ m . (6)Substitute Eq. (6) into MSE( x , y ), then we haveMSE( x , y ) = p k y − x k = p ( σ y − σ xy + σ x )= p ( σ y − P i s i σ i y + P i P j s i s j σ ij ) . (7)According to Eq. (7) and the second equation in Eq. (5), we have σ i y = X j s j σ ij . (8)From Eq. (8) we can calculate all the s i for x , and we can also obtain σ xy = X i s i σ i y = X i s i X j s j σ ij = σ x . (9)It shows in Eq. (9) that the variance of the best x chosen for y in the MSE-basedscheme is kept the same with the covariance between x and y . Thus, according toEq. (7), MSE( x , y ) = p ( σ y − σ xy )= σ y p (1 − σ xy σ y σ xy σ x ) = σ y p (1 − r xy ) . (10)Hence, for a given image block y to be linear decomposed, we havemin MSE( x , y ) ⇔ max | r xy | . (11)This interesting result shows that the linear transformations x chosen for y in thecorrelation coefficient-based linear decomposition scheme and the MSE-based schemeboth have the maximum absolute value of r xy in their own schemes.From Eq. (8), we have s = σ − σ • y (12)5nd σ xy = σ T • y σ − σ • y . (13)Hence, according to Eq. (10), for a given image block y to be encoded, we havemin MSE( x , y ) ⇔ max σ xy = max( σ T • y σ − σ • y ) . (14) Here we discuss the structural similarity (SSIM) index-based linear decompositionscheme.The value of SSIM index belongs to [-1,1]. When the value of SSIM is near to 1,two images are almost the same; when the value of SSIM is -1, two images have thesame values of mean and the addition of the corresponding pixel in the two imagesequals double of the mean of them. Hence, both the values of 1 and -1 are the targetsfor linear decomposition. To maximize or minimize the value of SSIM index, we needto solve the equations, ∂ SSIM( x , y ) ∂o = 0 ∂ SSIM( x , y ) ∂s i = 0 . (15)According to the first equation in Eq. (15), we have the same result of Eq. (6)for the coefficient o . Then we substitute Eq. (6) into the expression of SSIM index,we have SSIM( x , y ) = 2 σ xy σ x + σ y = 2 P i s i σ i y P i s i P j s j σ ij + σ y . (16)Solve the second equation in Eq. (15), we obtain σ i y ( X i s i X j s j σ ij + σ y ) = 2 X i s i σ i y X j s j σ ij . Hence, σ i y ( σ x + σ y ) = 2 σ xy X j s j σ ij . (17)From Eq. (17), ( σ x + σ y ) X i s i σ i y = 2 σ xy X i s i X j s j σ ij . Because σ xy = P i s i σ i y σ x = P i s i P j s j σ ij , we have σ y = σ x . (18)6s Eq. (18) shows, the variance of the best x chosen for y in the SSIM-basedlinear decomposition scheme is kept the same as the variance of y .Lastly, we have SSIM( x , y ) = 2 σ xy σ x + σ y = σ xy σ x = σ xy σ x σ y = r xy . Because both maximization and minimization of SSIM is equivalent to maximizingthe absolute value of SSIM, we havemax | SSIM( x , y ) | ⇔ max | r xy | . (19)From these results, we get the first association as following: Association 1.
Maximizing | r xy | in the correlation coefficient-based linear de-composition scheme, minimizing MSE( x , y ) in the MSE-based linear decompositionscheme, and maximizing | SSIM( x , y ) | in the SSIM-based linear decomposition schemeare all equivalent to maximizing | r xy | in their own schemes.According to Eq. (17) and Eq. (18), we have s = σ y σ xy σ − σ • y σ xy = s T σ • y . (20)Thus, ss T σ • y = σ y σ − σ • y . Make a premultiplication of σ • y in both sides, we have σ xy = σ y σ T • y σ − σ • y . (21)Hence, for a given target vector y , according to Eq. (16) and Eq. (18), we obtainmax SSIM( x , y ) ⇔ max | σ xy | ⇔ max( σ T • y σ − σ • y ) . (22)
4. The associations
For comparison of different schemes, we consider the case of the same number ofbasis vectors to linearly approximate the target vector y .Although the target functions in the different linear decomposition schemes areall equivalent to maximizing the absolute value of correlation coefficient, we cannotconfirm these schemes are totally same. For example, for a given image block y tobe encoded, according to Eq. (9), the variance of the final approximation x equals tothe covariance between x and y in the MSE-based linear decomposition scheme, butaccording to Eq. (18) the variance of x equals to the variance of y in the SSIM-basedscheme. Moreover, it is possible that the affine parameters, the selected basis vectorsfor the given target vector, and the value of r x MSE y and r x SSIM y are different in differentschemes. 7 .1 The Selected Basis Vectors There are no affine parameters in the expression σ T • y σ − σ • y , which is only relevantwith x , x , · · · , x m , and y . Therefore, according to Eq. (14) and Eq. (22), we havethe second association of the schemes as following: Association 2.
The selected basis vectors from the same basis set for the targetvector y in the the MSE-based linear decomposition scheme and in the SSIM-basedlinear decomposition schemes are totally the same. Assume the best linear approximations of y are x MSE and x SSIM in the MSE-based linear decomposition scheme and the SSIM-based linear decomposition scheme,respectively.According to Eq. (9) and Eq. (13), we have (cid:12)(cid:12)(cid:12) r x MSE y (cid:12)(cid:12)(cid:12) = σ x MSE y σ x MSE σ y = √ σ x MSE y σ y = q σ T • y σ − σ • y σ y . (23)According to Eq. (18) and Eq. (21), we can also obtain (cid:12)(cid:12)(cid:12) r x SSIM y (cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12) σ x SSIM y (cid:12)(cid:12)(cid:12) σ x SSIM σ y = (cid:12)(cid:12)(cid:12) σ x SSIM y (cid:12)(cid:12)(cid:12) σ y = q σ T • y σ − σ • y σ y . (24)Thus, (cid:12)(cid:12)(cid:12) r x MSE y (cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12) r x SSIM y (cid:12)(cid:12)(cid:12) . Then we have the third association:
Association 3.
Minimizing MSE( x , y ) in the MSE-based linear decompositionscheme is equivalent to maximizing | SSIM( x , y ) | in the SSIM-based linear decompo-sition scheme, and they are both equivalent to maximizing | r xy | . Assume the linear parameters are s MSE1 , s MSE2 , · · · , s MSE m , and o MSE in the MSE-based linear decomposition scheme, and the linear parameters are s SSIM1 , s SSIM2 , · · · , s SSIM m , and o SSIM in the SSIM-based linear decomposition scheme. We have x MSE = s MSE1 x + s MSE2 x + · · · + s MSE m x m + o MSE SSIM = s SSIM1 x + s SSIM2 x + · · · + s SSIM m x m + o MSE . Eqs. (8) and (17) offer the method to calculate affine parameters in the MSE-basedscheme and the SSIM-based scheme, respectively. We list them below for comparison: σ i y = P j s MSE j σ ij σ i y = P j r xy s SSIM j σ ij . (25)8he second equation set in Eq. (25) is transformed from Eq. (17) by use of Eq.(18).Because σ i y , σ ij are the same in both equation sets in Eq. (25), we have s MSE i s SSIM i = r xy . (26)This equation offers the fourth association between different schemes: Association 4.
Although the affine parameters are different in the MSE-basedscheme and the SSIM-based scheme, the ratio of the affine parameters in the MSE-based scheme and the corresponding affine parameters in the SSIM-based scheme isa constant, and the constant is just the correlation coefficient between x and y .By using Associations 2 and 4, the data storage in sparse coding can be optimized.Because data storage is beyond the scope of this paper, we will discuss it in anotherpaper.
5. Conclusion
Although there are many essential differences among image assessments MSE,SSIM, and Pearson’s correlation coefficient, in this work we have shown several inter-esting and important associations among them when they are used as the cost func-tions in linear decomposition. Firstly, the minimization of the MSE value between thetarget vector and its estimated vector in the MSE-based linear decomposition schemeis equivalent to maximizing the absolute value of SSIM in the SSIM-based lineardecomposition scheme, and they are both equivalent to maximizing the correlationcoefficient between them. Secondly, the selected basis vectors from the same basis setfor a given target vector are the same in both the linear decomposition schemes withthe cost functions MSE and SSIM. Moreover, the ratio of the affine parameters inthe MSE-based scheme and the corresponding affine parameters in the SSIM-basedscheme is a constant, and the constant is just the correlation coefficient between thetarget vector and its estimated vector. By using these associations, data storage ofsparse coding can be optimized.
References [1] H. Liu, Y. Sun, Z. Jin, L. Yang, and J. Liu, J, “Capillarity-constructed reversiblehot spots for molecular trapping inside silver nanorod arrays light up ultrahighSERS enhancement,”
Chem. Sci. , vol. 4, no. 9, pp. 3490-3496, 2013.[2] A. Galdran, J. Vazquez-Corral, D. Pardo, M. Bertalmo, “Fusion-based varia-tional image dehazing,”
IEEE Signal Process. Lett. , vol. 24, no.2, pp. 151-155,2017. 93] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image qualityassessment: from error visibility to structural similarity,”
IEEE Trans. ImageProcess. , vol. 13, no. 4, pp. 600-612, 2004.[4] Z. Wang, and Q. Li, “Information content weighting for perceptual image qual-ity assessment,”
IEEE Trans. Image Process. , vol. 20, no. 5, pp. 1185-1198,2011.[5] Z. Wang, and E. P. Simoncelli, “Translation insensitive image similarity incomplex wavelet domain,” in
Proc. IEEE Int. Conf. Acoust., Speech, SignalProcessing (ICASSP), Philadelphia, Mar. 2005, vol. 2, pp. 573-576.[6] S. H. Bae, and M. Kim, “A novel SSIM index for image quality assessmentusing a new luminance adaptation effect model in pixel intensity domain,” in
Proc. IEEE Int. Conf. Visual Commun. Image Processing (VCIP), Dec. 2015,pp. 1-4.[7] R. Dosselmann, and X. D. Yang, “A comprehensive assessment of the structuralsimilarity index,”
Signal Image Video Process. , vol. 5, no. 1, pp. 81-91, 2011.[8] B. A. Olshausen, and D. J. Field, “Sparse coding with an overcomplete basis set:A strategy employed by V1?,”
Vision research , vol. 37, no.23, pp. 3311-3325,1997.[9] H. Markowitz, “Portfolio selection,”
The Journal of Finance , vol. 7, no. 1, pp.77-91, 1952.[10] M. Aharon, M. Elad, and A. Bruckstein, “KSVD: An algorithm for design-ing overcomplete dictionaries for sparse representation,”
IEEE Trans. SignalProcess. , vol. 54, no. 11, pp. 4311-4322, 2006.[11] S. G. Mallat, Z. Zhang, “Matching pursuits with time-frequency dictionaries,”
IEEE Trans. Signal Process. , vol. 41, pp. 3397-3415, 1993.[12] Y. C. Pati, R. Rezaiifar, P. S. Krishnaprasad, “Orthogonal matching pursuit:Recursive function approximation with applications to wavelet decomposition,”in
Proc. IEEE Asilomar Conf. Signals Systems Computers , 1993, pp. 40-44.[13] J. Wang, Y. Liu, P. Wei, Z. Tian, Y. Li, and N. Zheng, “Fractal image codingusing SSIM”,
Proc. IEEE Conf. Image Process. pp. 241–244, Sep, 2011.[14] J. Wang, N. Zheng, Y. Liu, G. Zhou, “Parameter analysis of fractal imagecompression and its applications in image sharpening and smoothing.”
SignalProcess. Image Commun. , vol. 28, no. 6, pp. 681-687, 2013.1015] D. Brunet, E. R. Vrscay, and Z. Wang, “Structural similarity-based affine ap-proximation and self-similarity of images revisited,” in
Proc. Int. Conf. ImageAnal. Recognit. , June, 2011, pp. 264-275. Springer: Heidelberg, Germany.[16] D. Brunet, E. R. Vrscay, Z. Wang, “On the mathematical properties of thestructural similarity index.”
IEEE Trans. Image Process. , vol. 21, no. 4, pp.1488-1499, 2012.[17] D. Brunet, E. R. Vrscay, and Z. Wang, M. Kamel and A. Campilho, Eds.,“Structural similarity-based approximation of signals and images using orthog-onal bases,” in
Proc. Int. Conf. Image Anal. Recognit. , 2010, vol. 6111, LNCS,pp. 11C22, Springer: Heidelberg, Germany.[18] A. Rehman, M. Rostami, Z. Wang, D. Brunet, and E. R. Vrscay, “SSIM-inspiredimage restoration using sparse representation,”