The relation between color spaces and compositional data analysis demonstrated with magnetic resonance image processing applications
AAJS
Austrian Journal of Statistics
The relation between color spaces andcompositional data analysis demonstrated withmagnetic resonance image processing applications
Omer Faruk Gulban
Maastricht University
Abstract
This paper presents a novel application of compositional data analysis methods inthe context of color image processing. A vector decomposition method is proposed toreveal compositional components of any vector with positive components followed bycompositional data analysis to demonstrate the relation between color space conceptssuch as hue and saturation to their compositional counterparts. The proposed methodsare applied to a magnetic resonance imaging dataset acquired from a living human brainand a digital color photograph to perform image fusion. Potential future applications inmagnetic resonance imaging are mentioned and the benefits/disadvantages of the proposedmethods are discussed in terms of color image processing.
Keywords : compositional data analysis, color, magnetic resonance imaging, image fusion.
1. Introduction
Compositional data analysis can be applied to color images in the context of image processingand analysis. Color images are stored as triplets of non-negative integers representing theadditive primary colors red, green and blue (referred as RGB channels). Once the color imageis formed, the transformation from RGB to hue, saturation and intensity (HSI) coordinatesystem is an often used first step to improve the image visualization (Smith 1978; Ledley,Buas, and Golab 1990; Grasso 1993) (for the mathematical expression of this transformationsee Appendix 6.2). HSI color space is commonly used in computer applications becauseof the commonly accepted property of its components relating to human color perception(Joblove and Greenberg 1978; Levkowitz and Herman 1993; Pohl and van Genderen 2016).Hue relates to the dominant component of the primary colors (red, green, blue) in the mixture,saturation relates to the distance from an equal mixture (gray), and intensity relates todistance from total darkness (black to white). In this article, inspiration is drawn fromRGB to HSI color space transformation to propose an analogous and more general methodbased on compositional data analysis. During this process, a simple vector decompositionis outlined to justify the use of compositional data analysis (Aitchison 1982; Pawlowsky-Glahn, Egozcue, and Tolosana-Delgado 2015) and the simplex space is leveraged to manipulatevector fields to perform image enhancement. The proposed method is demonstrated using a a r X i v : . [ q - b i o . Q M ] J un Colors and compositional data analysis magnetic resonance imaging (MRI) dataset. The results are also visualized using a digitalcolor photograph to provide a general intuition. Due to the interdisciplinary nature of thiswork a glossary is provided in Appendix 6.1 to accompany readers from different fields.
2. Materials and Methods
Whole head T1 weighted (T1w), proton density weighted (PDw) and T2* weighted (T2*w)images at 0.7 mm isotropic volumetric cubic element (voxel) resolution were acquired inone male participant using a three dimensional magnetization prepared rapid acquisitiongradient echo sequence with a 32-channel head coil (Nova Medical) on a 7 Tesla whole-bodyscanner (Siemens) in Maastricht Brain Imaging Center. These measurements reflect differentintrinsic properties of the tissues, for instance T1w image shows the optimal contrast betweenwhite matter and gray matter, PDw image shows the density of the hydrogen atoms, T2*wimage shows the iron content. For the detailed report of the data acquisition parameters seeAppendix 6.3. For a review on ultra high field MRI (( ≥ × × ∼ . ∼ . The image analysis illustrated in this paper starts from the following vector decomposition,where the real space of n dimensions is indicated with R n and the simplex space is indicatedwith S n symbols: (cid:126)v = [ v , v , . . . , v D ], where v , v , . . . , v D ∈ R D> ,(cid:126)v = 1 k C ( (cid:126)v ) s , where C ( (cid:126)v ) ∈ S D , and s ∈ R > , (1)where R D> indicates positive real numbers, k is an arbitrary scalar and the letter C stands forthe closure operation used in compositional data analysis (Aitchison 2002; Pawlowsky-Glahn et al. C ( (cid:126)v ) = k (cid:126)vs . (2)The letter s is another scalar that is the sum of the vector components: s = D (cid:88) i =1 v i . (3)Note that k disappears from the expression in Eq. 1 and 2 when selected as one. Thedecomposition (Eq. 1) of the vector (cid:126)v into its barycentric coordinates ( C ( (cid:126)v )) and a scalar( s ) allows the application of the compositional data analysis to the barycentric coordinates.Historically, the closure operation was used by August Ferdinand Mobius (Fauvel, Flood, andWilson 1993) and the resulting vector was interpreted as relating to the barycenter (center ofmass) of a simplex, hence giving the name to the vector decomposition (Eq 1) demonstratedhere. ustrian Journal of Statistics In this section, operations performed on the the barycentric coordinates are laid out. Theseoperations are mostly adapted from Pawlowsky-Glahn et al. (2015) to fit the context of theanalyzed vector field. Tensor notation is used for explicit formulations:Let A be a tensor with three dimensions, A x,y,z , where the coordinates x, y, z of an elementrepresents the spatial location. As mentioned the in Section 2.1, there are ∼ . A indicate different MRI measurements acquired at every voxel; A T wx,y,z , A P Dwx,y,z , A T ∗ wx,y,z . As the first step, all three tensors are vectorized (flattened): vec ( A t ) = [ a , , , . . . , a x, , , a , , , . . . , a ,y, , a , , , . . . , a , ,z ] T . (4) T stands for the transpose operator and t ∈ T w, P Dw, T ∗ w .Concatenation denoted by (cid:107) is used to matricize the vectorized tensors as the next step: V = vec ( A T w ) (cid:107) vec ( A P Dw ) (cid:107) vec ( A T w ) , (5)where number of rows of the matrix V is x × y × z ( n = ∼ . V indicate the set of voxels where the vector v i is the composition of T1w, PDw andT2*w measurements: V = [ v i , . . . , v n ] where i ∈ [1 , , . . . , n ] and v i = [ v T wi , v P Dwi , v T ∗ wi ] . (6)At this stage, V ∈ R . Voxel-wise (i.e. row-wise) closure is applied to V to acquire thebarycentric coordinates of every composition: X = C ( V ) = [ C ( v i ) , . . . , C ( v n )] , X ∈ S . (7)The set of compositions X was centered by finding the sample center and perturbing eachcomposition with the inverse of the sample center:ˆ X = X ⊕ cen ( X ) − , (8)where ⊕ denotes the perturbation operator (analogous to addition in real space): x ⊕ y = C [ x T w y , x P Dw y , x T ∗ w y ] . (9)Multipliers of the components are indicated with y , y and y . The term cen ( X ) is a vectorthat stands for the component-wise (i.e. column-wise) geometric mean across all voxels: cen ( X ) = [ g T w , g P Dw , g T ∗ w ] = (cid:32) n (cid:89) i =1 x ij (cid:33) /n , j = [ T w, P Dw, T ∗ w ] . (10)After centering, the data is standardized:ˆˆ X = ˆ X (cid:12) totvar [ X ] − / , (11)where (cid:12) symbol stands for the power operator (analogous to scaling in real space). Theexponent p is applied to every component: x (cid:12) p = C [ x pT w , x pP Dw , x pT ∗ w ] , (12)and the total variance is computed by: totvar [ X ] = 1 n n (cid:88) i =1 d a ( x i , cen ( X )) , (13) Colors and compositional data analysis where d a indicates squared Aitchison distance: d a ( x, y ) = (cid:118)(cid:117)(cid:117)(cid:116) D D (cid:88) j =1 D (cid:88) k =1 (cid:18) ln x j x k − ln y j y k (cid:19) . (14)In the current example D = 3 because of operating on a set of three part compositions X (see Equation 7).At this stage some visual intuition could be gained with regards to the vector field (MRimages) that is decomposed and processed via compositional data analysis methods. Forillustration purposes, this is done by generating virtual image contrasts computed throughmetrics of simplex space (see Figure 2).The norm in S image in Figure 2 is generated by computing Aitchison norm voxel-wise: (cid:107) X (cid:107) a = [ (cid:107) x i (cid:107) a , . . . , (cid:107) x n (cid:107) a ] . (15)The subscript a stands for the vector norm defined in simplex space (analogous to Euclideannorm in real space). Following Pawlowsky-Glahn et al. (2015), Aitchison norm is defined as: (cid:107) x (cid:107) a = (cid:118)(cid:117)(cid:117)(cid:116) D D (cid:88) j =1 D (cid:88) k =1 (cid:18) ln x j x k (cid:19) , (16)where D = 3 considering three different MRI measurements (T1w, PDw, T2*w).It should be noted that this norm image is analogous to saturation dimension in the afore-mentioned RGB to HSI color space transformation (see Appendix 6.2).The angular difference in S image in Figure 2 is the voxel-wise angular difference ( ∠ ) betweenthe set of compositional vectors ( X ) and a reference vector ( r ): X (cid:93) = [ ∠ x i r, . . . , ∠ x n r ] , (17)where ∠ x i r = arccos (cid:18) (cid:104) x i , r (cid:105) a (cid:107) x i (cid:107) a (cid:107) r (cid:107) a (cid:19) . (18) (cid:107) x (cid:107) a stands for Aitchison norm of the vector x as defined in Equation 16 and (cid:104) x, r (cid:105) a standsfor inner product in simplex space: (cid:104) x, r (cid:105) a = 12 D D (cid:88) j =1 D (cid:88) k =1 ln x j x k ln r j r k . (19)Here, D = 3 again and it should be noted that the choice of the reference vector r is arbitrary.In this example the reference vector is selected as r = [0 . , . , . truncate ( x ) = x (cid:12) (cid:18) (cid:107) x (cid:107) a λ (cid:19) , (cid:107) x (cid:107) a > λx , (cid:107) x (cid:107) a ≤ λ (20)This method is inspired from other simple color balance methods that makes use of the dy-namic range of RGB channels of color images (Limare, Lisani, Morel, Petro, and Sbert 2011). ustrian Journal of Statistics λ = 3 based on visual inspection of the resulting color enhancement. It is conceivable thatother operations based on percentiles can also be used to the make the decision less arbitrary(e.g. 99 th percentile of Aitchison norm distribution consisting of all vectors of X ).The ilr transformation (Egozcue, Pawlowsky-Glahn, Mateu-Figueras, and Barcelo-Vidal 2003;Pawlowsky-Glahn et al. X ilr = [ ilr ( x i ) , . . . , ilr ( x n )] , (21)where the ilr transformation is defined as: ilr ( x i ) = ln( x i ) · H . (22) H indicates the Helmert sub-matrix, chosen by following Tsagris, Preston, and Wood (2011);Lancaster (1965). In this case H consists of 3 rows and 2 columns and selected as the following: H = √ √ − √ √ − (cid:113) . (23)To explore the usefulness of the ilr coordinates and probe physically interpretable compo-sitional characteristics, a real-time interactive visualization method is used with joint ilr-coordinates and image space representations of the data. This method is similar to usingpre-defined modulation transfer functions for mapping the bins of a 2D histogram data rep-resentation to 3D image space used in volume rendering software (Kniss, Kindlmann, andHansen 2005). Major brain tissues such as gray matter, white matter, cerebrospinal fluid ar-teries and sinuses are identified manually by the author and delineated in both ilr-coordinatesand brain images using neuro-anatomical expertise.All analysis steps demonstrated in this work are implemented in a free and open sourcePython package (available at https://github.com/ofgulban/compoda; Gulban (2018)) usingNumpy (Van Der Walt, Colbert, and Varoquaux 2011), Scipy (Jones, Oliphant, Peterson et al.
3. Results
Figure 1 shows the similarity between HSI and the analogous metrics demonstrated in methodssection. When comparing Fig. 1 rows 2-3 column 1 to column 3, it can be seen that hue andangular difference in simplex space are both invariant to intensity gradient visible in thesky (brightness change from upper right corner to lower left). The same observation canalso be made for the saturation and norm in simplex space images (compare Fig. 1 rows 2-3column 2 to column 3). These observations can be related to the scale invariance principleof compositional data analysis. By applying the barycentric decomposition (Eq.1) to colortriplets in each pixel, scale information is separated, revealing compositional vectors invariantto the underlying multiplicative scalar field (i.e. brightness). These image contrasts based on
Colors and compositional data analysis
Original Red channel Green channel Blue channelHue Saturation IntensityAngular di ff erence in S Norm in S Sum of channels ustrian Journal of Statistics
Colors and compositional data analysis ilr OriginalAfter color balance
SuperiorInferiorAnterior Posterior Artery SinusWhiteMatter GrayMatter CerebrospinalFluid ArterySinus WhiteMatterGrayMatterCerebrospinalFluidPDw T1wT2*wPDw T1wT2*w Nr.voxels
Figure 3: MRI measurements rendered as a color image. Red channel is assigned to T1w,green to PDw and blue to T2*w measurements. Left column shows the 2D histograms of thecorresponding ilr coordinates (Eq. 22). The projection of primary axes of RGB color cube (in R ) to ilr coordinates are embedded to provide an intuitive reference for the characteristicsof the compositions. The effect of centering and standardization inside the simplex (Eq. 8,11) and the compositional truncation (Eq. 20) is visible as color balance improvement inthe rendered brain slice. The labels pointing to the tissues in brain image and circles in2D ilr coordinate histograms shows the relation of the compositional characteristics with thecoloration. The circles in 2D histograms are the edges of the 2D transfer functions used tomanually probe the tissue-ilr coordinate relationship.
4. Discussion
In this work, compositional data analysis methods are used to reformulate RGB-HSI colorspace transformation. It is visually demonstrated that the compositional metrics such asangular difference and norm in simplex space relate to hue and saturation concepts in colorspace literature. This reformulation of hue and saturation would be advantageous for havinga well-principled framework when operating on images with n-dimensions. For instance, apotential future application would be to analyze MRI datasets with more than three typesof measurements by adding cerebral blood volume measurements (Uludag and Blinder 2017)or multi-echo echo planar imaging (Poser and Norris 2009) to T1, T2 and PD weightedmeasurements at ultra high fields. Another application area would be processing of multi-spectral images for image fusion purposes (Pohl and van Genderen 2016). In cases where morethan three measurements are acquired for each element in an image, dimensionality reductionmethods such as principal component analysis in simplex space (Wang, Shangguan, Guan,and Billard 2015) would become relevant to maximize the visualized information contentof color images. This is the disadvantage of being limited to three primary additive colors ustrian Journal of Statistics
5. Acknowledgements
The data was acquired thanks to Federico De Martino, under the project supported by NWOVIDI grant 864-13-012. The author O.F.G. was also supported by the same grant. I thankIngo Marquardt for language editing in the initial version of the manuscript and AlbertoCassese for the advice on mathematical notations in Section 2.2. In addition, I wish toexpress my appreciation for the comments and suggestions of the anonymous reviewers whichI believe have helped to improve the manuscript.
6. Appendix
Barycentric coordinates : Center of mass-centric coordinates. Used in relation to the centerof mass of an n-simplex.
Color balance : Used to indicate centering and standardizing compositional vectors in thiswork.
HSI : A three dimensional space with named axes relating to human color perception: hue,saturation, intensity.
MRI : Magnetic resonance imaging.
PDw : Proton density weighted MRI signal acquisition. Indicates density of the hydrogenatoms in different tissues.
RGB : A three dimensional space with named axes relating to color cube: red, green, blue
Simplex : Generalized geometrical notion of triangles. For example, 0-simplex is a point,1-simplex is a line segment, 2-simplex is a triangle, 3-simplex is a tetrahedron etc.
T1w : T1 weighted MRI signal acquisition. Optimized to give the highest image contrastbetween white matter and gray matter brain tissues.
T2*w : T2* weighted MRI signal acquisition. Optimized to give the highest image contrastrelated to iron concentration between brain tissues.
Voxel : Volumetric cubic element, in other words a three dimensional pixel.0
Colors and compositional data analysis
Intensity = I = (cid:18) R max( R, G, B ) + G max( R, G, B ) + B max( R, G, B ) (cid:19) ÷ .Saturation = S = , I = 01 − min( R, G, B ) I , I > .Hue = H = ◦ , ∆ = 060 ◦ × (cid:18) G (cid:48) − B (cid:48) ∆ mod 6 (cid:19) , C max = R (cid:48) ◦ × (cid:18) B (cid:48) − R (cid:48) ∆ + 2 (cid:19) , C max = G (cid:48) ◦ × (cid:18) R (cid:48) − G (cid:48) ∆ + 4 (cid:19) , C max = B (cid:48) where ∆ = max( R, G, B ) − min( R, G, B )and R (cid:48) = R max( R, G, B ) , G (cid:48) = G max( R, G, B ) , B (cid:48) = B max( R, G, B ) . Whole head images were acquired using a three dimensional magnetization prepared rapidacquisition gradient echo (MPRAGE) sequence. The data consisted of a T1w image (repeti-tion time [TR] = 3100 ms ; time to inversion [TI] = 1500 ms [adiabatic non-selective inversionpulse]; time echo [TE] = 2 . ms ; flip angle = 5 ◦ ; generalized auto-calibrating partially par-allel acquisitions [GRAPPA] = 3 (Griswold, Jakob, Heidemann, Nittka, Jellus, Wang, Kiefer,and Haase 2002); field of view [FOV] = 224 × mm ; matrix size = 320 × ms ; TE= 2 . ms ; flip angle = 5 ◦ ; GRAPPA = 3; FOV = 224 × mm ; matrix size = 320 × ms ; TE = 16 ms ; flip angle = 5 ◦ ;GRAPPA = 3; FOV = 224 × mm ; matrix size = 320 × References
Aitchison J (1982). “The Statistical Analysis of Compositional Data.”
Journal of the RoyalStatistical Society , (2), 139–177.Aitchison J (2002). “A Concise Guide to Compositional Data Analysis.” CDA WorkshopGirona , , 73–81. ustrian Journal of Statistics doi:10.5281/zenodo.1011207 . URL https://doi.org/10.5281/zenodo.1011207 .Egozcue JJ, Pawlowsky-Glahn V, Mateu-Figueras G, Barcelo-Vidal C (2003). “IsometricLogratio Transformations for Compositional Data Analysis.” Mathematical Geology , (3),279–300. ISSN 08828121. doi:10.1023/A:1023818214614 . URL http://link.springer.com/10.1023/A:1023818214614 .Fauvel J, Flood R, Wilson R (1993). Mobius and his band: mathematics and astronomy innineteenth-century Germany . Oxford University Press. URL https://books.google.de/books?id=z-XuAAAAMAAJ .Grasso DN (1993). “Applications of the IHS Color Transformation for 1:24,000-Scale Geo-logic Mapping: A Low Cost SPOT Alternative.”
Photogrammetric Engineering & Remotesensing , (1), 73–80.Griswold MA, Jakob PM, Heidemann RM, Nittka M, Jellus V, Wang J, Kiefer B, Haase A(2002). “Generalized Autocalibrating Partially Parallel Acquisitions (GRAPPA).” MagneticResonance in Medicine , (6), 1202–1210. ISSN 07403194. doi:10.1002/mrm.10171 .Gulban OF (2018). “Compoda v0.3.1.” doi:10.5281/zenodo.1207293 . URL https://doi.org/10.5281/zenodo.1207293 .Gulban OF, Schneider M (2017). “Segmentator v1.3.0.” doi:10.5281/zenodo.999487 . URL https://doi.org/10.5281/zenodo.999487 .Hunter JD (2007). “Matplotlib: A 2D graphics environment.” Computing In Science &Engineering , (3), 90–95. doi:10.1109/MCSE.2007.55 .Joblove GH, Greenberg D (1978). “Color spaces for computer graphics.” ACMSIGGRAPH Computer Graphics , (3), 20–25. ISSN 00978930. doi:10.1145/965139.807362 . URL .Jones E, Oliphant T, Peterson P, et al. (2001–). “SciPy: Open source scientific tools forPython.” [Online; accessed 2015-07-13], URL .Kniss J, Kindlmann G, Hansen CD (2005). “Multidimensional transfer functions for vol-ume rendering.” Visualization Handbook , pp. 189–209. ISSN 1077-2626. doi:10.1016/B978-012387582-2/50011-3 .Lancaster HO (1965). “The Helmert Matrices.”
The American Mathematical Monthly , (1), 4. ISSN 00029890. doi:10.2307/2312989 . URL .Ledley R, Buas M, Golab T (1990). “Fundamentals of true-color image processing.” In [1990] Proceedings. 10th International Conference on Pattern Recognition , volume i, pp.791–795. IEEE Comput. Soc. Press. ISBN 0-8186-2062-5. doi:10.1109/ICPR.1990.118218 . URL http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=118218http://ieeexplore.ieee.org/document/118218/ .Levkowitz H, Herman G (1993). “GLHS: A Generalized Lightness, Hue, and Saturation ColorModel.” CVGIP: Graphical Models and Image Processing , (4), 271–285. ISSN 10499652. doi:10.1006/cgip.1993.1019 .2 Colors and compositional data analysis
Limare N, Lisani Jl, Morel Jm, Petro AB, Sbert C (2011). “Simplest Color Balance.”
ImageProcessing On Line , (1), 125–133. ISSN 2105-1232. doi:10.5201/ipol.2011.llmps-scb .URL .Pawlowsky-Glahn V, Egozcue JJ, Tolosana-Delgado R (2015). Modelling and Analysis ofCompositional Data . John Wiley & Sons, Ltd, Chichester, UK. ISBN 9781119003144. doi:10.1002/9781119003144 . URL http://doi.wiley.com/10.1002/9781119003144 .Pohl C, van Genderen J (2016).
Remote Sensing Image Fusion . CRC Press, Taylor & FrancisGroup, 6000 Broken Sound Parkway NW, Suite 300, Boca Raton, FL 33487-2742. ISBN978-1-4987-3002-0. doi:10.1201/9781315370101 . URL .Poser BA, Norris DG (2009). “Investigating the benefits of multi-echo EPI for fMRI at 7T.”
NeuroImage , (4), 1162–1172. ISSN 10538119. doi:10.1016/j.neuroimage.2009.01.007 .Smith AR (1978). “Color gamut transform pairs.” ACM SIGGRAPH Computer Graphics , (3), 12–19. ISSN 00978930. doi:10.1145/965139.807361 . URL http://portal.acm.org/citation.cfm?doid=965139.807361 .Smith S, Jenkinson M, Woolrich M (2004). “Advances in functional and structural MR imageanalysis and implementation as FSL.” Neuroimage , (November), S208–S219. ISSN 1053-8119. doi:10.1016/j.neuroimage.2004.07.051 .Smith SM (2002). “Fast robust automated brain extraction.” Human Brain Mapping , (3),143–155. ISSN 10659471. doi:10.1002/hbm.10062 . NIHMS150003 .Tsagris MT, Preston S, Wood ATA (2011). “A data-based power transformation for compo-sitional data.” arXiv preprint , (1), 1–9. , URL http://arxiv.org/abs/1106.1451 .Ugurbil K (2014). “Magnetic Resonance Imaging at Ultrahigh Fields.”
IEEE Transactionson Biomedical Engineering , (5), 1364–1379. ISSN 0018-9294. doi:10.1109/TBME.2014.2313619 . , URL http://ieeexplore.ieee.org/document/6778087/ .Uludag K, Blinder P (2017). “Linking brain vascular physiology to hemodynamic response inultra-high field MRI.” NeuroImage , (December 2016). ISSN 1095-9572. doi:10.1016/j.neuroimage.2017.02.063 . URL .Van Der Walt S, Colbert SC, Varoquaux G (2011). “The NumPy array: a structure forefficient numerical computation.”
Computing in Science & Engineering , (2), 22–30.van der Walt S, Sch¨onberger JL, Nunez-Iglesias J, Boulogne F, Warner JD, Yager N, GouillartE, Yu T, the scikit-image contributors (2014). “scikit-image: image processing in Python.” PeerJ , , e453. ISSN 2167-8359. doi:10.7717/peerj.453 . URL http://dx.doi.org/10.7717/peerj.453 .Wang H, Shangguan L, Guan R, Billard L (2015). “Principal component analysis for com-positional data vectors.” Computational Statistics , (4), 1079–1096. ISSN 16139658. doi:10.1007/s00180-015-0570-1 . ustrian Journal of Statistics Affiliation:
Omer Faruk GulbanCognitive Neuroscience Department, Faculty of Psychology and NeuroscienceMaastricht University6224 EV Maastricht, NetherlandsE-mail: [email protected]
URL: https://orcid.org/0000-0001-7761-3727
Austrian Journal of Statistics published by the Austrian Society of Statistics
Volume VV
Submitted: yyyy-mm-ddMMMMMM YYYY