[PDF] LISA: a MATLAB package for Longitudinal Image Sequence Analysis

Abstract

Large sequences of images (or movies) can now be obtained on an unprecedented scale, which poses fundamental challenges to the existing image analysis techniques. The challenges include heterogeneity, (automatic) alignment, multiple comparisons, potential artifacts, and hidden noises. This paper introduces our MATLAB package, Longitudinal Image Sequence Analysis (LISA), as a one-stop ensemble of image processing and analysis tool for comparing a general class of images from either different times, sessions, or subjects. Given two contrasting sequences of images, the image processing in LISA starts with selecting a region of interest in two representative images, followed by automatic or manual segmentation and registration. Automatic segmentation de-noises an image using a mixture of Gaussian distributions of the pixel intensity values, while manual segmentation applies a user-chosen intensity cut-off value to filter out noises. Automatic registration aligns the contrasting images based on a mid-line regression whereas manual registration lines up the images along a reference line formed by two user-selected points. The processed images are then rendered for simultaneous statistical comparisons to generate D, S, T, and P-maps. The D map represents a curated difference of contrasting images, the S map is the non-parametrically smoothed differences, the T map presents the variance-adjusted, smoothed differences, and the P-map provides multiplicity-controlled p-values. These maps reveal the regions with significant differences due to either longitudinal, subject-specific, or treatment changes. A user can skip the image processing step to dive directly into the statistical analysis step if the images have already been processed. Hence, LISA offers flexibility in applying other image pre-processing tools. LISA also has a parallel computing option for high definition images.

Full PDF

LLISA : A MATLAB

PACKAGE FOR L ONGITUDINAL I MAGE S EQUENCE A NALYSIS

A P

REPRINT

Jang Ik Cho

Eli Lilly and CompanyGlobal Headquarters Lilly Corporate CenterIndianapolis, Indiana 46285 USAE-mail: [email protected]

Xiaofeng Wang

Department of Quantitative Health SciencesCleveland Clinic Lerner Research Institute9500 Euclid Ave. JJN3Cleveland, OH 44139E-mail: [email protected]

Yifan Xu

Amazon.com, Inc.410 Terry Avenue NSeattle WA 98109E-mail: [email protected]

Jiayang Sun

Department of Epidemiology and BiostatisticsCase Western Reserve University10900 Euclid Ave. SOM, WG-43Cleveland, OH 44106-4945E-mail: [email protected] 19, 2019 A BSTRACT

Large sequences of images (or movies) can now be obtained on an unprecedented scale, whichpose fundamental challenges to the existing image analysis techniques. The challenges includeheterogeneity, (automatic) alignment, multiple comparisons, potential artifacts, and hidden noises.This paper introduces our

MATLAB package, L ongitudinal I mage S equence A nalysis ( LISA ), as aone-stop ensemble of image processing and analysis tool for comparing a general class of imagesfrom either different times, sessions, or subjects. Given two contrasting sequences of images, theimage processing in

LISA starts with selecting a region of interest in two representative images,followed by automatic or manual segmentation and registration. Automatic segmentation de-noisesan image using a mixture of Gaussian distributions of the pixel intensity values, while manualsegmentation applies a user-chosen intensity cut-off value to ﬁlter out noises. Automatic registrationaligns the contrasting images based on a mid-line regression whereas manual registration lines upthe images along a reference line formed by two user-selected points. The processed images arethen rendered for simultaneous statistical comparisons to generate D, S, T, and P-maps. The D maprepresents a curated difference of contrasting images, the S map is the non-parametrically smootheddifferences, the T map presents the variance-adjusted, smoothed differences, and the P-map providesmultiplicity-controlled p-values. These maps reveal the regions with signiﬁcant differences due toeither longitudinal, subject-speciﬁc, or treatment changes. A user can skip the image processing stepto dive directly into the statistical analysis step if the images have already been processed. Hence,

LISA offers the ﬂexibility in applying other image pre-processing tools.

LISA also has a parallelcomputing option for high deﬁnition (HD) images. K eywords longitudinal images · high-dimensional data · image sequence · multiple comparison · non-parametricsmoothing · MATLAB a r X i v : . [ s t a t . C O ] F e b PREPRINT - F

EBRUARY

19, 2019

Rapid development in imaging technologies has made medical imaging one of the most important techniques inproviding essential evidence for clinical diagnosis and medical intervention. We can now collect unprecedented amountsof image data in both spatial and temporal dimensions per patient, from one or more modalities. The challenges inextracting valuable information from these images motivate modern updates and developments in image processingand analysis methods and tools. For example, three recent imaging books by Russ and Neal[1], Birkfellner[2], andBurger and Burge[3] show the fundamentals of image processing methods developed in the last score. Industries havealso invested in developing image processing tools to accommodate many different modalities with types of imagesranging from document imaging [4] and multimedia imaging [5] to medical imaging [6]. The National Institute ofHealth (NIH)’s Center for Information Technology is leading the

Medical Image Processing, Analysis, and Visualization (MIPAV) project for quantitative analysis of medical images [7]. Most of the functionalities in MIPAV are for image(pre)processing of PET, MRI, CT, microscopy, 3D images and other widely-used images in medicine. One of the mostwidely used image analysis that MIPIV provides, uses summary statistics of the volume of interest (VOI) within asingle image to generate contour VOI plots for all of the images to be analyzed.Comparing images or assessing their similarities is one of the most important objectives in image analysis. The methodsfor such comparisons include the methods for pixel by pixel comparison and those based on a speciﬁc image measure [8].The measure-based methods include Least Square Image Matching (LSM) [9], Hausdorff distance [10] and componentsof a matrix decomposition [11]. Among these measures, LSM has been used extensively because of its simplicity, but itmay suffer from inaccurate initial input value and outliers [9]. To obtain comprehensive details of the differences overentire imaging region, a statistical pixel by pixel comparison is needed.Denote an image sequence as spatial and temporal data Y ( s, t, n ) , where Y is the intensity value at the pixel orspatial location s ∈ S and time t ∈ T for the subject n ∈ N . Here S , T , N are the sizes of sets S , T , and N respectively. Then, the number of measurements per subject, or the data dimension is p = S × T . This p can be huge, leading to a large-p data problem. If N is much smaller than p , we have a large-p-small-n problem.Large-p data challenges statistical multiple comparisons, no matter the number of subjects N is large or small. Howto obtain an objective threshold for multiple comparisons of large, noisy and potentially heterogeneous images haschallenged statisticians, especially for analyzing overﬂowing simultaneous images. Statistical parametric mapping(SPM) [12] is an effective and widely-used pixel by pixel image comparison approach. The motivation for developing L ongitudinal I mage S equence A nalysis ( LISA ) MATLAB package comes from generalizing and making a one-stopensemble of image processing and analysis tool for complex, longitudinal images, by transforming the components ofthe Longitudinal Analysis and Self-Registration (LASR) procedure [13]. The LASR was developed to overcome theheterogeneity, and the large-p-small-n issue in assessing the effect of Neuromuscular Electrical Stimulation (NMES)experiments [14] in reducing the risk of developing pressure ulcers. The data for LASR were movies composed fromsequences of seated pressure mat images obtained under a static and a dynamic stimulation, respectively, in each sessionfrom each visit by a patient, for eight patients observed up to 5 years. There were artifacts, heterogeneity, noises, andsubstantial spatial and temporal alignment issues from different sessions even on the same patient. The data was massivewith a huge p and a small n of 8. These data required extensive imaging preprocessing and general statistical analysismethod, Statistical Non-parametric Mapping (SNM) that is more general than the SPM, for each set of comparisons.Various code for the components of LASR was developed individually and speciﬁcally to the pressure data formate. Itwould have been easier if the LISA , a general purpose MATLAB package was available then. With increasing amountsof imaging data from different modalities in nowadays, it becomes imperative to implement and generalize the LASRmethod into a stand-alone MATLAB package for general application of images in standard image formate. Making

LISA available to public as a one-stop MATLAB package sets a basis for future development for even large data andmore generalized case. See the discussion in Section 6.Figure 1 above shows the schematic structure of

LISA tool, where the ‘image processing’ is called ‘image pre-processing,’ and ‘statistical analysis’ is called ‘image post-processing.’ The top part of Figure 1 presents the overallﬂow of

LISA . A complete process of

LISA goes through 2 steps: “I. Image Pre-processing” step as introduced inthe left bottom ﬂowchart of Figure 1 and “II. Image Post-processing (Statistical Analysis)” step as introduced in rightbottom ﬂowchart of Figure 1. However, if image pre-processing is unnecessary or a user chooses to use other

MATLAB built-in tools to de-noise, segment or register their two sequences of images,

LISA has the option to skip the imagepre-processing step to go directly to image post-processing (statistical analysis). Image pre-processing is done by 1)selecting a region of interest (ROI), 2) de-noising the images by automatic or manual image segmentation, and 3)aligning the images by also automatic or manual image registration. Image post-processing is done automatically by 1)calculating curated differences, 2) de-noising the differences obtained from Step 1 by a non-parametric smoothing, 3)deriving variance-adjusted, smoothed values to get T-type statistics, and 4) providing ﬁnal signiﬁcances of the differences2

PREPRINT - F

EBRUARY

19, 2019Figure 1: Structure of

LISA based on the multiplicity adjusted p-values. These four post-processing steps provide Statistical Nonparametric Mapping(SNM) with minimal assumptions on the images. Here we use the False Discovery Rate (FDR) to adjust the multiplicity.The development of

MATLAB package

LISA (Longitudinal Image Sequence Analysis) has implemented and expandedthe functionality of LASR to general cases for any images that can be aligned using a reference line (estimable fromthe image data directly) or general image processing tools in Matlab. A graphical user interface in

LISA providesconvenience for a user to quickly remove noise and artifacts, decide on the region of interests and choose a quickreference line if feasible. In addition,

LISA provides a parallel computing option for a large sequence or high-resolutionimages.In the remaining paper, we provide the

LISA installation guide (§2), options for pre-processing (§3) and post-processing(statistical analysis) with a brief introduction of methodologies (§4), demonstrate the software

LISA via an application(§5), and then conclude the article with a discussion, including limitations and further extensions (§6). For the detailsand interpretations of of the D, S, T and P maps, see Section 4. With features introduced in

LISA , we improve theaccuracy of statistical comparison for both static and dynamic sequences of images.

LISA can tackle two sequencesof time-series images in general, including seated pressure mat images used in the NMES study, time-series satelliteimage[15], and time-series chemical imaging data [16].

A step-by-step installation guide is provided in this section.

1. Download LISA AppLISA program “LISA.zip” can be downloaded from the following two locations. · MathWorks File Exchange website : PREPRINT - F

EBRUARY

19, 2019 · Center for Statistical Research, Computing, and Collaboration(SR2c) website : http://sr2c.case.edu/software.html

The archive “LISA.zip” contains three ﬁles : LISAxxx.mlappinstall, LISAxxx.prj, and ReadMe where xxx is thepackage version number. “LISAxxx.mlappinstall” ﬁle is the installation ﬁle, “LISAxxx.prj"’ is a project ﬁle containingpackage information and “ReadMe” ﬁle is a quick starting guide.

2. Install and load LISA App

With a newer version of

MATLAB (version 8 or above), one can install and load

LISA by simply running theLISAxxx.mlappinstall ﬁle [17] and follow the three steps below. However, with an older version of

MATLAB (version7 or below) one can manually install

LISA using the instructions provided in the following MathWorks website .1. Run LISAxxx.mlappinstall ﬁle. Then

MATLAB will be launched and a “Install” window will pop-up as shownin Figure 2 (a)2. Continue by clicking the “Install” button.

MATLAB will automatically install the package and will notify theinstaller from the “APP” tab when the installation is complete, as shown in Figure 2 (b)3. Load

LISA by clicking on the

LISA

App icon under “APPS - MY APPS” as shown in Figure 3 (a) Installation Window (b) Installation Complete

Figure 2: Installation ProcessFigure 3: Where

LISA can be found and launchedWhen

LISA is loaded, a user can start using the program right away by running the lisa function. Typing “helpLISA”into

MATLAB command window will provide a vignette including the syntax and examples of the lisa function.

LISA is bulit to compare two sequences of images by contrasting each sequence with representing images taken in onesession. The ﬁrst stable image of each sequence is treated as its reference image, which will be used throughout theimage processing. Data and image processing consists of the following 5 steps.1. Data Scan : Scan image frames from “csv" ﬁles2. Region of Interest(ROI) Selection : Crop images to the region of interest4

PREPRINT - F

EBRUARY

19, 20193. Image Segmentation : Detect noise4. Image Registration : Calibrate Images5. Finer ROI Selection(optional)

The current version of

LISA accepts “csv" ﬁles as inputs. A user can convert various image formats to a “csv” formatin

MATLAB by reading images using imread and saving the image to a “csv” ﬁle using csvwrite . In general, theoutput of an imaging tool in a “csv” ﬁle includes information on the date, time, frame numbers, intensity values of theimages and metadata such as equipment model, user information etc. Therefore, it is very important to distinguish theimage intensity matrix in a numerical form to conduct statistical comparison.

LISA is able to extract image intensitymatrices from the raw data ﬁles that are in the following three formats as shown in ﬁgures 4(a), 4(b), and 4(c). (a) Row Identiﬁer (b) Column Identiﬁer (c) Blank Row

Figure 4: Data Matrix Scanning MethodThe data structure in Figure 4(a) requires a user to extract the data matrix using a row identiﬁer. A user needs to set theparameter ‘scantype’ to ‘row’ and a row identiﬁer character string needs to be passed over to the parameter ‘rowid’ . LISA skips the header rows until it encounters the ﬁrst identiﬁer. It then scans next ‘nrow’ number ofrows to extract as a numeric matrix and deﬁne it as the ﬁrst image frame.

LISA continues the same steps until itreaches the last identiﬁer. Here is an example of the syntax for scanning the data using the row identiﬁer having allother arguments with their default values. lisa(‘file1.csv’,‘file2.csv’,‘row’,

The data structure in Figure 4(b) requires a user to extract the data matrix using a column identiﬁer. A user needs tospecify the column that contains information on the frame numbers by setting the parameter ‘scantype’ to ‘col’ and a column number needs to be passed over to the parameter ‘colid’ . LISA reads the number in that column toextract numeric matrix and save each matrix to its corresponding frames. Here is an example of the syntax for scanningthe data using the column identiﬁer having all other arguments with their default values. lisa(‘file1.csv’,‘file2.csv’,‘col’,

The data structure in Figure 4(c) requires a user to extract the data matrix using a blank row right before the numericintensity matrix. A user needs to set the parameter ‘scantype’ to ‘blank’ and LISA skips the header rows untilit encounters the ﬁrst blank row. It then scans next ‘nrow’ number of rows to extract as a numeric matrix and deﬁne itas the ﬁrst image frame.

LISA continues the same steps until it reaches the last blank row. Here is an example of the5

PREPRINT - F

EBRUARY

19, 2019syntax for scanning the data using a blank row having all other arguments with their default values. lisa(‘file1.csv’,‘file2.csv’,‘blank’,

This is the ﬁrst interactive step for a user. A user is given the opportunity to select a rectangular region of interest fromthe ﬁrst images of each sequence. Cropping images to a speciﬁc area of interest enhances the result of data segmentationand data registration because this naturally eliminates some part of the noise.

LISA uses the “ imcrop " function withinthe

MATLAB Image Processing Toolbox to select the region of interest.

It is critical to reduce noises and outliers by segmenting an image, which sets the noisy intensity values to zero beforeperforming an image registration. LISA provides data-driven automatic and manual segmentation. A pixel is identiﬁedas noise if its intensity value is less than a threshold C . Let Y ( i, j ) denote the pixel intensity value at the i th row and j th column of a representative image. Then the intensity values in the segmented image are (cid:101) Y ( i, j ) below: ˜ Y ( i, j ) = (cid:26) Y ( i, j ) if Y ( i, j ) > C otherwise . (1)The threshold value C is derived differently for automatic segmentation and manual segmentation, as shown below. Automatic segmentation ﬁnds an optimal C that separates data into two groups with the maximum separation, based onan estimated mixture of Gaussian distributions of the pixel intensity values, as introduced in [13]. The mixture density f ( y ) has the following form: f ( y ) = G (cid:88) g =1 α g σ g φ (cid:18) y − µ g σ g (cid:19) ≡ α · f ( y ) + (1 − α ) · f ( y ) (2)where φ is the standard normal density, G is the number of mixtures, and { α g > , g = 1 , ..., G } are mixingparameters satisfying (cid:80) Gg =1 α g = 1 . The ﬁrst-component density f ( y ) = φ [( y − µ ) /σ ] /σ represents thebackground distribution, while the second-component density f ( y ) = (cid:80) Gg =2 ( α g /σ g ) · φ [( y − µ g ) /σ g ] / (1 − α ) represents the distribution of pixel intensity inside the region of interest, also as a mixture of Gaussian distributions.The C value automatically computed by LISA separates f and f at the valley bottom between the two densities - seethe black vertical line in Figure 5.Figure 5: Data Segmentation from Histogram and Density Plot6 PREPRINT - F

EBRUARY

19, 2019Figure 5 is an example of the histogram and density plot for the pixel intensity values in a seated pressure map. Thedistribution of the intensity values is modeled by a mixture of 3-component Gaussian distributions. The threshold C isdetermined by the vertical line that separates the ﬁrst and second components in the ﬁtted parametric Gaussian mixturemodel. Manual segmentation allows a user to choose C visually based on a histogram of pixel intensity values. Figure 6 is ahistogram of intensity values. It is natural to select the threshold C at the gap where the target cross is located becauseat that threshold value the intensity values are distinctively divided into two groups. Once the threshold C is selected,intensity values below C will be replaced with zero, deﬁned as the background noise.Figure 6: Manual Segmentation from Histogram There are two steps in image registration: temporal registration and spatial registration. Temporal registration excludesunstable images from each sequence and identiﬁes the most similar subset of images to compare between the twosequences. Spatial image registration is done using a geometric transformation of the images.

The correlation coefﬁcient of the intensities between two images is a reasonable measure of similarity between the twoimages.

LISA temporally aligns two image sequences (of same speed) by shifting one sequence forward j frames tomatch temporally the conditions of another sequence, using an Intensity-based Correlation Registration ( ICR ) below.1. Exclude or the ﬁrst 5 % of images from each sequence, whichever the number is larger. This step removesnoise images at the start of recording. Let S , ..., S n and S , ..., S n be the emaining n sequential images intwo image sequences S and S .2. Compute the shift- j correlation coefﬁcient of two sequences by averaging correlation coefﬁcients of the datafrom the respective i -th and ( i + j ) th frames of two sequences over i = 1 , ..., n − j , for each j = 0 , ..., n − to obtain: AvgCor j = 1 n − j (cid:88) i cor ( S i , S i + j ) Then ﬁnd j max that maximizes the correlations coefﬁcients vgCor j over j . In other words j max = min { j : AvgCor j = max j ( AvgCor j ) } .3. Align S i and S i + j max for frames i = 1 , ..., n − j max . provides automatic registration and manual registration for the sequences after a temporal registration,7 PREPRINT - F

EBRUARY

19, 2019

Automatic Registration

The following midline registration algorithm from Algorithm 3.1 in

Wang et al.(2006) [13]is applicable to registering images that can be aligned by a rigid transformation using a midline (and a point) as a naturallandmark.

Automatic Registration Algorithm :1. Determine, column by column, the mid-points( m c ) by counting non-zero values in each column of an imageusing m c = r + ( u − l )2 , where r, u, l are the number of rows, number of non-zero intensity values from theupper half and lower half of the column respectively for c = 1 , ..., of columns .2. Fit a regression line through the mid-points and call this line the mid-line.3. Conduct a rigid transformation by aligning each mid-line with a horizontal line as shown in Figure 7 using arotation and location shift.Figure 7: Transformation in Automatic RegistrationIn other words, we deﬁne the transformation matrix R using the slope and intercept estimates of the mid-linefrom step 2 as R = (cid:34) cosθ − sinθ s x sinθ cosθ s y (cid:35) where tanθ is the slope of the mid-line and s x and s y are the shifting parameters. Using R we change X to X (cid:48) = RX where X contains the information of the coordinate of the pixel deﬁned as X = [ x, y, .Figure 8 shows an example of automatic registration. The image is translated to be aligned with the horizontal centerline. The mid-line registration algorithm given above is fast to implement automatically when comparing sequencesof many images. For images that require a rescaling or other types of registration, we suggest a user to use other MATLAB built-in image processing tools or manual registration in the next section.8

PREPRINT - F

EBRUARY

19, 2019 (a) Before Registration:Image and midline (b) After Registration:Image and midline

Figure 8: An example of automatic spatial image registration

Manual Registration

Manual registration aligns the images using a line deﬁned by two user-selected points. Usersselect two points on the ﬁrst images from each sequence which makes four point selections in total. The ﬁrst pointsselected for each sequence are used as a reference point which is aligned at (5, (a) Points selection for the 1 st sequence (b) Points selection for the 2 nd sequence Figure 9: An example of point selection in manual registrationThen

LISA will perform a rigid transformation based on a shift and rotation matrix of R deﬁned by the dotted blueline and 1 st points from Figure 9, where R = (cid:34) cosθ − sinθ usinθ cosθ v (cid:35) . The mathematics behind the transformation usingthe matrix R us identical to step 3 in the automatic registration. provides an optional region of interest selection after the image registration as shown in Figure 10. This step isdone by connecting a polygon around the ROI to distinguish ROI from the background by setting intensity valuesoutside of the ROI to zero. Figure 10 shows an example of optional ROI selection after registration. Comparingﬁgure 10(a) and ﬁgure 10(b), regions at y = 7 to 15 at x = 30 of the image in ﬁgure 10(b) is trimmed off after theselection. 9 PREPRINT - F

EBRUARY

19, 2019 (a) Before optional ROI Selection (b) After optional ROI Selection

Figure 10: Example of optional ROI selection.This step is optional to be used only when it is necessary because a user needs to be cautious when selecting theROI at this step. If the ROI is accurately selected, it will improve the accuracy of our statistical analysis. However,if the ROI selection includes artifacts or strong noises by accident, it will corrupt the image and possibly lead tomisleading signiﬁcance regions.

Our primary objective for

LISA is to identify the areas where there have been signiﬁcant changes. The statisticalmethods to achieve this goal are given in the following four steps, by revising the Algorithm 4.1 in our early work[13].1. Curated difference2. Non-parametric smoothing map3. Generalized T-Map4. FDR-controlled P-Map

The initial step is to calculate curated difference of intensity values between the pairs of images in the two sequencesbeing compared. For an image with a resolution of n by m , the image contains N number of pixels where N = n × m . Let Y ( i, j, t ) and Y ( i, j, t ) be the pixel intensity values at coordinate ( i, j ) of the corresponding images attime t in sequence 1 and sequence 2 respectively. Then the curated difference d ( i, j, t ) is a pixel by pixel subtractiondeﬁned as d ( i, j, t ) = Y ( i, j, t ) − Y ( i, j, t ) for i = 1 , ..., n , j = 1 , ..., m and ∀ t are calculated for every pair ofimage comparison from the two image sequences and are used to generate the time series D-maps. Our non-parametric smoothing map ﬁrst pads the border of curated difference and then smooth the curated difference, d j , in the region of interest. Padding the edge will prevent outstanding false-positive differences due to imagealignment and noise. Our non-parametric smoothing method uses a local polynomial regression. The method isreproduced from 4.1.T-type Statistics [13] as follows.For a single set of curated difference image data { ( x j , x j , d j ); j = 1 , ..., N } , consider the model d j = m ( x j , x j ) + (cid:15) j where m ( · ) is an unknown smooth function and (cid:15) i is an error term representing the random errors of the curateddifferences derived from the image observations. Then the smooth function m at s = ( x , x ) can be approximatedby the value of a ﬁtted local polynomial function using the intensity values in the neighborhood of s . Speciﬁcally,10 PREPRINT - F

EBRUARY

19, 2019for any s = ( x , x ) of interest, the least squares problem seeks ˆ β = ( ˆ β , ..., ˆ β ) such that ˆ β = − min β N (cid:88) j =1 (cid:26) d j − β − β ( x j − x ) − β ( x j − x ) − β x j − x ) − β x j − x ) − β ( x j − x )( x j − x ) } w ( x j − x ; h ) w ( x j − x ; h ) and then deﬁnes the estimate of the regression surface as ˆ m ( x , x ) = ˆ β [18, 19]. The smooth weight function w ( z ; h ) peaks at z = 0 and the smoothness of ˆ m is controlled by the bandwidth h . A normal density function withmean 0 and standard deviation h is a typical choice. In our procedure, the bandwidths h and h are selected basedon the approach of 10 fold cross validation with 20 iterations [20]. The method used for generalized T-Map is reproduced from 4.1.T-type Statistics [13] as follows.The hypothesis we are testing is H : m ( s ) = 0 vs. H : m ( s ) > at s = ( x , x ) . Where m can be written as alinear combination of the actual curated differences d j of pixel values, ˆ m ( s ) = N (cid:88) j =1 p j ( s ) y j where p ( s ) T = ( p ( s ) , ..., p N ( s )) are the rows of the hat matrix speciﬁed by the local quadratic approximation. Theestimated standard deviation of the local estimate ˆ m is ˆ S ( s ) = ˆ σ (cid:107) ( s ) (cid:107) . From these estimates, a T-type statistics isderived as, T ( s ) = ˆ m ( s )ˆ S ( s ) T ( s ) follows t-distribution and the null hypothesis is rejected at s when T ( s ) > t − α ( σ /σ ) under a signiﬁcancelevel of α with the degrees of freedom n − , and σ and σ obtained by two-moment chi-square approximations[18, 21, 22]. T-type statistics is different from the conventional T-statistics because it uses the weighted average ofthe values near s whereas conventional T-statistics uses simple average. Therefore, T ( s ) and T ( s (cid:48) ) can be correlatedfor our T-type statistics whereas the test statistics are independent to each other in the conventional T-tests. The P-Map is generated from multiple comparison controlled p-values derived from the distribution of the T-typestatistics. The most important problem in generating P-Map is overcoming the multiplicity problem for simultaneouscomparisons. Family-wise error rate (FWER) such as Bonferroni correction and false discovery rate(FDR) are themost common methods used to control for multiple comparisons. Section 4.2 of

Zhang(2005) [23] discussed therelationship between FWER and FDR and we decided to use FDR for

LISA . Speciﬁcally, we used the Benjamini andHochberg’s FDR-controlled method [24]. The details of the method are introduced in section 4.2 Multiple testingproblem of

Wang et al.(2006) [13].

LISA provides ‘2D’ and ‘3D’ options in displaying the P-map.

In this section, an application of

LISA on Neuromuscular Electrical Stimulation(NMES) experiment [14] isintroduced. NMES is a treatment proposed to improve pressure distribution at the seating or lying support area forpatients with mobile difﬁculties. Patients are at a risk of developing pressure ulcers when seated or lying down for along time due to the blocking of the blood ﬂow. NMES is intended to improve the blood ﬂow for healthier regionaltissue. In this application, the effect of NMES is evaluated by comparing the sequences of the pressure mat imagesbefore and after the treatment. A step by step process of using

LISA will be introduced to conduct the comparison.Generic

LISA function :l i s a ( ’ f i l e n a m e 1 ’ , ’ f i l e n a m e 2 ’ , ’ s c a n t y p e ’ , ’ nframe ’ , ’ nrow ’ , ’ n c o l ’ ,’ rowid ’ , ’ c o l i d ’ , ’ p r e p r o c e s s ’ , ’ s e g t y p e ’ , ’ r e g i s t t y p e ’ , ’ r o i ’ ,’ d i s p l a y m ’ , ’ dimenpmap ’ , ’ parcomp ’ )11

PREPRINT - F

EBRUARY

19, 2019 filename1

Name of the csv ﬁle that contains the ﬁrst sequence of images filename2

Name of the csv ﬁle that contains the second sequence of images scantype

Data scanning method. Possible options include ‘row’ : using a row identiﬁer (required arguments : nrow, rowid) ‘col’ : using a column identiﬁer (required arguments : colid) ‘no’ : using a blank row nframe

Number of frames in each sequence nrow

Number of rows in one frame or the height of the image ncol

Number of columns in one frame or the width of the image rowid

A characters string that identiﬁes the start of a frame colid

An integer number that identiﬁes the column where the frame numbers are stored preprocess

Logical value to inform if the images have been pre-processed. Possible options include ‘T’ : True ‘F’ : False (default) segtype

Type of image segmentation. Possible options include ‘auto’ : De-noises the image by choosing a cutoff value to divide noise to the regionof interest by using a mixture of normal distributions of the intensity values (default) ‘manual’ : De-noises the image through user-chosen cutoffs for the intensity valuesto divide noise to the region of interest registtype

Type of image registration. Possible options include ‘auto’ : Rotates and/or shifts images based on a mid-line found by a linear regression(no user interface is required) (default) ‘manual’ : Rotates and/or shifts images based on a line speciﬁed by two pointsselected manually by a user (the ﬁrst click sets a reference point and the second clickdraws the reference line) roi

Select region of interest after image registration. Possible options include ‘T’ : True ‘F’ : False (default) display

Output option. Possible options include ‘basic’ : Only displays the P-Movie (movie of multiple comparison adjusted p-values) (default) ‘all’ : Displays O-Movies (movies of the original two sequences of images), R-Movies (movies of the registered sequence of images), D-Movie (movie of the curateddifference), S-Movie (movie of the non-parametrically smoothed differences), T-Movie(movie of variance-adjusted smoothed differences), and P-Movie dimenpmap

Select the dimension of P-Movie. Possible options include : Two dimensions (default) : Three dimensions parcomp Parallel computing. Possible options include ‘T’ : True ‘F’ : False (default)

The argument ‘scantype’ takes one of the three possible values. When scantype = ‘row’ , LISA ﬁnds theﬁrst row that matches ‘rowid’ , then ‘nrow’ rows after the ‘rowid’ are stored as a frame. The search continuesuntil the frame after the last ‘rowid’ is read in. When scantype = ‘col’ , LISA scans the csv ﬁle, and readin frames using the frame numbers found under the ‘colid’ column. When scantype = ‘no’ , LISA usesblank rows as row ids, and read in ‘nrow’ rows after each blank row as individual frames. After all data are readin,

LISA will have two arrays with nframe number of cells having ‘nrow’ by ‘ncol’ matrix of the intensityvalues in each cell. A user can skip the image processing step if it’s unnecessary by setting ‘preprocess’ to ‘T’ . Otherwise, a usercan set ‘preprocess’ to ‘F’ to process the images as follows.12 PREPRINT - F

EBRUARY

19, 2019 shows the ﬁrst images of each sequence to select ROI as the ﬁrst step of image processing. An example isdisplayed in Figure 11 and speciﬁc instructions on how to choose ROI are provided on the screen. A user needs todrag a square to roughly select the ROI. It is advised to choose a region large enough to enclose all ROI. Between thetwo regions selected from each sequence, the size of the larger region will be used to crop the images. (a) Select ROI for 1 st sequence (b) Select ROI for 2 nd sequence Figure 11: Example of ROI selection

In the image segmentation step, noises to ROI are identiﬁed using the pixel intensity values. Automatic and manualimage segmentation options are available in

LISA as described in section 3.3. The following is an example.(Automatic segmentation)1. Select ROI : A user can use a multi-point polygon to select detailed ROI as shown in Figure 12. (a) Select detalied ROI for 1 st sequence (b) Select detailed ROI for 2 nd sequence Figure 12: Example of ROI selection for automatic image segmentation13

PREPRINT - F

EBRUARY

19, 20192. Select groups for mixture of Gaussian distribution : A user can change the number of bins and deﬁne visuallydistinct groups from the distribution of intensity as shown in Figure 13. (a) Select groups for 1 st sequence (b) Select groups for 2 nd sequence Figure 13: Example of groups selection for automatic image segmentation(Manual segmentation)1. Pick cutoff value : A user should select one cutoff value from the histogram of the intensity values to distinguishthe noise and ROI as shown in Figure 14. (a) Select cutoff value for 1 st sequence (b) Select cutoff value for 2 nd sequence Figure 14: Example of selecting cutoff value for manual image segmentation

Image registration can be done automatically or manually as described in section 3.4. Automatic registration doesnot require any additional input from a user. Manual registration requires a user to select two points : a referencepoint and a second point to form a reference line using the ﬁrst image of each sequence as shown in Figure 15.14

PREPRINT - F

EBRUARY

19, 2019 (a) Select reference point and line for 1 st se-quence (b) Select reference point and line for 2 nd se-quence Figure 15: Example of points selection for manual image registration

Before conducting statistical comparison,

LISA overlaps the ﬁrst images of the two sequences which have beenprocessed to conﬁrm whether the image segmentation and image registration is satisﬁed by a user. If a user is satisﬁedand selects ‘Yes’, then

LISA continues to statistical analysis but if a user selects ‘No’

LISA provides suggestions tothe user on image processing in a pop-up window and terminates the process. An example of the overlapped imageis shown in Figure 16. Figure 16: Checking image alignment before statistical analysis

Depending on the option a user selects,

LISA can provide 8 different movies :1. O-movie 1 (Figure 17-a) : Movie of the actual images from the ﬁrst sequence2. O-movie 2 (Figure 17-b) : Movie of the actual images from the second sequence3. R-movie 1 (Figure 17-c) : Movie of the registered images from the ﬁrst sequence15

PREPRINT - F

EBRUARY

19, 2019 (a) O-movie 1 (b) O-movie 2 (c) R-movie 1 (d) R-movie 2(e) D-movie (f) S-movie (g) T-movie (h) P-movie

Figure 17: Output movies4. R-movie 2 (Figure 17-d) : Movie of the registered images from the second sequence5. D-movie (Figure 17-e) : Movie of the curated difference6. S-movie (Figure 17-f) : Movie of the non-parametrically smoothed difference7. T-movie (Figure 17-g) : Movie of the T-type statistics8. P-movie (Figure 17-h) : Movie of FDR-controlled p-valuesWhen ‘basic’ is selected only P-map will be generated but when ‘all’ is selected, all 8 movies will begenerated for display. In D-movie, S-movie and T-movie, a blue peak signiﬁes positive differences which indicatehigher pressure in the ﬁrst sequence compared to the second sequence, whereas, a red peak signiﬁes the opposite. InP-movie, red areas are statistically signiﬁcantly activated/different areas. The snapshots of the movies resulting fromour data application using the seated pressure mats are introduced in Figure 17.

LISA is an image analysis package that processes and analyzes two sequences of images to ﬁnd signiﬁcant activationregions between them. In

LISA , the essential components of image processing functionality include ROI selection,image segmentation and image registration and that of statistical analysis functionality include non-parametricmapping, smoothing and multiple comparison adjusted testing.The current version of

LISA was developed for and focused on sequence of images from seated pressure mats butthe package can be extended to analyze other types of time series images such as satellite images, hyper-spectralchemical images, longitudinal magnetic resonance images. The interactive function during the image processingstep makes it easier for users to use the package and provides users with visual evidence to derive more accurateimage alignment and statistical comparison results. Statistical analysis is designed to reduce possible errors resultingfrom edge effects and to account for multiple comparisons using false discovery rate which makes the ﬁnal resultconservative and accurate.

LISA also has a parallel computing function to help with computing issues for big data.High-resolution or long image sequences can be analyzed with the parallel computing function in a multi-coreprocessing environment.Depending on a user’s needs and preference, other image processing tools can enhance the overall analysis experience,such as the Image Processing Toolbox(IPT) within

MATLAB . Among the many options in IPT, a function such asmulti-point registration could help users to derive more accurate image alignments depending on the characteristicsof the image. The underlying method is identical to manual registration in

LISA , but users can select multi-referencepoints for image translation. For details, please refer to the MathWorks website [25]. Any other tools can be used forimage processing as long as the images are well aligned and the results are saved in ‘csv’ format to be consumed by

LISA . 16

PREPRINT - F

EBRUARY

19, 2019Development of a standalone

MATLAB package

LISA as an ensemble of elements from the LASR algorithm [13]with additional features for image processing and graphical user interface is the start of our project in building animage comparison analysis package for large dynamic sequence of high-resolution images. The current version canhandle relatively simple and static sequence of images. However, we aim to develop a next generation of

LISA tobe able to handle more complex and dynamic images, such as the entire body of a human and comparing imagesat many different time points. If we take a pressure map of a entire body of a human as an example, our futurework will focus on segmented image processing and relevant statistical comparisons. This will allow us to compareand ﬁnd signiﬁcant activation regions independently separating by upper body, lower body, arms, legs and for anyspeciﬁc areas.Another area of our future development includes reducing the computing time. Even with the parallel computingoption, the computing time for

LISA can be relatively long depending on the size of the image and sequence and thenumber of cores in the computer because the image comparison is done pixel-by-pixel. Pixel-by-pixel comparison isthe most comprehensive and exhaustive way to compare images but can also provide most accurate results relative tousing measures. Therefore, we will be aiming to develop methods such as to distribute the statistical calculation toreduce the computing time working under the essential framework of pixel-by-pixel comparison.17

PREPRINT - F

EBRUARY

19, 2019

References [1] J. Russ and B. Neal.

The Image Processing Hadbook Seventh Edition . CRC Press Taylor and Francis Group, 6000Broken Sound Parkway NW, Suite 300 Boca Raton, Fl 33487-2742, 2016.[2] W. Birkfellner.

Applied Medical Image Processing Second Edition . CRC Press Taylor and Francis Group, 6000Broken Sound Parkway NW, Suite 300 Boca Raton, Fl 33487-2742, 2014.[3] W. Burger and M. Burge.

Digital Image Processing Second Edition . Springer, Noblis, Inc. Washington, DC, 2016.[4] Inc. CVISION Technologies. Document image processing : Smarter document capture, 1998-2017.[5] ARMLtd. Multimedia processing from arm, 1995-2017.[6] Inc. LEAD Technologies. Leadtools, 1990-2017.[7] NIHCIT. Medical image processing, analysis, and visualization(mipav), 2016.[8] R. Katukam and P. Sindhoora. Image comparison methods and tools : A review. , pages 35–42, 2015.[9] W. Förstner. On the geometric precision of digital correlation.

International Archives of Photogrammetry andRemote Sensing , 24(3):176–189, 1982.[10] D.P. Huttenlochter, G.A. Klanderman, and W.J. Rucklidge. Comparing images using the hausdorff distance.

IEEETransactions on Pattern Analysis and Machine Intelligence , 15:850–863, 1993.[11] Z. Hocenski and A. Baumgartner. Image comparison method for visual quality control based on matrix decompo-sition.

Proceedings of the 2000 IEEE International on Industrial Electornics , 2000.[12] K.J. Worsley, C. Liao, Aston J., V. Petre, G.H. Duncan, F. Morales, and A.C. Evans. A general statistical analysisfor fmri data.

NeuroImage , 15:1–15, 2002.[13] X. Wang, J. Sun, and K. Bogie. Spatial-temporal data mining procedure : Lasr.

IMS LNMS Series, "RecentDevelopments in Nonparametric Inference and Probability" , 50:213–231, 2006.[14] N. Bergstrom, M. Bennet, and C.E. Carlson.

Treatment of pressure ulcers: Clinical practice guideline . No.95-0852 1994. AHCPR Publication, US Department of Health and Human Services. Rockville, MD. Public HealthService, Agency for Health Care Policy and Research., 2004.[15] J. Verbesselt, R.J. Hyndman, G. Newnham, and Culvenor D. Detecting trend and seasonal changes in satelliteimage time series.

Remote Sensing of Environment , 114:106–115, 2010.[16] A.A. Gowen, F. Marini, C. O’Donnell, C. Downew, and J. Burger. Time series hyperspectral chemical imagingdata: challenges, solutions and applications.

Analyitica Chimica Acta , 705:272–282, 2011.[17] MathWorks. Packaging and installing matlab apps, 2013.[18] W. Cleveland and S. Delvin. Locally weighted regresion : An approach to regression analysis by local ﬁtting.

Journal of the American Statistical Association , 83:596–610, 1988.[19] X. F. Wang and D. Ye. On nonparametric comparison of images and regression surfaces.

Journal of statisticalplanning and inference , 140:2875–2884, 2010.[20] T. Hastie, R. Tibshirani, and J. Friedman.

The Elements of Statistical Learning . Springer, New York, 2001.[21] J. Sun and C. Loader. Simultaneous conﬁdence bands for linear regression and smoothing.

The Annals of Statistics ,22:627–643, 1994.[22] X. Wang.

New procedures for data mining and measurement error models with medical imaging applications.

Ph.d. disseratation., Case Western Reserve University, Cleveland OH, 2005.[23] Z. Zhang.

Multiple hypothesis testing for ﬁnite and inﬁnite number of hypothesis.

Ph.d. disseratation., CaseWestern Reserve University, Cleveland OH, 2005.[24] Y. Benjamini and Y. Hochberg. Controlling the false discovery rate: a practical and powerful approach to multipletesting.