[PDF] Predicting Future Cognitive Decline with Hyperbolic Stochastic Coding

Abstract

Hyperbolic geometry has been successfully applied in modeling brain cortical and subcortical surfaces with general topological structures. However such approaches, similar to other surface based brain morphology analysis methods, usually generate high dimensional features. It limits their statistical power in cognitive decline prediction research, especially in datasets with limited subject numbers. To address the above limitation, we propose a novel framework termed as hyperbolic stochastic coding (HSC). Our preliminary experimental results show that our algorithm achieves superior results on various classification tasks. Our work may enrich surface based brain imaging research tools and potentially result in a diagnostic and prognostic indicator to be useful in individualized treatment strategies.

Full PDF

aa r X i v : . [ ee ss . I V ] F e b Predicting Future Cognitive Decline with Hyperbolic StochasticCoding

Jie Zhang a , Qunxi Dong a , Jie Shi a , Qingyang Li a , Cynthia M. Stonnington b ,Boris A. Gutman c , Kewei Chen d , Eric M. Reiman d , Richard J. Caselli e ,Paul M. Thompson f , Jieping Ye g , Yalin Wang a , and for the Alzheimer’s DiseaseNeuroimaging Initiative* a School of Computing, Informatics, and Decision Systems Engineering,Arizona State University, Tempe, AZ, USA b Department of Psychiatry and Psychology,Mayo Clinic Arizona, Scottsdale, AZ, USA c Armour College of Engineering,Illinois Institute of Technology, Chicago, IL, USA d Banner Alzheimer’s Institute, Phoenix, AZ, USA e Department of Neurology, Mayo Clinic Arizona, Scottsdale, AZ, USA f Imaging Genetics Center, Institute for Neuroimaging and Informatics,University of Southern California, Los Angeles, CA, USA g Department of Computational Medicine and Bioinformatics &Department of Electrical Engineering and Computer Science,University of Michigan, Ann Arbor, MI, USA

Abstract

Hyperbolic geometry has been successfully applied in modeling brain cortical and subcorti-cal surfaces with general topological structures. However, such approaches, similar to othersurface-based brain morphology analysis methods, usually generate high dimensional fea-tures. It limits their statistical power in cognitive decline prediction research, especiallyin datasets with limited subject numbers. To address the above limitation, we propose anovel framework termed as hyperbolic stochastic coding (HSC). We ﬁrst compute diﬀeomor-phic maps between general topological surfaces by mapping them to a canonical hyperbolicparameter space with consistent boundary conditions and extracts critical shape features.Secondly, in the hyperbolic parameter space, we introduce a farthest point sampling withbreadth-ﬁrst search method to obtain ring-shaped patches. Thirdly, stochastic coordinatecoding and max-pooling algorithms are adopted for feature dimension reduction. We fur-ther validate the proposed system by comparing its classiﬁcation accuracy with some othermethods on two brain imaging datasets for Alzheimer’s disease (AD) progression studies.Our preliminary experimental results show that our algorithm achieves superior results on

Data used in preparation of this article were obtained from the Alzheimer’s Disease NeuroimagingInitiative (ADNI) database (adni.loni.usc.edu). As such, the investigators within the ADNI contributed tothe design and implementation of ADNI and/or provided data but did not participate in analysis or writingof this report. A complete listing of ADNI investigators can be found at: https://adni.loni.usc.edu/wp-content/uploads/how to apply/ADNI Acknowledgement List.pdf

E-mail: [email protected] arious classiﬁcation tasks. Our work may enrich surface-based brain imaging research toolsand potentially result in a diagnostic and prognostic indicator to be useful in individualizedtreatment strategies.

Keywords:

Alzheimer’s disease (AD), Hyperbolic Space, Ring-shaped Patches, SparseCoding, Classiﬁcation

1. Introduction

Alzheimer’s Disease (AD), an irreversible neurodegenerative disaese, is the most commoncause of dementia among older adults. It is generally agreed that accurate presymptomaticdiagnosis and preventive treatment of AD could have enormous public health beneﬁts. Brainstructural magnetic resonance imaging (sMRI) analysis has the potential to provide validdiagnostic biomarkers of the preclinical stage as well as symptomatic AD (Frisoni et al.,2010). For example, a single-valued sMRI-based atrophy is used as a neurodegenerationmarker in the recently proposed AD descriptive “A/T/N” (amyloid, tau, neurodegenera-tion) system (Jack et al., 2016) to clinically deﬁne AD. Tosun et al. proposed MRI-basedapproaches to impute Abeta status (Tosun et al., 2014, 2016). Their results demonstratedthat sMRI can be used to predict the amyloid status of MCI individuals and mild AD pa-tients. Recently, brain morphology measures have been integrated with machine learningalgorithms to classify individual subjects into diﬀerent diagnostic groups (e.g. Sun et al.,2009; Ferrarini et al., 2008; Wang et al., 2013; Li et al., 2014). It oﬀers a promising ap-proach to computer-aided cognitive decline prediction by leveraging both sensitive brainimage features and powerful machine learning techniques.Although most brain sMRI analysis approaches use cortical and subcortical volumes (e.g.Jack et al., 2003; Vemuri et al., 2008; den Heijer et al., 2010; Wolz et al., 2010), recent re-search has demonstrated that surface-based analyses, (e.g. Styner et al., 2005; Thompson et al.,2004b; Ferrarini et al., 2008; Qiu et al., 2010; Costafreda et al., 2011) can oﬀer advantagesover volume measures, due to their sub-voxel accuracy and the capability of detecting sub-tle subregional changes. In surface-based brain imaging research, a practical approach tomodel brain landmark curves is to model them as surface boundaries by cutting open cor-tical surfaces along these landmarks. Thus they are modeled as open boundaries to bematched across subjects (Tsui et al., 2013; Shi and Wang, 2020) or be used as shape in-dices (Zeng et al., 2013; Shi et al., 2017). Similarly, adding open boundaries have beenproved to be useful in modeling ventricular surfaces which have a concave shape and complexbranching topology (Wang et al., 2010; Shi et al., 2015). We call these genus-zero surfaceswith more than two open boundaries as general topological surfaces and hyperbolic geometryhas been demonstrated to be useful to model general topological surfaces. However, mostof current hyperbolic space-based brain imaging methods have been focused on studyinggroup diﬀerence between diagnostic groups. To develop brain imaging methods for personalmedicine research, it would be advantageous to design powerful machine learning methodsthat work on general topological surface features for the identiﬁcation of AD symptoms onan individual basis.One of the major challenges to directly apply vertex-wise surface features, such as sur-face tensor-based morphometry (TBM) (Thompson et al., 2000; Chung et al., 2008), to cog-2itive decline prediction research is that the surface feature dimension is usually muchlarger than the number of subjects, the so-called high dimension-small sample problem .Existing feature dimension reduction approaches include feature selection (Fan et al., 2005;Jain and Zongker, 1997), feature extraction (Saadi et al., 2007; Guyon et al., 2008; Scholkopft and Mullert,1999; Jolliﬀe, 2011) and sparse coding-based methods (Vounou et al., 2010; Donoho, 2006;Wang et al., 2013). In most cases, information is lost when mapping high-dimenstional fea-tures into a lower-dimensional space. However, by deﬁning a better lower-dimensional sub-space, sparse coding (Lee et al., 2006; Mairal et al., 2009) may limit such information loss.Sparse coding has been previously proposed to learn an over-complete set of basis vectors(dictionary) to represent input vectors eﬃciently and concisely (Donoho and Elad, 2003).It has shown to be eﬃcient for many tasks such as image deblurring (Yin et al., 2008),super-resolution (Yang et al., 2010), classiﬁcation (Mairal et al., 2009), functional connec-tivity (Zhang et al., 2018b; Lv et al., 2015b, 2017; Jiang et al., 2015; Lv et al., 2015a) andstructural morphometry analysis (Zhang et al., 2017c; Li et al., 2017). However, solvingsparse coding remains a computationally challenging problem, especially when dealing withlarge-scale datasets and learning large size dictionaries (Lin et al., 2014).To generalize sparse coding to process general topological surface features (Mairal et al.,2009), we propose a novel pipeline to extract sparse hyperbolic features for classiﬁcationtermed hyperbolic stochastic coding (HSC), consisting of our unique farthest point sam-pling with breadth-ﬁrst search (FPSBS) method for ring-shaped surface patches extraction,stochastic coordinate coding (SCC) and max-pooling methods for feature dimension reduc-tion, to extract critical low-dimensional shape features from the hyperbolic TBM maps. Wecall such features as HSC measures. Then the AdaBoost classiﬁer (Freund and Schapire,1997) is further adopted on these HSC measures for AD clinical group classiﬁcation andcognitive decline prediction. We hypothesize that our HSC measures may outperform vol-ume, area and shape-based cortical structural measures on discriminating clinical groupsrelated with AD (Li et al., 2014; Ferrarini et al., 2008; Jack et al., 1999; Leung et al., 2010)We validate our system in a publicly available brain image dataset, Alzheimer’s Disease Neu-roimaging Initiative (ADNI) cohort (Weiner et al., 2013). With the sMRI baseline data of133 mild cognitive impairment (MCI) subjects, consisting of 71 MCI converter (MCIc) and62 MCI stable (MCIs) subjects (Shi et al., 2015) and 115 subjects (30 AD, 44 MCI and 40cognitively unimpaired (CU) subjects) (Shi and Wang, 2020), we set out to test our hypoth-esis by performing classiﬁcation accuracy comparison with three other popular structuralmeasures (volume, area, and shape-based biomarkers).

2. Subjects and Methods . ± .

81) who develop incident AD during the sub-sequent 36 months, which we call the MCI converter (MCIc) group, and 62 subjects (age:75 . ± .

83 years) who do not during the same period, which we call the MCI stable (MCIs)group. These subjects are chosen on the basis of having at least 36 months of longitudinaldata. If a subject developed incident AD more than 36 months after baseline, it is assigned tothe MCIs group. All subjects undergo thorough clinical and cognitive assessments at the timeof acquisition, including the Mini-Mental State Examination (MMSE) score, Alzheimer’s dis-ease assessment scale – Cognitive (ADAS-COG) (Rosen et al., 1984) and Auditory VerbalLearning Test (AVLT) (Rey, 1964). The demographic statistical information of this datasetis shown in Table 1.In Dataset II, we study cortical morphometry for tracking AD progression. Dataset IIhas 115 T1-weighted MRIs from the ADNI-1 (Weiner et al., 2013) baseline dataset, including30 AD patients, 45 MCI subjects and 40 CU subjects (Shi and Wang, 2020). All subjectsunderwent through MMSE (Folstein et al., 1975). The demographic statistics with matchedgender, education, age and MMSE are shown in Table 2.

The major computational steps of our proposed work are illustrated in Fig. 1 where wetake a left ventricular surface as an example. There are two major stages in the process.In the ﬁrst stage, we perform ventricular surface reconstruction from MRI data, surfaceregistration and surface TBM feature computation. The second stage is for HSC measurecomputation. Speciﬁcally, we build ring-shaped patches on the hyperbolic parameter spaceby FPSBS to initialize the original dictionary. Dictionary learning and max-pooling areGroup Gender (F/M) Education Age MMSEMCIc 26/45 15.99 ± ± ± ± ± ± Table 1: Demographic statistic information of Dataset I. ± ± ± ± ± ± ± ± ± Table 2: Demographic statistical information of Dataset II.Figure 1: The major processing steps in the proposed framework. performed for feature dimension reduction. Following that, Adaboost is adopted to diagnosediﬀerent clinical groups and predict future AD conversions.5 lgorithm 1:

Brain surface registration with hyperbolic Ricci ﬂow and harmonicmap

Input:

Brain surface S with more than 2 open boundaries. Output:

Klein model of S Compute the hyperbolic uniformization metric of S with hyperbolic Ricci Flow. Compute the fundamental group of paths on S and, together with originalboundaries, obtain the simply connected domain ¯ S . Embed S onto the Poincar´e disk with its hyperbolic metric and its simply connecteddomain ¯ S , we obtain the fundamental domain of S . Tile the fundamental domain of S with its Fuchsian group of transformations to geta ﬁnite portion of the universal covering space of S . Compute the positions of the paths in the fundamental group as geodesics in theuniversal covering space. By slicing the universal covering space along thegeodesics, we obtain the canonical fundamental domain of S . Convert the canonical Poincar´e disk to the Klein model and construct the harmonicmap between S and a selected template surface. Taking a left ventricular surface S as an example, the corresponding framework is sum-marized in Algorithm1 (Shi et al., 2015) and Fig. 1 (c). Its critical steps are shown in Fig.2.Following our prior work (Shi et al., 2015), three horns of a ventricular surface are identi-ﬁed and three cuts { γ , γ , γ } are made on these horns (Fig.2 (a)). We term this step as topology optimization . As a result, each ventricular surface becomes a topologically multi-ply connected surface and admits the hyperbolic geometry. We apply the hyperbolic Ricciﬂow method to compute its discrete hyperbolic uniformization metric. With the hyperbolicuniformization metric, we can embed S onto the Poincar´e disk. In the obtained Poincar´edisk, we apply the geodesic curve lifting algorithm (Shi et al., 2015) to obtain a canonicalparameter space (Fig.2 (b)). Furthermore, we convert the Poincar´e disk to the Klein model.It converts the canonical fundamental domains of the ventricular surfaces to a Euclideanoctagon, as shown in Fig.2 (c). Then we compute surface harmonic map with the Klein diskas the canonical parameter space for the following surface morphometry analysis (Shi et al.,2015). Suppose φ = S → S is a map from surface S to surface S . The derivative mapof φ is the linear map between the tangent spaces dφ : T M ( p ) → T M ( φ ( p )), inducedby the map φ , which also deﬁnes the Jacobian matrix of φ . The derivative map dφ isapproximated by the linear map from one face [ v , v , v ] to another one [ w , w , w ]. First,we isometrically embed the triangles [ v , v , v ] and [ w , w , w ] onto the Klein disk, theplanar coordinates of the vertices are denoted by v i , w i , i = 1 , ,

3, which represent the 3Dposition of points Then, the Jacobian matrix for the derivative map dφ can be computed as J = dφ = [ w − w , w − w ][ v − v , v − v ] − . Based on the derivative map J , the surface TBM is deﬁned as p det ( J ), which measuresthe amount of local area changes in a surface with the map φ (Fig. 1 (d)). As pointed out in6 igure 2: Modeling ventricular surface with hyperbolic geometry. (a) shows three identiﬁed open boundaries, γ , γ , γ , on the ends of three horns. After that, ventricular surfaces can be conformally mapped to thehyperbolic space. (b) and (c) show the hyperbolic parameter space, where (b) is the Poincar´e disk modeland (c) is the Klein model. (Chung et al., 2005), each step in the processing pipeline including MRI acquisition, surfacedeformation, etc., are expected to introduce noise in the deformation measurement. Thedeformation is applied to map each subject’s surface to a template surface. The Jacobianmatrices of the transformation were used per subject. To account for the noise eﬀects, weapply surface heat kernel smoothing algorithm proposed in (Chung et al., 2005) to improveSNR in the TBM features and boost the sensitivity of statistical analysis. The vertical-wisesurface TBM features are used as the inputs for dictionary learning. We use the coordinatesof these vertices to localize the ring-shaped patches and we use 3-dimentional TBM featuresas the feature map of ring-shape patches. The hyperbolic space is diﬀerent from the original Euclidean space. The common rectan-gle patch construction developed in Euclidean space (Zhang et al., 2017c) cannot be directlyapplied to the hyperbolic space. Therefore, we propose FPSBS on hyperbolic space to initial-ize dictionaries for sparse coding (Fig. 1 (e)). The intuition of the algorithm is that we wantto select patches without losing the geometry information and all vertices on the hyperbolicspace selected at least once. This will guarantee we learn complete information from thehyperbolic space. Fig. 3 (right) is the visualization of patch selection on the hyperbolic pa-rameter domain. And Fig. 3 (left) projects the selected patches on the hyperbolic parameterdomain back to the original ventricular surface, which still maintains the same topologicalstructure as the parameter domain. In Fig. 3, each patch has a unique color and patchesmay overlap with each other. Together all patches cover the entire surfaces. In the followingparagraph, we explain how these patches are selected.We ﬁrst randomly select a patch center point c on the hyperbolic space V , where c ∈ V and V is the set of all discrete vertices on the hyperbolic space. We then ﬁnd all u verticesconnected with the center point c ,i ( i = 1 , , ..., u ) and c ,i is the i -th vertex connectedwith c . The procedure is called breadth-ﬁrst search (BFS) (Patel et al., 2015), which isan algorithm for searching graph data structures. It starts at the tree root and exploresthe neighbor nodes ﬁrst, before moving to the next level neighbors. We use the same BFSprocedure to ﬁnd all connected vertices with c ,i , which are c ,i j ( j = 1 , , · · · , w i ). w i is the7 igure 3: Visualization of computed image patches on the ventricle surface (left) and hyperbolic space (right).The zoom-in pictures show some overlapping areas between image patches. number of connected vertices with center point c ,i . Finally, we get a vertex set (no duplicatevertices) x as follows, we call it a selected ring-shape patch on hyperbolic space and thepatch center is c . x = { c , c , , · · · , c , w , · · · , c ,u , · · · , c ,u wu } (1)The dimension of x is u + w + · · · + w u = m and x ∈ R m . We construct the topologicalpatches based on hyperbolic geometry and the edge connections among diﬀerent pointsfrom x . x is the ﬁrst selected patch. To select the second patch center, we sample thefarthest point with c , s.t. radius r = max c v ∈ V d V ( c v , c ). We now ﬁnd the second patchcenter c ∈ V with the farthest distance r of c . We follow the farthest point samplingscheme (Moenning and Dodgson, 2003), the sampling principle is repeatedly placing thenext sample point in the middle of the least known area of the sampling domain, which canguarantee the randomness of the patch selection.Here, d is the hyperbolic distance in the Klein model. Given two points v ′ and v ′′ , drawa straight line between them; the straight line intersects the unit circle at points a and b , so d is deﬁned as follows: d ( v ′ , v ′′ ) = 12 (log | av ′′ || bv ′ || av ′ || bv ′′ | ) (2)where | av ′′ | > | av ′ | and | bv ′ | > | bv ′′ | . V r denotes the set of selected patch centers ( V r = { c } when we compute c ). Afterselecting the second patch x , we add c into V r ( V r = { c , c } ). We iterate the patchselection procedure p times to get p/ p/ p patches per subject). The details of FPSBS aresummarized in Algorithm 2. We model surface TBM features as a sparse linear combination of atoms selected from adictionary which is initialized by FPSBS on the hyperbolic parameter space. This modelingprocedure is known as sparse coding (Mairal et al., 2009). Our aim is to reduce the originalsurfaces dimension with the over-complete dictionary and ﬁnd a linear combination of thedictionary bases to reconstruct the original surface statistics. The problem statement ofsparse coding is described as below.Given a ﬁnite training set of ring-shaped patches (as the description in Sec II. C) X =( x , x , · · · , x n ) ∈ R m × n , and x i ∈ R m , i = 1 , , · · · , n , where m is the dimension of each8 lgorithm 2: Farthest Point Sampling with Breadth-ﬁrst Search (FPSBS)

Input:

Hyperbolic parameter space.

Output:

A collection of diﬀerent amount overlapped patches on topologicalstructure. Start with V r = { c } , V denotes all discrete vertices on the hyperbolic space and V r denotes the set of selected patch centers. for T = n do for r determine sampling radius do Find set x T by following Eq. 1 and two times BFS. r = max c v ∈ V d V ( c v , c T ) if r ≤ e − then STOP end Find the farthest point c T +1 Add c T +1 = arg max c v ∈ V dr ( c v , V r ) to V r end end ring-shaped patch and n is the total number of patches. In this paper, we use superscript torepresent k -th epoch and use subscript to represent i -th coordinate. We use boldface lowercase letters x to denote vectors and use boldface upper case letters X to denote matrices.We then learn dictionary and sparse codes for these input patch features x i using sparsecoding.We use f i ( · ) to represent the optimization problem of sparse coding for each patch x i : min D ∈ R m × t , z i ∈ R t f i ( D , z i ) = 12 || Dz i − x i || + λ || z i || (3) where λ is the regularization parameter, || · || is the standard Euclidean norm and || z i || = P tj =1 | z i,j | . In Eq. 3, each input vector will be represented by a linear combination of afew basis vectors of a dictionary. The ﬁrst term of Eq. 3 is the reconstruction error, whichmeasures how well the new feature represents the input vector. The second term of Eq. 3ensures the sparsity of the learned feature z i . Each z i is often called the sparse code . Since z i is sparse, there are only a few entries in z i which are non-zero. We call its non-zero entriesas its support , i.e., supp( z i ) = z i,j : z i,j = 0 , j = 1 , · · · , t . D = ( d , d , · · · , d t ) ∈ R m × t is socalled dictionary , each column represents a basis vector.Speciﬁcally, suppose there are t atoms d j ∈ R m , j = 1 , , · · · , t , where the number ofatoms is much smaller than n (the total number of training image patches) but larger than m (the dimension of the image patches). x i can be represented by x i = P tj =1 z i,j d j . In thisway, the m -dimensional vector x i is represented by a t -dimensional vector z i = ( z i, , · · · , z i,t ) T ( Z = ( z , · · · , z n ) ∈ R t × n ). To prevent an arbitrary scaling of the sparse codes, the columns d i are constrained by C △ = { D ∈ R m × t , s.t. ∀ j = 1 , · · · , t, d Tj d j ≤ } . Thus, we use F ( · ) torepresent the sparse coding problem for X , we then rewrite F ( · ) as a matrix factorization9 igure 4: Illustration of hyperbolic stochastic coding (HSC) framework. problem: min D ∈ C , Z F ( D , Z ) ≡ n n X i =1 f i ( D , z i ) = 12 || X − DZ || F + λ || Z || (4) where || · || F is the Frobenius norm. Eq. 4 is a non-convex problem. However, it is a convexproblem when either D or Z is ﬁxed. When the dictionary D is ﬁxed, solving each sparsecode z i is a Lasso problem (Tibshirani, 1994). Otherwise, when the Z are ﬁxed, it becomesa simple quadratic problem. Here we adopt the SCC method (Lin et al., 2014) to optimizeEq. 4, which has been studied in a number of prior work (e.g., Lin et al., 2014; Lv et al.,2015a,b, 2017; Zhang et al., 2017a, 2016a, 2017c, 2018a). Following Lin et al. (2014), weupdate z ki via one or a few steps of coordinate descent (CD) (Wu and Lange, 2008): z ki = CD( D ki , z k − i , x i ) (5)The updated sparse code is then denoted by z ki . A detailed derivation of CD utilizing soft-ware thresholding shrinkage function (Combettes and Wajs, 2005) can be found in Lin et al.(2014). We then update the dictionary D by using stochastic gradient descent (SGD) (Bottou,1998): D ki +1 = P C ( D ki − η ki ∇ D ki f i ( D ki , z ki )) (6)where P is the shrinkage function, C is the feasible set of D and η ki is the learning rate of i -th step in k -th epoch. We set the learning rate as an approximation of the inverse of theHessian matrix H . We illustrate the algorithmic framework in Fig. 4. At each iteration,with a ring-shaped patch x i , we perform one step of CD to ﬁnd the supports of the sparsecode z k − i . Next, we perform a few steps of CD on the supports to obtain a new sparse code z ki . Then we update the supports of the dictionary by the second order SGD to obtain anew dictionary D ki +1 . With a trained dictionary D , for a set of ring-shaped patches from a new subject, x i , i = 1 , , ..., p , p is the patch number of an individual subject, we can learn its sparse features z i , z i ∈ R t , i = 1 , , ..., p . In theory, one could use all learned features as input data of a10lassiﬁer but it poses intractable computational challenges. Thus, to describe our surfacefeatures eﬃciently, one natural approach is to aggregate statistics of these features at variouslocations. A key component of deep learning models, max-pooling (Boureau et al., 2010)takes the most responsive node of a given region of interest. In our system, we borrow theidea of max-pooling and apply it to the extracted sparse coding surface features (sparsecodes) from HSC (Fig. 1 (g)). Speciﬁcally, one could compute the max value for each feature( t features obtained from HSC) over all patches ( p patches per subject), which is equivalentto applying a high-pass ﬁlter to the learned sparse codes. These summary statistics are muchlower in dimension t compared to using all of the learned surface patch features and reduceover ﬁtting. Finally, Adaboost (Rojas, 2009) classiﬁer is used for binary classiﬁcation, asshown in Fig. 1 (h).

3. Results

In Dataset I, we try to use ventricular morphometry features to discriminate betweenMCIc and MCIs subjects. To extract hyperbolic surface features, we automatically segmentlateral ventricular volumes with the multi-atlas ﬂuid image alignment (MAFIA) method (Chou et al.,2010) from each MRI scan. We then use a topology-preserving level set method (Han et al.,2009) to build surface models and the marching cube algorithm (Lorensen and Cline, 1987)is applied to construct triangular surface meshes (Fig. 1 (b)). After the topology optimiza-tion, we apply hyperbolic Ricci ﬂow method and conformally map the ventricular surface tothe Poincar´e disk and register them via harmonic map (Shi et al., 2015). We ﬁnally com-pute the surface TBM features (Shi et al., 2015) and smooth them with surface heat kernelmethod (Chung et al., 2005).Here we select 2 ,

000 ring-shaped patches (300-vertex, Fig. 3) by the FPSBS method(Alg. 2) on each side of ventricle for each subject and ﬁnally have n = 532 ,

000 ring-shapedpatches. The generated patches are consistent across all subjects since surfaces are regis-tered already (Shi et al., 2015). Surface TBM is a scalar feature deﬁned on each vertexon the hyperbolic domain so the feature number of each subject is 1 , ,

000 ( m = 300, p = 4 ,

000 in notations of Sec. 2.6, 2.7). We initialize the dictionary via selecting randompatches (Coates and Ng, 2011), which has been shown to be a very eﬃcient method in prac-tice. Then we learn the dictionary and sparse codes by HSC using the initial dictionary.All our experiments involve training for ten epochs with a batch size of one. After the opti-mization, all subjects use the same dictionary ( m = 300, t = 2 ,

000 in notation of Sec. 2.6).With the sparse coding, we obtain 4 ,

000 samples each of which has 2 ,

000 sparse codes persubject. After that, max-pooling is adopted to choose the maximum value for each sparsecode as a feature on these 4 ,

000 samples. Our ﬁnal feature dimension for classiﬁcation is2 ,

000 per subject.We take a nested cross-validation approach by pre-separating training, validation andtesting sets. Speciﬁcally, we use the ratio of 7:1:2 for training, validation and testing. We se-lect the hyper parameters based on the validation set and test all methods on the same testingset. Besides, we also compare our work with some other measures and methods. We computebilateral ventricular volumes and surface areas, which are used as MRI biomarkers in ADresearch. We also compare HSC with a ventricular surface shape method in (Ferrarini et al.,11ataset I HSC Shape Volume Area LRSDLwhole ACC 85.19% 70.37% 66.67% 59.26% 77.78%SEN 76.92% 100.00% 85.71% 78.57% 75.00%SPE 81.25% 57.89% 63.16% 57.89% 80.00%AUC 0.8516 0.7857 0.6731 0.5934 0.7775left ACC 74.07% 81.48% 62.96% 59.26% 70.37%SEN 76.92% 76.92% 61.54% 83.33% 76.92%SPE 71.43% 75.00% 56.52% 52.63% 64.29%AUC 0.7418 0.8159 0.6264 0.5907 0.706right ACC 70.37% 66.67% 62.96% 59.26% 70.37%SEN 53.85% 84.62% 84.62% 78.57% 69.23%SPE 78.57% 57.14% 57.89% 57.89% 71.43%AUC 0.7088 0.6703 0.6319 0.5879 0.7033

Table 3: The comparison results by nested cross-validation on Dataset I.

Shape ), which automatically generates comparable meshes of all ventricles. The de-formations based on the morphometry model are employed with repeated permutation testsand then used as geometry features. With our ventricle surface registration results, we fol-low the

Shape work (Ferrarini et al., 2008) for selecting biomarkers and use support vectormachine (SVM) for classiﬁcation on the same dataset. We implement the low-rank shareddictionary learning (LRSDL) method, based on the paper (Vu and Monga, 2017) and thegithub source code . We select the hyper-parameter for LRSDL by using the same strategyas HSC on the training set. We run LRSDL 50 iterations with λ = 0 . λ = 0 . η = 0 . k = 20, k = 10. Same as HSC, we apply the LRSDL on ring-shape patches and applymax-pooling as post processing on the learnt sparse codes. The same classiﬁer is appliedon the learnt features on the same test set as HSC. We test HSC, Shape, volume, area andLRSDL measures on the left, right and whole ventricle, respectively. Accuracy (ACC), Sen-sitivity (SEN), Speciﬁcity (SPE) and compute Area Under The Curve (AUC) are computedto evaluate classiﬁcation results. Table 3 shows classiﬁcation performances of four methods.From the experimental results, we can ﬁnd that the best accuracy (85.19%) and the bestspeciﬁcity (81.25%) are achieved when we use TBM features on ventricle hyperbolic space ofboth sides (whole) for training and testing. The comparison shows that our new frameworkselects better features, and achieves better and more meaningful classiﬁcation results. TheHSC algorithm with whole ventricle TBM features achieves the best AUC (0.8516). Thecomparison demonstrates that our proposed algorithm may be useful for AD diagnosis andprognosis research.

Many researches have analyzed that the cortial surface morphometry is a valid imagingbiomarker for AD (Shi and Wang, 2020; Thompson et al., 2004b; Chung et al., 2008). In https://github.com/tiepvupsu/DICTOL python igure 5: Modeling cortical surface with hyperbolic geometry. (a) shows six identiﬁed open boundaries, γ , · · · , γ . (b) shows the hyperbolic parameter space, which is the Poincar´e disk model Dataset II, we apply HSC to analyze cortical morphometry for AD related clinical groupclassiﬁcation. We use the left hemispheric cerebral cortices and follow (Shi and Wang, 2020)to preprocess cortical surface data. We ﬁrst use FreeSurfer software (Fischl, 2012) to pre-process the MRIs of 115 subjects and reconstruct their left cortical surfaces. The Caretsoftware (Van Essen, 2012) is then used to automatically label six major brain landmarks,which include the Central Sulcus, Anterior Half of the Superior Temporal Gyrus, SylvianFissure, Calcarine Sulcus, Medial Wall Ventral Segment and Medial Wall Dorsal Segment.Fig. 5 (a) shows an example of the landmark curves on the left cortical surface, where thesix landmark curves are modeled as open boundaries and denoted as γ , · · · , γ . The fun-damental group of paths are computed by connecting boundary γ to every other boundaryand the path is denoted as τ , τ , τ , τ , τ . Fig. 5 (b) shows that they are embedded into thePoincar´e disk. After we cut the cortical surfaces along the delineated landmark curves, thecortical surfaces become genus-0 surfaces with six open boundaries. We ﬁnally randomlyselect the left cortical surface of a CU subject, who is not in the studied subject dataset, asthe template surface, and perform the processing steps described in Sec. 2.3 and Sec. 2.4 toget the hyperbolic surface TBM features.For Dataset II, since there are only 115 subject for three classes, we use the same hy-perparameter as what we used in Dataset I for training. We apply ﬁve-fold cross-validationto evaluate our algorithm, which guarantees the model is tested on all subjects. All exper-iments are trained for k = 10 epochs with a batch size of 1. The regularization parameter λ is set to 0 . ≈ . / √ m , 1 / √ m is a classical normalization factor (Bickel et al., 2009)and the constant 1.2 has been shown to produce about 10 non-zero coeﬃcients. We select p = 2 ,

000 ring-shaped patches as shown in Fig. 6 by FPSBS on the cortical surface and wehave n = 230 ,

000 ring-shaped patches for Dataset II. Fig. 6 (right) is the visualization ofcortical morphometry on the hyperbolic parameter domain and Fig. 6 (left) projects the se-lected patches on the hyperbolic parameter domain back to the original cortical surface. OurFPSBS patch selection algorithm can maintain the same topological structure as the parame-ter domain. After learning the sparse codes via HSC, we apply max-pooling (Boureau et al.,2010) for further dimension reduction. Finally, we employ the Adaboost (Rojas, 2009) todo the binary classiﬁcation and distinguish individuals from diﬀerent groups. We report theclassiﬁcation results of (1) AD vs. CU, (2) AD vs. MCI and (3) MCI vs. CU in Table 4.13 igure 6: Visualization of computed image patches on the cortical surface (left) and hyperbolic space (right).Each patch has a unique color. The zoom-in pictures show some overlapping areas between image patches.

Dataset II AD vs. CU AD vs. MCI MCI vs. CUACC 88.57% 82.67% 80%SEN 89.29% 84.00% 79.17%SPE 83.33% 70.00% 84.44%AUC 0.879 0.8111 0.7944

Table 4: The classiﬁcation results by ﬁve-fold cross-validation on Dataset II.

In our prior work (Shi and Wang, 2020), we have shown that the hyperbolic surfacefeatures are signiﬁcantly associated with the diagnostic disease severity. However, it isdiﬃcult to directly use hyperbolic surface features for diﬀerent stages of disease diagnosisclassiﬁcation due to the large amount of features and limited subject numbers. Table 4shows that HSC overcomes the above issue and FPSBS has a good generalization capabilityto capture the meaningful features from ring-shaped patches. HSC works well on even moresubtle diﬀerence classiﬁcation problem (CU vs. MCI) compared with AD vs. CU. Our newframework makes meaningful and high performances on diﬀerent groups and may be usefulfor AD diagnosis and prognosis researches.

4. Discussion

The current work presents our initial eﬀorts to develop eﬃcient machine learning meth-ods to work with brain sMRI features computed from general topological surfaces. Wevalidate our proposed FPSBS and HSC methods on two datasets and the preliminary ex-perimental results demonstrate that the proposed algorithms outperform some other workson classiﬁcation accuracy. By reducing the dimension of hyperbolic TBM features with thenovel HSC algorithm, the present study is capable of applying the low-dimensional HSCmeasures to diagnose AD and its prodromal stages. In Dataset I, the proposed systemsuccessfully distinguishes the ventricular HSC measures of MCIc subjects from MCIs sub-jects with a higher accuracy ( > > Table 5: Studies to distinguish MCI converters (MCIc) from nonconverters (MCInc). signiﬁcant ventricular morphometry diﬀerences between MCI converter group and MCI sta-ble group using a novel ventricular morphometry analysis system. Using the same MRIcohorts, this work further proposes HSC to extract eﬀective structural features and classify71 MCI converters versus 62 nonconverters. The AUC is 0.85. Moradi et al. (2015) pro-posed a low density separation scheme to learn aggregate biomarkers to discriminate 164MCI converters from 100 stable MCI patients. They achieved a 0.7661 AUC. The studyof (Sørensen et al., 2016) applied hippocampal texture features and support vector machine(SVM) classiﬁer to predict 8 MCI converters versus 17 nonconverters. The best prognosticAUC was 0.83. Chincarini et al. (2011) applied a medial temporal lobe intensity and texturalfeatures and SVM classiﬁer to separate 136 MCI converters from 166 MCI non converterswith AUC=0.74. Collij et al. (2016) applied SVM classiﬁer to predict arterial spin labelingperfusion maps of 12 patients with MCI diagnosis converted to AD versus 12 subjects withstable MCI. The AUC was 0.71. Table 5 presents the AUC values of this work and the abovestudies. Compared to other single modality neuroimaging-based classiﬁers, our proposed sys-tem has a larger or comparable AUC on predicting MCI converters versus nonconverters.There are also studies demonstrating that multimodality machine learning models have supe-rior performances than single modality classiﬁers (Varatharajah et al., 2019; Rathore et al.,2017; Moradi et al., 2015). We have developed a series of surface-based biomarkers of variousbrain structures for AD research (Dong et al., 2020b; Wang et al., 2010, 2011; Dong et al.,2019; Fan et al., 2018). Our latest work (Dong et al., 2020a) has indicated that combiningthese biomarkers could empower the prediction of AD progression. In future work, we ex-pect to improve the MCI conversion prediction performance by introducing these eﬀectivemultiple statistics. We also note that the study (Edmonds et al., 2019) proposed a neuropsy-chological approach to improve the reliability of staging early and late MCI. We expect thatour biomarkers for predicting MCI converters versus nonconverters also work well with reli-able neuropsychological tests for staging early and late MCI. We will study it in our futurework.There are three important caveats when applying the proposed framework to AD diagno-sis and prognosis. First, because of the overlapping patch selection and max-pooling scheme,we generally cannot visualize the selected features and it decreases the comprehensibility16lthough we may always visualize statistically signiﬁcant regions in our prior group diﬀer-ence studies (Shi et al., 2013; Wang et al., 2013). However, our recent work (Zhang et al.,2018a) made some progress which may potentially better address this problem. In our re-cent work (Zhang et al., 2018a), instead of randomly selecting patches to build the initialdictionary, we use group lasso screening to select the most signiﬁcant features. Therefore,the features used in sparse coding may be visualized on the surface map. In future, wewill incorporate this idea into our current framework to improve its interpretation ability.Second, our current work, similar to several other work (e.g. Fan et al., 2007; Colliot et al.,2008; Kl¨oppel et al., 2008; Gerardin et al., 2009; Magnin et al., 2009; Cuingnet et al., 2011;Liu et al., 2011; Shen et al., 2012; Ben Ahmed et al., 2015), uses clinical diagnoses as the“ground truth” diagnoses for training and cross-validation. However, some recent work (e.g.Beach et al., 2012) has reported that neuropathological diagnoses only had limited accuracyvalues (e.g. only 80 - 90% of the labels are correct) when conﬁrmed with AD histopathology.Under this limitation, we should be cautious when making inferences and conclusions on ourwork for the AD diagnosis since our discovered features are not necessarily real AD biomark-ers. Even so, our recent work (Wu et al., 2018) has studied hippocampal morphometry on acohort consisting of Aβ positive AD ( N = 151) and matched Aβ negative cognitively unim-paired subjects ( N = 271) where Aβ positivity was determined using mean-cortical standarduptake value ratio (SUVR) with cerebellum as the reference region over the amyloid PETimages. With our Euclidean SCC work (Zhang et al., 2016b) integrating the proposed HSCand MP methods, we achieved an accuracy rate of 90.48% in that task (Wu et al., 2018). Theresults demonstrate that our proposed framework may potentially help discover pathology-conﬁrmed AD biomarkers. Third, as our initial attempt to integrate geometry analysisand machine learning method for AD diagnosis, the current work reports very limited ex-perimental results since we mainly reuse the data in our prior published work (Shi et al.,2015; Shi and Wang, 2020). The geometry analysis part involves multiple steps, includingimage segmentation, surface registration, surface parameterization, etc. Our ongoing work,e.g. (Mi et al., 2020), is developing novel approaches which will make the whole process moreautomatic and more accurate. Once they are available, we will apply the proposed methodto analyze more longitudinal ADNI data.

5. Conclusions and Future Works

Here we present a hyperbolic sparse coding with ring-shaped patch selection algorithm,which may improve the accuracy for AD diagnosis and prognosis with sMRI biomarkers.In the future, we will explore whether the proposed framework will work with other shapestatistics, such as spherical harmonics and radial distance. In our previous preclinical ADstudy (Dong et al., 2019), we found APOE-e4 does eﬀects on hippocampal morphometry ofcognitively unimpaired subjects. In our future work, we will further explore whether theproposed FPSBS and HSC methods are useful for such preclinical AD study.

Acknowledgements

References

Beach, T.G., Monsell, S.E., Phillips, L.E., Kukull, W., 2012. Accuracy of the clinical di-agnosis of Alzheimer disease at National Institute on Aging Alzheimer Disease Centers,2005-2010. J. Neuropathol. Exp. Neurol. 71, 266–273.Ben Ahmed, O., Mizotin, M., Benois-Pineau, J., Allard, M., Catheline, G., Ben Amar, C.,2015. Alzheimer’s disease diagnosis on structural MR images using circular harmonic func-tions descriptors on hippocampus and posterior cingulate cortex. Comput Med ImagingGraph 44, 13–25.Bickel, P.J., Ritov, Y., Tsybakov, A.B., 2009. Simultaneous analysis of lasso and dantzigselector. The Annals of Statistics , 1705–1732.Bottou, L., 1998. Online learning and stochastic approximation. Online Learning and NeuralNetworks, Cambridge University Press, Cambridge, UK .Boureau, Y.L., Ponce, J., LeCun, Y., 2010. A theoretical analysis of feature pooling in visualrecognition, in: Proceedings of the 27th international conference on machine learning(ICML-10), pp. 111–118.Brooks, B.L., Iverson, G.L., Holdnack, J.A., Feldman, H.H., 2008. Potential for misclassiﬁ-cation of mild cognitive impairment: A study of memory scores on the Wechsler MemoryScale-III in healthy older adults. Journal of the International Neuropsychological Society14, 463–478. 18rooks, B.L., Iverson, G.L., White, T., 2007. Substantial risk of “Accidental MCI” in healthyolder adults: Base rates of low memory scores in neuropsychological assessment. Journalof the International Neuropsychological Society 13, 490–500.Chincarini, A., Bosco, P., Calvini, P., Gemme, G., Esposito, M., Olivieri, C., Rei, L., Squar-cia, S., Rodriguez, G., Bellotti, R., et al., 2011. Local MRI analysis approach in thediagnosis of early and prodromal Alzheimer’s disease. Neuroimage 58, 469–480.Chou, Y.Y., Lepor´e, N., Saharan, P., Madsen, S.K., Hua, X., Jack, C.R., Shaw, L.M.,Trojanowski, J.Q., Weiner, M.W., Toga, A.W., et al., 2010. Ventricular maps in 804ADNI subjects: correlations with CSF biomarkers and clinical decline. Neurobiology ofaging 31, 1386–1400.Chung, M.K., Dalton, K.M., Davidson, R.J., 2008. Tensor-based cortical surface morphome-try via weighted spherical harmonic representation. Medical Imaging, IEEE Transactionson 27, 1143–1151.Chung, M.K., Robbins, S.M., Dalton, K.M., Davidson, R.J., Alexander, A.L., Evans, A.C.,2005. Cortical thickness analysis in autism with heat kernel smoothing. NeuroImage 25,1256–1265.Coates, A., Ng, A.Y., 2011. The importance of encoding versus training with sparse codingand vector quantization, in: Proceedings of the 28th International Conference on MachineLearning (ICML-11), pp. 921–928.Collij, L.E., Heeman, F., Kuijer, J.P., Ossenkoppele, R., Benedictus, M.R., M¨oller, C.,Verfaillie, S.C., Sanz-Arigita, E.J., van Berckel, B.N., van der Flier, W.M., , Scheltens, P.,Barkhof, F., Wink, A.M., 2016. Application of machine learning to arterial spin labelingin mild cognitive impairment and Alzheimer disease. Radiology 281, 865–875.Colliot, O., Ch´etelat, G., Chupin, M., Desgranges, B., Magnin, B., Benali, H., Dubois, B.,Garnero, L., Eustache, F., Leh´ericy, S., 2008. Discrimination between alzheimer disease,mild cognitive impairment, and normal aging by using automated segmentation of thehippocampus. Radiology 248, 194–201.Combettes, P.L., Wajs, V.R., 2005. Signal Recovery by Proximal Forward-Backward Split-ting. Multiscale Modeling & Simulation 4, 1168–1200.Costafreda, S.G., Dinov, I.D., Tu, Z., Shi, Y., Liu, C.Y., Kloszewska, I., Mecocci, P., Soini-nen, H., Tsolaki, M., Vellas, B., Wahlund, L.O., Spenger, C., Toga, A.W., Lovestone, S.,Simmons, A., 2011. Automated hippocampal shape analysis predicts the onset of dementiain Mild Cognitive Impairment. Neuroimage 56, 212–219.Cuingnet, R., Gerardin, E., Tessieras, J., Auzias, G., Leh´ericy, S., Habert, M.O., Chupin, M.,Benali, H., Colliot, O., 2011. Automatic classiﬁcation of patients with Alzheimer’s diseasefrom structural MRI: a comparison of ten methods using the ADNI database. neuroimage56, 766–781. 19e Rotrou, J., Wenisch, E., Chausson, C., Dray, F., Faucounau, V., Rigaud, A.S., 2005.Accidental MCI in healthy subjects: a prospective longitudinal study. European Journalof Neurology 12, 879–885.Dong, Q., Zhang, J., Li, Q., Wang, J., Lepore, N., Thompson, P.M., Caselli, R.J., Ye, J.,Wang, Y., 2020a. Integrating convolutional neural networks and multi-task dictionarylearning for cognitive decline prediction with longitudinal images. J. Alzheimers Dis. 75,971–992.Dong, Q., Zhang, W., Stonnington, C.M., Wu, J., Gutman, B.A., Chen, K., Su, Y., Baxter,L.C., Thompson, P.M., Reiman, E.M., Caselli, R.J., Wang, Y., 2020b. Applying surface-based morphometry to study ventricular abnormalities of cognitively unimpaired subjectsprior to clinically signiﬁcant memory decline. Neuroimage Clin 27, 102338.Dong, Q., Zhang, W., Wu, J., Li, B., Schron, E.H., McMahon, T., Shi, J., Gutman, B.A.,Chen, K., Baxter, L.C., Thompson, P.M., Reiman, E.M., Caselli, R.J., Wang, Y., 2019.Applying surface-based hippocampal morphometry to study APOE-E4 allele dose eﬀectsin cognitively unimpaired subjects. Neuroimage Clin 22, 101744.Donoho, D.L., 2006. Compressed sensing. IEEE Transactions on information theory 52,1289–1306.Donoho, D.L., Elad, M., 2003. Optimally sparse representation in general (nonorthogonal)dictionaries via ℓ minimization. Proc. Natl. Acad. Sci. U.S.A. 100, 2197–2202.Edmonds, E.C., McDonald, C.R., Marshall, A., Thomas, K.R., Eppig, J., Weigand, A.J.,Delano-Wood, L., Galasko, D.R., Salmon, D.P., Bondi, M.W., 2019. Early versus late MCI:Improved MCI staging using a neuropsychological approach. Alzheimer’s & Dementia 15,699–708.Fan, Y., Shen, D., Davatzikos, C., 2005. Classiﬁcation of structural images via high-dimensional image warping, robust feature extraction, and SVM. Med Image ComputComput Assist Interv 8, 1–8.Fan, Y., Shen, D., Gur, R.C., Gur, R.E., Davatzikos, C., 2007. COMPARE: classiﬁcationof morphological patterns using adaptive regional elements. IEEE Trans Med Imaging 26,93–105.Fan, Y., Wang, G., Lepore, N., Wang, Y., 2018. A tetrahedron-based heat ﬂux signature forcortical thickness morphometry analysis, in: International Conference on Medical ImageComputing and Computer-Assisted Intervention, Springer. pp. 420–428.Ferrarini, L., Palm, W.M., Olofsen, H., van der Landen, R., van Buchem, M.A., Reiber,J.H., Admiraal-Behloul, F., 2008. Ventricular shape biomarkers for Alzheimer’s disease inclinical MR images. Magn Reson Med 59, 260–267.Fischl, B., 2012. Freesurfer. Neuroimage 62, 774–781.20olstein, M.F., Folstein, S.E., McHugh, P.R., 1975. “mini-mental state”: a practical methodfor grading the cognitive state of patients for the clinician. Journal of psychiatric research12, 189–198.Freund, Y., Schapire, R.E., 1997. A decision-theoretic generalization of on-line learning andan application to boosting. Journal of computer and system sciences 55, 119–139.Frisoni, G.B., Fox, N.C., Jack, C.R., Scheltens, P., Thompson, P.M., 2010. The clinical useof structural MRI in Alzheimer disease. Nat Rev Neurol 6, 67–77.Gerardin, E., Ch´etelat, G., Chupin, M., Cuingnet, R., Desgranges, B., Kim, H.S., Nietham-mer, M., Dubois, B., Leh´ericy, S., Garnero, L., Eustache, F., Colliot, O., 2009. Multidi-mensional classiﬁcation of hippocampal shape features discriminates Alzheimer’s diseaseand mild cognitive impairment from normal aging. Neuroimage 47, 1476–1486.Guyon, I., Gunn, S., Nikravesh, M., Zadeh, L.A., 2008. Feature extraction: foundations andapplications. volume 207. Springer.Han, X., Xu, C., Prince, J.L., 2009. A Moving Grid Framework for Geometric DeformableModels. Int J Comput Vis 84, 63–79.den Heijer, T., van der Lijn, F., Koudstaal, P.J., Hofman, A., van der Lugt, A., Krestin,G.P., Niessen, W.J., Breteler, M.M., 2010. A 10-year follow-up of hippocampal volume onmagnetic resonance imaging in early dementia and cognitive decline. Brain 133, 1163–1172.Jack, C., Petersen, R.C., Xu, Y.C., O’Brien, P.C., Smith, G.E., Ivnik, R.J., Boeve, B.F.,Waring, S.C., Tangalos, E.G., Kokmen, E., 1999. Prediction of AD with MRI-basedhippocampal volume in mild cognitive impairment. Neurology 52, 1397–1397.Jack, C., Slomkowski, M., Gracon, S., Hoover, T.M., Felmlee, J.P., Stewart, K., Xu, Y.,Shiung, M., O’Brien, P.C., Cha, R., Knopman, D., Petersen, R.C., 2003. MRI as abiomarker of disease progression in a therapeutic trial of milameline for AD. Neurology60, 253–260.Jack, C.R., Bennett, D.A., Blennow, K., Carrillo, M.C., Feldman, H.H., Frisoni, G.B., Ham-pel, H., Jagust, W.J., Johnson, K.A., Knopman, D.S., Petersen, R.C., Scheltens, P.,Sperling, R.A., Dubois, B., 2016. A/T/N: an unbiased descriptive classiﬁcation schemefor Alzheimer disease biomarkers. Neurology 87, 539–547.Jain, A., Zongker, D., 1997. Feature selection: Evaluation, application, and small sampleperformance. IEEE transactions on pattern analysis and machine intelligence 19, 153–158.Jiang, X., Li, X., Lv, J., Zhang, T., Zhang, S., Guo, L., Liu, T., 2015. Sparse representationof HCP grayordinate data reveals novel functional architecture of cerebral cortex. HumBrain Mapp 36, 5301–5319.Jolliﬀe, I., 2011. Principal component analysis, in: International encyclopedia of statisticalscience. Springer, pp. 1094–1096. 21l¨oppel, S., Stonnington, C.M., Chu, C., Draganski, B., Scahill, R.I., Rohrer, J.D., Fox,N.C., Jack Jr, C.R., Ashburner, J., Frackowiak, R.S., 2008. Automatic classiﬁcation ofMR scans in Alzheimer’s disease. Brain 131, 681–689.Lee, H., Battle, A., Raina, R., Ng, A., 2006. Eﬃcient sparse coding algorithms, in: Advancesin neural information processing systems, pp. 801–808.Leung, K.K., Barnes, J., Ridgway, G.R., Bartlett, J.W., Clarkson, M.J., Macdonald, K.,Schuﬀ, N., Fox, N.C., Ourselin, S., 2010. Automated cross-sectional and longitudinalhippocampal volume measurement in mild cognitive impairment and Alzheimer’s disease.NeuroImage 51, 1345–1359.Li, S., Yuan, X., Pu, F., Li, D., Fan, Y., Wu, L., Chao, W., Chen, N., He, Y., Han, Y.,2014. Abnormal changes of multidimensional surface features using multivariate patternclassiﬁcation in amnestic mild cognitive impairment patients. Journal of Neuroscience 34,10541–10553.Li, Y., Chen, H., Jiang, X., Li, X., Lv, J., Li, M., Peng, H., Tsien, J.Z., Liu, T., 2017.Transcriptome Architecture of Adult Mouse Brain Revealed by Sparse Coding of Genome-Wide In Situ Hybridization Images. Neuroinformatics 15, 285–295.Lin, B., Li, Q., Sun, Q., Lai, M.J., Davidson, I., Fan, W., Ye, J., 2014. Stochastic coordi-nate coding and its application for drosophila gene expression pattern annotation. arXivpreprint arXiv:1407.8147 .Liu, Y., Paajanen, T., Zhang, Y., Westman, E., Wahlund, L.O., Simmons, A., Tunnard,C., Sobow, T., Mecocci, P., Tsolaki, M., Vellas, B., Muehlboeck, S., Evans, A., Spenger,C., Lovestone, S., Soininen, H., 2011. Combination analysis of neuropsychological testsand structural MRI measures in diﬀerentiating AD, MCI and control groups–the AddNeu-roMed study. Neurobiol. Aging 32, 1198–1206.Lorensen, W.E., Cline, H.E., 1987. Marching cubes: A high resolution 3D surface construc-tion algorithm, in: ACM siggraph computer graphics, ACM. pp. 163–169.Lv, J., Jiang, X., Li, X., Zhu, D., Chen, H., Zhang, T., Zhang, S., Hu, X., Han, J., Huang,H., Zhang, J., Guo, L., Liu, T., 2015a. Sparse representation of whole-brain fMRI signalsfor identiﬁcation of functional networks. Med Image Anal 20, 112–134.Lv, J., Jiang, X., Li, X., Zhu, D., Zhang, S., Zhao, S., Chen, H., Zhang, T., Hu, X., Han,J., Ye, J., Guo, L., Liu, T., 2015b. Holistic atlases of functional networks and interactionsreveal reciprocal organizational architecture of cortical function. IEEE Trans Biomed Eng62, 1120–1131.Lv, J., Lin, B., Li, Q., Zhang, W., Zhao, Y., Jiang, X., Guo, L., Han, J., Hu, X., Guo, C.,Ye, J., Liu, T., 2017. Task fMRI data analysis based on supervised stochastic coordinatecoding. Med Image Anal 38, 1–16. 22agnin, B., Mesrob, L., Kinkingn´ehun, S., P´el´egrini-Issac, M., Colliot, O., Sarazin, M.,Dubois, B., Leh´ericy, S., Benali, H., 2009. Support vector machine-based classiﬁcation ofAlzheimer’s disease from whole-brain anatomical MRI. Neuroradiology 51, 73–83.Mairal, J., Bach, F., Ponce, J., Sapiro, G., 2009. Online dictionary learning for sparse coding,in: Proceedings of the 26th annual international conference on machine learning, ACM.pp. 689–696.Mi, L., Zhang, W., Wang, Y., 2020. Regularized wasserstein means for aligning distributionaldata, in: Proceedings of 34th Conference on Artiﬁcial Intelligence (AAAI-20), pp. 5166–5173.Moenning, C., Dodgson, N.A., 2003. Fast Marching farthest point sampling. Techni-cal Report UCAM-CL-TR-562. University of Cambridge, Computer Laboratory. URL: .Moradi, E., Pepe, A., Gaser, C., Huttunen, H., Tohka, J., 2015. Machine learning frameworkfor early MRI-based Alzheimer’s conversion prediction in MCI subjects. Neuroimage 104,398–412.Patel, J.R., Shah, T.R., Shingadiya, V.P., Patel, V.B., 2015. Comparison between breadthﬁrst search and nearest neighbor algorithm for waveguide path planning. Int. J. Researchand Scientiﬁc Innovation 2, 19–21.Petersen, R.C., Aisen, P., Beckett, L.A., Donohue, M., Gamst, A., Harvey, D.J., Jack, C.,Jagust, W., Shaw, L., Toga, A., et al., 2010. Alzheimer’s disease neuroimaging initiative(ADNI): clinical characterization. Neurology 74, 201–209.Qiu, A., Brown, T., Fischl, B., Ma, J., Miller, M.I., 2010. Atlas generation for subcorticaland ventricular structures with its applications in shape analysis. IEEE Trans ImageProcess 19, 1539–1547.Rathore, S., Habes, M., Iftikhar, M.A., Shacklett, A., Davatzikos, C., 2017. A review onneuroimaging-based classiﬁcation studies and associated feature extraction methods foralzheimer’s disease and its prodromal stages. NeuroImage 155, 530–548.Rey, A., 1964. L’examen clinique en psychologie. Presses Universitaires de France; Paris.Rojas, R., 2009. Adaboost and the super bowl of classiﬁers a tutorial introduction to adaptiveboosting. Freie University, Berlin, Tech. Rep .Rosen, W.G., Mohs, R.C., Davis, K.L., 1984. A new rating scale for Alzheimer’s disease.The American journal of psychiatry .Saadi, K., Talbot, N.L., Cawley, G.C., 2007. Optimally regularised kernel Fisher discriminantclassiﬁcation. Neural Netw 20, 832–841.Scholkopft, B., Mullert, K.R., 1999. Fisher discriminant analysis with kernels. Neuralnetworks for signal processing IX 1, 1. 23hen, K.K., Fripp, J., M´eriaudeau, F., Ch´etelat, G., Salvado, O., Bourgeat, P., 2012. Detect-ing global and local hippocampal shape changes in Alzheimer’s disease using statisticalshape models. Neuroimage 59, 2155–2166.Shi, J., Stonnington, C.M., Thompson, P.M., Chen, K., Gutman, B., Reschke, C., Baxter,L.C., Reiman, E.M., Caselli, R.J., Wang, Y., 2015. Studying ventricular abnormalities inMild Cognitive Impairment with hyperbolic Ricci ﬂow and Tensor-based morphometry.NeuroImage 104, 1–20.Shi, J., Thompson, P.M., Gutman, B., Wang, Y., 2013. Surface ﬂuid registration of conformalrepresentation: application to detect disease burden and genetic inﬂuence on hippocampus.Neuroimage 78, 111–134.Shi, J., Wang, Y., 2020. Hyperbolic Wasserstein Distance for Shape Indexing. IEEE TransPattern Anal Mach Intell 42, 1362–1376.Shi, J., Zhang, W., Tang, M., Caselli, R.J., Wang, Y., 2017. Conformal invariants formultiply connected surfaces: Application to landmark curve-based brain morphometryanalysis. Med Image Anal 35, 517–529.Sørensen, L., Igel, C., Liv Hansen, N., Osler, M., Lauritzen, M., Rostrup, E., Nielsen, M.,2016. Early detection of Alzheimer’s disease using MRI hippocampal texture. HumanBrain Papping 37, 1148–1161.Styner, M., Lieberman, J.A., McClure, R.K., Weinberger, D.R., Jones, D.W., Gerig, G.,2005. Morphometric analysis of lateral ventricles in schizophrenia and healthy controlsregarding genetic and disease-speciﬁc factors. Proc. Natl. Acad. Sci. U.S.A. 102, 4872–4877.Sun, D., van Erp, T.G., Thompson, P.M., Bearden, C.E., Daley, M., Kushan, L., Hardt,M.E., Nuechterlein, K.H., Toga, A.W., Cannon, T.D., 2009. Elucidating a magneticresonance imaging-based neuroanatomic biomarker for psychosis: classiﬁcation analysisusing probabilistic brain atlas and machine learning algorithms. Biol. Psychiatry 66,1055–1060.Thompson, P.M., Giedd, J.N., Woods, R.P., MacDonald, D., Evans, A.C., Toga, A.W., 2000.Growth patterns in the developing brain detected by using continuum mechanical tensormaps. Nature 404, 190–193.Thompson, P.M., Hayashi, K.M., De Zubicaray, G.I., Janke, A.L., Rose, S.E., Semple, J.,Hong, M.S., Herman, D.H., Gravano, D., Doddrell, D.M., Toga, A.W., 2004a. Mappinghippocampal and ventricular change in Alzheimer disease. Neuroimage 22, 1754–1766.Thompson, P.M., Hayashi, K.M., Sowell, E.R., Gogtay, N., Giedd, J.N., Rapoport, J.L.,de Zubicaray, G.I., Janke, A.L., Rose, S.E., Semple, J., Doddrell, D.M., Wang, Y., vanErp, T.G., Cannon, T.D., Toga, A.W., 2004b. Mapping cortical change in Alzheimer’sdisease, brain development, and schizophrenia. Neuroimage 23 Suppl 1, 2–18.24ibshirani, R., 1994. Regression shrinkage and selection via the LASSO. Journal of theRoyal Statistical Society, Series B 58, 267–288.Tosun, D., Chen, Y.F., Yu, P., Sundell, K.L., Suhy, J., Siemers, E., Schwarz, A.J., Weiner,M.W., 2016. Amyloid status imputed from a multimodal classiﬁer including structural mridistinguishes progressors from nonprogressors in a mild alzheimer’s disease clinical trialcohort. Alzheimer’s & Dementia 12, 977–986.Tosun, D., Joshi, S., Weiner, M.W., 2014. Multimodal MRI-based imputation of the A β + inearly mild cognitive impairment. Annals of clinical and translational neurology 1, 160–170.Tsui, A., Fenton, D., Vuong, P., Hass, J., Koehl, P., Amenta, N., Coeurjolly, D., DeCarli,C., Carmichael, O., 2013. Globally optimal cortical surface matching with exact landmarkcorrespondence. Inf Process Med Imaging 23, 487–498.Van Essen, D.C., 2012. Cortical cartography and caret software. Neuroimage 62, 757–764.Varatharajah, Y., Ramanan, V.K., Iyer, R., Vemuri, P., 2019. Predicting short-term MCI-to-AD progression using imaging, CSF, genetic factors, cognitive resilience, and demo-graphics. Scientiﬁc reports 9, 1–15.Vemuri, P., Gunter, J.L., Senjem, M.L., Whitwell, J.L., Kantarci, K., Knopman, D.S., Boeve,B.F., Petersen, R.C., Jack, C.R., 2008. Alzheimer’s disease diagnosis in individual subjectsusing structural MR images: validation studies. Neuroimage 39, 1186–1197.Vounou, M., Nichols, T.E., Montana, G., Initiative, A.D.N., et al., 2010. Discovering ge-netic associations with high-dimensional neuroimaging phenotypes: A sparse reduced-rankregression approach. Neuroimage 53, 1147–1159.Vu, T.H., Monga, V., 2017. Fast low-rank shared dictionary learning for image classiﬁcation.IEEE Transactions on Image Processing 26, 5160–5175.Wang, Y., Song, Y., Rajagopalan, P., An, T., Liu, K., Chou, Y.Y., Gutman, B., Toga,A.W., Thompson, P.M., Initiative, A.D.N., et al., 2011. Surface-based TBM boosts powerto detect disease eﬀects on the brain: an N=804 ADNI study. Neuroimage 56, 1993–2010.Wang, Y., Yuan, L., Shi, J., Greve, A., Ye, J., Toga, A.W., Reiss, A.L., Thompson, P.M.,2013. Applying tensor-based morphometry to parametric surfaces can improve MRI-baseddisease diagnosis. Neuroimage 74, 209–230.Wang, Y., Zhang, J., Gutman, B., Chan, T.F., Becker, J.T., Aizenstein, H.J., Lopez, O.L.,Tamburo, R.J., Toga, A.W., Thompson, P.M., 2010. Multivariate tensor-based morphom-etry on surfaces: application to mapping ventricular abnormalities in HIV/AIDS. Neu-roimage 49, 2141–2157.Wee, C.Y., Liu, C., Lee, A., Poh, J.S., Ji, H., Qiu, A., Initiative, A.D.N., et al., 2019. Corticalgraph neural network for ad and mci diagnosis and transfer learning across populations.NeuroImage: Clinical 23, 101929. 25einer, M.W., Veitch, D.P., Aisen, P.S., Beckett, L.A., Cairns, N.J., Green, R.C., Harvey,D., Jack, C.R., Jagust, W., Liu, E., Morris, J.C., Petersen, R.C., Saykin, A.J., Schmidt,M.E., Shaw, L., Shen, L., Siuciak, J.A., Soares, H., Toga, A.W., Trojanowski, J.Q., 2013.The Alzheimer’s Disease Neuroimaging Initiative: a review of papers published since itsinception. Alzheimers Dement 9, e111–194.Wolz, R., Heckemann, R.A., Aljabar, P., Hajnal, J.V., Hammers, A., Lotjonen, J., Rueck-ert, D., 2010. Measurement of hippocampal atrophy using 4D graph-cut segmentation:application to ADNI. Neuroimage 52, 109–118.Wu, J., Zhang, J., Shi, J., Chen, K., Caselli, R.J., Reiman, E.M., Wang, Y., 2018. Hip-pocampus morphometry study on pathology-conﬁrmed alzheimer’s disease patients withsurface multivariate morphometry statistics, in: 2018 IEEE 15th International Symposiumon Biomedical Imaging (ISBI 2018), IEEE. pp. 1555–1559.Wu, T.T., Lange, K., 2008. Coordinate Descent Algorithms for LASSO Penalized Regression.The Annals of Applied Statistics 2, 224–244.Yang, J., Wright, J., Huang, T.S., Ma, Y., 2010. Image super-resolution via sparse repre-sentation. IEEE Trans Image Process 19, 2861–2873.Yin, W., Osher, S., Goldfarb, D., Darbon, J., 2008. Bregman iterative algorithms for ℓ1