Dynamic Functional Connectivity and Graph Convolution Network for Alzheimer's Disease Classification
DDynamic Functional Connectivity and Graph Convolution Network for Alzheimer’s Disease Classification
Xingwei An
Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, China 300072 [email protected] Yang Di
Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, China 300072 [email protected] Yutao Zhou
Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, China 300072 [email protected] Dong Ming
Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, China 300072 [email protected]
ABSTRACT
Alzheimer’s disease (AD) is the most prevalent form of dementia. Traditional methods cannot achieve efficient and accurate diagnosis of AD. In this paper, we introduce a novel method based on dynamic functional connectivity (dFC) that can effectively capture changes in the brain. We compare and combine four different types of features including amplitude of low-frequency fluctuation (ALFF), regional homogeneity (ReHo), dFC and the adjacency matrix of different brain structures between subjects. We use graph convolution network (GCN) which consider the similarity of brain structure between patients to solve the classification problem of non-Euclidean domains. The proposed method’s accuracy and the area under the receiver operating characteristic curve achieved 91.3% and 98.4%. This result demonstrated that our proposed method can be used for detecting AD.
CCS Concepts • Applied computing ➝ Life and medical sciences ➝ Computational biology ➝ Imaging
Keywords
Alzheimer’s disease (AD); graph convolutional network (GCN); functional connectivity (FC); classification INTRODUCTION
Alzheimer’s disease (AD) is an irreversible, progressive neurodegenerative disease that primarily affects the elderly and is the most prevalent form of dementia [1]. The initial clinical manifestations of the patients are a variety of common symptom such as memory loss, speech impairment and cognitive deficit, which severely affected and limited personal daily life and even pose a grave threat of patient with the progression of the disease. It is estimated that with the increasing aging of the global population, 1 out of 85 people will be affected by AD in the future [2]. Therefore, the diagnosis of AD is very crucial. Now, the approach commonly used in this field is to combine resting-state functional magnetic resonance imaging (rs-fMRI) with machine learning and the stationary functional connectivity (sFC). Deep learning methods are also applied to identify and detect AD. However, most of the previous studies believe that these methods have some drawbacks. First, machine learning fails to consider the association between patients. Second, the traditional deep neural network CNN cannot effectively solve the graph structure, since the brain is considered as a complex network with small-world attributes. Deep learning method contain a large number of parameters, which will give rise to long training time and high requirements for computer hardware. Third, sFC cannot reflect time-varying dynamic behavior and ignore the local dynamic changes of the brain during the whole time series [3]. Graph convolution network (GCN) model was proposed to handle the problem of non-Euclidean domains, which can take into account the similarity between different individuals by aggregating adjacency subjects [4, 5]. Compare with other deep learning methods, GCN has simple structure, less parameters and training time. It is conductive to assisting doctors to make decisions timely and accurately. Dynamic functional connectivity can effectively focus on changes in the relationships between brain regions in different time sub-segments. In this paper, we propose a novel method to classify AD based GCN model and dynamic functional connectivity. Specifically, we combine with different type of features such as amplitude of low-frequency fluctuation (ALFF), regional homogeneity (ReHo) and thresholding dynamic functional connectivity (dFC) as feature set was utilized to conduct a classification study of AD with full onsideration of individual similarity and data association between subject’s brain information. Our method was proved that can significantly improve prediction performance and execution speed. MATERIALS AND METHODS 2.1
Data Acquisition
The rs-fMRI time-series data used in this paper were collected from Xuanwu Hospital, Beijing, China. Patients were provided written informed consent. There are total of 423 scans which contained 120 AD and 303 normal controls (NC) from 317 subjects, including 106 AD and 211 NC subjects. Then, we choose 246 (98 AD, 148 NC) scans from 204 patients that contain 94 AD and 110 NC subjects. It’s clearly that some subjects scanned twice or more times, separated by at least one year. 177 scans were excluded for different total time points.
Data Preprocessing remove time points, (2) slice timing correction, (3) head motion realignment. 20 AD and 3 NC scans with a max head motion over 3.0mm translation or 3° rotation were discarded, (4) normalize (5) smooth (6) detrend (7) nuisance covariates regression (8) temporal filter ranging from 0.01-0.08Hz. Notably, regional homogeneity (ReHo) were obtained without spatial smoothing. The subjects’ information of this study as shown in Table 1.
Table 1. Subjects’ information of this study Group AD NC gender(F/M) 47/31 79/66 age(mean ± std) 71.1 ± ± Feature Extraction and Selection
Choosing the appropriate feature set can improve the performance of the classifier and reduce the training time. This section will introduce feature extraction and selection from four aspects.
ALFF and ReHo Feature Extraction
Two sample T-test was used to extract features subsets from the features sets of ALFF and ReHo, we choose false discovery rate (FDR) correction and set 𝑝 < 0.05 and cluster 𝑠𝑖𝑧𝑒 > 50 . Finally, we remained 4 ALFF and 5 ReHo features.
Dynamic FC Construction
In this section, brain was parceled into 116 ROIs by Automated Anatomical Labeling (AAL) atlas. Time series were extracted from each region. To capture temporal variability, the entire time course was split into multiple sub-segments by sliding window approach. We can get K sub-segments by this formula:
𝐾 = [
𝑇−𝐿𝑠 ] + 1 , where 𝑇 means the length of time points, 𝐿 means the length of the sliding window, 𝑠 means the length of step and 𝐾 means the number of segments. Pearson’s correlation coefficient (PCC) was used to construct functional connectivity matrix’s element: c 𝑖𝑗𝑙 =𝑐𝑜𝑟𝑟{𝑥 𝑖𝑙 , 𝑥 𝑗𝑙 } between region 𝑥 𝑖𝑙 and region 𝑥 𝑗𝑙 in 𝑙 -th sliding window. Then we use Fisher’s 𝑧 transformation to normalized 𝑟 value and obtain a 116*116 symmetric matrix 𝐶 . Dynamic FC Features Extraction
In current study, our original FC feature set include a large number of features. Thresholding operation was performed to simplify feature selection stage. We assign the same specific threshold 𝜏 for each subject and obtain a new matrix 𝑀 𝑎(𝑙) = [𝑚 𝑖𝑗(𝑙) ] 𝑛×𝑛 by Figure 1. The pipeline of GCN model for AD classification 𝑚 𝑖𝑗(𝑙) = { |𝑐 𝑖𝑗(𝑙) | , |𝑐 𝑖𝑗(𝑙) | > 𝜏0, 𝑒𝑙𝑠𝑒 (1) where 𝑚 𝑖𝑗𝑙 denotes sub-segment the 𝑙 -th connection strength between ROI 𝑖 -th and ROI 𝑗 -th of the subject 𝑎 -th. Next we change 𝑀 𝑎(𝑙) into 𝐴 𝑎(𝑙) by binary operation for creating similarity adjacency matrix. Therewith we remove matrix 𝑀 ’s lower triangles and the main diagonal values and then retain the upper triangles 6670 values to collapse these values into a vector. The advantage of dynamic FC can sufficiently capture localized properties in different time series. The same feature was accumulated in different segments as follows: 𝑚 𝑖𝑗′ = ∑ 𝑚 𝑖𝑗𝑙𝐾𝑙=1 , 𝑚 𝑖𝑗′ ∈ 𝑀 𝑎 (2) Dynamic FC Features Selection
After the previous operation, two-step approach which significantly improved speed and efficiency was employed for feature selection. First, we take random forest and obtain the most importance 10 features. Second, we use a recursive feature elimination SVM (RFE-SVM) which is backward elimination method that starts with a full set of all features and then remove the most irrelevant features one by one. Final, 9 features were remained for training classifier. The feature subset of ALFF 、 ReHo and dFC were set as a feature matrix
X ∈ ℝ 𝑛×𝑚 each row represents a subject’s 𝑚 feature vectors. Graph Convolution Network
GCN is used to solve non-Euclidean problems, such as event graphs, knowledge graphs, brain network and so on. We define an undirected graph
𝐺(𝑉, 𝐸, 𝑆) with 𝑁 nodes to describe personal network classification that each subject is represented by a node 𝑣 𝑖 ∈ V , 𝑒 𝑖𝑗 denotes edge weight which the similarity of between individual matrix M 𝑖 and individual matrix M 𝑗 . In order to construct a similarity adjacency matrix 𝑆 , 𝑠 𝑖𝑗 ∈ 𝑆 between the subject 𝑖 and 𝑗 in terms of these formulas: 𝑠 𝑖𝑗 = 1 − ∑|𝐴 𝑖 −𝐴 𝑗 | ∑ |𝐴 𝑖 | (3) S(s ij ) = {1, 𝑠 𝑖𝑗 < 𝑡 0, 𝑠 𝑖𝑗 ≥ 𝑡 (4) where Σ denotes the sum of matrix’s element and adjacency matrix between different ROI regions were utilized to compute similarity adjacency matrix S . GCN take the neighbors of subject into account which can associate with other subject’s information to make prediction. In this research, we finish classification for AD and NC as follows: 𝐻 (𝑙+1) = 𝑓(𝐻 (𝑙) , 𝑆) (5) 𝑆̂ = 𝑆 + 𝐼 (6) 𝑓(𝐻 (𝑙) , 𝑆) = 𝜎(𝐷̂ −12
𝑆̂𝐷̂ −12 𝐻 (𝑙) 𝑊 (𝑙) ) (7) where H (l) means feature matrix that H (0) = 𝑋 , 𝑆 means similarity adjacency matrix, 𝐼 means identity matrix, 𝑊 (𝑙) means the weight matrix of 𝑙 th layer, σ is an activate function. The pipeline of GCN model for AD classification is illustrated in Figure 1. RESULTS AND DISCUSSION
The model we used in this paper consist of two graph convolution layers. Each layer followed by an exponential linear unit (ELU). A softmax was used as output layer. The settings of the hyper-parameters during the training stage are as follow: learning rate is 0.06, dropout rate is 0.5, weight decay is 0.0005, the number of epoch is 150, the number of input layer’s neurons is 18 and the number of hidden layer’s neurons is 16. We adopt the Adam algorithm and the cross-entropy as optimizer and loss function, respectively. The sliding window length L is 39 time points with the step size s of 5. The classification accuracy (ACC), precision (PRE), recall (REC), F1 score (F1) and the area under the receiver operating characteristic curve(AUC) were used as evaluation criteria to evaluate the classification performance. Performance of the Different Feature Sets
In this part, we verify from three different types of features and different numbers of features of the same type in order to obtain the optimal feature combination as input. As shown in Table 2, different types of the feature combinations have different classification results. The 9 FC features achieved the improvement of 7%, compared with 6670 FC features. It is indicated that 9 FC features contained some discriminative features that played an important role for recognizing AD. In addition, the performance of combined features is generally better than the performance of single feature. More specifically, we proposed a method that concatenate three type of features to execute the classification task yielded the accuracy up to 91.3%, which outperform than other the combination of feature. Due to ALFF 、 ReHo and FC are heterogeneous, combining them can provide complementary information each other. Finally, we found ALFF and ReHo features including fusiform gyrus, right insula, right dorsolateral superior frontal gyrus, right inferior temporal gyrus,
Method ACC (%) PRE (%) REC (%) F1 (%) AUC (%) type number
ALFF 4 82.6 77.8 77.8 77.8 84.9 ReHo 5 78.2 75.0 66.7 70.6 85.7 FC 9 87.0 80.0
ReHo+ALFF+FC eft medial orbitofrontal gyrus and cerebellum, which manifest that the cerebellum might be related to cognition [7]. Results proved that the cerebellum can provide crucial information for the classification of AD. Compare with State-of-the Art Method
Based on the previous description, in order to test the robustness of our method, we readjust the data distribution of the training set, validation set and test set from
6: 2: 2 to
3: 1: 6 , which is different from traditional data set distribution ratio. We believe that this method may be suitable for unlabeled data, which can considerably reduce the cost of labeling data. In current studies, researchers have taken many different ways to classify and predict AD and NC, they hope to get a trade-off between training speed and model performance. Wang et al. [8] introduced the spatial-temporal information into deep neural network model that model employed convolution component, recurrent component and long short-term memory (LSTM) to process the dependence between time and space. Their method’s accuracy and AUC achieved 90.28% and 89.78%, respectively. Song et al. [9] achieved an accuracy of 88.7% using the graph neural network to classify 40 late mild cognitive impairment (MCI) and 67 NC. This method utilized construct functional connectivity network with static, dynamic and high-level FC. The result of this model show that classifier had pretty good performance. SVM is also widely used in research in this field. Bi et al. [10] proposed a clustering evolutionary random forest for feature extraction and selection. Their accuracy rate was 86.2%. The proposed method in this study has some advantages. On the one hand, compared with traditional deep learning models, our model has a simple architecture and a short training time. On the other hand, compared with the traditional machine learning method, our method can consider the similarity between different subjects and classify and predict the subjects as a whole. CONCLUSION
In this paper, we propose a novel method based on GCN and dynamic connectivity for AD classification. In order to construct graph, three different types of features were fused to represent each vertices and edges were used to capture the similarity of brain structure between individuals. Result demonstrated that consider the structural similarity between different individual brains is helpful to improve the classification accuracy. This method can effectively promote training speed and model performance. ACKNOWLEDGMENTS
The authors sincerely thank Xuanwu Hospital for the data provided to us, and all patients for their active cooperation. This work was supported in part by the National Key Research & Development Program of China (No.2017YFB1300302) and National Natural Science Foundation of China (No.81630051 and 61603269). REFERENCES [1]
S. Rathore, M. Habes, M.A. Iftikhar, A. Shacklett, C. Davatzikos. 2017. A review on neuroimaging-based classification studies and associated feature extraction methods for Alzheimer's disease and its prodromal stages. Neuroimage. 155, (2017), 530-548. DOI:https://doi.org/10.1016/j.neuroimage.2017.03.057. [2]
R. Brookmeyer, E. Johnson, K. Ziegler-Graham, H.M. Arrighi. 2007. Forecasting the global burden of Alzheimer's disease. Alzheimers Dement. 3, 3 (2007), 186-191. DOI:https://doi.org/10.1016/j.jalz.2007.04.381. [3]
R. Liegeois, J. Li, R. Kong, C. Orban, D. Van De Ville, T. Ge, M.R. Sabuncu, B.T.T. Yeo. 2019. Resting brain dynamics at different timescales capture distinct aspects of human behavior. Nat Commun. 10, 1 (2019), 2317. DOI:https://doi.org/10.1038/s41467-019-10317-7. [4]
X. Zhao, F. Zhou, L. Ou-Yang, T. Wang, B. Lei, Ieee, GRAPH CONVOLUTIONAL NETWORK ANALYSIS FOR MILD COGNITIVE IMPAIRMENT PREDICTION, 2019 Ieee 16th International Symposium on Biomedical Imaging2019, pp. 1598-1601. [5]
X.G. Song, A. Elazab, Y.X. Zhang. 2020. Classification of Mild Cognitive Impairment Based on a Combined High-Order Network and Graph Convolutional Network. Ieee Access. 8, (2020), 42816-42827. DOI:https://doi.org/10.1109/access.2020.2974997. [6]
C.-G. Yan, X.-D. Wang, X.-N. Zuo, Y.-F. Zang. 2016. DPABI: Data Processing & Analysis for (Resting-State) Brain Imaging. Neuroinformatics. 14, 3 (2016), 339-351. DOI:https://doi.org/10.1007/s12021-016-9299-4. [7]
C.C. Guo, R. Tan, J.R. Hodges, X. Hu, S. Sami, M. Hornberger. 2016. Network-selective vulnerability of the human cerebellum to Alzheimer's disease and frontotemporal dementia. Brain. 139, Pt 5 (2016), 1527-1538. DOI:https://doi.org/10.1093/brain/aww003. [8]
M. Wang, C. Lian, D. Yao, D. Zhang, M. Liu, D. Shen. 2019. Spatial-Temporal Dependency Modeling and Network Hub Detection for Functional MRI Analysis via Convolutional-Recurrent Network. IEEE Trans Biomed Eng. (2019), DOI:https://doi.org/10.1109/TBME.2019.2957921. [9]
T.A. Song, S.R. Chowdhury, F. Yang, H. Jacobs, G. El Fakhri, Q.Z. Li, K. Johnson, J. Dutta, Ieee, GRAPH CONVOLUTIONAL NEURAL NETWORKS FOR ALZHEIMER'S DISEASE CLASSIFICATION, 2019 Ieee 16th International Symposium on Biomedical Imaging, Ieee, New York, 2019, pp. 414-417. [10]