Kim Han Thung
University of North Carolina at Chapel Hill
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Kim Han Thung.
NeuroImage | 2014
Kim Han Thung; Chong Yaw Wee; Pew Thian Yap; Dinggang Shen
In this work, we are interested in predicting the diagnostic statuses of potentially neurodegenerated patients using feature values derived from multi-modality neuroimaging data and biological data, which might be incomplete. Collecting the feature values into a matrix, with each row containing a feature vector of a sample, we propose a framework to predict the corresponding associated multiple target outputs (e.g., diagnosis label and clinical scores) from this feature matrix by performing matrix shrinkage following matrix completion. Specifically, we first combine the feature and target output matrices into a large matrix and then partition this large incomplete matrix into smaller submatrices, each consisting of samples with complete feature values (corresponding to a certain combination of modalities) and target outputs. Treating each target output as the outcome of a prediction task, we apply a 2-step multi-task learning algorithm to select the most discriminative features and samples in each submatrix. Features and samples that are not selected in any of the submatrices are discarded, resulting in a shrunk version of the original large matrix. The missing feature values and unknown target outputs of the shrunk matrix is then completed simultaneously. Experimental results using the ADNI dataset indicate that our proposed framework achieves higher classification accuracy at a greater speed when compared with conventional imputation-based classification methods and also yields competitive performance when compared with the state-of-the-art methods.
IEEE Journal of Biomedical and Health Informatics | 2015
Feng Li; Loc Tran; Kim Han Thung; Shuiwang Ji; Dinggang Shen; Jiang Li
Accurate classification of Alzheimers disease (AD) and its prodromal stage, mild cognitive impairment (MCI), plays a critical role in possibly preventing progression of memory impairment and improving quality of life for AD patients. Among many research tasks, it is of a particular interest to identify noninvasive imaging biomarkers for AD diagnosis. In this paper, we present a robust deep learning system to identify different progression stages of AD patients based on MRI and PET scans. We utilized the dropout technique to improve classical deep learning by preventing its weight coadaptation, which is a typical cause of overfitting in deep learning. In addition, we incorporated stability selection, an adaptive learning factor, and a multitask learning strategy into the deep learning framework. We applied the proposed method to the ADNI dataset, and conducted experiments for AD and MCI conversion diagnosis. Experimental results showed that the dropout technique is very effective in AD diagnosis, improving the classification accuracies by 5.9% on average as compared to the classical deep learning methods.
Human Brain Mapping | 2015
Yan Jin; Chong Yaw Wee; Feng Shi; Kim Han Thung; Dong Ni; Pew Thian Yap; Dinggang Shen
Autism spectrum disorder (ASD) is a wide range of disabilities that cause life‐long cognitive impairment and social, communication, and behavioral challenges. Early diagnosis and medical intervention are important for improving the life quality of autistic patients. However, in the current practice, diagnosis often has to be delayed until the behavioral symptoms become evident during childhood. In this study, we demonstrate the feasibility of using machine learning techniques for identifying high‐risk ASD infants at as early as six months after birth. This is based on the observation that ASD‐induced abnormalities in white matter (WM) tracts and whole‐brain connectivity have already started to appear within 24 months after birth. In particular, we propose a novel multikernel support vector machine classification framework by using the connectivity features gathered from WM connectivity networks, which are generated via multiscale regions of interest (ROIs) and multiple diffusion statistics such as fractional anisotropy, mean diffusivity, and average fiber length. Our proposed framework achieves an accuracy of 76% and an area of 0.80 under the receiver operating characteristic curve (AUC), in comparison to the accuracy of 70% and the AUC of 70% provided by the best single‐parameter single‐scale network. The improvement in accuracy is mainly due to the complementary information provided by multiparameter multiscale networks. In addition, our framework also provides the potential imaging connectomic markers and an objective means for early ASD diagnosis. Hum Brain Mapp 36:4880–4896, 2015.
Neurobiology of Aging | 2016
Lei Huang; Yan Jin; Yaozong Gao; Kim Han Thung; Dinggang Shen
Alzheimers disease (AD) is an irreversible neurodegenerative disease and affects a large population in the world. Cognitive scores at multiple time points can be reliably used to evaluate the progression of the disease clinically. In recent studies, machine learning techniques have shown promising results on the prediction of AD clinical scores. However, there are multiple limitations in the current models such as linearity assumption and missing data exclusion. Here, we present a nonlinear supervised sparse regression-based random forest (RF) framework to predict a variety of longitudinal AD clinical scores. Furthermore, we propose a soft-split technique to assign probabilistic paths to a test sample in RF for more accurate predictions. In order to benefit from the longitudinal scores in the study, unlike the previous studies that often removed the subjects with missing scores, we first estimate those missing scores with our proposed soft-split sparse regression-based RF and then utilize those estimated longitudinal scores at all the previous time points to predict the scores at the next time point. The experiment results demonstrate that our proposed method is superior to the traditional RF and outperforms other state-of-art regression models. Our method can also be extended to be a general regression framework to predict other disease scores.
PLOS ONE | 2014
Guan Yu; Yufeng Liu; Kim Han Thung; Dinggang Shen
Accurately identifying mild cognitive impairment (MCI) individuals who will progress to Alzheimers disease (AD) is very important for making early interventions. Many classification methods focus on integrating multiple imaging modalities such as magnetic resonance imaging (MRI) and fluorodeoxyglucose positron emission tomography (FDG-PET). However, the main challenge for MCI classification using multiple imaging modalities is the existence of a lot of missing data in many subjects. For example, in the Alzheimers Disease Neuroimaging Initiative (ADNI) study, almost half of the subjects do not have PET images. In this paper, we propose a new and flexible binary classification method, namely Multi-task Linear Programming Discriminant (MLPD) analysis, for the incomplete multi-source feature learning. Specifically, we decompose the classification problem into different classification tasks, i.e., one for each combination of available data sources. To solve all different classification tasks jointly, our proposed MLPD method links them together by constraining them to achieve the similar estimated mean difference between the two classes (under classification) for those shared features. Compared with the state-of-the-art incomplete Multi-Source Feature (iMSF) learning method, instead of constraining different classification tasks to choose a common feature subset for those shared features, MLPD can flexibly and adaptively choose different feature subsets for different classification tasks. Furthermore, our proposed MLPD method can be efficiently implemented by linear programming. To validate our MLPD method, we perform experiments on the ADNI baseline dataset with the incomplete MRI and PET images from 167 progressive MCI (pMCI) subjects and 226 stable MCI (sMCI) subjects. We further compared our method with the iMSF method (using incomplete MRI and PET images) and also the single-task classification method (using only MRI or only subjects with both MRI and PET images). Experimental results show very promising performance of our proposed MLPD method.
Medical Image Analysis | 2015
Gerard Sanroma; Guorong Wu; Yaozong Gao; Kim Han Thung; Yanrong Guo; Dinggang Shen
Recently, multi-atlas patch-based label fusion has received an increasing interest in the medical image segmentation field. After warping the anatomical labels from the atlas images to the target image by registration, label fusion is the key step to determine the latent label for each target image point. Two popular types of patch-based label fusion approaches are (1) reconstruction-based approaches that compute the target labels as a weighted average of atlas labels, where the weights are derived by reconstructing the target image patch using the atlas image patches; and (2) classification-based approaches that determine the target label as a mapping of the target image patch, where the mapping function is often learned using the atlas image patches and their corresponding labels. Both approaches have their advantages and limitations. In this paper, we propose a novel patch-based label fusion method to combine the above two types of approaches via matrix completion (and hence, we call it transversal). As we will show, our method overcomes the individual limitations of both reconstruction-based and classification-based approaches. Since the labeling confidences may vary across the target image points, we further propose a sequential labeling framework that first labels the highly confident points and then gradually labels more challenging points in an iterative manner, guided by the label information determined in the previous iterations. We demonstrate the performance of our novel label fusion method in segmenting the hippocampus in the ADNI dataset, subcortical and limbic structures in the LONI dataset, and mid-brain structures in the SATA dataset. We achieve more accurate segmentation results than both reconstruction-based and classification-based approaches. Our label fusion method is also ranked 1st in the online SATA Multi-Atlas Segmentation Challenge.
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) | 2014
Feng Li; Loc Tran; Kim Han Thung; Shuiwang Ji; Dinggang Shen; Jiang Li
Accurate classification of Alzheimer’s Disease (AD) and its prodromal stage, Mild Cognitive Impairment (MCI), plays a critical role in preventing progression of memory impairment and improving quality of life for AD patients. Among many research tasks, it is of particular interest to identify noninvasive imaging biomarkers for AD diagnosis. In this paper, we present a robust deep learning system to identify different progression stages of AD patients based on MRI and PET scans. We utilized the dropout technique to improve classical deep learning by preventing its weight co-adaptation, which is a typical cause of over-fitting in deep learning. In addition, we incorporated stability selection, an adaptive learning factor and a multi-task learning strategy into the deep learning framework. We applied the proposed method to the ADNI data set and conducted experiments for AD and MCI conversion diagnosis. Experimental results showed that the dropout technique is very effective in AD diagnosis, improving the classification accuracies by 6.2% on average as compared to classical deep learning methods.
medical image computing and computer assisted intervention | 2016
Kim Han Thung; Ehsan Adeli; Pew Thian Yap; Dinggang Shen
Effective utilization of heterogeneous multi-modal data for Alzheimers Disease (AD) diagnosis and prognosis has always been hampered by incomplete data. One method to deal with this is low-rank matrix completion (LRMC), which simultaneous imputes missing data features and target values of interest. Although LRMC yields reasonable results, it implicitly weights features from all the modalities equally, ignoring the differences in discriminative power of features from different modalities. In this paper, we propose stability-weighted LRMC (swLRMC), an LRMC improvement that weights features and modalities according to their importance and reliability. We introduce a method, called stability weighting, to utilize subsampling techniques and outcomes from a range of hyper-parameters of sparse feature learning to obtain a stable set of weights. Incorporating these weights into LRMC, swLRMC can better account for differences in features and modalities for improving diagnosis. Experimental results confirm that the proposed method outperforms the conventional LRMC, feature-selection based LRMC, and other state-of-the-art methods.
international conference on machine learning | 2015
Xiaofeng Zhu; Heung Il Suk; Yonghua Zhu; Kim Han Thung; Guorong Wu; Dinggang Shen
In this paper, we propose a multi-view learning method using Magnetic Resonance Imaging (MRI) data for Alzheimers Disease (AD) diagnosis. Specifically, we extract both Region-Of-Interest (ROI) features and Histograms of Oriented Gradient (HOG) features from each MRI image, and then propose mapping HOG features onto the space of ROI features to make them comparable and to impose high intra-class similarity with low inter-class similarity. Finally, both mapped HOG features and original ROI features are input to the support vector machine for AD diagnosis. The purpose of mapping HOG features onto the space of ROI features is to provide complementary information so that features from different views can not only be comparable (i.e., homogeneous) but also be interpretable. For example, ROI features are robust to noise, but lack of reflecting small or subtle changes, while HOG features are diverse but less robust to noise. The proposed multi-view learning method is designed to learn the transformation between two spaces and to separate the classes under the supervision of class labels. The experimental results on the MRI images from the Alzheimers Disease Neuroimaging Initiative (ADNI) dataset show that the proposed multi-view method helps enhance disease status identification performance, outperforming both baseline methods and state-of-the-art methods.
international conference on machine learning | 2013
Kim Han Thung; Chong Yaw Wee; Pew Thian Yap; Dinggang Shen
Incomplete dataset due to missing values is ubiquitous in multimodal neuroimaging data. Denoting an incomplete dataset as a feature matrix, where each row contains feature values of the multi-modality data of a sample, we propose a framework to predict the corresponding interrelated multiple target outputs (e.g., diagnosis label and clinical scores) from this feature matrix. This is achieved by applying a matrix completion algorithm on a shrunk version of the feature matrix that is augmented with the corresponding target output matrix, to simultaneously predict the missing feature values and the unknown target outputs. We shrink the matrix by first partition the large incomplete feature matrix into smaller submatrices that contain complete feature data. Treating each target output prediction from the submatrix as a task, we perform multi-task learning based feature and sample selections to select the most discriminative features and samples from each submatrix. Features and samples which are not selected from any of the submatrices are removed, resulting in a shrunk feature matrix, which is still incomplete. This shrunk matrix together with its corresponding target matrix (of possibly unknown values) are finally simultaneously completed using a low rank matrix completion algorithm. Experimental results using the ADNI dataset indicate that our proposed framework yields better identification accuracy at higher speed compared with conventional imputation-based identification methods.