Xiaodan Fan
The Chinese University of Hong Kong
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Xiaodan Fan.
Computers in Biology and Medicine | 2003
Ying-Fei Sun; Xiaodan Fan; Yan-Da Li
We introduce a new method for splicing sites prediction based on the theory of support vector machines (SVM). The SVM represents a new approach to supervised pattern classification and has been successfully applied to a wide range of pattern recognition problems. In the process of splicing sites prediction, the statistical information of RNA secondary structure in the vicinity of splice sites, e.g. donor and acceptor sites, is introduced in order to compare recognition ratio of true positive and true negative. From the results of comparison, addition of structural information has brought no significant benefit for the recognition of splice sites and had even lowered the rate of recognition. Our results suggest that, through three cross validation, the SVM method can achieve a good performance for splice sites identification.
IEEE Transactions on Neural Systems and Rehabilitation Engineering | 2013
Baojun Chen; Enhao Zheng; Xiaodan Fan; Tong Liang; Qining Wang; Kunlin Wei; Long Wang
Locomotion mode classification is one of the most important aspects for the control of powered lower-limb prostheses. We propose a wearable capacitive sensing system for recognizing locomotion modes as an alternative solution to popular electromyography (EMG)-based systems, aiming to overcome drawbacks of the latter. Eight able-bodied subjects and five transtibial amputees were recruited for automatic classification of six common locomotion modes. The system measured ten channels of capacitance signals from the shank, the thigh, or both. With a phase-dependent linear discriminant analysis classifier and selected time-domain features, the system can achieve a satisfactory classification accuracy of 93.6% ±0.9% and 93.4% ±0.8% for able-bodied subjects and amputee subjects, respectively. The classification accuracy is comparable with that of EMG-based systems. More importantly, we verify that neuro-mechanical delay inherent in capacitive sensing does not affect the timeliness of classification decisions as the system, similar to EMG-based systems, can make multiple judgments during a gait cycle. Experimental results also indicate that capacitance signals from the thigh alone are sufficient for mode classification for both able-bodied and transtibial subjects. Our investigations demonstrate that capacitive sensing is a promising alternative to myoelectric sensing for real-time control of powered lower-limb prostheses.
PLOS ONE | 2013
Claudia H. T. Tam; Janice S. K. Ho; Ying Wang; Vincent K. L. Lam; Heung Man Lee; Guozhi Jiang; Eric S.H. Lau; Alice P.S. Kong; Xiaodan Fan; Jean Woo; Stephen Kwok-Wing Tsui; Maggie C.Y. Ng; Wing Yee So; Juliana C.N. Chan; Ronald C.W. Ma
Background Recent genome-wide association studies (GWAS) identified more than 70 novel loci for type 2 diabetes (T2D), some of which have been widely replicated in Asian populations. In this study, we investigated their individual and combined effects on T2D in a Chinese population. Methodology We selected 14 single nucleotide polymorphisms (SNPs) in T2D genes relating to beta-cell function validated in Asian populations and genotyped them in 5882 Chinese T2D patients and 2569 healthy controls. A combined genetic score (CGS) was calculated by summing up the number of risk alleles or weighted by the effect size for each SNP under an additive genetic model. We tested for associations by either logistic or linear regression analysis for T2D and quantitative traits, respectively. The contribution of the CGS for predicting T2D risk was evaluated by receiver operating characteristic (ROC) analysis and net reclassification improvement (NRI). Results We observed consistent and significant associations of IGF2BP2, WFS1, CDKAL1, SLC30A8, CDKN2A/B, HHEX, TCF7L2 and KCNQ1 (8.5×10−18<P<8.5×10−3), as well as nominal associations of NOTCH2, JAZF1, KCNJ11 and HNF1B (0.05<P<0.1) with T2D risk, which yielded odds ratios ranging from 1.07 to 2.09. The 8 significant SNPs exhibited joint effect on increasing T2D risk, fasting plasma glucose and use of insulin therapy as well as reducing HOMA-β, BMI, waist circumference and younger age of diagnosis of T2D. The addition of CGS marginally increased AUC (2%) but significantly improved the predictive ability on T2D risk by 11.2% and 11.3% for unweighted and weighted CGS, respectively using the NRI approach (P<0.001). Conclusion In a Chinese population, the use of a CGS of 8 SNPs modestly but significantly improved its discriminative ability to predict T2D above and beyond that attributed to clinical risk factors (sex, age and BMI).
Nature Genetics | 2017
Qin Cao; Christine Anyansi; Xihao Hu; Liangliang Xu; Lei Xiong; Wenshu Tang; Myth T.S. Mok; Chao Cheng; Xiaodan Fan; Mark Gerstein; Alfred S.L. Cheng; Kevin Y. Yip
We propose a new method for determining the target genes of transcriptional enhancers in specific cells and tissues. It combines global trends across many samples and sample-specific information, and considers the joint effect of multiple enhancers. Our method outperforms existing methods when predicting the target genes of enhancers in unseen samples, as evaluated by independent experimental data. Requiring few types of input data, we are able to apply our method to reconstruct the enhancer–target networks in 935 samples of human primary cells, tissues and cell lines, which constitute by far the largest set of enhancer–target networks. The similarity of these networks from different samples closely follows their cell and tissue lineages. We discover three major co-regulation modes of enhancers and find defense-related genes often simultaneously regulated by multiple enhancers bound by different transcription factors. We also identify differentially methylated enhancers in hepatocellular carcinoma (HCC) and experimentally confirm their altered regulation of HCC-related genes.
BMC Bioinformatics | 2007
Xiaodan Fan; Jun Zhu; Eric E. Schadt; Jun S. Liu
BackgroundAn important goal of comparative genomics is the identification of functional elements through conservation analysis. Phylo-HMM was recently introduced to detect conserved elements based on multiple genome alignments, but the method has not been rigorously evaluated.ResultsWe report here a simulation study to investigate the power of phylo-HMM. We show that the power of the phylo-HMM approach depends on many factors, the most important being the number of species-specific genomes used and evolutionary distances between pairs of species. This finding is consistent with results reported by other groups for simpler comparative genomics models. In addition, the conservation ratio of conserved elements and the expected length of the conserved elements are also major factors. In contrast, the influence of the topology and the nucleotide substitution model are relatively minor factors.ConclusionOur results provide for general guidelines on how to select the number of genomes and their evolutionary distance in comparative genomics studies, as well as the level of power we can expect under different parameter settings.
Kidney International | 2016
Guozhi Jiang; Cheng Hu; Claudia H. T. Tam; Eric S.H. Lau; Ying Wang; Andrea Luk; Xilin Yang; Alice P.S. Kong; Janice S. K. Ho; Vincent K. L. Lam; Heung Man Lee; Jie Wang; Rong Zhang; Stephen Kwok-Wing Tsui; Maggie C.Y. Ng; Cheuk-Chun Szeto; Weiping Jia; Xiaodan Fan; Wing Yee So; Juliana C.N. Chan; Ronald C.W. Ma
Type 2 diabetes and chronic kidney disease (CKD) may share common risk factors. Here we used a 3-stage procedure to discover novel predictors of CKD by repeatedly applying a stepwise selection based on the Akaike information criterion to subsamples of a prospective complete-case cohort of 2755 patients. This cohort encompassed 25 clinical variables and 36 genetic variants associated with type 2 diabetes, obesity, or fasting plasma glucose. We compared the performance of the clinical, genetic, and clinico-genomic models and used net reclassification improvement to evaluate the impact of top selected genetic variants to the clinico-genomic model. Associations of selected genetic variants with CKD were validated in 2 independent cohorts followed by meta-analyses. Among the top 6 single-nucleotide polymorphisms selected from clinico-genomic data, three (rs478333 of G6PC2, rs7754840 and rs7756992 of CDKAL1) contributed toward the improvement of prediction performance. The variant rs478333 was associated with rapid decline (over 4% per year) in estimated glomerular filtration rate. In a meta-analysis of 2 replication cohorts, the variants rs478333 and rs7754840 showed significant associations with CKD after adjustment for conventional risk factors. Thus, this novel 3-stage approach to a clinico-genomic data set identified 3 novel genetic predictors of CKD in type 2 diabetes. This method can be applied to similar data sets containing clinical and genetic variables to select predictors for clinical outcomes.
PLOS ONE | 2014
Shengtong Han; Raymond K. W. Wong; Thomas C. M. Lee; Linghao Shen; Shuo-Yen Robert Li; Xiaodan Fan
Boolean networks are a simple but efficient model for describing gene regulatory systems. A number of algorithms have been proposed to infer Boolean networks. However, these methods do not take full consideration of the effects of noise and model uncertainty. In this paper, we propose a full Bayesian approach to infer Boolean genetic networks. Markov chain Monte Carlo algorithms are used to obtain the posterior samples of both the network structure and the related parameters. In addition to regular link addition and removal moves, which can guarantee the irreducibility of the Markov chain for traversing the whole network space, carefully constructed mixture proposals are used to improve the Markov chain Monte Carlo convergence. Both simulations and a real application on cell-cycle data show that our method is more powerful than existing methods for the inference of both the topology and logic relations of the Boolean network from observed data.
intelligent systems in molecular biology | 2011
Qiwei Li; Xiaodan Fan; Tong Liang; Shuo-Yen Robert Li
MOTIVATION Repeats detection problems are traditionally formulated as string matching or signal processing problems. They cannot readily handle gaps between repeat units and are incapable of detecting repeat patterns shared by multiple sequences. This study detects short adjacent repeats with interunit insertions from multiple sequences. For biological sequences, such studies can shed light on molecular structure, biological function and evolution. RESULTS The task of detecting short adjacent repeats is formulated as a statistical inference problem by using a probabilistic generative model. An Markov chain Monte Carlo algorithm is proposed to infer the parameters in a de novo fashion. Its applications on synthetic and real biological data show that the new method not only has a competitive edge over existing methods, but also can provide a way to study the structure and the evolution of repeat-containing genes. AVAILABILITY The related C++ source code and datasets are available at http://ihome.cuhk.edu.hk/%7Eb118998/share/BASARD.zip. CONTACT [email protected]
The Annals of Applied Statistics | 2010
Xiaodan Fan; Saumyadipta Pyne; Jun S. Liu
The effort to identify genes with periodic expression during the cell cycle from genome-wide microarray time series data has been ongoing for a decade. However, the lack of rigorous modeling of periodic expression as well as the lack of a comprehensive model for integrating information across genes and experiments has impaired the effort for the accurate identification of periodically expressed genes. To address the problem, we introduce a Bayesian model to integrate multiple independent microarray data sets from three recent genome-wide cell cycle studies on fission yeast. A hierarchical model was used for data integration. In order to facilitate an efficient Monte Carlo sampling from the joint posterior distribution, we develop a novel Metropolis-Hastings group move. A surprising finding from our integrated analysis is that more than 40% of the genes in fission yeast are significantly periodically expressed, greatly enhancing the reported 10-15% of the genes in the current literature. It calls for a reconsideration of the periodically expressed gene detection problem.
2011 INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL MODELS FOR LIFE SCIENCES (CMLS-11) | 2011
Weiqiang Zhou; Hong Yan; Xiaodan Fan; Quan Hao
Protein‐protein interactions play important roles in a lot of biological progress. Previous studies about protein‐protein interactions were mainly based on sequence analysis. As more 3D structural information can be obtained from protein‐protein complexes, structural analysis becomes feasible and useful. In this study, we used structural alignment to predict the protein‐binding site and apply 3D alpha shape modeling to analyze the interface characteristics. We have developed a method for protein‐protein interaction prediction. The result indicates good performance of our method in discriminating protein‐binding structures from non‐protein binding structures. Our method outperforms the previous methods based on the Matthews correlation coefficient.