Ying-Chao Lin
Academia Sinica
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Ying-Chao Lin.
Genomics | 2009
Chien-Hsing Lin; Ying-Chao Lin; Jer-Yuarn Wu; Wen-Harn Pan; Yuan-Tsong Chen; Cathy S.J. Fann
Copy number variation (CNV) is a form of DNA sequence variation in the human genome. CNVs can affect expression of nearby and distant genes, and some of them might cause certain phenotypic differences. CNVs vary slightly in location and frequency among different populations. Because currently-available CNV information from Asian population was limited to fewer small-scale studies with only dozens of subjects, a high-resolution CNV survey was conducted using a large number of Han Chinese in this study. The Illumina HumanMap550K single-nucleotide polymorphism array was used to identify CNVs from 813 unrelated Han Chinese residing in Taiwan. A total of 365 CNV regions were identified in this population, and the average size of the CNV regions was 235 kb (covering a total of 2.86% of the human genome), and 67 (18.4%) were newly-discovered CNV regions. Two hundred and seventy-nine CNV regions (76%) were verified from 304 randomly-selected samples by Affymetrix 500K GeneChip and qPCR experiments. These regions contain 1029 genes, some of which are associated with diseases. Consistent with previous studies, most CNVs were rare structural variations in the human genome, and only 64 regions (17.5%) had a CNV allele frequency greater than 1%. Our discovery of 67 new CNV regions indicates that previous CNV coverage of the human genome is incomplete and there is diversity among different ethnic populations. The comprehensive knowledge of CNVs in the human genome is very important and useful in further genetic studies.
PLOS ONE | 2012
Hui-Min Wang; Ching-Lin Hsiao; Ai-Ru Hsieh; Ying-Chao Lin; Cathy S.J. Fann
Complex diseases are typically caused by combinations of molecular disturbances that vary widely among different patients. Endophenotypes, a combination of genetic factors associated with a disease, offer a simplified approach to dissect complex trait by reducing genetic heterogeneity. Because molecular dissimilarities often exist between patients with indistinguishable disease symptoms, these unique molecular features may reflect pathogenic heterogeneity. To detect molecular dissimilarities among patients and reduce the complexity of high-dimension data, we have explored an endophenotype-identification analytical procedure that combines non-negative matrix factorization (NMF) and adjusted rand index (ARI), a measure of the similarity of two clusterings of a data set. To evaluate this procedure, we compared it with a commonly used method, principal component analysis with k-means clustering (PCA-K). A simulation study with gene expression dataset and genotype information was conducted to examine the performance of our procedure and PCA-K. The results showed that NMF mostly outperformed PCA-K. Additionally, we applied our endophenotype-identification analytical procedure to a publicly available dataset containing data derived from patients with late-onset Alzheimer’s disease (LOAD). NMF distilled information associated with 1,116 transcripts into three metagenes and three molecular subtypes (MS) for patients in the LOAD dataset: MS1 (), MS2 (), and MS3 (). ARI was then used to determine the most representative transcripts for each metagene; 123, 89, and 71 metagene-specific transcripts were identified for MS1, MS2, and MS3, respectively. These metagene-specific transcripts were identified as the endophenotypes. Our results showed that 14, 38, 0, and 28 candidate susceptibility genes listed in AlzGene database were found by all patients, MS1, MS2, and MS3, respectively. Moreover, we found that MS2 might be a normal-like subtype. Our proposed procedure provides an alternative approach to investigate the pathogenic mechanism of disease and better understand the relationship between phenotype and genotype.
Journal of Biomedical Science | 2014
Ying-Chao Lin; Ai-Ru Hsieh; Ching-Lin Hsiao; Shang-Jung Wu; Hui-Min Wang; Ie-Bin Lian; Cathy Sj Fann
BackgroundGenome-wide association studies have been successful in identifying common genetic variants for human diseases. However, much of the heritable variation associated with diseases such as Parkinsons disease remains unknown suggesting that many more risk loci are yet to be identified. Rare variants have become important in disease association studies for explaining missing heritability. Methods for detecting this type of association require prior knowledge on candidate genes and combining variants within the region. These methods may suffer from power loss in situations with many neutral variants or causal variants with opposite effects.ResultsWe propose a method capable of scanning genetic variants to identify the region most likely harbouring disease gene with rare and/or common causal variants. Our method assigns a score at each individual variant based on our scoring system. It uses aggregate scores to identify the region with disease association. We evaluate performance by simulation based on 1000 Genomes sequencing data and compare with three commonly used methods. We use a Parkinsons disease case–control dataset as a model to demonstrate the application of our method.Our method has better power than CMC and WSS and similar power to SKAT-O with well-controlled type I error under simulation based on 1000 Genomes sequencing data. In real data analysis, we confirm the association of α-synuclein gene (SNCA) with Parkinsons disease (p = 0.005). We further identify association with hyaluronan synthase 2 (HAS2, p = 0.028) and kringle containing transmembrane protein 1 (KREMEN1, p = 0.006). KREMEN1 is associated with Wnt signalling pathway which has been shown to play an important role for neurodegeneration in Parkinsons disease.ConclusionsOur method is time efficient and less sensitive to inclusion of neutral variants and direction effect of causal variants. It can narrow down a genomic region or a chromosome to a disease associated region. Using Parkinsons disease as a model, our method not only confirms association for a known gene but also identifies two genes previously found by other studies. In spite of many existing methods, we conclude that our method serves as an efficient alternative for exploring genomic data containing both rare and common variants.
BMC Bioinformatics | 2008
Ie-Bin Lian; Yi-Hsien Lin; Ying-Chao Lin; Hsin-Chou Yang; Chee-Jang Chang; Cathy S.J. Fann
BackgroundAssociation testing is a powerful tool for identifying disease susceptibility genes underlying complex diseases. Technological advances have yielded a dramatic increase in the density of available genetic markers, necessitating an increase in the number of association tests required for the analysis of disease susceptibility genes. As such, multiple-tests corrections have become a critical issue. However the conventional statistical corrections on locus-specific multiple tests usually result in lower power as the number of markers increases. Alternatively, we propose here the application of the longest significant run (LSR) method to estimate a region-specific p-value to provide an index for the most likely candidate region.ResultsAn advantage of the LSR method relative to procedures based on genotypic data is that only p-value data are needed and hence can be applied extensively to different study designs. In this study the proposed LSR method was compared with commonly used methods such as Bonferronis method and FDR controlling method. We found that while all methods provide good control over false positive rate, LSR has much better power and false discovery rate. In the authentic analysis on psoriasis and asthma disease data, the LSR method successfully identified important candidate regions and replicated the results of previous association studies.ConclusionThe proposed LSR method provides an efficient exploratory tool for the analysis of sequences of dense genetic markers. Our results show that the LSR method has better power and lower false discovery rate comparing with the locus-specific multiple tests.
Cephalalgia | 2018
Shih-Pin Chen; Jong-Ling Fuh; M.-Y. Chung; Ying-Chao Lin; Yi-Chu Liao; Yen-Feng Wang; Chia-Lin Hsu; Ueng-Cheng Yang; Ming-Wei Lin; Jen-Jie Chiou; Po-Jen Wang; Ping-Kun Chen; Pi-Chuan Fan; J.-Y. Wu; Yuan-Tsong Chen; Lung-Sen Kao; Cathy S.J. Fann; Shuu-Jiun Wang
Background Susceptibility genes for migraine, despite it being a highly prevalent and disabling neurological disorder, have not been analyzed in Asians by genome-wide association study (GWAS). Methods We conducted a two-stage case-control GWAS to identify susceptibility genes for migraine without aura in Han Chinese residing in Taiwan. In the discovery stage, we genotyped 1005 clinic-based Taiwanese migraine patients and 1053 population-based sex-matched controls using Axiom Genome-Wide CHB Array. In the replication stage, we genotyped 27 single-nucleotide polymorphisms with p < 10−4 in 1120 clinic-based migraine patients and 604 sex-matched normal controls by using Sequenom. Variants at LRP1, TRPM8, and PRDM, which have been replicated in Caucasians, were also genotyped. Results We identified a novel susceptibility locus (rs655484 in DLG2) that reached GWAS significance level for migraine risk in Han Chinese (p = 1.45 × 10−12, odds ratio [OR] = 2.42), and also another locus (rs3781545in GFRA1) with suggestive significance (p = 1.27 × 10−7, OR = 1.38). In addition, we observed positive association signals with a similar trend to the associations identified in Caucasian GWASs for rs10166942 in TRPM8 (OR = 1.33, 95% confidence interval [CI] = 1.14–1.54, Ppermutation = 9.99 × 10−5; risk allele: T) and rs1172113 in LRP1 (OR = 1.23, 95% CI = 1.04–1.45, Ppermutation = 2.9 × 10−2; risk allele: T). Conclusion The present study is the first migraine GWAS conducted in Han-Chinese and Asians. The newly identified susceptibility genes have potential implications in migraine pathogenesis. DLG2 is involved in glutamatergic neurotransmission, and GFRA1 encodes GDNF receptors that are abundant in CGRP-containing trigeminal neurons. Furthermore, positive association signals for TRPM8 and LRP1 suggest the possibility for common genetic contributions across ethnicities.
PLOS ONE | 2014
Ching-Lin Hsiao; Ai-Ru Hsieh; Ie-Bin Lian; Ying-Chao Lin; Hui-Min Wang; Cathy S.J. Fann
Advances in biotechnology have resulted in large-scale studies of DNA methylation. A differentially methylated region (DMR) is a genomic region with multiple adjacent CpG sites that exhibit different methylation statuses among multiple samples. Many so-called “supervised” methods have been established to identify DMRs between two or more comparison groups. Methods for the identification of DMRs without reference to phenotypic information are, however, less well studied. An alternative “unsupervised” approach was proposed, in which DMRs in studied samples were identified with consideration of nature dependence structure of methylation measurements between neighboring probes from tiling arrays. Through simulation study, we investigated effects of dependencies between neighboring probes on determining DMRs where a lot of spurious signals would be produced if the methylation data were analyzed independently of the probe. In contrast, our newly proposed method could successfully correct for this effect with a well-controlled false positive rate and a comparable sensitivity. By applying to two real datasets, we demonstrated that our method could provide a global picture of methylation variation in studied samples. R source codes to implement the proposed method were freely available at http://www.csjfann.ibms.sinica.edu.tw/eag/programlist/ICDMR/ICDMR.html.
Genetic Epidemiology | 2012
Ying-Chao Lin; Ching-Lin Hsiao; Ai-Ru Hsieh; Ie-Bin Lian; Cathy S.J. Fann
Genome‐wide association studies (GWAS) have become the method of choice for identifying disease susceptibility genes in common disease genetics research. Despite successes in these studies, much of the heritability remains unexplained due to lack of power and low resolution. High‐density genotyping arrays can now screen more than 5 million genetic markers. As a result, multiple comparison has become an important issue especially in the era of next‐generation sequencing. We propose to use a two‐stage maximal segmental score procedure (MSS) which uses region‐specific empirical P‐values to identify genomic segments most likely harboring the disease gene. We develop scoring systems based on Fishers P‐value combining method to convert locus‐specific significance levels into region‐specific scores. Through simulations, our result indicated that MSS increased the power to detect genetic association as compared with conventional methods provided type I error was at 5%. We demonstrated the application of MSS on a publicly available case‐control dataset of Parkinsons disease and replicated the findings in the literature. MSS provides an efficient exploratory tool for high‐density association data in the current era of next‐generation sequencing. R source codes to implement the MSS procedure are freely available at http://www.csjfann.ibms.sinica.edu.tw/EAG/program/programlist.htm.
Computational Biology and Chemistry | 2016
Amrita Sengupta Chattopadhyay; Ying-Chao Lin; Ai-Ru Hsieh; Chien-Ching Chang; Ie-Bin Lian; Cathy S.J. Fann
BACKGROUND The statistical tests for single locus disease association are mostly under-powered. If a disease associated causal single nucleotide polymorphism (SNP) operates essentially through a complex mechanism that involves multiple SNPs or possible environmental factors, its effect might be missed if the causal SNP is studied in isolation without accounting for these unknown genetic influences. In this study, we attempt to address the issue of reduced power that is inherent in single point association studies by accounting for genetic influences that negatively impact the detection of causal variant in single point association analysis. In our method we use propensity score (PS) to adjust for the effect of SNPs that influence the marginal association of a candidate marker. These SNPs might be in linkage disequilibrium (LD) and/or epistatic with the target-SNP and have a joint interactive influence on the disease under study. We therefore propose a propensity score adjustment method (PSAM) as a tool for dimension reduction to improve the power for single locus studies through an estimated PS to adjust for influence from these SNPs while regressing disease status on the target-genetic locus. The degree of freedom of such a test is therefore always restricted to 1. RESULTS We assess PSAM under the null hypothesis of no disease association to affirm that it correctly controls for the type-I-error rate (<0.05). PSAM displays reasonable power (>70%) and shows an average of 15% improvement in power as compared with commonly-used logistic regression method and PLINK under most simulated scenarios. Using the open-access multifactor dimensionality reduction dataset, PSAM displays improved significance for all disease loci. Through a whole genome study, PSAM was able to identify 21 SNPs from the GAW16 NARAC dataset by reducing their original trend-test p-values from within 0.001 and 0.05 to p-values less than 0.0009, and among which 6 SNPs were further found to be associated with immunity and inflammation. CONCLUSIONS PSAM improves the significance of single-locus association of causal SNPs which have had marginal single point association by adjusting for influence from other SNPs in a dataset. This would explain part of the missing heritability without increasing the complexity of the model due to huge multiple testing scenarios. The newly reported SNPs from GAW16 data would provide evidences for further research to elucidate the etiology of rheumatoid arthritis. PSAM is proposed as an exploratory tool that would be complementary to other existing methods. A downloadable user friendly program, PSAM, written in SAS, is available for public use.
Journal of Physics: Conference Series | 2011
Ying-Chao Lin; C R Wang; C L Dong; Min-Nan Ou; Y. Y. Chen
To study the size effects on mixed-valence state of CePd3, nanoparticles of CePd3, sizes ranging from 5.2 to 9.5 nm, were prepared. The mixed valence increased from 3.3 to 3.5 as particle size reduced from bulk to 5.2 nm. This consequence was illustrated by the enhancements of valence fluctuations and 4f electron hybridization with conduction band through size reduction. Another interesting finding in the nanoparticles is that a certain fraction ~ 25% of the sample becomes trivalent which is evident from the increase of magnetic susceptibility at low temperatures.
Nature Communications | 2015
Pei-Lung Chen; Shyang-Rong Shih; Pei-Wen Wang; Ying-Chao Lin; Chen-Chung Chu; Jung-Hsin Lin; Szu-Chi Chen; Chang Cc; Tien-Shang Huang; Keh-Sung Tsai; Fen-Yu Tseng; Chih-Yuan Wang; Jin-Ying Lu; Wei-Yih Chiu; Chien-Ching Chang; Yu-Hsuan Chen; Yuan-Tsong Chen; Cathy S.J. Fann; Wei-Shiung Yang; Tien-Chun Chang