Chun-Chieh Fan
University of California, San Diego
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Chun-Chieh Fan.
Nature Genetics | 2017
Min-Tzu Lo; David A. Hinds; Joyce Y. Tung; Carol E. Franz; Chun-Chieh Fan; Yunpeng Wang; Olav B. Smeland; Andrew J. Schork; Dominic Holland; Karolina Kauppi; Nilotpal Sanyal; Valentina Escott-Price; Daniel J. Smith; Michael Conlon O'Donovan; Hreinn Stefansson; Gyda Bjornsdottir; Thorgeir E. Thorgeirsson; Kari Stefansson; Linda K. McEvoy; Anders M. Dale; Ole A. Andreassen; Chi-Hua Chen
Personality is influenced by genetic and environmental factors and associated with mental health. However, the underlying genetic determinants are largely unknown. We identified six genetic loci, including five novel loci, significantly associated with personality traits in a meta-analysis of genome-wide association studies (N = 123,132–260,861). Of these genome-wide significant loci, extraversion was associated with variants in WSCD2 and near PCDH15, and neuroticism with variants on chromosome 8p23.1 and in L3MBTL2. We performed a principal component analysis to extract major dimensions underlying genetic variations among five personality traits and six psychiatric disorders (N = 5,422–18,759). The first genetic dimension separated personality traits and psychiatric disorders, except that neuroticism and openness to experience were clustered with the disorders. High genetic correlations were found between extraversion and attention-deficit–hyperactivity disorder (ADHD) and between openness and schizophrenia and bipolar disorder. The second genetic dimension was closely aligned with extraversion–introversion and grouped neuroticism with internalizing psychopathology (e.g., depression or anxiety).
Scientific Reports | 2017
Chi-Hua Chen; Yunpeng Wang; Min-Tzu Lo; Andrew J. Schork; Chun-Chieh Fan; Dominic Holland; Karolina Kauppi; Olav B. Smeland; Srdjan Djurovic; Nilotpal Sanyal; Derrek P. Hibar; Paul M. Thompson; Wesley K. Thompson; Ole A. Andreassen; Anders M. Dale
Discovering genetic variants associated with human brain structures is an on-going effort. The ENIGMA consortium conducted genome-wide association studies (GWAS) with standard multi-study analytical methodology and identified several significant single nucleotide polymorphisms (SNPs). Here we employ a novel analytical approach that incorporates functional genome annotations (e.g., exon or 5′UTR), total linkage disequilibrium (LD) scores and heterozygosity to construct enrichment scores for improved identification of relevant SNPs. The method provides increased power to detect associated SNPs by estimating stratum-specific false discovery rate (FDR), where strata are classified according to enrichment scores. Applying this approach to the GWAS summary statistics of putamen volume in the ENIGMA cohort, a total of 15 independent significant SNPs were identified (conditional FDRu2009<u20090.05). In contrast, 4 SNPs were found based on standard GWAS analysis (Pu2009<u20095 × 10−8). These 11 novel loci include GATAD2B, ASCC3, DSCAML1, and HELZ, which are previously implicated in various neural related phenotypes. The current findings demonstrate the boost in power with the annotation-informed FDR method, and provide insight into the genetic architecture of the putamen.
bioRxiv | 2017
Dominic Holland; Chun-Chieh Fan; Oleksandr Frei; Alexey A. Shadrin; Olav B. Smeland; V. S. Sundar; Enigma; Ole A. Andreassen; Anders M. Dale
Of signal interest in the genetics of traits are estimating the proportion, π 1 , of causally associated single nucleotide polymorphisms (SNPs), and their effect size variance, σ 2 β , which are components of the mean heritabilities captured by the causal SNP. Here we present the first model, using detailed linkage disequilibrium structure, to estimate these quantities from genome-wide association studies (GWAS) summary statistics, assuming a Gaussian distribution of SNP effect sizes, β. We apply the model to three diverse phenotypes -- schizophrenia, putamen volume, and educational attainment -- and validate it with extensive simulations. We find that schizophrenia is highly polygenic, with ~5×10 4 causal SNPs distributed with small effect size variance, σ 2 β =3.5×10 -5 (in units where the phenotype variance is normalized to 1), requiring a GWAS study with more than 1/2-million samples in each arm for full discovery. In contrast, putamen volume involves only ~3×10 2 causal SNPs, but with σ 2 β =1.2×10 -3 , indicating a much larger proportion of the causal SNPs that are strongly associated. Educational attainment has similar polygenicity to schizophrenia, but with effects that are substantially weaker, σ 2 β =5×10 -6 , leading to much lower heritability. Thus the model is able to describe the broad genetic architecture of phenotypes where both polygenicity and effect size variance range over several orders of magnitude, shows why only small proportions of heritability have been explained for discovered SNPs, and provides a roadmap for future GWAS discoveries.Of signal interest in the genetics of traits are estimating the proportion, π 1 , of causally associated single nucleotide polymorphisms (SNPs), and their effect size variance, σ 2 β , which are components of the mean heritabilities captured by the causal SNP. Here we present the first model, using detailed linkage disequilibrium structure, to estimate these quantities from genome-wide association studies (GWAS) summary statistics, assuming a Gaussian distribution of SNP effect sizes, β. We apply the model to three diverse phenotypes -- schizophrenia, putamen volume, and educational attainment -- and validate it with extensive simulations. We find that schizophrenia is highly polygenic, with ~5×10 4 causal SNPs distributed with small effect size variance, σ 2 β =3.5×10 -5 (in units where the phenotype variance is normalized to 1), requiring a GWAS study with more than 1/2-million samples in each arm for full discovery. In contrast, putamen volume involves only ~3×10 2 causal SNPs, but with σ 2 β =1.2×10 -3 , indicating a much larger proportion of the causal SNPs that are strongly associated. Educational attainment has similar polygenicity to schizophrenia, but with effects that are substantially weaker, σ 2 β =5×10 -6 , leading to much lower heritability. Thus the model is able to describe the broad genetic architecture of phenotypes where both polygenicity and effect size variance range over several orders of magnitude, shows why only small proportions of heritability have been explained for discovered SNPs, and provides a roadmap for future GWAS discoveries.Estimating the polygenicity (proportion of causally associated single nucleotide polymorphisms (SNPs)) and discover-ability (effect size variance) of causal SNPs for human traits is currently of considerable interest. SNP-heritability is proportional to the product of these quantities. We present a basic model, using detailed linkage disequilibrium structure from an extensive reference panel, to estimate these quantities from genome-wide association studies (GWAS) summary statistics. We apply the model to diverse phenotypes and validate the implementation with simulations. We find model polygenicities ranging from ≃ 2 × 10−5 to ≃ 4 × 10−3, with discoverabilities similarly ranging over two orders of magnitude. A power analysis allows us to estimate the proportions of phenotypic variance explained additively by causal SNPs reaching genome-wide significance at current sample sizes, and map out sample sizes required to explain larger portions of additive SNP heritability. The model also allows for estimating residual inflation (or deflation from over-correcting of z-scores), and assessing compatibility of replication and discovery GWAS summary statistics. Author Summary There are ∼10 million common variants in the genome of humans with European ancestry. For any particular phenotype a number of these variants will have some causal effect. It is of great interest to be able to quantify the number of these causal variants and the strength of their effect on the phenotype. Genome wide association studies (GWAS) produce very noisy summary statistics for the association between subsets of common variants and phenotypes. For any phenotype, these statistics collectively are difficult to interpret, but buried within them is the true landscape of causal effects. In this work, we posit a probability distribution for the causal effects, and assess its validity using simulations. Using a detailed reference panel of ∼11 million common variants – among which only a small fraction are likely to be causal, but allowing for non-causal variants to show an association with the phenotype due to correlation with causal variants – we implement an exact procedure for estimating the number of causal variants and their mean strength of association with the phenotype. We find that, across different phenotypes, both these quantities – whose product allows for lower bound estimates of heritability – vary by orders of magnitude.
bioRxiv | 2017
Dominic Holland; Chun-Chieh Fan; Oleksandr Frei; Alexey A. Shadrin; Olav B. Smeland; V. S. Sundar; Ole A. Andreassen; Anders M. Dale
Cryptic relatedness is inherently a feature of large genome-wide association studies (GWAS), and can give rise to considerable inflation in summary statistics for single nucleotide polymorphism (SNP) associations with phenotypes. It has proven difficult to disentangle these inflationary effects from true polygenic effects. Here we present results of a model that enables estimation of polygenicity, mean strength of association, and residual inflation in GWAS summary statistics. We show that there is substantial residual inflation in recent large GWAS of height and schizophrenia; correcting for this reduces the number of independent genome-wide significant loci from the reported values of 697 for height and 108 for schizophrenia to 368 and 61, respectively. In contrast, a larger GWAS of educational attainment shows no residual inflation. Additionally, we find that height has a relatively low polygenicity, with approximately 8k SNPs having causal association, more than an order of magnitude less than has been reported. The residual inflation in GWAS summary statistics can be corrected using the standard genomic control procedure with the estimated residual inflation factor.
bioRxiv | 2018
Dominic Holland; Oleksandr Frei; Chun-Chieh Fan; Alexey A. Shadrin; Olav B. Smeland; V. S. Sundar; Paul M. Thompson; Ole A. Andreassen; Anders M. Dale
Of signal interest in the genetics of traits are estimating the proportion, π 1 , of causally associated single nucleotide polymorphisms (SNPs), and their effect size variance, σ 2 β , which are components of the mean heritabilities captured by the causal SNP. Here we present the first model, using detailed linkage disequilibrium structure, to estimate these quantities from genome-wide association studies (GWAS) summary statistics, assuming a Gaussian distribution of SNP effect sizes, β. We apply the model to three diverse phenotypes -- schizophrenia, putamen volume, and educational attainment -- and validate it with extensive simulations. We find that schizophrenia is highly polygenic, with ~5×10 4 causal SNPs distributed with small effect size variance, σ 2 β =3.5×10 -5 (in units where the phenotype variance is normalized to 1), requiring a GWAS study with more than 1/2-million samples in each arm for full discovery. In contrast, putamen volume involves only ~3×10 2 causal SNPs, but with σ 2 β =1.2×10 -3 , indicating a much larger proportion of the causal SNPs that are strongly associated. Educational attainment has similar polygenicity to schizophrenia, but with effects that are substantially weaker, σ 2 β =5×10 -6 , leading to much lower heritability. Thus the model is able to describe the broad genetic architecture of phenotypes where both polygenicity and effect size variance range over several orders of magnitude, shows why only small proportions of heritability have been explained for discovered SNPs, and provides a roadmap for future GWAS discoveries.Of signal interest in the genetics of traits are estimating the proportion, π 1 , of causally associated single nucleotide polymorphisms (SNPs), and their effect size variance, σ 2 β , which are components of the mean heritabilities captured by the causal SNP. Here we present the first model, using detailed linkage disequilibrium structure, to estimate these quantities from genome-wide association studies (GWAS) summary statistics, assuming a Gaussian distribution of SNP effect sizes, β. We apply the model to three diverse phenotypes -- schizophrenia, putamen volume, and educational attainment -- and validate it with extensive simulations. We find that schizophrenia is highly polygenic, with ~5×10 4 causal SNPs distributed with small effect size variance, σ 2 β =3.5×10 -5 (in units where the phenotype variance is normalized to 1), requiring a GWAS study with more than 1/2-million samples in each arm for full discovery. In contrast, putamen volume involves only ~3×10 2 causal SNPs, but with σ 2 β =1.2×10 -3 , indicating a much larger proportion of the causal SNPs that are strongly associated. Educational attainment has similar polygenicity to schizophrenia, but with effects that are substantially weaker, σ 2 β =5×10 -6 , leading to much lower heritability. Thus the model is able to describe the broad genetic architecture of phenotypes where both polygenicity and effect size variance range over several orders of magnitude, shows why only small proportions of heritability have been explained for discovered SNPs, and provides a roadmap for future GWAS discoveries.Estimating the polygenicity (proportion of causally associated single nucleotide polymorphisms (SNPs)) and discover-ability (effect size variance) of causal SNPs for human traits is currently of considerable interest. SNP-heritability is proportional to the product of these quantities. We present a basic model, using detailed linkage disequilibrium structure from an extensive reference panel, to estimate these quantities from genome-wide association studies (GWAS) summary statistics. We apply the model to diverse phenotypes and validate the implementation with simulations. We find model polygenicities ranging from ≃ 2 × 10−5 to ≃ 4 × 10−3, with discoverabilities similarly ranging over two orders of magnitude. A power analysis allows us to estimate the proportions of phenotypic variance explained additively by causal SNPs reaching genome-wide significance at current sample sizes, and map out sample sizes required to explain larger portions of additive SNP heritability. The model also allows for estimating residual inflation (or deflation from over-correcting of z-scores), and assessing compatibility of replication and discovery GWAS summary statistics. Author Summary There are ∼10 million common variants in the genome of humans with European ancestry. For any particular phenotype a number of these variants will have some causal effect. It is of great interest to be able to quantify the number of these causal variants and the strength of their effect on the phenotype. Genome wide association studies (GWAS) produce very noisy summary statistics for the association between subsets of common variants and phenotypes. For any phenotype, these statistics collectively are difficult to interpret, but buried within them is the true landscape of causal effects. In this work, we posit a probability distribution for the causal effects, and assess its validity using simulations. Using a detailed reference panel of ∼11 million common variants – among which only a small fraction are likely to be causal, but allowing for non-causal variants to show an association with the phenotype due to correlation with causal variants – we implement an exact procedure for estimating the number of causal variants and their mean strength of association with the phenotype. We find that, across different phenotypes, both these quantities – whose product allows for lower bound estimates of heritability – vary by orders of magnitude.
bioRxiv | 2018
V. S. Sundar; Chun-Chieh Fan; Dominic Holland; Anders M. Dale
Determining the genetic causal variants and estimating their effect sizes are considered to be correlated but independent problems. Fine-mapping studies often rely on the ability to integrate useful functional annotation information into genome wide association univariate/multivariate analysis. In the present study, by modeling the probability of a SNP being causal and its effect size as a set of correlated Gaussian/non-Gaussian random variables, we design an optimization routine for simultaneous fine-mapping and effect size estimation. The algorithm is released as an open source C package MODE. Availability and Implementation: http://sites.google.com/site/sundarvelkur/mode Contact: [email protected], [email protected]
Scientific Reports | 2018
Yi Li; Matthew J. Barkovich; Celeste M. Karch; Ryan M. Nillo; Chun-Chieh Fan; Iris Broce; Chin Hong Tan; Daniel Cuneo; Christopher P. Hess; William P. Dillon; Orit A. Glenn; Christine M. Glastonbury; Nicholas Olney; Jennifer S. Yokoyama; Luke W. Bonham; Bruce L. Miller; Aimee W. Kao; Nicholas J. Schmansky; Bruce Fischl; Ole A. Andreassen; Terry L. Jernigan; Anders M. Dale; A. James Barkovich; Rahul S. Desikan; Leo P. Sugrue
Tuberous sclerosis complex (TSC), a heritable neurodevelopmental disorder, is caused by mutations in the TSC1 or TSC2 genes. To date, there has been little work to elucidate regional TSC1 and TSC2 gene expression within the human brain, how it changes with age, and how it may influence disease. Using a publicly available microarray dataset, we found that TSC1 and TSC2 gene expression was highest within the adult neo-cerebellum and that this pattern of increased cerebellar expression was maintained throughout postnatal development. During mid-gestational fetal development, however, TSC1 and TSC2 expression was highest in the cortical plate. Using a bioinformatics approach to explore protein and genetic interactions, we confirmed extensive connections between TSC1/TSC2 and the other genes that comprise the mammalian target of rapamycin (mTOR) pathway, and show that the mTOR pathway genes with the highest connectivity are also selectively expressed within the cerebellum. Finally, compared to age-matched controls, we found increased cerebellar volumes in pediatric TSC patients without current exposure to antiepileptic drugs. Considered together, these findings suggest that the cerebellum may play a central role in TSC pathogenesis and may contribute to the cognitive impairment, including the high incidence of autism spectrum disorder, observed in the TSC population.
Frontiers in Genetics | 2018
V. S. Sundar; Chun-Chieh Fan; Dominic Holland; Anders M. Dale
With the availability of high-throughput sequencing data, identification of genetic causal variants accurately requires the efficient incorporation of function annotation data into the optimization routine. This motivates the need for development of novel methods for genome wide association studies with special focus on fine-mapping capabilities. A penalty function method that is simple to implement and capable of integrating functional annotation information into the estimation procedure, is proposed in this work. The idea is to use the prior distribution of the effect sizes explicitly as a penalty function. The estimates obtained are shown to be better correlated with the true effect sizes (in comparison with a few existing techniques). An increase in the positive and negative predictive value is demonstrated using Hapgen2 simulated data.
American Journal of Medical Genetics | 2018
Wen Li; Chun-Chieh Fan; Tuomo Mäki-Marttunen; Wesley K. Thompson; Andrew J. Schork; F. Bettella; Srdjan Djurovic; Anders M. Dale; Ole A. Andreassen; Yunpeng Wang; Swg Psychiat
Traditional genome‐wide association studies (GWAS) have successfully detected genetic variants associated with schizophrenia. However, only a small fraction of heritability can be explained. Gene‐set/pathway‐based methods can overcome limitations arising from single nucleotide polymorphism (SNP)‐based analysis, but most of them place constraints on size which may exclude highly specific and functional sets, like macromolecules. Voltage‐gated calcium (Cav) channels, belonging to macromolecules, are composed of several subunits whose encoding genes are located far away or even on different chromosomes. We combined information about such molecules with GWAS data to investigate how functional channels associated with schizophrenia. We defined a biologically meaningful SNP‐set based on channel structure and performed an association study by using a validated method: SNP‐set (sequence) kernel association test. We identified eight subtypes of Cav channels significantly associated with schizophrenia from a subsample of published data (Nu2009=u200956,605), including the L‐type channels (Cav1.1, Cav1.2, Cav1.3), P‐/Q‐type Cav2.1, N‐type Cav2.2, R‐type Cav2.3, T‐type Cav3.1, and Cav3.3. Only genes from Cav1.2 and Cav3.3 have been implicated by the largest GWAS (Nu2009=u200982,315). Each subtype of Cav channels showed relatively high chip heritability, proportional to the size of its constituent gene regions. The results suggest that abnormalities of Cav channels may play an important role in the pathophysiology of schizophrenia and these channels may represent appropriate drug targets for therapeutics. Analyzing subunit‐encoding genes of a macromolecule in aggregate is a complementary way to identify more genetic variants of polygenic diseases. This study offers the potential of power for discovery the biological mechanisms of schizophrenia.
bioRxiv | 2017
Dominic Holland; Chun-Chieh Fan; Oleksandr Frei; Alexey A. Shadrin; Olav B. Smeland; V. S. Sundar; Enigma; Ole A. Andreassen; Anders M. Dale
Of signal interest in the genetics of traits are estimating the proportion, π 1 , of causally associated single nucleotide polymorphisms (SNPs), and their effect size variance, σ 2 β , which are components of the mean heritabilities captured by the causal SNP. Here we present the first model, using detailed linkage disequilibrium structure, to estimate these quantities from genome-wide association studies (GWAS) summary statistics, assuming a Gaussian distribution of SNP effect sizes, β. We apply the model to three diverse phenotypes -- schizophrenia, putamen volume, and educational attainment -- and validate it with extensive simulations. We find that schizophrenia is highly polygenic, with ~5×10 4 causal SNPs distributed with small effect size variance, σ 2 β =3.5×10 -5 (in units where the phenotype variance is normalized to 1), requiring a GWAS study with more than 1/2-million samples in each arm for full discovery. In contrast, putamen volume involves only ~3×10 2 causal SNPs, but with σ 2 β =1.2×10 -3 , indicating a much larger proportion of the causal SNPs that are strongly associated. Educational attainment has similar polygenicity to schizophrenia, but with effects that are substantially weaker, σ 2 β =5×10 -6 , leading to much lower heritability. Thus the model is able to describe the broad genetic architecture of phenotypes where both polygenicity and effect size variance range over several orders of magnitude, shows why only small proportions of heritability have been explained for discovered SNPs, and provides a roadmap for future GWAS discoveries.Of signal interest in the genetics of traits are estimating the proportion, π 1 , of causally associated single nucleotide polymorphisms (SNPs), and their effect size variance, σ 2 β , which are components of the mean heritabilities captured by the causal SNP. Here we present the first model, using detailed linkage disequilibrium structure, to estimate these quantities from genome-wide association studies (GWAS) summary statistics, assuming a Gaussian distribution of SNP effect sizes, β. We apply the model to three diverse phenotypes -- schizophrenia, putamen volume, and educational attainment -- and validate it with extensive simulations. We find that schizophrenia is highly polygenic, with ~5×10 4 causal SNPs distributed with small effect size variance, σ 2 β =3.5×10 -5 (in units where the phenotype variance is normalized to 1), requiring a GWAS study with more than 1/2-million samples in each arm for full discovery. In contrast, putamen volume involves only ~3×10 2 causal SNPs, but with σ 2 β =1.2×10 -3 , indicating a much larger proportion of the causal SNPs that are strongly associated. Educational attainment has similar polygenicity to schizophrenia, but with effects that are substantially weaker, σ 2 β =5×10 -6 , leading to much lower heritability. Thus the model is able to describe the broad genetic architecture of phenotypes where both polygenicity and effect size variance range over several orders of magnitude, shows why only small proportions of heritability have been explained for discovered SNPs, and provides a roadmap for future GWAS discoveries.Estimating the polygenicity (proportion of causally associated single nucleotide polymorphisms (SNPs)) and discover-ability (effect size variance) of causal SNPs for human traits is currently of considerable interest. SNP-heritability is proportional to the product of these quantities. We present a basic model, using detailed linkage disequilibrium structure from an extensive reference panel, to estimate these quantities from genome-wide association studies (GWAS) summary statistics. We apply the model to diverse phenotypes and validate the implementation with simulations. We find model polygenicities ranging from ≃ 2 × 10−5 to ≃ 4 × 10−3, with discoverabilities similarly ranging over two orders of magnitude. A power analysis allows us to estimate the proportions of phenotypic variance explained additively by causal SNPs reaching genome-wide significance at current sample sizes, and map out sample sizes required to explain larger portions of additive SNP heritability. The model also allows for estimating residual inflation (or deflation from over-correcting of z-scores), and assessing compatibility of replication and discovery GWAS summary statistics. Author Summary There are ∼10 million common variants in the genome of humans with European ancestry. For any particular phenotype a number of these variants will have some causal effect. It is of great interest to be able to quantify the number of these causal variants and the strength of their effect on the phenotype. Genome wide association studies (GWAS) produce very noisy summary statistics for the association between subsets of common variants and phenotypes. For any phenotype, these statistics collectively are difficult to interpret, but buried within them is the true landscape of causal effects. In this work, we posit a probability distribution for the causal effects, and assess its validity using simulations. Using a detailed reference panel of ∼11 million common variants – among which only a small fraction are likely to be causal, but allowing for non-causal variants to show an association with the phenotype due to correlation with causal variants – we implement an exact procedure for estimating the number of causal variants and their mean strength of association with the phenotype. We find that, across different phenotypes, both these quantities – whose product allows for lower bound estimates of heritability – vary by orders of magnitude.