Yafang Li
Dartmouth College
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Yafang Li.
Human Molecular Genetics | 2012
Brian D. Juran; Gideon M. Hirschfield; Pietro Invernizzi; Elizabeth J. Atkinson; Yafang Li; Gang Xie; Roman Kosoy; Michael Ransom; Ye Sun; Ilaria Bianchi; Erik M. Schlicht; Ana Lleo; Catalina Coltescu; Francesca Bernuzzi; Mauro Podda; Craig Lammert; Russell Shigeta; Landon L. Chan; Tobias Balschun; Maurizio Marconi; Daniele Cusi; E. Jenny Heathcote; Andrew L. Mason; Robert P. Myers; Piotr Milkiewicz; Joseph A. Odin; Velimir A. Luketic; Bruce R. Bacon; Henry C. Bodenheimer; Valentina Liakina
To further characterize the genetic basis of primary biliary cirrhosis (PBC), we genotyped 2426 PBC patients and 5731 unaffected controls from three independent cohorts using a single nucleotide polymorphism (SNP) array (Immunochip) enriched for autoimmune disease risk loci. Meta-analysis of the genotype data sets identified a novel disease-associated locus near the TNFSF11 gene at 13q14, provided evidence for association at six additional immune-related loci not previously implicated in PBC and confirmed associations at 19 of 22 established risk loci. Results of conditional analyses also provided evidence for multiple independent association signals at four risk loci, with haplotype analyses suggesting independent SNP effects at the 2q32 and 16p13 loci, but complex haplotype driven effects at the 3q25 and 6p21 loci. By imputing classical HLA alleles from this data set, four class II alleles independently contributing to the association signal from this region were identified. Imputation of genotypes at the non-HLA loci also provided additional associations, but none with stronger effects than the genotyped variants. An epistatic interaction between the IL12RB2 risk locus at 1p31and the IRF5 risk locus at 7q32 was also identified and suggests a complementary effect of these loci in predisposing to disease. These data expand the repertoire of genes with potential roles in PBC pathogenesis that need to be explored by follow-up biological studies.
Cancer Epidemiology, Biomarkers & Prevention | 2017
Christopher I. Amos; Joe Dennis; Zhaoming Wang; Jinyoung Byun; Fredrick R. Schumacher; Simon A. Gayther; Graham Casey; David J. Hunter; Thomas A. Sellers; Stephen B. Gruber; Alison M. Dunning; Kyriaki Michailidou; Laura Fachal; Kimberly F. Doheny; Amanda B. Spurdle; Yafang Li; Xiangjun Xiao; Jane Romm; Elizabeth W. Pugh; Gerhard A. Coetzee; Dennis J. Hazelett; Stig E. Bojesen; Charlisse F. Caga-anan; Christopher A. Haiman; Ahsan Kamal; Craig Luccarini; Daniel C. Tessier; Daniel Vincent; Francois Bacot; David Van Den Berg
Background: Common cancers develop through a multistep process often including inherited susceptibility. Collaboration among multiple institutions, and funding from multiple sources, has allowed the development of an inexpensive genotyping microarray, the OncoArray. The array includes a genome-wide backbone, comprising 230,000 SNPs tagging most common genetic variants, together with dense mapping of known susceptibility regions, rare variants from sequencing experiments, pharmacogenetic markers, and cancer-related traits. Methods: The OncoArray can be genotyped using a novel technology developed by Illumina to facilitate efficient genotyping. The consortium developed standard approaches for selecting SNPs for study, for quality control of markers, and for ancestry analysis. The array was genotyped at selected sites and with prespecified replicate samples to permit evaluation of genotyping accuracy among centers and by ethnic background. Results: The OncoArray consortium genotyped 447,705 samples. A total of 494,763 SNPs passed quality control steps with a sample success rate of 97% of the samples. Participating sites performed ancestry analysis using a common set of markers and a scoring algorithm based on principal components analysis. Conclusions: Results from these analyses will enable researchers to identify new susceptibility loci, perform fine-mapping of new or known loci associated with either single or multiple cancers, assess the degree of overlap in cancer causation and pleiotropic effects of loci that have been identified for disease-specific risk, and jointly model genetic, environmental, and lifestyle-related exposures. Impact: Ongoing analyses will shed light on etiology and risk assessment for many types of cancer. Cancer Epidemiol Biomarkers Prev; 26(1); 126–35. ©2016 AACR.
Cancer Research | 2010
Christopher I. Amos; Susan M. Pinney; Yafang Li; Elena Kupert; Juwon Lee; Mariza de Andrade; Ping Yang; Ann G. Schwartz; Pam R. Fain; Adi F. Gazdar; John D. Minna; Jonathan S. Wiest; Dong Zeng; Henry Rothschild; Diptasri Mandal; Ming You; Teresa Coons; Colette Gaba; Joan E. Bailey-Wilson; Marshall W. Anderson
Cigarette smoking is the major cause for lung cancer, but genetic factors also affect susceptibility. We studied families that included multiple relatives affected by lung cancer. Results from linkage analysis showed strong evidence that a region of chromosome 6q affects lung cancer risk. To characterize the effects that this region of chromosome 6q region has on lung cancer risk, we identified a haplotype that segregated with lung cancer. We then performed Cox regression analysis to estimate the differential effects that smoking behaviors have on lung cancer risk according to whether each individual carried a risk-associated haplotype or could not be classified and was assigned unknown haplotypic status. We divided smoking exposures into never smokers, light smokers (<20 pack-years), moderate smokers (20 to <40 pack-years), and heavy smokers (>or=40 pack-years). Comparing results according to smoking behavior stratified by carrier status, compared with never smokers, there was weakly increasing risk for increasing smoking behaviors, with the hazards ratios being 3.44, 4.91, and 5.18, respectively, for light, moderate, or heavy smokers, whereas among the individuals from families without the risk haplotype, the risks associated with smoking increased strongly with exposure, the hazards ratios being, respectively, 4.25, 9.17, and 11.89 for light, moderate, and heavy smokers. The never smoking carriers had a 4.71-fold higher risk than the never smoking individuals without known risk haplotypes. These results identify a region of chromosome 6q that increases risk for lung cancer and that confers particularly higher risks to never and light smokers.
American Journal of Human Genetics | 2015
Dong Hai Xiong; Yian Wang; Elena Kupert; Claire L. Simpson; Susan M. Pinney; Colette Gaba; Diptasri Mandal; Ann G. Schwartz; Ping Yang; Mariza de Andrade; Claudio W. Pikielny; Jinyoung Byun; Yafang Li; Dwight Stambolian; Margaret R. Spitz; Yanhong Liu; Christopher I. Amos; Joan E. Bailey-Wilson; Marshall W. Anderson; Ming You
PARK2, a gene associated with Parkinson disease, is a tumor suppressor in human malignancies. Here, we show that c.823C>T (p.Arg275Trp), a germline mutation in PARK2, is present in a family with eight cases of lung cancer. The resulting amino acid change, p.Arg275Trp, is located in the highly conserved RING finger 1 domain of PARK2, which encodes an E3 ubiquitin ligase. Upon further analysis, the c.823C>T mutation was detected in three additional families affected by lung cancer. The effect size for PARK2 c.823C>T (odds ratio = 5.24) in white individuals was larger than those reported for variants from lung cancer genome-wide association studies. These data implicate this PARK2 germline mutation as a genetic susceptibility factor for lung cancer. Our results provide a rationale for further investigations of this specific mutation and gene for evaluation of the possibility of developing targeted therapies against lung cancer in individuals with PARK2 variants by compensating for the loss-of-function effect caused by the associated variation.
Carcinogenesis | 2016
Linda Kachuri; Christopher I. Amos; James D. McKay; Mattias Johansson; Paolo Vineis; H. Bas Bueno-de-Mesquita; Marie Christine Boutron-Ruault; Mikael Johansson; J. Ramón Quirós; Sabina Sieri; Ruth C. Travis; Elisabete Weiderpass; Loic Le Marchand; Brian E. Henderson; Lynne R. Wilkens; Gary E. Goodman; Chu Chen; Jennifer A. Doherty; David C. Christiani; Yongyue Wei; Li Su; Shelley S. Tworoger; Xuehong Zhang; Peter Kraft; David Zaridze; John K. Field; Michael W. Marcus; Michael P.A. Davies; Russell Hyde; Neil E. Caporaso
Chromosome 5p15.33 has been identified as a lung cancer susceptibility locus, however the underlying causal mechanisms were not fully elucidated. Previous fine-mapping studies of this locus have relied on imputation or investigated a small number of known, common variants. This study represents a significant advance over previous research by investigating a large number of novel, rare variants, as well as their underlying mechanisms through telomere length. Variants for this fine-mapping study were identified through a targeted deep sequencing (average depth of coverage greater than 4000×) of 576 individuals. Subsequently, 4652 SNPs, including 1108 novel SNPs, were genotyped in 5164 cases and 5716 controls of European ancestry. After adjusting for known risk loci, rs2736100 and rs401681, we identified a new, independent lung cancer susceptibility variant in LPCAT1: rs139852726 (OR = 0.46, P = 4.73×10(-9)), and three new adenocarcinoma risk variants in TERT: rs61748181 (OR = 0.53, P = 2.64×10(-6)), rs112290073 (OR = 1.85, P = 1.27×10(-5)), rs138895564 (OR = 2.16, P = 2.06×10(-5); among young cases, OR = 3.77, P = 8.41×10(-4)). In addition, we found that rs139852726 (P = 1.44×10(-3)) was associated with telomere length in a sample of 922 healthy individuals. The gene-based SKAT-O analysis implicated TERT as the most relevant gene in the 5p15.33 region for adenocarcinoma (P = 7.84×10(-7)) and lung cancer (P = 2.37×10(-5)) risk. In this largest fine-mapping study to investigate a large number of rare and novel variants within 5p15.33, we identified novel lung and adenocarcinoma susceptibility loci with large effects and provided support for the role of telomere length as the potential underlying mechanism.
PLOS ONE | 2015
Yafang Li; Xiayu Rao; William Mattox; Christopher I. Amos; Bin Liu
Alternative splicing is an important biological process in the generation of multiple functional transcripts from the same genomic sequences. Differential analysis of splice junctions (SJs) and intron retentions (IRs) is helpful in the detection of alternative splicing events. In this study, we conducted differential analysis of SJs and IRs by use of DEXSeq, a Bioconductor package originally designed for differential exon usage analysis in RNA-seq data analysis. We set up an analysis pipeline including mapping of RNA-seq reads, the preparation of count tables of SJs and IRs as the input files, and the differential analysis in DEXSeq. We analyzed the public RNA-seq datasets generated from RNAi experiments on Drosophila melanogaster S2-DRSC cells to deplete RNA-binding proteins (GSE18508). The analysis confirmed previous findings on the alternative splicing of the trol and Ant2 (sesB) genes in the CG8144 (ps)-depletion experiment and identified some new alternative splicing events in other RNAi experiments. We also identified IRs that were confirmed in our SJ analysis. The proposed method used in our study can output the genomic coordinates of differentially used SJs and thus enable sequence motif search. Sequence motif search and gene function annotation analysis helped us infer the underlying mechanism in alternative splicing events. To further evaluate this method, we also applied the method to public RNA-seq data from human breast cancer (GSE45419) and the plant Arabidopsis (SRP008262). In conclusion, our study showed that DEXSeq can be adapted to differential analysis of SJs and IRs, which will facilitate the identification of alternative splicing events and provide insights into the molecular mechanisms of transcription processes and disease development.
Cancer Epidemiology, Biomarkers & Prevention | 2009
Chu-Ling Yu; Yafang Li; D M Freedman; Thomas R. Fears; R Kwok; Gabriel Chodick; Bruce H. Alexander; Michael G. Kimlin; Anne Kricker; Bruce K. Armstrong; Martha S. Linet
Few studies have evaluated the reliability of lifetime sun exposure estimated from inquiring about the number of hours people spent outdoors in a given period on a typical weekday or weekend day (the time-based approach). Some investigations have suggested that women have a particularly difficult task in estimating time outdoors in adulthood due to their family and occupational roles. We hypothesized that people might gain additional memory cues and estimate lifetime hours spent outdoors more reliably if asked about time spent outdoors according to specific activities (an activity-based approach). Using self-administered, mailed questionnaires, test-retest responses to time-based and to activity-based approaches were evaluated in 124 volunteer radiologic technologist participants from the United States: 64 females and 60 males 48 to 80 years of age. Intraclass correlation coefficients (ICC) were used to evaluate the test-retest reliability of average number of hours spent outdoors in the summer estimated for each approach. We tested the differences between the two ICCs, corresponding to each approach, using a t test with the variance of the difference estimated by the jackknife method. During childhood and adolescence, the two approaches gave similar ICCs for average numbers of hours spent outdoors in the summer. By contrast, compared with the time-based approach, the activity-based approach showed significantly higher ICCs during adult ages (0.69 versus 0.43, P = 0.003) and over the lifetime (0.69 versus 0.52, P = 0.05); the higher ICCs for the activity-based questionnaire were primarily derived from the results for females. Research is needed to further improve the activity-based questionnaire approach for long-term sun exposure assessment. (Cancer Epidemiol Biomarkers Prev 2009;18(2):464–71)
PLOS ONE | 2012
Yafang Li; Jian Huang; Christopher I. Amos
Genetic researchers often collect disease related quantitative traits in addition to disease status because they are interested in understanding the pathophysiology of disease processes. In genome-wide association (GWA) studies, these quantitative phenotypes may be relevant to disease development and serve as intermediate phenotypes or they could be behavioral or other risk factors that predict disease risk. Statistical tests combining both disease status and quantitative risk factors should be more powerful than case-control studies, as the former incorporates more information about the disease. In this paper, we proposed a modified inverse-variance weighted meta-analysis method to combine disease status and quantitative intermediate phenotype information. The simulation results showed that when an intermediate phenotype was available, the inverse-variance weighted method had more power than did a case-control study of complex diseases, especially in identifying susceptibility loci having minor effects. We further applied this modified meta-analysis to a study of imputed lung cancer genotypes with smoking data in 1154 cases and 1137 matched controls. The most significant SNPs came from the CHRNA3-CHRNA5-CHRNB4 region on chromosome 15q24–25.1, which has been replicated in many other studies. Our results confirm that this CHRNA region is associated with both lung cancer development and smoking behavior. We also detected three significant SNPs—rs1800469, rs1982072, and rs2241714—in the promoter region of the TGFB1 gene on chromosome 19 (p = 1.46×10−5, 1.18×10−5, and 6.57×10−6, respectively). The SNP rs1800469 is reported to be associated with chronic obstructive pulmonary disease and lung cancer in cigarette smokers. The present study is the first GWA study to replicate this result. Signals in the 3q26 region were also identified in the meta-analysis. We demonstrate the intermediate phenotype can potentially enhance the power of complex disease association analysis and the modified meta-analysis method is robust to incorporate intermediate phenotype or other quantitative risk factor in the analysis.
European Journal of Human Genetics | 2015
Yufei Wang; Yongyue Wei; Valerie Gaborieau; Jianxin Shi; Younghun Han; Maria Timofeeva; Li Su; Yafang Li; T. Eisen; Christopher I. Amos; Maria Teresa Landi; David C. Christiani; James D. McKay; Richard S. Houlston
Recent genome-wide association studies have identified common variants at multiple loci influencing lung cancer risk. To decipher the genetic basis of the association signals at 3q28, 5p15.33, 6p21.33, 9p21 and 12p13.33, we performed a meta-analysis of data from five genome-wide association studies in populations of European ancestry totalling 12 316 lung cancer cases and 16 831 controls using imputation to recover untyped genotypes. For four of the regions, it was possible to refine the association signal identifying a smaller region of interest likely to harbour the functional variant. Our analysis did not provide evidence that any of the associations at the loci being a consequence of synthetic associations rather than linkage disequilibrium with a common risk variant at these risk loci.
Carcinogenesis | 2015
Xuemei Ji; Jiang Gui; Younghun Han; Paul Brennan; Yafang Li; James D. McKay; Neil E. Caporaso; Pier Alberto Bertazzi; Maria Teresa Landi; Christopher I. Amos
The role of haplotypes and the interaction of haplotypes and smoking in lung cancer risk have not been well characterized. We analyzed data from an Italian population-based, case-control study with 1815 lung cancer patients and 1959 healthy controls in discovery, and performed a validation using a case-control study with 2983 lung cancer patients and 3553 healthy controls of European ancestry for replication. Sliding window haplotype analysis within chromosome 15, evaluating 4722250 haplotypes and pair-wise haplotype analysis identified that CHRNA5 rs588765-rs16969968 was the most significant haplotype associated with lung cancer risk (omnibus P = 8.35×10(-15) in discovery and 7.26×10(-14) in replication), and improved the prediction of case status over that provided by the individual SNPs rs16969968 or rs588765 (likelihood ratio test P = 0.006 for rs16969968 and 3.83×10(-14) for rs588765 in discovery, 0.009 for rs16969968 and 4.62×10(-13) for rs588765 in replication, compared with rs588765-rs16969968). Compared with the wild-type homozygous diplotype, CA/CA homozygote exhibited an approximately 2-fold increase risk for lung cancer (OR = 2.12; 95% CI 1.46-3.07 in discovery, and OR = 2.01; 95% CI 1.51-2.67 in replication). Even among never-smokers, CA/CA homozygote showed an increased risk of lung cancer with borderline significance in discovery (adjusted OR = 1.75, 95% CI 0.96-3.19) and statistical significance in replication (adjusted OR = 2.10, 95% CI 1.12-3.96), compared with combined genotypes (CG/CG + CG/TG). Accordingly, rs588765-rs16969968 may be a genetic marker to lung cancer risk, even among never-smokers.