Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Haiyi Lou is active.

Publication


Featured researches published by Haiyi Lou.


American Journal of Human Genetics | 2009

Genomic Dissection of Population Substructure of Han Chinese and Its Implication in Association Studies

Shuhua Xu; Xianyong Yin; Shilin Li; Wenfei Jin; Haiyi Lou; Ling Yang; Xiaohong Gong; Hongyan Wang; Yiping Shen; Xuedong Pan; Yungang He; Yajun Yang; Yi Wang; Wenqing Fu; Yu An; Jiucun Wang; Jingze Tan; Ji Qian; Xiaoli Chen; Xin Zhang; Yangfei Sun; Xuejun Zhang; Bai-Lin Wu; Li Jin

To date, most genome-wide association studies (GWAS) and studies of fine-scale population structure have been conducted primarily on Europeans. Han Chinese, the largest ethnic group in the world, composing 20% of the entire global human population, is largely underrepresented in such studies. A well-recognized challenge is the fact that population structure can cause spurious associations in GWAS. In this study, we examined population substructures in a diverse set of over 1700 Han Chinese samples collected from 26 regions across China, each genotyped at approximately 160K single-nucleotide polymorphisms (SNPs). Our results showed that the Han Chinese population is intricately substructured, with the main observed clusters corresponding roughly to northern Han, central Han, and southern Han. However, simulated case-control studies showed that genetic differentiation among these clusters, although very small (F(ST) = 0.0002 approximately 0.0009), is sufficient to lead to an inflated rate of false-positive results even when the sample size is moderate. The top two SNPs with the greatest frequency differences between the northern Han and southern Han clusters (F(ST) > 0.06) were found in the FADS2 gene, which associates with the fatty acid composition in phospholipids, and in the HLA complex P5 gene (HCP5), which associates with HIV infection, psoriasis, and psoriatic arthritis. Ingenuity Pathway Analysis (IPA) showed that most differentiated genes among clusters are involved in cardiac arteriopathy (p < 10(-101)). These signals indicating significant differences among Han Chinese subpopulations should be carefully explained in case they are also detected in association studies, especially when sample sources are diverse.


PLOS ONE | 2011

A Map of Copy Number Variations in Chinese Populations

Haiyi Lou; Shilin Li; Yajun Yang; Longli Kang; Xin Zhang; Wenfei Jin; Bai-Lin Wu; Li Jin; Shuhua Xu

It has been shown that the human genome contains extensive copy number variations (CNVs). Investigating the medical and evolutionary impacts of CNVs requires the knowledge of locations, sizes and frequency distribution of them within and between populations. However, CNV study of Chinese minorities, which harbor the majority of genetic diversity of Chinese populations, has been underrepresented considering the same efforts in other populations. Here we constructed, to our knowledge, a first CNV map in seven Chinese populations representing the major linguistic groups in China with 1,440 CNV regions identified using Affymetrix SNP 6.0 Array. Considerable differences in distributions of CNV regions between populations and substantial population structures were observed. We showed that ∼35% of CNV regions identified in minority ethnic groups are not shared by Han Chinese population, indicating that the contribution of the minorities to genetic architecture of Chinese population could not be ignored. We further identified highly differentiated CNV regions between populations. For example, a common deletion in Dong and Zhuang (44.4% and 50%), which overlaps two keratin-associated protein genes contributing to the structure of hair fibers, was not observed in Han Chinese. Interestingly, the most differentiated CNV deletion between HapMap CEU and YRI containing CCL3L1 gene reported in previous studies was also the highest differentiated regions between Tibetan and other populations. Besides, by jointly analyzing CNVs and SNPs, we found a CNV region containing gene CTDSPL were in almost perfect linkage disequilibrium between flanking SNPs in Tibetan while not in other populations except HapMap CHD. Furthermore, we found the SNP taggability of CNVs in Chinese populations was much lower than that in European populations. Our results suggest the necessity of a full characterization of CNVs in Chinese populations, and the CNV map we constructed serves as a useful resource in further evolutionary and medical studies.


Human Molecular Genetics | 2012

A systematic characterization of genes underlying both complex and Mendelian diseases

Wenfei Jin; Pengfei Qin; Haiyi Lou; Li Jin; Shuhua Xu

Traditionally, genetic disorders have been classified as either Mendelian diseases or complex diseases. This nosology has greatly benefited genetic counseling and the development of gene mapping strategies. However, based on two well-established databases, we identified that 54% (524 of 968) of the Mendelian disease genes were also involved in complex diseases, and this kind of genes has not been systematically analyzed. Here, we classified human genes into five categories: Mendelian and complex disease (MC) genes, Mendelian but not complex disease (MNC) genes, complex but not Mendelian disease (CNM) genes, essential genes and OTHER genes. First, we found that MC genes were associated with more diseases and phenotypes, and were involved in more complex protein-protein interaction network than MNC or CNM genes on average. Secondly, MC genes encoded the longest proteins and had the highest transcript count among all gene categories. Especially, tissue specificity of MC genes was much higher than that of any other gene categories (P < 7.5 × 10(-5)), although their expression level was similar to that of essential genes. Thirdly, evidences from different aspects supported that MC genes have been subjected to both purifying and positive selection. Interestingly, functions of some human disease genes might be different from those of their orthologous genes in non-primate mammalians since they were even less conserved than OTHER genes. The significant over-representation of copy number variations (CNVs) in CNM genes suggested the important roles of CNVs in complex diseases. In brief, our study not only revealed the characteristics of MC genes, but also provided new insights into the other four gene categories.


American Journal of Human Genetics | 2015

A 3.4-kb Copy-Number Deletion near EPAS1 Is Significantly Enriched in High-Altitude Tibetans but Absent from the Denisovan Sequence

Haiyi Lou; Yan Lu; Dongsheng Lu; Ruiqing Fu; Xiaoji Wang; Qidi Feng; Sijie Wu; Yajun Yang; Shilin Li; Longli Kang; Yaqun Guan; Boon-Peng Hoh; Yeun-Jun Chung; Li Jin; Bing Su; Shuhua Xu

Tibetan high-altitude adaptation (HAA) has been studied extensively, and many candidate genes have been reported. Subsequent efforts targeting HAA functional variants, however, have not been that successful (e.g., no functional variant has been suggested for the top candidate HAA gene, EPAS1). With WinXPCNVer, a method developed in this study, we detected in microarray data a Tibetan-enriched deletion (TED) carried by 90% of Tibetans; 50% were homozygous for the deletion, whereas only 3% carried the TED and 0% carried the homozygous deletion in 2,792 worldwide samples (p < 10(-15)). We employed long PCR and Sanger sequencing technologies to determine the exact copy number and breakpoints of the TED in 70 additional Tibetan and 182 diverse samples. The TED had identical boundaries (chr2: 46,694,276-46,697,683; hg19) and was 80 kb downstream of EPAS1. Notably, the TED was in strong linkage disequilibrium (LD; r(2) = 0.8) with EPAS1 variants associated with reduced blood concentrations of hemoglobin. It was also in complete LD with the 5-SNP motif, which was suspected to be introgressed from Denisovans, but the deletion itself was absent from the Denisovan sequence. Correspondingly, we detected that footprints of positive selection for the TED occurred 12,803 (95% confidence interval = 12,075-14,725) years ago. We further whole-genome deep sequenced (>60×) seven Tibetans and verified the TED but failed to identify any other copy-number variations with comparable patterns, giving this TED top priority for further study. We speculate that the specific patterns of the TED resulted from its own functionality in HAA of Tibetans or LD with a functional variant of EPAS1.


European Journal of Human Genetics | 2014

A panel of ancestry informative markers to estimate and correct potential effects of population stratification in Han Chinese

Pengfei Qin; Zhiqiang Li; Wenfei Jin; Dongsheng Lu; Haiyi Lou; Jiawei Shen; Li Jin; Yongyong Shi; Shuhua Xu

Population stratification acts as a confounding factor in genetic association studies and may lead to false-positive or false-negative results. Previous studies have analyzed the genetic substructures in Han Chinese population, the largest ethnic group in the world comprising ∼20% of the global human population. In this study, we examined 5540 Han Chinese individuals with about 1 million single-nucleotide polymorphisms (SNPs) and screened a panel of ancestry informative markers (AIMs) to facilitate the discerning and controlling of population structure in future association studies on Han Chinese. Based on genome-wide data, we first confirmed our previous observation of the north–south differentiation in Han Chinese population. Second, we developed a panel of 150 validated SNP AIMs to determine the northern or southern origin of each Han Chinese individual. We further evaluated the performance of our AIMs panel in association studies in simulation analysis. Our results showed that this AIMs panel had sufficient power to discern and control population stratification in Han Chinese, which could significantly reduce false-positive rates in both genome-wide association studies (GWAS) and candidate gene association studies (CGAS). We suggest this AIMs panel be genotyped and used to control and correct population stratification in the study design or data analysis of future association studies, especially in CGAS which is the most popular approach to validate previous reports on genetic associations of diseases in post-GWAS era.


Scientific Reports | 2015

Quantitating and Dating Recent Gene Flow between European and East Asian Populations

Pengfei Qin; Ying Zhou; Haiyi Lou; Dongsheng Lu; Xiong Yang; Yuchen Wang; Li Jin; Yeun-Jun Chung; Shuhua Xu

Historical records indicate that extensive cultural, commercial and technological interaction occurred between European and Asian populations. What have been the biological consequences of these contacts in terms of gene flow? We systematically estimated gene flow between Eurasian groups using genome-wide polymorphisms from 34 populations representing Europeans, East Asians, and Central/South Asians. We identified recent gene flow between Europeans and Asians in most populations we studied, including East Asians and Northwestern Europeans, which are normally considered to be non-admixed populations. In addition we quantitatively estimated the extent of this gene flow using two statistical approaches, and dated admixture events based on admixture linkage disequilibrium. Our results indicate that most genetic admixtures occurred between 2,400 and 310 years ago and show the admixture proportions to be highly correlated with geographic locations, with the highest admixture proportions observed in Central Asia and the lowest in East Asia and Northwestern Europe. Interestingly, we observed a North-to-South decline of European gene flow in East Asians, suggesting a northern path of European gene flow diffusing into East Asian populations. Our findings contribute to an improved understanding of the history of human migration and the evolutionary mechanisms that have shaped the genetic structure of populations in Eurasia.


Journal of Medical Genetics | 2014

Genetic architectures of ADME genes in five Eurasian admixed populations and implications for drug safety and efficacy

Jing Li; Haiyi Lou; Xiong Yang; Dongsheng Lu; Shilin Li; Li Jin; Xinwei Pan; Wenjun Yang; Manshu Song; Dolikun Mamatyusupu; Shuhua Xu

Background Drug absorption, distribution, metabolism and excretion (ADME) contribute to the high heterogeneity of drug responses in humans. However, the same standard for drug dosage has been applied to all populations in China although genetic differences in ADME genes are expected to exist in different ethnic groups. In particular, the ethnic minorities in northwestern China with substantial ancestry contribution from Western Eurasian people might violate such a single unified standard. Methods In this study, we used Affymetrix SNP Array 6.0 to investigate the genetic diversity of 282 ADME genes in five northwestern Chinese minority populations, namely, Tajik, Uyghur, Kazakh, Kirgiz and Hui, and attempted to identify the highly differential SNPs and haplotypes and further explore their clinical implications. Results We found that genetic diversity of many ADME genes in the five minority groups was substantially different from those in the Han Chinese population. For instance, we identified 10 functional SNPs with substantial allele frequency differences, 14 functional SNPs with highly different heterozygous states and eight genes with significant haplotype differences between these admixed minority populations and the Han Chinese population. We further confirmed that these differences mainly resulted from the European gene flow, that is, this gene flow increased the genetic diversity in the admixed populations. Conclusions These results suggest that the ADME genes vary substantially among different Chinese ethnic groups. We suggest it could cause potential clinical risk if the same dosage of substances (eg, antitumour drugs) is used without considering population stratification.


European Journal of Human Genetics | 2015

Copy number variations and genetic admixtures in three Xinjiang ethnic minority groups

Haiyi Lou; Shilin Li; Wenfei Jin; Ruiqing Fu; Dongsheng Lu; Xinwei Pan; Huaigu Zhou; Yuan Ping; Li Jin; Shuhua Xu

Xinjiang is geographically located in central Asia, and it has played an important historical role in connecting eastern Eurasian (EEA) and western Eurasian (WEA) people. However, human population genomic studies in this region have been largely underrepresented, especially with respect to studies of copy number variations (CNVs). Here we constructed the first CNV map of the three major ethnic minority groups, the Uyghur, Kazakh and Kirgiz, using Affymetrix Genome-Wide Human SNP Array 6.0. We systematically compared the properties of CNVs we identified in the three groups with the data from representatives of EEA and WEA. The analyses indicated a typical genetic admixture pattern in all three groups with ancestries from both EEA and WEA. We also identified several CNV regions showing significant deviation of allele frequency from the expected genome-wide distribution, which might be associated with population-specific phenotypes. Our study provides the first genome-wide perspective on the CNVs of three major Xinjiang ethnic minority groups and has implications for both evolutionary and medical studies.


Molecular Biology and Evolution | 2017

Genetic History of Xinjiang’s Uyghurs Suggests Bronze Age Multiple-Way Contacts in Eurasia

Qidi Feng; Yan Lu; Xumin Ni; Kai Yuan; Yajun Yang; Xiong Yang; Chang Liu; Haiyi Lou; Zhilin Ning; Yuchen Wang; Dongsheng Lu; Chao Zhang; Ying Zhou; Meng Shi; Lei Tian; Xiaoji Wang; Xi Zhang; Jing Li; Asifullah Khan; Yaqun Guan; Kun Tang; Sijia Wang; Shuhua Xu

The Uyghur people residing in Xinjiang, a territory located in the far west of China and crossed by the Silk Road, are a key ethnic group for understanding the history of human dispersion in Eurasia. Here we assessed the genetic structure and ancestry of 951 Xinjiangs Uyghurs (XJU) representing 14 geographical subpopulations. We observed a southwest and northeast differentiation within XJU, which was likely shaped jointly by the Tianshan Mountains, which traverses from east to west as a natural barrier, and gene flow from both east and west directions. In XJU, we identified four major ancestral components that were potentially derived from two earlier admixed groups: one from the West, harboring European (25-37%) and South Asian ancestries (12-20%), and the other from the East, with Siberian (15-17%) and East Asian (29-47%) ancestries. By using a newly developed method, MultiWaver, the complex admixture history of XJU was modeled as a two-wave admixture. An ancient wave was dated back to ∼3,750 years ago (ya), which is much earlier than that estimated by previous studies, but fits within the range of dating of mummies that exhibited European features that were discovered in the Tarim basin, which is situated in southern Xinjiang (4,000-2,000 ya); a more recent wave occurred around 750 ya, which is in agreement with the estimate from a recent study using other methods. We unveiled a more complex scenario of ancestral origins and admixture history in XJU than previously reported, which further suggests Bronze Age massive migrations in Eurasia and East-West contacts across the Silk Road.


Journal of Medical Genetics | 2013

Identification of well-differentiated gene expressions between Han Chinese and Japanese using genome-wide microarray data analysis

Yuan Yuan; Ling Yang; Meng Shi; Dongsheng Lu; Haiyi Lou; Yi-Ping Phoebe Chen; Li Jin; Shuhua Xu

Background Investigating variations in gene expression, which can be quantitatively measured on a genome-wide scale, is essential to understand and interpret phenotypic differences among human populations. Several previous studies have examined and compared variations in gene expression between continental populations. However, differences in gene expression variation between closely related populations have not been studied yet. Method We performed a genome-wide analysis and systematically compared expression profiles of Han Chinese with those of the Japanese population. Results We identified 768 genes (4.4% of 17 354 expressed genes) which were expressed differentially between the two populations, with 165 showing highly differential expression and enriched in genes involved in the spliceosome pathway, mRNA processing, mRNA metabolic process, RNA processing, RNA splicing and mitochondrial transport. We further identified cis- and trans-variants that regulated these differential gene expressions, and found that cis-variants shared in the two populations were centred within a range of 200 kb around transcription start site. Our analysis indicated that genetic differences in the cis-associated genes between the two populations could explain 7–43% of the identified expression divergence. Conclusions In summary, despite considerable heterogeneity, gene expression profiles between Han Chinese and Japanese did show an overall difference, with well-differentiated expressions regulated by genetic variants which have been reported associated with hematological and biochemical traits in Japanese populations. Our results supported that gene expression is regulated by genetic variants and there is a genetic basis for the phenotypic differences between Han Chinese and Japanese populations.

Collaboration


Dive into the Haiyi Lou's collaboration.

Top Co-Authors

Avatar

Shuhua Xu

CAS-MPG Partner Institute for Computational Biology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Dongsheng Lu

CAS-MPG Partner Institute for Computational Biology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Yan Lu

CAS-MPG Partner Institute for Computational Biology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Yaqun Guan

Xinjiang Medical University

View shared research outputs
Top Co-Authors

Avatar

Chao Zhang

CAS-MPG Partner Institute for Computational Biology

View shared research outputs
Top Co-Authors

Avatar

Qidi Feng

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Ruiqing Fu

Chinese Academy of Sciences

View shared research outputs
Researchain Logo
Decentralizing Knowledge