Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Kai Yuan is active.

Publication


Featured researches published by Kai Yuan.


Scientific Reports | 2015

Differential Natural Selection of Human Zinc Transporter Genes between African and Non-African Populations

Chao Zhang; Jing Li; Lei Tian; Dongsheng Lu; Kai Yuan; Yuan Yuan; Shuhua Xu

Zinc transporters play important roles in all eukaryotes by maintaining the rational zinc concentration in cells. However, the diversity of zinc transporter genes (ZTGs) remains poorly studied. Here, we investigated the genetic diversity of 24 human ZTGs based on the 1000 Genomes data. Some ZTGs show small population differences, such as SLC30A6 with a weighted-average FST (WA-FST = 0.015), while other ZTGs exhibit considerably large population differences, such as SLC30A9 (WA-FST = 0.284). Overall, ZTGs harbor many more highly population-differentiated variants compared with random genes. Intriguingly, we found that SLC30A9 was underlying natural selection in both East Asians (EAS) and Africans (AFR) but in different directions. Notably, a non-synonymous variant (rs1047626) in SLC30A9 is almost fixed with 96.4% A in EAS and 92% G in AFR, respectively. Consequently, there are two different functional haplotypes exhibiting dominant abundance in AFR and EAS, respectively. Furthermore, a strong correlation was observed between the haplotype frequencies of SLC30A9 and distributions of zinc contents in soils or crops. We speculate that the genetic differentiation of ZTGs could directly contribute to population heterogeneity in zinc transporting capabilities and local adaptations of human populations in regard to the local zinc state or diets, which have both evolutionary and medical implications.


Molecular Biology and Evolution | 2017

Genetic History of Xinjiang’s Uyghurs Suggests Bronze Age Multiple-Way Contacts in Eurasia

Qidi Feng; Yan Lu; Xumin Ni; Kai Yuan; Yajun Yang; Xiong Yang; Chang Liu; Haiyi Lou; Zhilin Ning; Yuchen Wang; Dongsheng Lu; Chao Zhang; Ying Zhou; Meng Shi; Lei Tian; Xiaoji Wang; Xi Zhang; Jing Li; Asifullah Khan; Yaqun Guan; Kun Tang; Sijia Wang; Shuhua Xu

The Uyghur people residing in Xinjiang, a territory located in the far west of China and crossed by the Silk Road, are a key ethnic group for understanding the history of human dispersion in Eurasia. Here we assessed the genetic structure and ancestry of 951 Xinjiangs Uyghurs (XJU) representing 14 geographical subpopulations. We observed a southwest and northeast differentiation within XJU, which was likely shaped jointly by the Tianshan Mountains, which traverses from east to west as a natural barrier, and gene flow from both east and west directions. In XJU, we identified four major ancestral components that were potentially derived from two earlier admixed groups: one from the West, harboring European (25-37%) and South Asian ancestries (12-20%), and the other from the East, with Siberian (15-17%) and East Asian (29-47%) ancestries. By using a newly developed method, MultiWaver, the complex admixture history of XJU was modeled as a two-wave admixture. An ancient wave was dated back to ∼3,750 years ago (ya), which is much earlier than that estimated by previous studies, but fits within the range of dating of mummies that exhibited European features that were discovered in the Tarim basin, which is situated in southern Xinjiang (4,000-2,000 ya); a more recent wave occurred around 750 ya, which is in agreement with the estimate from a recent study using other methods. We unveiled a more complex scenario of ancestral origins and admixture history in XJU than previously reported, which further suggests Bronze Age massive migrations in Eurasia and East-West contacts across the Silk Road.


Heredity | 2018

Inference of multiple-wave admixtures by length distribution of ancestral tracks

Xumin Ni; Kai Yuan; Xiong Yang; Qidi Feng; Wei Guo; Zhi-Ming Ma; Shuhua Xu

The ancestral tracks in admixed genomes are valuable for population history inference. While a few methods have been developed to infer admixture history based on ancestral tracks, these methods suffer the same flaw: only population admixture history under some specific models can be inferred. In addition, the inference of history might be biased or even unreliable if the specific model deviates from the real situation. To address this problem, we firstly proposed a general discrete admixture model to describe the admixture history with multiple ancestral populations and multiple-wave admixtures. We next deduced the length distribution of ancestral tracks under the general discrete admixture model. We further developed a new method, MultiWaver, to explore multiple-wave admixture histories. Our method could automatically determine an optimal admixture model based on the length distribution of ancestral tracks, and estimate the corresponding parameters under this optimal model. Specifically, we used a likelihood ratio test (LRT) to determine the number of admixture waves, and implemented an expectation–maximization (EM) algorithm to estimate parameters. We used simulation studies to validate the reliability and effectiveness of our method. Finally, good performance was observed when our method was applied to real data sets of African Americans and Mexicans, and new insights were gained into the admixture history of Uyghurs and Hazaras.


bioRxiv | 2016

AdmixSim: A Forward-Time Simulator for Various and Complex Scenarios of Population Admixture

Xiong Yang; Xumin Ni; Ying Zhou; Wei Guo; Kai Yuan; Shuhua Xu

Background Population admixture has been a common phenomenon in human, animals and plants, and plays a very important role in shaping individual genetic architecture and population genetic diversity. Inference of population admixture, however, is challenging and typically relies on in silico simulation. We are aware of the lack of a computer tool for such a purpose, especially a simulator is not available for generating data under various and complex admixture scenarios. Results Here we developed a forward-time simulator (AdmixSim) under standard Wright Fisher model, which can simulate admixed populations with: 1) multiple ancestral populations; 2) multiple waves of admixture events; 3) fluctuating population size; and 4) fluctuating admixture proportions. Results of analysis of the simulated data by AdmixSim show that our simulator can fast and accurately generate data resemble real one. We included in AdmixSim all possible parameters that allow users to modify and simulate any kinds of admixture scenarios easily so that it is very flexible. AdmixSim records recombination break points and trace of each chromosomal segment from different ancestral populations, with which users can easily do further analysis and comparative studies with empirical data. Conclusions AdmixSim is expected to facilitate the study of population admixture by providing a simulation framework with flexible implementation of various admixture models and parameters.


Heredity | 2017

Inference of multiple-wave population admixture by modeling decay of linkage disequilibrium with polynomial functions

Ying Zhou; Kai Yuan; Yaoliang Yu; Xuming Ni; Pengtao Xie; Eric P. Xing; Shuhua Xu

To infer the histories of population admixture, one important challenge with methods based on the admixture linkage disequilibrium (ALD) is to remove the effect of source LD (SLD), which is directly inherited from source populations. In previous methods, only the decay curve of weighted LD between pairs of sites whose genetic distance were larger than a certain starting distance was fitted by single or multiple exponential functions, for the inference of recent single- or multiple-wave admixture. However, the effect of SLD has not been well defined and no tool has been developed to estimate the effect of SLD on weighted LD decay. In this study, we defined the SLD in the formularized weighted LD statistic under the two-way admixture model and proposed a polynomial spectrum (p-spectrum) to study the weighted SLD and weighted LD. We also found that reference populations could be used to reduce the SLD in weighted LD statistics. We further developed a method, iMAAPs, to infer multiple-wave admixture by fitting ALD using a p-spectrum. We evaluated the performance of iMAAPs under various admixture models in simulated data and applied iMAAPs to the analysis of genome-wide single nucleotide polymorphism data from the Human Genome Diversity Project and the HapMap Project. We showed that iMAAPs is a considerable improvement over other current methods and further facilitates the inference of histories of complex population admixtures.


Scientific Reports | 2016

Genetic diversity and natural selection footprints of the glycine amidinotransferase gene in various human populations.

Asifullah Khan; Lei Tian; Chao Zhang; Kai Yuan; Shuhua Xu

The glycine amidinotransferase gene (GATM) plays a vital role in energy metabolism in muscle tissues and is associated with multiple clinically important phenotypes. However, the genetic diversity of the GATM gene remains poorly understood within and between human populations. Here we analyzed the 1,000 Genomes Project data through population genetics approaches and observed significant genetic diversity across the GATM gene among various continental human populations. We observed considerable variations in GATM allele frequencies and haplotype composition among different populations. Substantial genetic differences were observed between East Asian and European populations (FST = 0.56). In addition, the frequency of a distinct major GATM haplotype in these groups was congruent with population-wide diversity at this locus. Furthermore, we identified GATM as the top differentiated gene compared to the other statin drug response-associated genes. Composite multiple analyses identified signatures of positive selection at the GATM locus, which was estimated to have occurred around 850 generations ago in European populations. As GATM catalyzes the key step of creatine biosynthesis involved in energy metabolism, we speculate that the European prehistorical demographic transition from hunter-gatherer to farming cultures was the driving force of selection that fulfilled creatine-based metabolic requirement of the populations.


Quantitative Biology | 2017

Models, methods and tools for ancestry inference and admixture analysis

Kai Yuan; Ying Zhou; Xumin Ni; Yuchen Wang; Chang Liu; Shuhua Xu

BackgroundGenetic admixture refers to the process or consequence of interbreeding between two or more previously isolated populations within a species. Compared to many other evolutionary driving forces such as mutations, genetic drift, and natural selection, genetic admixture is a quick mechanism for shaping population genomic diversity. In particular, admixture results in “recombination” of genetic variants that have been fixed in different populations, which has many evolutionary and medical implications.ResultsHowever, it is challenging to accurately reconstruct population admixture history and to understand of population admixture dynamics. In this review, we provide an overview of models, methods, and tools for ancestry inference and admixture analysis.ConclusionsMany methods and tools used for admixture analysis were originally developed to analyze human data, but these methods can also be directly applied and/or slightly modified to study non-human species as well.


Human Molecular Genetics | 2018

Genome-wide comparison of allele-specific gene expression between African and European populations

Lei Tian; Asifullah Khan; Zhilin Ning; Kai Yuan; Chao Zhang; Haiyi Lou; Yuan Yuan; Shuhua Xu

Transcriptomic diversity across human populations reflects differential regulatory mechanisms. Allelic-imbalanced gene expression is a genetic regulatory mechanism that contributes to human phenotypic variation. To systematically investigate genome-wide allele-specific expression (ASE), we analyzed RNA-Seq data from European and African populations provided by the Geuvadis project. We identified 11 sites in 8 genes showing ASE in both Europeans and Africans, and 9 sites in 9 genes showing population-specific ASE, including both novel and known ASE signals. Notably, the top signal of differentiated ASE between inter-continental populations was observed in DNAJC15, of which the derived allele of rs12015, a single nucleotide polymorphism (SNP), showed significantly higher expression than did the ancestral allele specifically in European individuals. We identified a unique haplotype of DNAJC15, where a few SNPs highly differentiated between European and African populations were strongly linked to sites with high ASE. Among these, SNP rs17553284 affected the binding of several transcription factors as well as the genotype-dependent expression of DNAJC15. Therefore, we speculated that rs17553284 could be a regulatory causal variant that mediates the ASE of rs12015. We found several variations in ASE between intercontinental populations. The highly differentiated ASE genes identified here may implicate in the phenotypic variations among populations that are both evolutionarily and medically important.


European Journal of Human Genetics | 2018

MultiWaver 2.0 : modeling discrete and continuous gene flow to reconstruct complex population admixtures

Xumin Ni; Kai Yuan; Chang Liu; Qidi Feng; Lei Tian; Zhi-Ming Ma; Shuhua Xu

Our goal in developing the MultiWaver software series was to be able to infer population admixture history under various complex scenarios. The earlier version of MultiWaver considered only discrete admixture models. Here, we report a newly developed version, MultiWaver 2.0, that implements a more flexible framework and is capable of inferring multiple-wave admixture histories under both discrete and continuous admixture models. MultiWaver 2.0 can automatically select an optimal admixture model based on the length distribution of ancestral tracks of chromosomes, and the program can estimate the corresponding parameters under the selected model. Specifically, for discrete admixture models, we used a likelihood ratio test (LRT) to determine the optimal discrete model and an expectation–maximization algorithm to estimate the parameters. In addition, according to the principles of the Bayesian Information Criterion (BIC), we compared the optimal discrete model with several continuous admixture models. In MultiWaver 2.0, we also applied a bootstrapping technique to provide levels of support for the chosen model and the confidence interval (CI) of the estimations of admixture time. Simulation studies validated the reliability and effectiveness of our method. Finally, the program performed well when applied to real datasets of typical admixed populations, such as African Americans, Uyghurs, and Hazaras.


bioRxiv | 2015

Inference of multiple-wave population admixture by modeling decay of linkage disequilibrium with multiple exponential functions

Ying Zhou; Kai Yuan; Yaoliang Yu; Xumin Ni; Pengtao Xie; Eric P. Xing; Shuhua Xu

Admixture-introduced linkage disequilibrium (LD) has recently been introduced into the inference of the histories of complex admixtures. However, the influence of ancestral source populations on the LD pattern in admixed populations is not properly taken into consideration by currently available methods, which affects the estimation of several gene flow parameters from empirical data. We first illustrated the dynamic changes of LD in admixed populations and mathematically formulated the LD under a generalized admixture model with finite population size. We next developed a new method, MALDmef, by fitting LD with multiple exponential functions for inferring and dating multiple-wave admixtures. MALDmef takes into account the effects of source populations which substantially affect modeling LD in admixed population, which renders it capable of efficiently detecting and dating multiple-wave admixture events. The performance of MALDmef was evaluated by simulation and it was shown to be more accurate than MALDER, a state-of-the-art method that was recently developed for similar purposes, under various admixture models. We further applied MALDmef to analyzing genome-wide data from the Human Genome Diversity Project (HGDP) and the HapMap Project. Interestingly, we were able to identify more than one admixture events in several populations, which have yet to be reported. For example, two major admixture events were identified in the Xinjiang Uyghur, occurring around 27–30 generations ago and 182–195 generations ago, respectively. In an African population (MKK), three recent major admixtures occurring 13–16, 50–67, and 107–139 generations ago were detected. Our method is a considerable improvement over other current methods and further facilitates the inference of the histories of complex population admixtures.

Collaboration


Dive into the Kai Yuan's collaboration.

Top Co-Authors

Avatar

Shuhua Xu

CAS-MPG Partner Institute for Computational Biology

View shared research outputs
Top Co-Authors

Avatar

Xumin Ni

Beijing Jiaotong University

View shared research outputs
Top Co-Authors

Avatar

Ying Zhou

CAS-MPG Partner Institute for Computational Biology

View shared research outputs
Top Co-Authors

Avatar

Chao Zhang

CAS-MPG Partner Institute for Computational Biology

View shared research outputs
Top Co-Authors

Avatar

Xiong Yang

CAS-MPG Partner Institute for Computational Biology

View shared research outputs
Top Co-Authors

Avatar

Lei Tian

CAS-MPG Partner Institute for Computational Biology

View shared research outputs
Top Co-Authors

Avatar

Qidi Feng

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Haiyi Lou

CAS-MPG Partner Institute for Computational Biology

View shared research outputs
Top Co-Authors

Avatar

Wei Guo

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Yuchen Wang

CAS-MPG Partner Institute for Computational Biology

View shared research outputs
Researchain Logo
Decentralizing Knowledge