Hongzhi Cao
University of Copenhagen
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Hongzhi Cao.
Science | 2010
Xin Yi; Yu Liang; Emilia Huerta-Sanchez; Xin Jin; Zha Xi Ping Cuo; John E. Pool; Xun Xu; Hui Jiang; Nicolas Vinckenbosch; Thorfinn Sand Korneliussen; Hancheng Zheng; Tao Liu; Weiming He; Kui Li; Ruibang Luo; Xifang Nie; Honglong Wu; Meiru Zhao; Hongzhi Cao; Jing Zou; Ying Shan; Shuzheng Li; Qi Yang; Asan; Peixiang Ni; Geng Tian; Junming Xu; Xiao Liu; Tao Jiang; Renhua Wu
No Genetic Vertigo Peoples living in high altitudes have adapted to their situation (see the Perspective by Storz). To identify gene regions that might have contributed to high-altitude adaptation in Tibetans, Simonson et al. (p. 72, published online 13 May) conducted a genome scan of nucleotide polymorphism comparing Tibetans, Han Chinese, and Japanese, while Yi et al. (p. 75) performed comparable analyses on the coding regions of all genes—their exomes. Both studies converged on a gene, endothelial Per-Arnt-Sim domain protein 1 (also known as hypoxia-inducible factor 2α), which has been linked to the regulation of red blood cell production. Other genes identified that were potentially under selection included adult and fetal hemoglobin and two functional candidate loci that were correlated with low hemoglobin concentration in Tibetans. Future detailed functional studies will now be required to examine the mechanistic underpinnings of physiological adaptation to high altitudes. Sequencing coding regions identified genetic changes that were likely involved in adaptation to hypoxia. Residents of the Tibetan Plateau show heritable adaptations to extreme altitude. We sequenced 50 exomes of ethnic Tibetans, encompassing coding sequences of 92% of human genes, with an average coverage of 18× per individual. Genes showing population-specific allele frequency changes, which represent strong candidates for altitude adaptation, were identified. The strongest signal of natural selection came from endothelial Per-Arnt-Sim (PAS) domain protein 1 (EPAS1), a transcription factor involved in response to hypoxia. One single-nucleotide polymorphism (SNP) at EPAS1 shows a 78% frequency difference between Tibetan and Han samples, representing the fastest allele frequency change observed at any human gene to date. This SNP’s association with erythrocyte abundance supports the role of EPAS1 in adaptation to hypoxia. Thus, a population genomic survey has revealed a functionally important locus in genetic adaptation to high altitude.
Nature Genetics | 2014
Laurent C. Francioli; Androniki Menelaou; Sara L. Pulit; Freerk van Dijk; Pier Francesco Palamara; Clara C. Elbers; Pieter B. T. Neerincx; Kai Ye; Victor Guryev; Wigard P. Kloosterman; Patrick Deelen; Abdel Abdellaoui; Elisabeth M. van Leeuwen; Mannis van Oven; Martijn Vermaat; Mingkun Li; Jeroen F. J. Laros; Lennart C. Karssen; Alexandros Kanterakis; Najaf Amin; Jouke-Jan Hottenga; Eric-Wubbo Lameijer; Mathijs Kattenberg; Martijn Dijkstra; Heorhiy Byelas; Jessica van Setten; Barbera D. C. van Schaik; Jan Bot; Isaac J. Nijman; Ivo Renkens
Whole-genome sequencing enables complete characterization of genetic variation, but geographic clustering of rare alleles demands many diverse populations be studied. Here we describe the Genome of the Netherlands (GoNL) Project, in which we sequenced the whole genomes of 250 Dutch parent-offspring families and constructed a haplotype map of 20.4 million single-nucleotide variants and 1.2 million insertions and deletions. The intermediate coverage (∼13×) and trio design enabled extensive characterization of structural variation, including midsize events (30–500 bp) previously poorly catalogued and de novo mutations. We demonstrate that the quality of the haplotypes boosts imputation accuracy in independent samples, especially for lower frequency alleles. Population genetic analyses demonstrate fine-scale structure across the country and support multiple ancient migrations, consistent with historical changes in sea level and flooding. The GoNL Project illustrates how single-population whole-genome sequencing can provide detailed characterization of genetic variation and may guide the design of future population studies.
PLOS Biology | 2010
Yingrui Li; Jingde Zhu; Geng Tian; Ning Li; Qibin Li; Mingzhi Ye; Hancheng Zheng; Jian-Xin Yu; Honglong Wu; Jihua Sun; Hongyu Zhang; Quan Chen; Ruibang Luo; Minfeng Chen; Yinghua He; Xin Jin; Qinghui Zhang; Chang Yu; Guangyu Zhou; Jinfeng Sun; Yebo Huang; Huisong Zheng; Hongzhi Cao; Xiaoyu Zhou; Shicheng Guo; Xueda Hu; Xin Li; Karsten Kristiansen; Lars Bolund; Jiujin Xu
Analysis across the genome of patterns of DNA methylation reveals a rich landscape of allele-specific epigenetic modification and consequent effects on allele-specific gene expression.
Nature Genetics | 2010
Yingrui Li; Nicolas Vinckenbosch; Geng Tian; Emilia Huerta-Sanchez; Tao Jiang; Hui Jiang; Anders Albrechtsen; Gitte Andersen; Hongzhi Cao; Thorfinn Sand Korneliussen; Niels Grarup; Yiran Guo; Ines Hellman; Xin Jin; Qibin Li; Jiangtao Liu; Xiao Liu; Thomas Sparsø; Meifang Tang; Honglong Wu; Renhua Wu; Chang Yu; Hancheng Zheng; Arne Astrup; Lars Bolund; Johan Holmkvist; Torben Jørgensen; Karsten Kristiansen; Ole Schmitz; Thue W. Schwartz
Targeted capture combined with massively parallel exome sequencing is a promising approach to identify genetic variants implicated in human traits. We report exome sequencing of 200 individuals from Denmark with targeted capture of 18,654 coding genes and sequence coverage of each individual exome at an average depth of 12-fold. On average, about 95% of the target regions were covered by at least one read. We identified 121,870 SNPs in the sample population, including 53,081 coding SNPs (cSNPs). Using a statistical method for SNP calling and an estimation of allelic frequencies based on our population data, we derived the allele frequency spectrum of cSNPs with a minor allele frequency greater than 0.02. We identified a 1.8-fold excess of deleterious, non-syonomyous cSNPs over synonymous cSNPs in the low-frequency range (minor allele frequencies between 2% and 5%). This excess was more pronounced for X-linked SNPs, suggesting that deleterious substitutions are primarily recessive.
Proceedings of the National Academy of Sciences of the United States of America | 2013
Yongping Cui; Chang Yu; Yong-Bin Yan; Duanzhuo Li; Yingrui Li; Thibaut Jombart; L. A. Weinert; Zuyun Wang; Zhaobiao Guo; Lizhi Xu; Yueyang Zhang; Huisong Zheng; Nan Qin; Xueshan Xiao; Mingzhu Wu; X.L. Wang; Dongsheng Zhou; Zhizhen Qi; Zongmin Du; Huilan Wu; Xukui Yang; Hongzhi Cao; Hongyang Wang; Jun Wang; S. Yao; A. Rakin; Daniel Falush; Francois Balloux; Mark Achtman; Yajun Song
The genetic diversity of Yersinia pestis, the etiologic agent of plague, is extremely limited because of its recent origin coupled with a slow clock rate. Here we identified 2,326 SNPs from 133 genomes of Y. pestis strains that were isolated in China and elsewhere. These SNPs define the genealogy of Y. pestis since its most recent common ancestor. All but 28 of these SNPs represented mutations that happened only once within the genealogy, and they were distributed essentially at random among individual genes. Only seven genes contained a significant excess of nonsynonymous SNP, suggesting that the fixation of SNPs mainly arises via neutral processes, such as genetic drift, rather than Darwinian selection. However, the rate of fixation varies dramatically over the genealogy: the number of SNPs accumulated by different lineages was highly variable and the genealogy contains multiple polytomies, one of which resulted in four branches near the time of the Black Death. We suggest that demographic changes can affect the speed of evolution in epidemic pathogens even in the absence of natural selection, and hypothesize that neutral SNPs are fixed rapidly during intermittent epidemics and outbreaks.
European Journal of Human Genetics | 2014
Dorret I. Boomsma; Cisca Wijmenga; Eline Slagboom; Morris A. Swertz; Lennart C. Karssen; Abdel Abdellaoui; Kai Ye; Victor Guryev; Martijn Vermaat; Freerk van Dijk; Laurent C. Francioli; Jouke-Jan Hottenga; Jeroen F. J. Laros; Qibin Li; Yingrui Li; Hongzhi Cao; Ruoyan Chen; Yuanping Du; Ning Li; Sujie Cao; Jessica van Setten; Androniki Menelaou; Sara L. Pulit; Jayne Y. Hehir-Kwa; Marian Beekman; Clara C. Elbers; Heorhiy Byelas; Anton J. M. de Craen; Patrick Deelen; Martijn Dijkstra
Within the Netherlands a national network of biobanks has been established (Biobanking and Biomolecular Research Infrastructure-Netherlands (BBMRI-NL)) as a national node of the European BBMRI. One of the aims of BBMRI-NL is to enrich biobanks with different types of molecular and phenotype data. Here, we describe the Genome of the Netherlands (GoNL), one of the projects within BBMRI-NL. GoNL is a whole-genome-sequencing project in a representative sample consisting of 250 trio-families from all provinces in the Netherlands, which aims to characterize DNA sequence variation in the Dutch population. The parent–offspring trios include adult individuals ranging in age from 19 to 87 years (mean=53 years; SD=16 years) from birth cohorts 1910–1994. Sequencing was done on blood-derived DNA from uncultured cells and accomplished coverage was 14–15x. The family-based design represents a unique resource to assess the frequency of regional variants, accurately reconstruct haplotypes by family-based phasing, characterize short indels and complex structural variants, and establish the rate of de novo mutational events. GoNL will also serve as a reference panel for imputation in the available genome-wide association studies in Dutch and other cohorts to refine association signals and uncover population-specific variants. GoNL will create a catalog of human genetic variation in this sample that is uniquely characterized with respect to micro-geographic location and a wide range of phenotypes. The resource will be made available to the research and medical community to guide the interpretation of sequencing projects. The present paper summarizes the global characteristics of the project.
Nature Biotechnology | 2011
Yingrui Li; Hancheng Zheng; Ruibang Luo; Honglong Wu; Hongmei Zhu; Ruiqiang Li; Hongzhi Cao; Boxin Wu; Shujia Huang; Haojing Shao; Hanzhou Ma; Fan Zhang; Shuijian Feng; Wei Zhang; Hongli Du; Geng Tian; Jingxiang Li; Xiuqing Zhang; Songgang Li; Lars Bolund; Karsten Kristiansen; Adam J. de Smith; Alexandra I. F. Blakemore; Lachlan Coin; Huanming Yang; Jian Wang; Jun Wang
Here we use whole-genome de novo assembly of second-generation sequencing reads to map structural variation (SV) in an Asian genome and an African genome. Our approach identifies small- and intermediate-size homozygous variants (1–50 kb) including insertions, deletions, inversions and their precise breakpoints, and in contrast to other methods, can resolve complex rearrangements. In total, we identified 277,243 SVs ranging in length from 1–23 kb. Validation using computational and experimental methods suggests that we achieve overall <6% false-positive rate and <10% false-negative rate in genomic regions that can be assembled, which outperforms other methods. Analysis of the SVs in the genomes of 106 individuals sequenced as part of the 1000 Genomes Project suggests that SVs account for a greater fraction of the diversity between individuals than do single-nucleotide polymorphisms (SNPs). These findings demonstrate that whole-genome de novo assembly is a feasible approach to deriving more comprehensive maps of genetic variation.
Nature Genetics | 2016
Fusheng Zhou; Hongzhi Cao; Xianbo Zuo; Tao Zhang; Xiaoguang Zhang; Xiaomin Liu; Ricong Xu; Gang Chen; Yuanwei Zhang; Xin Jin; Jinping Gao; Junpu Mei; Yujun Sheng; Qibin Li; Bo Liang; Juan Shen; Changbing Shen; Hui Jiang; Caihong Zhu; Xing Fan; Fengping Xu; Min Yue; Xianyong Yin; Chen Ye; Cuicui Zhang; Xiao Liu; Liang Yu; Jinghua Wu; Mengyun Chen; Xuehan Zhuang
The human major histocompatibility complex (MHC) region has been shown to be associated with numerous diseases. However, it remains a challenge to pinpoint the causal variants for these associations because of the extreme complexity of the region. We thus sequenced the entire 5-Mb MHC region in 20,635 individuals of Han Chinese ancestry (10,689 controls and 9,946 patients with psoriasis) and constructed a Han-MHC database that includes both variants and HLA gene typing results of high accuracy. We further identified multiple independent new susceptibility loci in HLA-C, HLA-B, HLA-DPB1 and BTNL2 and an intergenic variant, rs118179173, associated with psoriasis and confirmed the well-established risk allele HLA-C*06:02. We anticipate that our Han-MHC reference panel built by deep sequencing of a large number of samples will serve as a useful tool for investigating the role of the MHC region in a variety of diseases and thus advance understanding of the pathogenesis of these disorders.
PLOS Genetics | 2013
Niels Grarup; Patrick Sulem; Camilla H. Sandholt; Gudmar Thorleifsson; Tarunveer S. Ahluwalia; Valgerdur Steinthorsdottir; Helgi Bjarnason; Daniel F. Gudbjartsson; Olafur T. Magnusson; Thomas Sparsø; Anders Albrechtsen; Augustine Kong; Gisli Masson; Geng Tian; Hongzhi Cao; Chao Nie; Karsten Kristiansen; Lise Lotte N. Husemoen; Betina H. Thuesen; Yingrui Li; Rasmus Nielsen; Allan Linneberg; Isleifur Olafsson; Gudmundur I. Eyjolfsson; Torben Jørgensen; Jun Wang; Torben Hansen; Unnur Thorsteinsdottir; Kari Stefansson; Oluf Pedersen
Genome-wide association studies have mainly relied on common HapMap sequence variations. Recently, sequencing approaches have allowed analysis of low frequency and rare variants in conjunction with common variants, thereby improving the search for functional variants and thus the understanding of the underlying biology of human traits and diseases. Here, we used a large Icelandic whole genome sequence dataset combined with Danish exome sequence data to gain insight into the genetic architecture of serum levels of vitamin B12 (B12) and folate. Up to 22.9 million sequence variants were analyzed in combined samples of 45,576 and 37,341 individuals with serum B12 and folate measurements, respectively. We found six novel loci associating with serum B12 (CD320, TCN2, ABCD4, MMAA, MMACHC) or folate levels (FOLR3) and confirmed seven loci for these traits (TCN1, FUT6, FUT2, CUBN, CLYBL, MUT, MTHFR). Conditional analyses established that four loci contain additional independent signals. Interestingly, 13 of the 18 identified variants were coding and 11 of the 13 target genes have known functions related to B12 and folate pathways. Contrary to epidemiological studies we did not find consistent association of the variants with cardiovascular diseases, cancers or Alzheimers disease although some variants demonstrated pleiotropic effects. Although to some degree impeded by low statistical power for some of these conditions, these data suggest that sequence variants that contribute to the population diversity in serum B12 or folate levels do not modify the risk of developing these conditions. Yet, the study demonstrates the value of combining whole genome and exome sequencing approaches to ascertain the genetic and molecular architectures underlying quantitative trait associations.
npj Genomic Medicine | 2016
Ryan Kc Yuen; Daniele Merico; Hongzhi Cao; Giovanna Pellecchia; Babak Alipanahi; Bhooma Thiruvahindrapuram; Xin Tong; Yuhui Sun; Dandan Cao; Tao Zhang; Xueli Wu; Xin Jin; Ze Zhou; Xiaomin Liu; Thomas Nalpathamkalam; Susan Walker; Jennifer L. Howe; Z. B. Wang; Jeffrey R. MacDonald; Ada Js Chan; Lia D’Abate; Eric Deneault; Michelle T. Siu; Kristiina Tammimies; Mohammed Uddin; Mehdi Zarrei; Mingbang Wang; Yingrui Li; Jun Wang; Jian Wang
De novo mutations (DNMs) are important in autism spectrum disorder (ASD), but so far analyses have mainly been on the ~1.5% of the genome encoding genes. Here, we performed whole-genome sequencing (WGS) of 200 ASD parent–child trios and characterised germline and somatic DNMs. We confirmed that the majority of germline DNMs (75.6%) originated from the father, and these increased significantly with paternal age only (P=4.2×10−10). However, when clustered DNMs (those within 20 kb) were found in ASD, not only did they mostly originate from the mother (P=7.7×10−13), but they could also be found adjacent to de novo copy number variations where the mutation rate was significantly elevated (P=2.4×10−24). By comparing with DNMs detected in controls, we found a significant enrichment of predicted damaging DNMs in ASD cases (P=8.0×10−9; odds ratio=1.84), of which 15.6% (P=4.3×10−3) and 22.5% (P=7.0×10−5) were non-coding or genic non-coding, respectively. The non-coding elements most enriched for DNM were untranslated regions of genes, regulatory sequences involved in exon-skipping and DNase I hypersensitive regions. Using microarrays and a novel outlier detection test, we also found aberrant methylation profiles in 2/185 (1.1%) of ASD cases. These same individuals carried independently identified DNMs in the ASD-risk and epigenetic genes DNMT3A and ADNP. Our data begins to characterize different genome-wide DNMs, and highlight the contribution of non-coding variants, to the aetiology of ASD.