Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Haojing Shao is active.

Publication


Featured researches published by Haojing Shao.


Science | 2009

Complete Resequencing of 40 Genomes Reveals Domestication Events and Genes in Silkworm (Bombyx)

Qingyou Xia; Yiran Guo; Ze Zhang; Dong Li; Zhaoling Xuan; Zhuo Li; Fangyin Dai; Yingrui Li; Daojun Cheng; Ruiqiang Li; Tingcai Cheng; Tao Jiang; Celine Becquet; Xun Xu; Chun Liu; Xingfu Zha; Wei Fan; Ying Lin; Yihong Shen; Lan Jiang; Jeffrey D. Jensen; Ines Hellmann; Si Tang; Ping Zhao; Hanfu Xu; Chang Yu; Guojie Zhang; Jun Li; Jianjun Cao; Shiping Liu

The Taming of the Silkworm Silkworms, Bombyx mori, represent one of the few domesticated insects, having been domesticated over 10,000 years ago. Xia et al. (p. 433, published online 27 August) sequenced 29 domestic and 11 wild silkworm lines and identified genes that were most likely to be selected during domestication. These genes represent those that enhance silk production, reproduction, and growth. Furthermore, silkworms were probably only domesticated once from a large progenitor population, rather than on multiple occasions, as has been observed for other domesticated animals. Silkworm genomes show signatures of selection associated with domestication. A single–base pair resolution silkworm genetic variation map was constructed from 40 domesticated and wild silkworms, each sequenced to approximately threefold coverage, representing 99.88% of the genome. We identified ~16 million single-nucleotide polymorphisms, many indels, and structural variations. We find that the domesticated silkworms are clearly genetically differentiated from the wild ones, but they have maintained large levels of genetic variability, suggesting a short domestication event involving a large number of individuals. We also identified signals of selection at 354 candidate genes that may have been important during domestication, some of which have enriched expression in the silk gland, midgut, and testis. These data add to our understanding of the domestication processes and may have applications in devising pest control strategies and advancing the use of silkworms as efficient bioreactors.


Nature Genetics | 2014

A large-scale screen for coding variants predisposing to psoriasis.

Huayang Tang; Xin Jin; Yang Li; Hui Jiang; Xianfa Tang; Xu Yang; Hui Cheng; Ying Qiu; Gang Chen; Junpu Mei; Fusheng Zhou; Renhua Wu; Xianbo Zuo; Yong Zhang; Qi Cai; Xianyong Yin; Cheng Quan; Haojing Shao; Yong Cui; Fangzhen Tian; Xia Zhao; Liu H; Feng-Li Xiao; Fengping Xu; Jian-Wen Han; Dongmei Shi; Anping Zhang; Cheng Zhou; Qibin Li; Xing Fan

To explore the contribution of functional coding variants to psoriasis, we analyzed nonsynonymous single-nucleotide variants (SNVs) across the genome by exome sequencing in 781 psoriasis cases and 676 controls and through follow-up validation in 1,326 candidate genes by targeted sequencing in 9,946 psoriasis cases and 9,906 controls from the Chinese population. We discovered two independent missense SNVs in IL23R and GJB2 of low frequency and five common missense SNVs in LCE3D, ERAP1, CARD14 and ZNF816A associated with psoriasis at genome-wide significance. Rare missense SNVs in FUT2 and TARBP1 were also observed with suggestive evidence of association. Single-variant and gene-based association analyses of nonsynonymous SNVs did not identify newly associated genes for psoriasis in the regions subjected to targeted resequencing. This suggests that coding variants in the 1,326 targeted genes contribute only a limited fraction of the overall genetic risk for psoriasis.


Nature Biotechnology | 2011

Structural variation in two human genomes mapped at single-nucleotide resolution by whole genome de novo assembly

Yingrui Li; Hancheng Zheng; Ruibang Luo; Honglong Wu; Hongmei Zhu; Ruiqiang Li; Hongzhi Cao; Boxin Wu; Shujia Huang; Haojing Shao; Hanzhou Ma; Fan Zhang; Shuijian Feng; Wei Zhang; Hongli Du; Geng Tian; Jingxiang Li; Xiuqing Zhang; Songgang Li; Lars Bolund; Karsten Kristiansen; Adam J. de Smith; Alexandra I. F. Blakemore; Lachlan Coin; Huanming Yang; Jian Wang; Jun Wang

Here we use whole-genome de novo assembly of second-generation sequencing reads to map structural variation (SV) in an Asian genome and an African genome. Our approach identifies small- and intermediate-size homozygous variants (1–50 kb) including insertions, deletions, inversions and their precise breakpoints, and in contrast to other methods, can resolve complex rearrangements. In total, we identified 277,243 SVs ranging in length from 1–23 kb. Validation using computational and experimental methods suggests that we achieve overall <6% false-positive rate and <10% false-negative rate in genomic regions that can be assembled, which outperforms other methods. Analysis of the SVs in the genomes of 106 individuals sequenced as part of the 1000 Genomes Project suggests that SVs account for a greater fraction of the diversity between individuals than do single-nucleotide polymorphisms (SNPs). These findings demonstrate that whole-genome de novo assembly is a feasible approach to deriving more comprehensive maps of genetic variation.


Nature Genetics | 2016

Punctuated bursts in human male demography inferred from 1,244 worldwide Y-chromosome sequences

G. David Poznik; Yali Xue; Fernando L. Mendez; Thomas Willems; Andrea Massaia; Melissa A. Wilson Sayres; Qasim Ayub; Shane McCarthy; Apurva Narechania; Seva Kashin; Yuan Chen; Ruby Banerjee; Juan L. Rodriguez-Flores; Maria Cerezo; Haojing Shao; Melissa Gymrek; Ankit Malhotra; Sandra Louzada; Rob DeSalle; Graham R. S. Ritchie; Eliza Cerveira; Tomas Fitzgerald; Erik Garrison; Anthony Marcketta; David Mittelman; Mallory Romanovitch; Chengsheng Zhang; Xiangqun Zheng-Bradley; Gonçalo R. Abecasis; Steven A. McCarroll

We report the sequences of 1,244 human Y chromosomes randomly ascertained from 26 worldwide populations by the 1000 Genomes Project. We discovered more than 65,000 variants, including single-nucleotide variants, multiple-nucleotide variants, insertions and deletions, short tandem repeats, and copy number variants. Of these, copy number variants contribute the greatest predicted functional impact. We constructed a calibrated phylogenetic tree on the basis of binary single-nucleotide variants and projected the more complex variants onto it, estimating the number of mutations for each class. Our phylogeny shows bursts of extreme expansion in male numbers that have occurred independently among each of the five continental superpopulations examined, at times of known migrations and technological innovations.


Nature Genetics | 2016

Deep sequencing of the MHC region in the Chinese population contributes to studies of complex disease

Fusheng Zhou; Hongzhi Cao; Xianbo Zuo; Tao Zhang; Xiaoguang Zhang; Xiaomin Liu; Ricong Xu; Gang Chen; Yuanwei Zhang; Xin Jin; Jinping Gao; Junpu Mei; Yujun Sheng; Qibin Li; Bo Liang; Juan Shen; Changbing Shen; Hui Jiang; Caihong Zhu; Xing Fan; Fengping Xu; Min Yue; Xianyong Yin; Chen Ye; Cuicui Zhang; Xiao Liu; Liang Yu; Jinghua Wu; Mengyun Chen; Xuehan Zhuang

The human major histocompatibility complex (MHC) region has been shown to be associated with numerous diseases. However, it remains a challenge to pinpoint the causal variants for these associations because of the extreme complexity of the region. We thus sequenced the entire 5-Mb MHC region in 20,635 individuals of Han Chinese ancestry (10,689 controls and 9,946 patients with psoriasis) and constructed a Han-MHC database that includes both variants and HLA gene typing results of high accuracy. We further identified multiple independent new susceptibility loci in HLA-C, HLA-B, HLA-DPB1 and BTNL2 and an intergenic variant, rs118179173, associated with psoriasis and confirmed the well-established risk allele HLA-C*06:02. We anticipate that our Han-MHC reference panel built by deep sequencing of a large number of samples will serve as a useful tool for investigating the role of the MHC region in a variety of diseases and thus advance understanding of the pathogenesis of these disorders.


BMC Evolutionary Biology | 2010

Genetic diversity, molecular phylogeny and selection evidence of the silkworm mitochondria implicated by complete resequencing of 41 genomes

Dong Li; Yiran Guo; Haojing Shao; Laurent C. A. M. Tellier; Jun Wang; Zhonghuai Xiang; Qingyou Xia

BackgroundMitochondria are a valuable resource for studying the evolutionary process and deducing phylogeny. A few mitochondria genomes have been sequenced, but a comprehensive picture of the domestication event for silkworm mitochondria remains to be established. In this study, we integrate the extant data, and perform a whole genome resequencing of Japanese wild silkworm to obtain breakthrough results in silkworm mitochondrial (mt) population, and finally use these to deduce a more comprehensive phylogeny of the Bombycidae.ResultsWe identified 347 single nucleotide polymorphisms (SNPs) in the mt genome, but found no past recombination event to have occurred in the silkworm progenitor. A phylogeny inferred from these whole genome SNPs resulted in a well-classified tree, confirming that the domesticated silkworm, Bombyx mori, most recently diverged from the Chinese wild silkworm, rather than from the Japanese wild silkworm. We showed that the population sizes of the domesticated and Chinese wild silkworms both experience neither expansion nor contraction. We also discovered that one mt gene, named cytochrome b, shows a strong signal of positive selection in the domesticated clade. This gene is related to energy metabolism, and may have played an important role during silkworm domestication.ConclusionsWe present a comparative analysis on 41 mt genomes of B. mori and B. mandarina from China and Japan. With these, we obtain a much clearer picture of the evolution history of the silkworm. The data and analyses presented here aid our understanding of the silkworm in general, and provide a crucial insight into silkworm phylogeny.


Nucleic Acids Research | 2013

A population model for genotyping indels from next-generation sequence data

Haojing Shao; Evangelos Bellos; Hanjiudai Yin; Xiao Liu; Jing Zou; Yingrui Li; Jun Wang; Lachlan Coin

Insertion and deletion polymorphisms (indels) are an important source of genomic variation in plant and animal genomes, but accurate genotyping from low-coverage and exome next-generation sequence data remains challenging. We introduce an efficient population clustering algorithm for diploids and polyploids which was tested on a dataset of 2000 exomes. Compared with existing methods, we report a 4-fold reduction in overall indel genotype error rates with a 9-fold reduction in low coverage regions.


BMC Bioinformatics | 2018

npInv: accurate detection and genotyping of inversions using long read sub-alignment

Haojing Shao; Devika Ganesamoorthy; Tania P. S. Duarte; Minh Duc Cao; Clive J. Hoggart; Lachlan Coin

BackgroundDetection of genomic inversions remains challenging. Many existing methods primarily target inzversions with a non repetitive breakpoint, leaving inverted repeat (IR) mediated non-allelic homologous recombination (NAHR) inversions largely unexplored.ResultWe present npInv, a novel tool specifically for detecting and genotyping NAHR inversion using long read sub-alignment of long read sequencing data. We benchmark npInv with other tools in both simulation and real data. We use npInv to generate a whole-genome inversion map for NA12878 consisting of 30 NAHR inversions (of which 15 are novel), including all previously known NAHR mediated inversions in NA12878 with flanking IR less than 7kb. Our genotyping accuracy on this dataset was 94%. We used PCR to confirm the presence of two of these novel inversions. We show that there is a near linear relationship between the length of flanking IR and the minimum inversion size, without inverted repeats.ConclusionThe application of npInv shows high accuracy in both simulation and real data. The results give deeper insight into understanding inversion.


bioRxiv | 2017

Ongoing human chromosome end extension driven by a primate ancestral genomic region revealed by analysis of BioNano genomics data

Haojing Shao; Chenxi Zhou; Minh Duc Cao; Lachlan Coin

The majority of human chromosome ends remain incomplete due to their highly repetitive structure. In this study, we use BioNano data to anchor and extend chromosome ends from two European trios as well as two unrelated Asian genomes. Two thirds of BioNano assembled chromosome ends are structurally divergent from the reference genome, including both deletions and extensions. The majority of extensions are homologous to sequences on chromosome 1p, 5q and 19p. These extensions are heritable and in some cases divergent between Asian and European samples. We identified two sequence families in these sequences which have undergone substantial duplication in multiple primate lineages, leading to the formation of new fusion genes. We show that these sequence families have arisen from progenitor interstitial sequence on the ancestral primate chromosome 7. Comparison of chromosome end sequences from 15 species revealed that chromosome end divergence matches the corresponding phylogenetic relationship and revealed a rate of chromosome extension since the primate divergence of 80-440 kbp per million years.


bioRxiv | 2017

npInv: accurate detection and genotyping of inversions mediated by non-allelic homologous recombination using long read sub-alignment

Haojing Shao; Devika Ganesamoorthy; Tania Duarte; Minh Duc Cao; Clive J. Hoggart; Lachlan Coin

Detection of genomic inversions remains challenging. Many existing methods primarily target inversions with a non repetitive breakpoint, leaving inverted repeat (IR) mediated non-allelic homologous recombination (NAHR) inversions largely unexplored. We present npInv, a novel tool specifically for detecting and genotyping NAHR inversion using long read sub-alignment of long read sequencing data. We use npInv to generate a whole-genome inversion map for NA12878 consisting of 30 NAHR inversions (of which 15 are novel), including all previously known NAHR mediated inversions in NA12878 with flanking IR less than 7kb. Our genotyping accuracy on this dataset was 94%. We used PCR to confirm presence of two of these novel NAHR inversions. We show that there is a near linear relationship between the length of flanking IR and the size of the NAHR inversion.

Collaboration


Dive into the Haojing Shao's collaboration.

Top Co-Authors

Avatar

Lachlan Coin

University of Queensland

View shared research outputs
Top Co-Authors

Avatar

Minh Duc Cao

University of Queensland

View shared research outputs
Top Co-Authors

Avatar

Jun Wang

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Yingrui Li

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Dong Li

Southwest University

View shared research outputs
Top Co-Authors

Avatar

Fengping Xu

Beijing Institute of Genomics

View shared research outputs
Top Co-Authors

Avatar

Fusheng Zhou

Anhui Medical University

View shared research outputs
Top Co-Authors

Avatar

Gang Chen

Huazhong University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Hui Jiang

Chinese Center for Disease Control and Prevention

View shared research outputs
Top Co-Authors

Avatar

Junpu Mei

Beijing Institute of Genomics

View shared research outputs
Researchain Logo
Decentralizing Knowledge