Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Xiaowen Sun is active.

Publication


Featured researches published by Xiaowen Sun.


PLOS ONE | 2013

SLAF-seq: An Efficient Method of Large-Scale De Novo SNP Discovery and Genotyping Using High-Throughput Sequencing

Xiaowen Sun; Dongyuan Liu; Xiaofeng Zhang; Wenbin Li; Hui Liu; Weiguo Hong; Chuanbei Jiang; Ning Guan; Chouxian Ma; Huaping Zeng; Chunhua Xu; Jun Song; Long Huang; Chunmei Wang; Junjie Shi; Rui Wang; Xianhu Zheng; Cuiyun Lu; Xiaowu Wang; Hongkun Zheng

Large-scale genotyping plays an important role in genetic association studies. It has provided new opportunities for gene discovery, especially when combined with high-throughput sequencing technologies. Here, we report an efficient solution for large-scale genotyping. We call it specific-locus amplified fragment sequencing (SLAF-seq). SLAF-seq technology has several distinguishing characteristics: i) deep sequencing to ensure genotyping accuracy; ii) reduced representation strategy to reduce sequencing costs; iii) pre-designed reduced representation scheme to optimize marker efficiency; and iv) double barcode system for large populations. In this study, we tested the efficiency of SLAF-seq on rice and soybean data. Both sets of results showed strong consistency between predicted and practical SLAFs and considerable genotyping accuracy. We also report the highest density genetic map yet created for any organism without a reference genome sequence, common carp in this case, using SLAF-seq data. We detected 50,530 high-quality SLAFs with 13,291 SNPs genotyped in 211 individual carp. The genetic map contained 5,885 markers with 0.68 cM intervals on average. A comparative genomics study between common carp genetic map and zebrafish genome sequence map showed high-quality SLAF-seq genotyping results. SLAF-seq provides a high-resolution strategy for large-scale genotyping and can be generally applicable to various species and populations.


Nature Genetics | 2014

Genome sequence and genetic diversity of the common carp, Cyprinus carpio

Peng Xu; Xiaofeng Zhang; Xumin Wang; Jiong-Tang Li; Guiming Liu; Youyi Kuang; Jian Xu; Xianhu Zheng; Lufeng Ren; Guoliang Wang; Yan Zhang; Linhe Huo; Zixia Zhao; Dingchen Cao; Cuiyun Lu; Chao Li; Yi Zhou; Zhanjiang Liu; Zhonghua Fan; Guangle Shan; Xingang Li; Shuangxiu Wu; Lipu Song; Guangyuan Hou; Yanliang Jiang; Zsigmond Jeney; Dan Yu; Wang L; Changjun Shao; Lai Song

The common carp, Cyprinus carpio, is one of the most important cyprinid species and globally accounts for 10% of freshwater aquaculture production. Here we present a draft genome of domesticated C. carpio (strain Songpu), whose current assembly contains 52,610 protein-coding genes and approximately 92.3% coverage of its paleotetraploidized genome (2n = 100). The latest round of whole-genome duplication has been estimated to have occurred approximately 8.2 million years ago. Genome resequencing of 33 representative individuals from worldwide populations demonstrates a single origin for C. carpio in 2 subspecies (C. carpio Haematopterus and C. carpio carpio). Integrative genomic and transcriptomic analyses were used to identify loci potentially associated with traits including scaling patterns and skin color. In combination with the high-resolution genetic map, the draft genome paves the way for better molecular studies and improved genome-assisted breeding of C. carpio and other closely related species.


PLOS ONE | 2012

Characterization of Common Carp Transcriptome: Sequencing, De Novo Assembly, Annotation and Comparative Genomics

Peifeng Ji; Guiming Liu; Jian Xu; Xumin Wang; Jiong-Tang Li; Zixia Zhao; Xiaofeng Zhang; Yan Zhang; Peng Xu; Xiaowen Sun

Background Common carp (Cyprinus carpio) is one of the most important aquaculture species of Cyprinidae with an annual global production of 3.4 million tons, accounting for nearly 14% of the freshwater aquaculture production in the world. Due to the economical and ecological importance of common carp, genomic data are eagerly needed for genetic improvement purpose. However, there is still no sufficient transcriptome data available. The objective of the project is to sequence transcriptome deeply and provide well-assembled transcriptome sequences to common carp research community. Result Transcriptome sequencing of common carp was performed using Roche 454 platform. A total of 1,418,591 clean ESTs were collected and assembled into 36,811 cDNA contigs, with average length of 888 bp and N50 length of 1,002 bp. Annotation was performed and a total of 19,165 unique proteins were identified from assembled contigs. Gene ontology and KEGG analysis were performed and classified all contigs into functional categories for understanding gene functions and regulation pathways. Open Reading Frames (ORFs) were detected from 29,869 (81.1%) contigs with an average ORF length of 763 bp. From these contigs, 9,625 full-length cDNAs were identified with sequence length from 201 bp to 9,956 bp. Comparative analysis revealed that 27,693(75.2%) contigs have significant similarity to zebrafish Refseq proteins, and 24,371(66.2%), 24,501(66.5%) and 25,025(70.0%) to teraodon, medaka and three-spined stickleback refseq proteins. A total of 2,064 microsatellites were initially identified from 1,730 contigs, and 1,639 unique sequences had sufficient flanking sequences on both sides for primer design. Conclusion The transcriptome of common carp had been deep sequenced, de novo assembled and characterized, providing the valuable resource for better understanding of common carp genome. The transcriptome data will facilitate future functional studies on common carp genome, and gradually apply in breeding programs of common carp, as well as closely related other Cyprinids.


PLOS ONE | 2014

Construction and Analysis of High-Density Linkage Map Using High-Throughput Sequencing Data

Dongyuan Liu; Chouxian Ma; Weiguo Hong; Long Huang; Min Liu; Hui Liu; Huaping Zeng; Dejing Deng; Huaigen Xin; Jun Song; Chunhua Xu; Xiaowen Sun; Xilin Hou; Xiaowu Wang; Hongkun Zheng

Linkage maps enable the study of important biological questions. The construction of high-density linkage maps appears more feasible since the advent of next-generation sequencing (NGS), which eases SNP discovery and high-throughput genotyping of large population. However, the marker number explosion and genotyping errors from NGS data challenge the computational efficiency and linkage map quality of linkage study methods. Here we report the HighMap method for constructing high-density linkage maps from NGS data. HighMap employs an iterative ordering and error correction strategy based on a k-nearest neighbor algorithm and a Monte Carlo multipoint maximum likelihood algorithm. Simulation study shows HighMap can create a linkage map with three times as many markers as ordering-only methods while offering more accurate marker orders and stable genetic distances. Using HighMap, we constructed a common carp linkage map with 10,004 markers. The singleton rate was less than one-ninth of that generated by JoinMap4.1. Its total map distance was 5,908 cM, consistent with reports on low-density maps. HighMap is an efficient method for constructing high-density, high-quality linkage maps from high-throughput population NGS data. It will facilitate genome assembling, comparative genomic analysis, and QTL studies. HighMap is available at http://highmap.biomarker.com.cn/.


BMC Genomics | 2012

Transcriptome analysis reveals the time of the fourth round of genome duplication in common carp (Cyprinus carpio)

Jintu Wang; Jiong-Tang Li; Xiaofeng Zhang; Xiaowen Sun

BackgroundCommon carp (Cyprinus carpio) is thought to have undergone one extra round of genome duplication compared to zebrafish. Transcriptome analysis has been used to study the existence and timing of genome duplication in species for which genome sequences are incomplete. Large-scale transcriptome data for the common carp genome should help reveal the timing of the additional duplication event.ResultsWe have sequenced the transcriptome of common carp using 454 pyrosequencing. After assembling the 454 contigs and the published common carp sequences together, we obtained 49,669 contigs and identified genes using homology searches and an ab initio method. We identified 4,651 orthologous pairs between common carp and zebrafish and found 129,984 paralogous pairs within the common carp. An estimation of the synonymous substitution rate in the orthologous pairs indicated that common carp and zebrafish diverged 120 million years ago (MYA). We identified one round of genome duplication in common carp and estimated that it had occurred 5.6 to 11.3 MYA. In zebrafish, no genome duplication event after speciation was observed, suggesting that, compared to zebrafish, common carp had undergone an additional genome duplication event. We annotated the common carp contigs with Gene Ontology terms and KEGG pathways. Compared with zebrafish gene annotations, we found that a set of biological processes and pathways were enriched in common carp.ConclusionsThe assembled contigs helped us to estimate the time of the fourth-round of genome duplication in common carp. The resource that we have built as part of this study will help advance functional genomics and genome annotation studies in the future.


PLOS ONE | 2012

Identification and profiling of microRNAs from skeletal muscle of the common carp.

Xuechun Yan; Lei Ding; Yunchao Li; Xiaofeng Zhang; Yang Liang; Xiaowen Sun; Chun-Bo Teng

The common carp is one of the most important cultivated species in the world of freshwater aquaculture. The cultivation of this species is particularly productive due to its high skeletal muscle mass; however, the molecular mechanisms of skeletal muscle development in the common carp remain unknown. It has been shown that a class of non-coding ∼22 nucleotide RNAs called microRNAs (miRNAs) play important roles in vertebrate development. They regulate gene expression through sequence-specific interactions with the 3′ untranslated regions (UTRs) of target mRNAs and thereby cause translational repression or mRNA destabilization. Intriguingly, the role of miRNAs in the skeletal muscle development of the common carp remains unknown. In this study, a small-RNA cDNA library was constructed from the skeletal muscle of the common carp, and Solexa sequencing technology was used to perform high throughput sequencing of the library. Subsequent bioinformatics analysis identified 188 conserved miRNAs and 7 novel miRNAs in the carp skeletal muscle. The miRNA expression profiling showed that, miR-1, miR-133a-3p, and miR-206 were specifically expressed in muscle-containing organs, and that miR-1, miR-21, miR-26a, miR-27a, miR-133a-3p, miR-206, miR-214 and miR-222 were differentially expressed in the process of skeletal muscle development of the common carp. This study provides a first identification and profiling of miRNAs related to the muscle biology of the common carp. Their identification could provide clues leading towards a better understanding of the molecular mechanisms of carp skeletal muscle development.


BMC Genomics | 2013

L_RNA_scaffolder: scaffolding genomes with transcripts

Wei Xue; Jiong-Tang Li; Yaping Zhu; Guang-Yuan Hou; Xiang-Fei Kong; You-Yi Kuang; Xiaowen Sun

BackgroundGeneration of large mate-pair libraries is necessary for de novo genome assembly but the procedure is complex and time-consuming. Furthermore, in some complex genomes, it is hard to increase the N50 length even with large mate-pair libraries, which leads to low transcript coverage. Thus, it is necessary to develop other simple scaffolding approaches, to at least solve the elongation of transcribed fragments.ResultsWe describe L_RNA_scaffolder, a novel genome scaffolding method that uses long transcriptome reads to order, orient and combine genomic fragments into larger sequences. To demonstrate the accuracy of the method, the zebrafish genome was scaffolded. With expanded human transcriptome data, the N50 of human genome was doubled and L_RNA_scaffolder out-performed most scaffolding results by existing scaffolders which employ mate-pair libraries. In these two examples, the transcript coverage was almost complete, especially for long transcripts. We applied L_RNA_scaffolder to the highly polymorphic pearl oyster draft genome and the gene model length significantly increased.ConclusionsThe simplicity and high-throughput of RNA-seq data makes this approach suitable for genome scaffolding. L_RNA_scaffolder is available at http://www.fishbrowser.org/software/L_RNA_scaffolder.


PLOS ONE | 2012

Genome-Wide SNP Discovery from Transcriptome of Four Common Carp Strains

Jian Xu; Peifeng Ji; Zixia Zhao; Yan Zhang; Jianxin Feng; Jian Wang; Jiong-Tang Li; Xiaofeng Zhang; Lan Zhao; Guangzan Liu; Peng Xu; Xiaowen Sun

Background Single nucleotide polymorphisms (SNPs) have been used as genetic marker for genome-wide association studies in many species. Gene-associated SNPs could offer sufficient coverage in trait related research and further more could themselves be causative SNPs for traits. Common carp (Cyprinus carpio) is one of the most important aquaculture species in the world accounting for nearly 14% of freshwater aquaculture production. There are various strains of common carp with different economic traits, however, the genetic mechanism underlying the different traits have not been elucidated yet. In this project, we identified a large number of gene-associated SNPs from four strains of common carp using next-generation sequencing. Results Transcriptome sequencing of four strains of common carp (mirror carp, purse red carp, Xingguo red carp, Yellow River carp) was performed with Solexa HiSeq2000 platform. De novo assembled transcriptome was used as reference for alignments, and SNP calling was done through BWA and SAMtools. A total of 712,042 Intra-strain SNPs were discovered in four strains, of which 483,276 SNPs for mirror carp, 486,629 SNPs for purse red carp, 478,028 SNPs for Xingguo red carp and 488,281 SNPs for Yellow River carp were discovered, respectively. Besides, 53,893 inter-SNPs were identified. Strain-specific SNPs of four strains were 53,938, 53,866, 48,701, 40,131 in mirror carp, purse red carp, Xingguo red carp and Yellow River carp, respectively. GO and KEGG pathway analysis were done to reveal strain-specific genes affected by strain-specific non-synonymous SNPs. Validation of selected SNPs revealed that 48% percent of SNPs (12 of 25) were tested to be true SNPs. Conclusions Transcriptome analysis of common carp using RNA-Seq is a cost-effective way of generating numerous reads for SNP discovery. After validation of identified SNPs, these data will provide a solid base for SNP array designing and genome-wide association studies.


BMC Genomics | 2011

Genomic insight into the common carp (Cyprinus carpio) genome by sequencing analysis of BAC-end sequences

Peng Xu; Jiong-Tang Li; Yan Li; Runzi Cui; Jintu Wang; Jian Wang; Yan Zhang; Zixia Zhao; Xiaowen Sun

BackgroundCommon carp is one of the most important aquaculture teleost fish in the world. Common carp and other closely related Cyprinidae species provide over 30% aquaculture production in the world. However, common carp genomic resources are still relatively underdeveloped. BAC end sequences (BES) are important resources for genome research on BAC-anchored genetic marker development, linkage map and physical map integration, and whole genome sequence assembling and scaffolding.ResultTo develop such valuable resources in common carp (Cyprinus carpio), a total of 40,224 BAC clones were sequenced on both ends, generating 65,720 clean BES with an average read length of 647 bp after sequence processing, representing 42,522,168 bp or 2.5% of common carp genome. The first survey of common carp genome was conducted with various bioinformatics tools. The common carp genome contains over 17.3% of repetitive elements with GC content of 36.8% and 518 transposon ORFs. To identify and develop BAC-anchored microsatellite markers, a total of 13,581 microsatellites were detected from 10,355 BES. The coding region of 7,127 genes were recognized from 9,443 BES on 7,453 BACs, with 1,990 BACs have genes on both ends. To evaluate the similarity to the genome of closely related zebrafish, BES of common carp were aligned against zebrafish genome. A total of 39,335 BES of common carp have conserved homologs on zebrafish genome which demonstrated the high similarity between zebrafish and common carp genomes, indicating the feasibility of comparative mapping between zebrafish and common carp once we have physical map of common carp.ConclusionBAC end sequences are great resources for the first genome wide survey of common carp. The repetitive DNA was estimated to be approximate 28% of common carp genome, indicating the higher complexity of the genome. Comparative analysis had mapped around 40,000 BES to zebrafish genome and established over 3,100 microsyntenies, covering over 50% of the zebrafish genome. BES of common carp are tremendous tools for comparative mapping between the two closely related species, zebrafish and common carp, which should facilitate both structural and functional genome analysis in common carp.


PLOS ONE | 2013

Transcriptome Analysis of Crucian Carp (Carassius auratus), an Important Aquaculture and Hypoxia-Tolerant Species

Xiaolin Liao; Lei Cheng; Peng Xu; Guoqing Lu; Michael Wachholtz; Xiaowen Sun; Songlin Chen

The crucian carp is an important aquaculture species and a potential model to study genome evolution and physiological adaptation. However, so far the genomics and transcriptomics data available for this species are still scarce. We performed de novo transcriptome sequencing of four cDNA libraries representing brain, muscle, liver and kidney tissues respectively, each with six specimens. The removal of low quality reads resulted in 2.62 million raw reads, which were assembled as 127,711 unigenes, including 84,867 isotigs and 42,844 singletons. A total of 22,273 unigenes were found with significant matches to 14,449 unique proteins. Around14,398 unigenes were assigned with at least one Gene Ontology (GO) category in 84,876 total assignments, and 6,382 unigenes were found in 237 predicted KEGG pathways. The gene expression analysis revealed more genes expressed in brain, more up-regulated genes in muscle and more down-regulated genes in liver as compared with gene expression profiles of other tissues. In addition, 23 enzymes in the glycolysis/gluconeogenesis pathway were recovered. Importantly, we identified 5,784 high-quality putative SNP and 11,295 microsatellite markers which include 5,364 microsatellites with flanking sequences ≥50 bp. This study produced the most comprehensive genomic resources that have been derived from crucian carp, including thousands of genetic markers, which will not only lay a foundation for further studies on polyploidy origin and anoxic survival but will also facilitate selective breeding of this important aquaculture species.

Collaboration


Dive into the Xiaowen Sun's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Yan Zhang

Chinese Academy of Fishery Sciences

View shared research outputs
Top Co-Authors

Avatar

Jiong-Tang Li

Chinese Academy of Fishery Sciences

View shared research outputs
Top Co-Authors

Avatar

Dingchen Cao

Chinese Academy of Fishery Sciences

View shared research outputs
Top Co-Authors

Avatar

Xiaofeng Zhang

Chinese Academy of Fishery Sciences

View shared research outputs
Top Co-Authors

Avatar

Cuiyun Lu

Chinese Academy of Fishery Sciences

View shared research outputs
Top Co-Authors

Avatar

Youyi Kuang

Chinese Academy of Fishery Sciences

View shared research outputs
Top Co-Authors

Avatar

Zixia Zhao

Chinese Academy of Fishery Sciences

View shared research outputs
Top Co-Authors

Avatar

Xianhu Zheng

Chinese Academy of Fishery Sciences

View shared research outputs
Top Co-Authors

Avatar

Jian Xu

Chinese Academy of Fishery Sciences

View shared research outputs
Researchain Logo
Decentralizing Knowledge