Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Wenwei Zhang is active.

Publication


Featured researches published by Wenwei Zhang.


Nature | 2012

A metagenome-wide association study of gut microbiota in type 2 diabetes

Junjie Qin; Yingrui Li; Zhiming Cai; Shenghui Li; Jianfeng Zhu; Fan Zhang; Suisha Liang; Wenwei Zhang; Yuanlin Guan; Dongqian Shen; Yangqing Peng; Dongya Zhang; Zhuye Jie; Wenxian Wu; Youwen Qin; Wenbin Xue; Junhua Li; Lingchuan Han; Donghui Lu; Peixian Wu; Yali Dai; Xiaojuan Sun; Zesong Li; Aifa Tang; Shilong Zhong; Xiaoping Li; Weineng Chen; Ran Xu; Mingbang Wang; Qiang Feng

Assessment and characterization of gut microbiota has become a major research area in human disease, including type 2 diabetes, the most prevalent endocrine disease worldwide. To carry out analysis on gut microbial content in patients with type 2 diabetes, we developed a protocol for a metagenome-wide association study (MGWAS) and undertook a two-stage MGWAS based on deep shotgun sequencing of the gut microbial DNA from 345 Chinese individuals. We identified and validated approximately 60,000 type-2-diabetes-associated markers and established the concept of a metagenomic linkage group, enabling taxonomic species-level analyses. MGWAS analysis showed that patients with type 2 diabetes were characterized by a moderate degree of gut microbial dysbiosis, a decrease in the abundance of some universal butyrate-producing bacteria and an increase in various opportunistic pathogens, as well as an enrichment of other microbial functions conferring sulphate reduction and oxidative stress resistance. An analysis of 23 additional individuals demonstrated that these gut microbial markers might be useful for classifying type 2 diabetes.


Nature Biotechnology | 2014

An integrated catalog of reference genes in the human gut microbiome

Junhua Li; Huijue Jia; Xianghang Cai; Huanzi Zhong; Qiang Feng; Shinichi Sunagawa; Manimozhiyan Arumugam; Jens Roat Kultima; Edi Prifti; Trine Nielsen; Agnieszka Sierakowska Juncker; Chaysavanh Manichanh; Bing Chen; Wenwei Zhang; Florence Levenez; Juan Wang; Xun Xu; Liang Xiao; Suisha Liang; Dongya Zhang; Zhaoxi Zhang; Weineng Chen; Hailong Zhao; Jumana Y. Al-Aama; Sherif Edris; Huanming Yang; Jian Wang; Torben Hansen; Henrik Bjørn Nielsen; Søren Brunak

Many analyses of the human gut microbiome depend on a catalog of reference genes. Existing catalogs for the human gut microbiome are based on samples from single cohorts or on reference genomes or protein sequences, which limits coverage of global microbiome diversity. Here we combined 249 newly sequenced samples of the Metagenomics of the Human Intestinal Tract (MetaHit) project with 1,018 previously sequenced samples to create a cohort from three continents that is at least threefold larger than cohorts used for previous gene catalogs. From this we established the integrated gene catalog (IGC) comprising 9,879,896 genes. The catalog includes close-to-complete sets of genes for most gut microbes, which are also of considerably higher quality than in previous catalogs. Analyses of a group of samples from Chinese and Danish individuals using the catalog revealed country-specific gut microbial signatures. This expanded catalog should facilitate quantitative characterization of metagenomic, metatranscriptomic and metaproteomic data from the gut microbiome to understand its variation across populations in human health and disease.


PubMed | 2009

Genetic Loci associated with C-reactive protein levels and risk of coronary heart disease.

Perry M. Elliott; John Chambers; Wenwei Zhang; Robert Clarke; Jemma C. Hopewell; John F. Peden; J. Erdmann; P. S. Braund; Jc Engert; David A. Bennett; Lachlan Coin; Deborah Ashby; Ioanna Tzoulaki; Ian J. Brown; Shahrul Mt-Isa; Mark McCarthy; Leena Peltonen; Nelson B. Freimer; Martin Farrall; Aimo Ruokonen; Anders Hamsten; Noha Lim; Philippe Froguel; Dawn M. Waterworth; Peter Vollenweider; G. Waeber; Jarvelin; Mooser; James Scott; A. S. Hall

CONTEXT Plasma levels of C-reactive protein (CRP) are independently associated with risk of coronary heart disease, but whether CRP is causally associated with coronary heart disease or merely a marker of underlying atherosclerosis is uncertain. OBJECTIVE To investigate association of genetic loci with CRP levels and risk of coronary heart disease. DESIGN, SETTING, AND PARTICIPANTS We first carried out a genome-wide association (n = 17,967) and replication study (n = 13,615) to identify genetic loci associated with plasma CRP concentrations. Data collection took place between 1989 and 2008 and genotyping between 2003 and 2008. We carried out a mendelian randomization study of the most closely associated single-nucleotide polymorphism (SNP) in the CRP locus and published data on other CRP variants involving a total of 28,112 cases and 100,823 controls, to investigate the association of CRP variants with coronary heart disease. We compared our finding with that predicted from meta-analysis of observational studies of CRP levels and risk of coronary heart disease. For the other loci associated with CRP levels, we selected the most closely associated SNP for testing against coronary heart disease among 14,365 cases and 32,069 controls. MAIN OUTCOME MEASURE Risk of coronary heart disease. RESULTS Polymorphisms in 5 genetic loci were strongly associated with CRP levels (% difference per minor allele): SNP rs6700896 in LEPR (-14.8%; 95% confidence interval [CI], -17.6% to -12.0%; P = 6.2 x 10(-22)), rs4537545 in IL6R (-11.5%; 95% CI, -14.4% to -8.5%; P = 1.3 x 10(-12)), rs7553007 in the CRP locus (-20.7%; 95% CI, -23.4% to -17.9%; P = 1.3 x 10(-38)), rs1183910 in HNF1A (-13.8%; 95% CI, -16.6% to -10.9%; P = 1.9 x 10(-18)), and rs4420638 in APOE-CI-CII (-21.8%; 95% CI, -25.3% to -18.1%; P = 8.1 x 10(-26)). Association of SNP rs7553007 in the CRP locus with coronary heart disease gave an odds ratio (OR) of 0.98 (95% CI, 0.94 to 1.01) per 20% lower CRP level. Our mendelian randomization study of variants in the CRP locus showed no association with coronary heart disease: OR, 1.00; 95% CI, 0.97 to 1.02; per 20% lower CRP level, compared with OR, 0.94; 95% CI, 0.94 to 0.95; predicted from meta-analysis of the observational studies of CRP levels and coronary heart disease (z score, -3.45; P < .001). SNPs rs6700896 in LEPR (OR, 1.06; 95% CI, 1.02 to 1.09; per minor allele), rs4537545 in IL6R (OR, 0.94; 95% CI, 0.91 to 0.97), and rs4420638 in the APOE-CI-CII cluster (OR, 1.16; 95% CI, 1.12 to 1.21) were all associated with risk of coronary heart disease. CONCLUSION The lack of concordance between the effect on coronary heart disease risk of CRP genotypes and CRP levels argues against a causal association of CRP with coronary heart disease.


Nature Biotechnology | 2012

Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome

Zhiyu Peng; Yanbing Cheng; Bertrand Chin-Ming Tan; Lin Kang; Zhijian Tian; Yuankun Zhu; Wenwei Zhang; Yu Liang; Xueda Hu; Xuemei Tan; Jing Guo; Zirui Dong; Yan Liang; Li Bao; Jun Wang

RNA editing is a post-transcriptional event that recodes hereditary information. Here we describe a comprehensive profile of the RNA editome of a male Han Chinese individual based on analysis of ∼767 million sequencing reads from poly(A)+, poly(A)− and small RNA samples. We developed a computational pipeline that carefully controls for false positives while calling RNA editing events from genome and whole-transcriptome data of the same individual. We identified 22,688 RNA editing events in noncoding genes and introns, untranslated regions and coding sequences of protein-coding genes. Most changes (∼93%) converted A to I(G), consistent with known editing mechanisms based on adenosine deaminase acting on RNA (ADAR). We also found evidence of other types of nucleotide changes; however, these were validated at lower rates. We found 44 editing sites in microRNAs (miRNAs), suggesting a potential link between RNA editing and miRNA-mediated regulation. Our approach facilitates large-scale studies to profile and compare editomes across a wide range of samples.


Genome Biology | 2015

Comparison of RNA-seq and microarray-based models for clinical endpoint prediction

Wenqian Zhang; Falk Hertwig; Jean Thierry-Mieg; Wenwei Zhang; Danielle Thierry-Mieg; Jian Wang; Cesare Furlanello; Viswanath Devanarayan; Jie Cheng; Youping Deng; Barbara Hero; Huixiao Hong; Meiwen Jia; Li Li; Simon Lin; Yuri Nikolsky; André Oberthuer; Tao Qing; Zhenqiang Su; Ruth Volland; Charles Wang; May D. Wang; Junmei Ai; Davide Albanese; Shahab Asgharzadeh; Smadar Avigad; Wenjun Bao; Marina Bessarabova; Murray H. Brilliant; Benedikt Brors

BackgroundGene expression profiling is being widely applied in cancer research to identify biomarkers for clinical endpoint prediction. Since RNA-seq provides a powerful tool for transcriptome-based applications beyond the limitations of microarrays, we sought to systematically evaluate the performance of RNA-seq-based and microarray-based classifiers in this MAQC-III/SEQC study for clinical endpoint prediction using neuroblastoma as a model.ResultsWe generate gene expression profiles from 498 primary neuroblastomas using both RNA-seq and 44 k microarrays. Characterization of the neuroblastoma transcriptome by RNA-seq reveals that more than 48,000 genes and 200,000 transcripts are being expressed in this malignancy. We also find that RNA-seq provides much more detailed information on specific transcript expression patterns in clinico-genetic neuroblastoma subgroups than microarrays. To systematically compare the power of RNA-seq and microarray-based models in predicting clinical endpoints, we divide the cohort randomly into training and validation sets and develop 360 predictive models on six clinical endpoints of varying predictability. Evaluation of factors potentially affecting model performances reveals that prediction accuracies are most strongly influenced by the nature of the clinical endpoint, whereas technological platforms (RNA-seq vs. microarrays), RNA-seq data analysis pipelines, and feature levels (gene vs. transcript vs. exon-junction level) do not significantly affect performances of the models.ConclusionsWe demonstrate that RNA-seq outperforms microarrays in determining the transcriptomic characteristics of cancer, while RNA-seq and microarray-based models perform similarly in clinical endpoint prediction. Our findings may be valuable to guide future studies on the development of gene expression-based predictive models and their implementation in clinical practice.


Human Mutation | 2013

A Short-Read Multiplex Sequencing Method for Reliable, Cost-Effective and High-Throughput Genotyping in Large-Scale Studies

Hongzhi Cao; Yu Wang; Wei Zhang; Xianghua Chai; Xiandong Zhang; Shiping Chen; Fan Yang; Caifen Zhang; Yulai Guo; Ying Liu; Zhoubiao Tang; Caifen Chen; Yaxin Xue; Hefu Zhen; Yinyin Xu; Bin Rao; Tao Liu; Meiru Zhao; Wenwei Zhang; Yingrui Li; Xiuqing Zhang; Laurent C. A. M. Tellier; Anders Krogh; Karsten Kristiansen; Jun Wang; Jian Li

Accurate genotyping is important for genetic testing. Sanger sequencing‐based typing is the gold standard for genotyping, but it has been underused, due to its high cost and low throughput. In contrast, short‐read sequencing provides inexpensive and high‐throughput sequencing, holding great promise for reaching the goal of cost‐effective and high‐throughput genotyping. However, the short‐read length and the paucity of appropriate genotyping methods, pose a major challenge. Here, we present RCHSBT—reliable, cost‐effective and high‐throughput sequence based typing pipeline—which takes short sequence reads as input, but uses a unique variant calling, haploid sequence assembling algorithm, can accurately genotype with greater effective length per amplicon than even Sanger sequencing reads. The RCHSBT method was tested for the human MHC loci HLA‐A, HLA‐B, HLA‐C, HLA‐DQB1, and HLA‐DRB1, upon 96 samples using Illumina PE 150 reads. Amplicons as long as 950 bp were readily genotyped, achieving 100% typing concordance between RCHSBT‐called genotypes and genotypes previously called by Sanger sequence. Genotyping throughput was increased over 10 times, and cost was reduced over five times, for RCHSBT as compared with Sanger sequence genotyping. We thus demonstrate RCHSBT to be a genotyping method comparable to Sanger sequencing‐based typing in quality, while being more cost‐effective, and higher throughput.


GigaScience | 2018

Assessment of the cPAS-based BGISEQ-500 platform for metagenomic sequencing

Chao Fang; Huanzi Zhong; Yuxiang Lin; Bing Chen; Mo Han; Huahui Ren; Haorong Lu; Jacob M. Luber; Min Xia; Wangsheng Li; Shayna Stein; Xun Xu; Wenwei Zhang; Radoje Drmanac; Jian Wang; Huanming Yang; Lennart Hammarström; Aleksandar D. Kostic; Karsten Kristiansen; Junhua Li

Abstract Background More extensive use of metagenomic shotgun sequencing in microbiome research relies on the development of high-throughput, cost-effective sequencing. Here we present a comprehensive evaluation of the performance of the new high-throughput sequencing platform BGISEQ-500 for metagenomic shotgun sequencing and compare its performance with that of 2 Illumina platforms. Findings Using fecal samples from 20 healthy individuals, we evaluated the intra-platform reproducibility for metagenomic sequencing on the BGISEQ-500 platform in a setup comprising 8 library replicates and 8 sequencing replicates. Cross-platform consistency was evaluated by comparing 20 pairwise replicates on the BGISEQ-500 platform vs the Illumina HiSeq 2000 platform and the Illumina HiSeq 4000 platform. In addition, we compared the performance of the 2 Illumina platforms against each other. By a newly developed overall accuracy quality control method, an average of 82.45 million high-quality reads (96.06% of raw reads) per sample, with 90.56% of bases scoring Q30 and above, was obtained using the BGISEQ-500 platform. Quantitative analyses revealed extremely high reproducibility between BGISEQ-500 intra-platform replicates. Cross-platform replicates differed slightly more than intra-platform replicates, yet a high consistency was observed. Only a low percentage (2.02%–3.25%) of genes exhibited significant differences in relative abundance comparing the BGISEQ-500 and HiSeq platforms, with a bias toward genes with higher GC content being enriched on the HiSeq platforms. Conclusions Our study provides the first set of performance metrics for human gut metagenomic sequencing data using BGISEQ-500. The high accuracy and technical reproducibility confirm the applicability of the new platform for metagenomic studies, though caution is still warranted when combining metagenomic data from different platforms.


Insect Systematics & Evolution | 2016

A miniaturized beetle larva in Cretaceous Burmese amber: reinterpretation of a fossil “strepsipteran triungulin”

Rolf G. Beutel; Wenwei Zhang; Hans Pohl; Torsten Wappler; Ming Bai

A wingless and eyeless tiny fossil embedded in Cretaceous amber from Myanmar is described and interpreted phylogenetically as beetle larva, very likely belonging to a cucujiform group of Coleoptera with parasitic habits, probably the family Ripiphoridae. Features supporting this are the lobe-like terminal elements of the legs and the pattern of setae on the abdomen. However the larva display specialized features differing from immatures of extant ripiphorid species, such as for instance the absence of stemmata and the presence of ventral transverse rows of spines. An earlier tentative assignement of a similar larva embedded in Cretaceaous amber from Manitoba (Canada) to Strepsiptera is not followed here. We suggest that this larva is closely related to the beetle larva described here.


Clinical Chemistry | 2018

Advanced Whole-Genome Sequencing and Analysis of Fetal Genomes from Amniotic Fluid

Qing Mao; Robert Chin; Weiwei Xie; Yuqing Deng; Wenwei Zhang; Huixin Xu; Rebecca Yu Zhang; Quan Shi; Erin E. Peters; Natali Gulbahce; Zhenyu Li; Fang Chen; Radoje Drmanac; Brock A. Peters

BACKGROUND Amniocentesis is a common procedure, the primary purpose of which is to collect cells from the fetus to allow testing for abnormal chromosomes, altered chromosomal copy number, or a small number of genes that have small single- to multibase defects. Here we demonstrate the feasibility of generating an accurate whole-genome sequence of a fetus from either the cellular or cell-free DNA (cfDNA) of an amniotic sample. METHODS cfDNA and DNA isolated from the cell pellet of 31 amniocenteses were sequenced to approximately 50× genome coverage by use of the Complete Genomics nanoarray platform. In a subset of the samples, long fragment read libraries were generated from DNA isolated from cells and sequenced to approximately 100× genome coverage. RESULTS Concordance of variant calls between the 2 DNA sources and with parental libraries was >96%. Two fetal genomes were found to harbor potentially detrimental variants in chromodomain helicase DNA binding protein 8 (CHD8) and LDL receptor-related protein 1 (LRP1), variations of which have been associated with autism spectrum disorder and keratosis pilaris atrophicans, respectively. We also discovered drug sensitivities and carrier information of fetuses for a variety of diseases. CONCLUSIONS We were able to elucidate the complete genome sequence of 31 fetuses from amniotic fluid and demonstrate that the cfDNA or DNA from the cell pellet can be analyzed with little difference in quality. We believe that current technologies could analyze this material in a highly accurate and complete manner and that analyses like these should be considered for addition to current amniocentesis procedures.


bioRxiv | 2018

Single tube bead-based DNA co-barcoding for cost effective and accurate sequencing, haplotyping, and assembly

Ou Wang; Robert Chin; Xiaofang Cheng; Michelle Wu; Qing Mao; Jingbo Tang; Yuhui Sun; Han Lam; Dan Chen; Yujun Zhou; Linying Wang; Fei Fan; Yan Zou; Ellis Anderson; Yinlong Xie; Rebecca Yu Zhang; Snezana Drmanac; Darlene Nguyen; Chongjun Xu; Christian Villarosa; Scott Gablenz; Nina Barua; Staci Nguyen; Wenlan Tian; Jia Liu; Jingwan Wang; Xiao Liu; Xiaojuan Qi; Ao Chen; He Wang

Obtaining accurate sequences from long DNA molecules is very important for genome assembly and other applications. Here we describe single tube long fragment read (stLFR), a technology that enables this a low cost. It is based on adding the same barcode sequence to sub-fragments of the original long DNA molecule (DNA co-barcoding). To achieve this efficiently, stLFR uses the surface of microbeads to create millions of miniaturized barcoding reactions in a single tube. Using a combinatorial process up to 3.6 billion unique barcode sequences were generated on beads, enabling practically non-redundant co-barcoding with 50 million barcodes per sample. Using stLFR, we demonstrate efficient unique co-barcoding of over 8 million 20-300 kb genomic DNA fragments. Analysis of the genome of the human genome NA12878 with stLFR demonstrated high quality variant calling and phasing into contigs up to N50 34 Mb. We also demonstrate detection of complex structural variants and complete diploid de novo assembly of NA12878. These analyses were all performed using single stLFR libraries and their construction did not significantly add to the time or cost of whole genome sequencing (WGS) library preparation. stLFR represents an easily automatable solution that enables high quality sequencing, phasing, SV detection, scaffolding, cost-effective diploid de novo genome assembly, and other long DNA sequencing applications.Single tube long fragment read (stLFR) technology enables efficient WGS, haplotyping, and contig scaffolding. It is based on adding the same barcode sequence to sub-fragments of the original DNA molecule (DNA co-barcoding). To achieve this, stLFR uses the surface of microbeads to create millions of miniaturized compartments in a single tube. Using a combinatorial process over 1.8 billion unique barcode sequences were generated on beads, enabling practically non-redundant co-barcoding in reactions with 50 million barcodes. Using stLFR we demonstrate efficient unique co-barcoding of over 8 million 20300 kb genomic DNA fragments with near perfect variant calling and phasing of the genome of NA12878 into contigs up to N50 23.4 Mb. stLFR represents a low-cost single library solution that can enable long sequence data.

Collaboration


Dive into the Wenwei Zhang's collaboration.

Top Co-Authors

Avatar

Radoje Drmanac

Argonne National Laboratory

View shared research outputs
Top Co-Authors

Avatar

Jian Wang

Guangzhou Medical University

View shared research outputs
Top Co-Authors

Avatar

Xun Xu

Beijing Institute of Genomics

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Huanming Yang

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Hui Jiang

Chinese Center for Disease Control and Prevention

View shared research outputs
Top Co-Authors

Avatar

Junhua Li

South China University of Technology

View shared research outputs
Top Co-Authors

Avatar

Brock A. Peters

Howard Hughes Medical Institute

View shared research outputs
Top Co-Authors

Avatar

Snezana Drmanac

Argonne National Laboratory

View shared research outputs
Researchain Logo
Decentralizing Knowledge