Shujun Luo | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Shujun Luo is active.

Explore More

Publication

Featured researches published by Shujun Luo.

Nature | 2008

Alternative isoform regulation in human tissue transcriptomes

Eric T. Wang; Rickard Sandberg; Shujun Luo; Irina Khrebtukova; Lu Zhang; Christine Mayr; Stephen F. Kingsmore; Gary P. Schroth; Christopher B. Burge

Through alternative processing of pre-messenger RNAs, individual mammalian genes often produce multiple mRNA and protein isoforms that may have related, distinct or even opposing functions. Here we report an in-depth analysis of 15 diverse human tissue and cell line transcriptomes on the basis of deep sequencing of complementary DNA fragments, yielding a digital inventory of gene and mRNA isoform expression. Analyses in which sequence reads are mapped to exon–exon junctions indicated that 92–94% of human genes undergo alternative splicing, ∼86% with a minor isoform frequency of 15% or more. Differences in isoform-specific read densities indicated that most alternative splicing and alternative cleavage and polyadenylation events vary between tissues, whereas variation between individuals was approximately twofold to threefold less common. Extreme or ‘switch-like’ regulation of splicing between tissues was associated with increased sequence conservation in regulatory regions and with generation of full-length open reading frames. Patterns of alternative splicing and alternative cleavage and polyadenylation were strongly correlated across tissues, suggesting coordinated regulation of these processes, and sequence conservation of a subset of known regulatory motifs in both alternative introns and 3′ untranslated regions suggested common involvement of specific factors in tissue-level regulation of both splicing and polyadenylation.

Science | 2011

Metagenomic discovery of biomass-degrading genes and genomes from cow rumen.

Matthias Hess; Alexander Sczyrba; Rob Egan; Tae Wan Kim; Harshal A. Chokhawala; Gary P. Schroth; Shujun Luo; Douglas S. Clark; Feng Chen; Tao Zhang; Roderick I. Mackie; Len A. Pennacchio; Susannah G. Tringe; Axel Visel; Tanja Woyke; Zhong Wang; Edward M. Rubin

Metagenomic sequencing of biomass-degrading microbes from cow rumen reveals new carbohydrate-active enzymes. The paucity of enzymes that efficiently deconstruct plant polysaccharides represents a major bottleneck for industrial-scale conversion of cellulosic biomass into biofuels. Cow rumen microbes specialize in degradation of cellulosic plant material, but most members of this complex community resist cultivation. To characterize biomass-degrading genes and genomes, we sequenced and analyzed 268 gigabases of metagenomic DNA from microbes adherent to plant fiber incubated in cow rumen. From these data, we identified 27,755 putative carbohydrate-active genes and expressed 90 candidate proteins, of which 57% were enzymatically active against cellulosic substrates. We also assembled 15 uncultured microbial genomes, which were validated by complementary methods including single-cell genome sequencing. These data sets provide a substantially expanded catalog of genes and genomes participating in the deconstruction of cellulosic biomass.

Nature Biotechnology | 2012

Full-Length mRNA-Seq from single cell levels of RNA and individual circulating tumor cells

Daniel Ramsköld; Shujun Luo; Yu-Chieh Wang; Robin Li; Qiaolin Deng; Omid R Faridani; Gregory A. Daniels; Irina Khrebtukova; Jeanne F. Loring; Louise C. Laurent; Gary P. Schroth; Rickard Sandberg

Genome-wide transcriptome analyses are routinely used to monitor tissue-, disease- and cell type–specific gene expression, but it has been technically challenging to generate expression profiles from single cells. Here we describe a robust mRNA-Seq protocol (Smart-Seq) that is applicable down to single cell levels. Compared with existing methods, Smart-Seq has improved read coverage across transcripts, which enhances detailed analyses of alternative transcript isoforms and identification of single-nucleotide polymorphisms. We determined the sensitivity and quantitative accuracy of Smart-Seq for single-cell transcriptomics by evaluating it on total RNA dilution series. We found that although gene expression estimates from single cells have increased noise, hundreds of differentially expressed genes could be identified using few cells per cell type. Applying Smart-Seq to circulating tumor cells from melanomas, we identified distinct gene expression patterns, including candidate biomarkers for melanoma circulating tumor cells. Our protocol will be useful for addressing fundamental biological problems requiring genome-wide transcriptome profiling in rare cells.

Nature Biotechnology | 2008

Global identification of microRNA-target RNA pairs by parallel analysis of RNA ends

Marcelo A German; Manoj Pillay; Dong-Hoon Jeong; Amit Hetawal; Shujun Luo; Prakash Janardhanan; Vimal Kannan; Linda A. Rymarquis; Kan Nobuta; Rana German; Emanuele De Paoli; Cheng Lu; Gary P. Schroth; Blake C. Meyers; Pamela J. Green

MicroRNAs (miRNAs) are important regulatory molecules in most eukaryotes and identification of their target mRNAs is essential for their functional analysis. Whereas conventional methods rely on computational prediction and subsequent experimental validation of target RNAs, we directly sequenced >28,000,000 signatures from the 5′ ends of polyadenylated products of miRNA-mediated mRNA decay, isolated from inflorescence tissue of Arabidopsis thaliana, to discover novel miRNA–target RNA pairs. Within the set of ∼27,000 transcripts included in the 8,000,000 nonredundant signatures, several previously predicted but nonvalidated targets of miRNAs were found. Like validated targets, most showed a single abundant signature at the miRNA cleavage site, particularly in libraries from a mutant deficient in the 5′-to-3′ exonuclease AtXRN4. Although miRNAs in Arabidopsis have been extensively investigated, working in reverse from the cleaved targets resulted in the identification and validation of novel miRNAs. This versatile approach will affect the study of other aspects of RNA processing beyond miRNA–target RNA pairs.

Science | 2010

High Resolution Analysis of Parent-of-Origin Allelic Expression in the Mouse Brain

Christopher Gregg; Jiangwen Zhang; Brandon Weissbourd; Shujun Luo; Gary P. Schroth; David Haig; Catherine Dulac

Parental Influences Genomic imprinting results in the preferential expression of either the paternally or the maternally inherited allele of certain genes. Two papers by Gregg et al. (p. 643, published online 8 July; and p. 682, published online 8 July; see the Perspective by Wilkinson) use a genome-wide approach to characterize the repertoire of genes with parent-of-origin allelic effects in the mouse embryonic and adult brain. The studies uncovered over 1300 loci with maternal or paternal allelic bias. Comparison of the parent-of-origin allelic expression bias in the adult hypothalamus and cortex, and in the developing brain, revealed spatiotemporal, sex-specific, and isoform-specific regulation. Parent-of-origin effects thus represent a major and dynamic mode of epigenetic regulation of gene expression in the brain. A large repertoire of genes shows preferential expression of the paternally or maternally inherited allele. Genomic imprinting results in preferential expression of the paternal or maternal allele of certain genes. We have performed a genome-wide characterization of imprinting in the mouse embryonic and adult brain. This approach uncovered parent-of-origin allelic effects of more than 1300 loci. We identified parental bias in the expression of individual genes and of specific transcript isoforms, with differences between brain regions. Many imprinted genes are expressed in neural systems associated with feeding and motivated behaviors, and parental biases preferentially target genetic pathways governing metabolism and cell adhesion. We observed a preferential maternal contribution to gene expression in the developing brain and a major paternal contribution in the adult brain. Thus, parental expression bias emerges as a major mode of epigenetic regulation in the brain.

Nature | 2010

Genome, epigenome and RNA sequences of monozygotic twins discordant for multiple sclerosis

Sergio E. Baranzini; Joann Mudge; Jennifer C. van Velkinburgh; Pouya Khankhanian; Irina Khrebtukova; Neil Miller; Lu Zhang; Andrew D. Farmer; Callum J. Bell; Ryan W. Kim; Gregory D. May; Jimmy E. Woodward; Stacy J. Caillier; Joseph P. McElroy; Refujia Gomez; Marcelo J. Pando; Leonda E. Clendenen; Elena E. Ganusova; Faye D. Schilkey; Thiruvarangan Ramaraj; Omar Khan; Jim J. Huntley; Shujun Luo; Pui-Yan Kwok; Thomas D. Wu; Gary P. Schroth; Jorge R. Oksenberg; Stephen L. Hauser; Stephen F. Kingsmore

Monozygotic or ‘identical’ twins have been widely studied to dissect the relative contributions of genetics and environment in human diseases. In multiple sclerosis (MS), an autoimmune demyelinating disease and common cause of neurodegeneration and disability in young adults, disease discordance in monozygotic twins has been interpreted to indicate environmental importance in its pathogenesis. However, genetic and epigenetic differences between monozygotic twins have been described, challenging the accepted experimental model in disambiguating the effects of nature and nurture. Here we report the genome sequences of one MS-discordant monozygotic twin pair, and messenger RNA transcriptome and epigenome sequences of CD4+ lymphocytes from three MS-discordant, monozygotic twin pairs. No reproducible differences were detected between co-twins among ∼3.6 million single nucleotide polymorphisms (SNPs) or ∼0.2 million insertion-deletion polymorphisms. Nor were any reproducible differences observed between siblings of the three twin pairs in HLA haplotypes, confirmed MS-susceptibility SNPs, copy number variations, mRNA and genomic SNP and insertion-deletion genotypes, or the expression of ∼19,000 genes in CD4+ T cells. Only 2 to 176 differences in the methylation of ∼2 million CpG dinucleotides were detected between siblings of the three twin pairs, in contrast to ∼800 methylation differences between T cells of unrelated individuals and several thousand differences between tissues or between normal and cancerous tissues. In the first systematic effort to estimate sequence variation among monozygotic co-twins, we did not find evidence for genetic, epigenetic or transcriptome differences that explained disease discordance. These are the first, to our knowledge, female, twin and autoimmune disease individual genome sequences reported.

Molecular Cell | 2008

PRG-1 and 21U-RNAs Interact to Form the piRNA Complex Required for Fertility in C. elegans

Pedro J. Batista; J. Graham Ruby; Julie M. Claycomb; H. Rosaria Chiang; Noah Fahlgren; Kristin D. Kasschau; Daniel A. Chaves; Weifeng Gu; Jessica J. Vasale; Shenghua Duan; Darryl Conte; Shujun Luo; Gary P. Schroth; James C. Carrington; David P. Bartel; Craig C. Mello

In metazoans, Piwi-related Argonaute proteins have been linked to germline maintenance, and to a class of germline-enriched small RNAs termed piRNAs. Here we show that an abundant class of 21 nucleotide small RNAs (21U-RNAs) are expressed in the C. elegans germline, interact with the C. elegans Piwi family member PRG-1, and depend on PRG-1 activity for their accumulation. The PRG-1 protein is expressed throughout development and localizes to nuage-like structures called P granules. Although 21U-RNA loci share a conserved upstream sequence motif, the mature 21U-RNAs are not conserved and, with few exceptions, fail to exhibit complementarity or evidence for direct regulation of other expressed sequences. Our findings demonstrate that 21U-RNAs are the piRNAs of C. elegans and link this class of small RNAs and their associated Piwi Argonaute to the maintenance of temperature-dependent fertility.

Proceedings of the National Academy of Sciences of the United States of America | 2009

Chimeric transcript discovery by paired-end transcriptome sequencing

Christopher A. Maher; Nallasivam Palanisamy; John C. Brenner; Xuhong Cao; Shanker Kalyana-Sundaram; Shujun Luo; Irina Khrebtukova; Terrence R. Barrette; Catherine S. Grasso; Jindan Yu; Robert J. Lonigro; Gary P. Schroth; Chandan Kumar-Sinha; Arul M. Chinnaiyan

Recurrent gene fusions are a prevalent class of mutations arising from the juxtaposition of 2 distinct regions, which can generate novel functional transcripts that could serve as valuable therapeutic targets in cancer. Therefore, we aim to establish a sensitive, high-throughput methodology to comprehensively catalog functional gene fusions in cancer by evaluating a paired-end transcriptome sequencing strategy. Not only did a paired-end approach provide a greater dynamic range in comparison with single read based approaches, but it clearly distinguished the high-level “driving” gene fusions, such as BCR-ABL1 and TMPRSS2-ERG, from potential lower level “passenger” gene fusions. Also, the comprehensiveness of a paired-end approach enabled the discovery of 12 previously undescribed gene fusions in 4 commonly used cell lines that eluded previous approaches. Using the paired-end transcriptome sequencing approach, we observed read-through mRNA chimeras, tissue-type restricted chimeras, converging transcripts, diverging transcripts, and overlapping mRNA transcripts. Last, we successfully used paired-end transcriptome sequencing to detect previously undescribed ETS gene fusions in prostate tumors. Together, this study establishes a highly specific and sensitive approach for accurately and comprehensively cataloguing chimeras within a sample using paired-end transcriptome sequencing.

Proceedings of the National Academy of Sciences of the United States of America | 2011

Multiplexed RNA structure characterization with selective 2′-hydroxyl acylation analyzed by primer extension sequencing (SHAPE-Seq)

Julius B. Lucks; Stefanie A. Mortimer; Cole Trapnell; Shujun Luo; Sharon Aviran; Gary P. Schroth; Lior Pachter; Jennifer A. Doudna; Adam P. Arkin

New regulatory roles continue to emerge for both natural and engineered noncoding RNAs, many of which have specific secondary and tertiary structures essential to their function. Thus there is a growing need to develop technologies that enable rapid characterization of structural features within complex RNA populations. We have developed a high-throughput technique, SHAPE-Seq, that can simultaneously measure quantitative, single nucleotide-resolution secondary and tertiary structural information for hundreds of RNA molecules of arbitrary sequence. SHAPE-Seq combines selective 2′-hydroxyl acylation analyzed by primer extension (SHAPE) chemistry with multiplexed paired-end deep sequencing of primer extension products. This generates millions of sequencing reads, which are then analyzed using a fully automated data analysis pipeline, based on a rigorous maximum likelihood model of the SHAPE-Seq experiment. We demonstrate the ability of SHAPE-Seq to accurately infer secondary and tertiary structural information, detect subtle conformational changes due to single nucleotide point mutations, and simultaneously measure the structures of a complex pool of different RNA molecules. SHAPE-Seq thus represents a powerful step toward making the study of RNA secondary and tertiary structures high throughput and accessible to a wide array of scientific pursuits, from fundamental biological investigations to engineering RNA for synthetic biological systems.

Blood | 2012

Monitoring chronic lymphocytic leukemia progression by whole genome sequencing reveals heterogeneous clonal evolution patterns

Anna Schuh; Jennifer Becq; Sean Humphray; Adrian Alexa; Adam Burns; Ruth Clifford; Stephan M. Feller; Russell Grocock; Shirley Henderson; Irina Khrebtukova; Zoya Kingsbury; Shujun Luo; David McBride; Lisa Murray; Toshi Menju; Adele Timbs; Mark T. Ross; Jenny C. Taylor; David R. Bentley

Chronic lymphocytic leukemia is characterized by relapse after treatment and chemotherapy resistance. Similarly, in other malignancies leukemia cells accumulate mutations during growth, forming heterogeneous cell populations that are subject to Darwinian selection and may respond differentially to treatment. There is therefore a clinical need to monitor changes in the subclonal composition of cancers during disease progression. Here, we use whole-genome sequencing to track subclonal heterogeneity in 3 chronic lymphocytic leukemia patients subjected to repeated cycles of therapy. We reveal different somatic mutation profiles in each patient and use these to establish probable hierarchical patterns of subclonal evolution, to identify subclones that decline or expand over time, and to detect founder mutations. We show that clonal evolution patterns are heterogeneous in individual patients. We conclude that genome sequencing is a powerful and sensitive approach to monitor disease progression repeatedly at the molecular level. If applied to future clinical trials, this approach might eventually influence treatment strategies as a tool to individualize and direct cancer treatment.

Explore More