Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Leo J. Lee is active.

Publication


Featured researches published by Leo J. Lee.


Nature Genetics | 2008

Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing

Qun Pan; Ofer Shai; Leo J. Lee; Brendan J. Frey; Benjamin J. Blencowe

We carried out the first analysis of alternative splicing complexity in human tissues using mRNA-Seq data. New splice junctions were detected in ∼20% of multiexon genes, many of which are tissue specific. By combining mRNA-Seq and EST-cDNA sequence data, we estimate that transcripts from ∼95% of multiexon genes undergo alternative splicing and that there are ∼100,000 intermediate- to high-abundance alternative splicing events in major human tissues. From a comparison with quantitative alternative splicing microarray profiling data, we also show that mRNA-Seq data provide reliable measurements for exon inclusion levels.


Science | 2012

The Evolutionary Landscape of Alternative Splicing in Vertebrate Species

Nuno L. Barbosa-Morais; Manuel Irimia; Qun Pan; Hui Yuan Xiong; Serge Gueroussov; Leo J. Lee; Slobodeniuc; Claudia Kutter; Stephen Watt; Recep Colak; Tae-Hyung Kim; Misquitta-Ali Cm; Wilson; Philip M. Kim; Duncan T. Odom; Brendan J. Frey; Benjamin J. Blencowe

Whence Species Variation? Vertebrates have widely varying phenotypes that are at odds with their much more limited proteincoding genotypes and conserved messenger RNA expression patterns. Genes with multiple exons and introns can undergo alternative splicing, potentially resulting in multiple protein isoforms (see the Perspective by Papasaikas and Valcárcel). Barbosa-Morais et al. (p. 1587) and Merkin et al. (p. 1593) analyzed alternative splicing across the genomes of a variety of vertebrates, including human, primates, rodents, opossum, platypus, chicken, lizard, and frog. The findings suggest that the evolution of alternative splicing has for the most part been very rapid and that alternative splicing patterns of most organs more strongly reflect the identity of the species rather than the organ type. Species-classifying alternative splicing can affect key regulators, often in disordered regions of proteins that may influence protein-protein interactions, or in regions involved in protein phosphorylation. The patterns and complexity of messenger RNA splicing across vertebrates cluster by species rather than by organ. How species with similar repertoires of protein-coding genes differ so markedly at the phenotypic level is poorly understood. By comparing organ transcriptomes from vertebrate species spanning ~350 million years of evolution, we observed significant differences in alternative splicing complexity between vertebrate lineages, with the highest complexity in primates. Within 6 million years, the splicing profiles of physiologically equivalent organs diverged such that they are more strongly related to the identity of a species than they are to organ type. Most vertebrate species-specific splicing patterns are cis-directed. However, a subset of pronounced splicing changes are predicted to remodel protein interactions involving trans-acting regulators. These events likely further contributed to the diversification of splicing and other transcriptomic changes that underlie phenotypic differences among vertebrate species.


Science | 2015

The human splicing code reveals new insights into the genetic determinants of disease

Hui Y. Xiong; Babak Alipanahi; Leo J. Lee; Hannes Bretschneider; Daniele Merico; Ryan K. C. Yuen; Yimin Hua; Serge Gueroussov; Hamed Shateri Najafabadi; Timothy R. Hughes; Quaid Morris; Yoseph Barash; Adrian R. Krainer; Nebojsa Jojic; Stephen W. Scherer; Benjamin J. Blencowe; Brendan J. Frey

Predicting defects in RNA splicing Most eukaryotic messenger RNAs (mRNAs) are spliced to remove introns. Splicing generates uninterrupted open reading frames that can be translated into proteins. Splicing is often highly regulated, generating alternative spliced forms that code for variant proteins in different tissues. RNA-binding proteins that bind specific sequences in the mRNA regulate splicing. Xiong et al. develop a computational model that predicts splicing regulation for any mRNA sequence (see the Perspective by Guigó and Valcárcel). They use this to analyze more than half a million mRNA splicing sequence variants in the human genome. They are able to identify thousands of known disease-causing mutations, as well as many new disease candidates, including 17 new autism-linked genes. Science, this issue 10.1126/science.1254806; see also p. 124 A model predicts how thousands of disease-linked nucleotide variants affect messenger RNA splicing. [Also see Perspective by Guigó and Valcárcel] INTRODUCTION Advancing whole-genome precision medicine requires understanding how gene expression is altered by genetic variants, especially those that are far outside of protein-coding regions. We developed a computational technique that scores how strongly genetic variants affect RNA splicing, a critical step in gene expression whose disruption contributes to many diseases, including cancers and neurological disorders. A genome-wide analysis reveals tens of thousands of variants that alter splicing and are enriched with a wide range of known diseases. Our results provide insight into the genetic basis of spinal muscular atrophy, hereditary nonpolyposis colorectal cancer, and autism spectrum disorder. RATIONALE We used “deep learning” computer algorithms to derive a computational model that takes as input DNA sequences and applies general rules to predict splicing in human tissues. Given a test variant, which may be up to 300 nucleotides into an intron, our model can be used to compute a score for how much the variant alters splicing. The model is not biased by existing disease annotations or population data and was derived in such a way that it can be used to study diverse diseases and disorders and to determine the consequences of common, rare, and even spontaneous variants. RESULTS Our technique is able to accurately classify disease-causing variants and provides insights into the role of aberrant splicing in disease. We scored more than 650,000 DNA variants and found that disease-causing variants have higher scores than common variants and even those associated with disease in genome-wide association studies (GWAS). Our model predicts substantial and unexpected aberrant splicing due to variants within introns and exons, including those far from the splice site. For example, among intronic variants that are more than 30 nucleotides away from any splice site, known disease variants alter splicing nine times as often as common variants; among missense exonic disease variants, those that least affect protein function are more than five times as likely as other variants to alter splicing. Autism has been associated with disrupted splicing in brain regions, so we used our method to score variants detected using whole-genome sequencing data from individuals with and without autism. Genes with high-scoring variants include many that have previously been linked with autism, as well as new genes with known neurodevelopmental phenotypes. Most of the high-scoring variants are intronic and cannot be detected by exome analysis techniques. When we scored clinical variants in spinal muscular atrophy and colorectal cancer genes, up to 94% of variants found to alter splicing using minigene reporters were correctly classified. CONCLUSION In the context of precision medicine, causal support for variants independent of existing whole-genome variant studies is greatly needed. Our computational model was trained to predict splicing from DNA sequence alone, without using disease annotations or population data. Consequently, its predictions are independent of and complementary to population data, GWAS, expression-based quantitative trait loci (QTL), and functional annotations of the genome. As such, our technique greatly expands the opportunities for understanding the genetic determinants of disease. “Deep learning” reveals the genetic origins of disease. A computational system mimics the biology of RNA splicing by correlating DNA elements with splicing levels in healthy human tissues. The system can scan DNA and identify damaging genetic variants, including those deep within introns. This procedure has led to insights into the genetics of autism, cancers, and spinal muscular atrophy. To facilitate precision medicine and whole-genome annotation, we developed a machine-learning technique that scores how strongly genetic variants affect RNA splicing, whose alteration contributes to many diseases. Analysis of more than 650,000 intronic and exonic variants revealed widespread patterns of mutation-driven aberrant splicing. Intronic disease mutations that are more than 30 nucleotides from any splice site alter splicing nine times as often as common variants, and missense exonic disease mutations that have the least impact on protein function are five times as likely as others to alter splicing. We detected tens of thousands of disease-causing mutations, including those involved in cancers and spinal muscular atrophy. Examination of intronic and exonic variants found using whole-genome sequencing of individuals with autism revealed misspliced genes with neurodevelopmental phenotypes. Our approach provides evidence for causal variants and should enable new discoveries in precision medicine.


Genes & Development | 2009

Current-generation high-throughput sequencing: deepening insights into mammalian transcriptomes.

Benjamin J. Blencowe; Sidrah Ahmad; Leo J. Lee

Recent papers have described the first application of high-throughput sequencing (HTS) technologies to the characterization of transcriptomes. These studies emphasize the tremendous power of this new technology, in terms of both profiling coverage and quantitative accuracy. Initial discoveries include the detection of substantial new transcript complexity, the elucidation of binding maps and regulatory properties of RNA-binding proteins, and new insights into the links between different steps in pre-mRNA processing. We review these findings, focusing on results from profiling mammalian transcriptomes. The strengths and limitations of HTS relative to microarray profiling are discussed. We also consider how future advances in HTS technology are likely to transform our understanding of integrated cellular networks operating at the RNA level.


Bioinformatics | 2014

Deep learning of the tissue-regulated splicing code

Michael K. K. Leung; Hui Yuan Xiong; Leo J. Lee; Brendan J. Frey

Motivation: Alternative splicing (AS) is a regulated process that directs the generation of different transcripts from single genes. A computational model that can accurately predict splicing patterns based on genomic features and cellular context is highly desirable, both in understanding this widespread phenomenon, and in exploring the effects of genetic variations on AS. Methods: Using a deep neural network, we developed a model inferred from mouse RNA-Seq data that can predict splicing patterns in individual tissues and differences in splicing patterns across tissues. Our architecture uses hidden variables that jointly represent features in genomic sequences and tissue types when making predictions. A graphics processing unit was used to greatly reduce the training time of our models with millions of parameters. Results: We show that the deep architecture surpasses the performance of the previous Bayesian method for predicting AS patterns. With the proper optimization procedure and selection of hyperparameters, we demonstrate that deep architectures can be beneficial, even with a moderately sparse dataset. An analysis of what the model has learned in terms of the genomic features is presented. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.


Genome Research | 2011

Genome-wide analysis of alternative splicing in Caenorhabditis elegans

Arun K. Ramani; John A. Calarco; Qun Pan; Sepand Mavandadi; Ying Wang; Andrew C. Nelson; Leo J. Lee; Quaid Morris; Benjamin J. Blencowe; Mei Zhen; Andrew G. Fraser

Alternative splicing (AS) plays a crucial role in the diversification of gene function and regulation. Consequently, the systematic identification and characterization of temporally regulated splice variants is of critical importance to understanding animal development. We have used high-throughput RNA sequencing and microarray profiling to analyze AS in C. elegans across various stages of development. This analysis identified thousands of novel splicing events, including hundreds of developmentally regulated AS events. To make these data easily accessible and informative, we constructed the C. elegans Splice Browser, a web resource in which researchers can mine AS events of interest and retrieve information about their relative levels and regulation across development. The data presented in this study, along with the Splice Browser, provide the most comprehensive set of annotated splice variants in C. elegans to date, and are therefore expected to facilitate focused, high resolution in vivo functional assays of AS function.


Genome Biology | 2007

Functional coordination of alternative splicing in the mammalian central nervous system

Matthew M. Fagnani; Yoseph Barash; Joanna Y. Ip; Christine M. Misquitta; Qun Pan; Arneet L. Saltzman; Ofer Shai; Leo J. Lee; Aviad Rozenhek; Naveed Mohammad; Sandrine Willaime-Morawek; Tomas Babak; Wen Zhang; Timothy R. Hughes; Derek van der Kooy; Brendan J. Frey; Benjamin J. Blencowe

BackgroundAlternative splicing (AS) functions to expand proteomic complexity and plays numerous important roles in gene regulation. However, the extent to which AS coordinates functions in a cell and tissue type specific manner is not known. Moreover, the sequence code that underlies cell and tissue type specific regulation of AS is poorly understood.ResultsUsing quantitative AS microarray profiling, we have identified a large number of widely expressed mouse genes that contain single or coordinated pairs of alternative exons that are spliced in a tissue regulated fashion. The majority of these AS events display differential regulation in central nervous system (CNS) tissues. Approximately half of the corresponding genes have neural specific functions and operate in common processes and interconnected pathways. Differential regulation of AS in the CNS tissues correlates strongly with a set of mostly new motifs that are predominantly located in the intron and constitutive exon sequences neighboring CNS-regulated alternative exons. Different subsets of these motifs are correlated with either increased inclusion or increased exclusion of alternative exons in CNS tissues, relative to the other profiled tissues.ConclusionOur findings provide new evidence that specific cellular processes in the mammalian CNS are coordinated at the level of AS, and that a complex splicing code underlies CNS specific AS regulation. This code appears to comprise many new motifs, some of which are located in the constitutive exons neighboring regulated alternative exons. These data provide a basis for understanding the molecular mechanisms by which the tissue specific functions of widely expressed genes are coordinated at the level of AS.


Nature Genetics | 2013

Mutations in STAMBP, encoding a deubiquitinating enzyme, cause microcephaly-capillary malformation syndrome

Laura M McDonell; Ghayda M. Mirzaa; Diana Alcantara; Jeremy Schwartzentruber; Melissa T. Carter; Leo J. Lee; Carol L. Clericuzio; John M. Graham; Deborah J. Morris-Rosendahl; Tilman Polster; Gyula Acsadi; Sharron Townshend; Simon Williams; Anne Halbert; Bertrand Isidor; Albert David; Christopher D. Smyser; Alex R. Paciorkowski; Marcia C. Willing; John Woulfe; Soma Das; Chandree L. Beaulieu; Janet Marcadier; Michael T. Geraghty; Brendan J. Frey; Jacek Majewski; Dennis E. Bulman; William B. Dobyns; Mark O'Driscoll; Kym M. Boycott

Microcephaly–capillary malformation (MIC-CAP) syndrome is characterized by severe microcephaly with progressive cortical atrophy, intractable epilepsy, profound developmental delay and multiple small capillary malformations on the skin. We used whole-exome sequencing of five patients with MIC-CAP syndrome and identified recessive mutations in STAMBP, a gene encoding the deubiquitinating (DUB) isopeptidase STAMBP (STAM-binding protein, also known as AMSH, associated molecule with the SH3 domain of STAM) that has a key role in cell surface receptor–mediated endocytosis and sorting. Patient cell lines showed reduced STAMBP expression associated with accumulation of ubiquitin-conjugated protein aggregates, elevated apoptosis and insensitive activation of the RAS-MAPK and PI3K-AKT-mTOR pathways. The latter cellular phenotype is notable considering the established connection between these pathways and their association with vascular and capillary malformations. Furthermore, our findings of a congenital human disorder caused by a defective DUB protein that functions in endocytosis implicates ubiquitin-conjugate aggregation and elevated apoptosis as factors potentially influencing the progressive neuronal loss underlying MIC-CAP syndrome.


BMC Bioinformatics | 2012

Challenges in estimating percent inclusion of alternatively spliced junctions from RNA-seq data

Boyko Kakaradov; Hui Yuan Xiong; Leo J. Lee; Nebojsa Jojic; Brendan J. Frey

Transcript quantification is a long-standing problem in genomics and estimating the relative abundance of alternatively-spliced isoforms from the same transcript is an important special case. Both problems have recently been illuminated by high-throughput RNA sequencing experiments which are quickly generating large amounts of data. However, much of the signal present in this data is corrupted or obscured by biases resulting in non-uniform and non-proportional representation of sequences from different transcripts. Many existing analyses attempt to deal with these and other biases with various task-specific approaches, which makes direct comparison between them difficult. However, two popular tools for isoform quantification, MISO and Cufflinks, have adopted a general probabilistic framework to model and mitigate these biases in a more general fashion. These advances motivate the need to investigate the effects of RNA-seq biases on the accuracy of different approaches for isoform quantification. We conduct the investigation by building models of increasing sophistication to account for noise introduced by the biases and compare their accuracy to the established approaches.We focus on methods that estimate the expression of alternatively-spliced isoforms with the percent-spliced-in (PSI) metric for each exon skipping event. To improve their estimates, many methods use evidence from RNA-seq reads that align to exon bodies. However, the methods we propose focus on reads that span only exon-exon junctions. As a result, our approaches are simpler and less sensitive to exon definitions than existing methods, which enables us to distinguish their strengths and weaknesses more easily. We present several probabilistic models of of position-specific read counts with increasing complexity and compare them to each other and to the current state-of-the-art methods in isoform quantification, MISO and Cufflinks. On a validation set with RT-PCR measurements for 26 cassette events, some of our methods are more accurate and some are significantly more consistent than these two popular tools. This comparison demonstrates the challenges in estimating the percent inclusion of alternatively spliced junctions and illuminates the tradeoffs between different approaches.


Genome Biology | 2013

AVISPA: a web tool for the prediction and analysis of alternative splicing

Yoseph Barash; Jorge Vaquero-Garcia; Juan González-Vallinas; Hui Yuan Xiong; Weijun Gao; Leo J. Lee; Brendan J. Frey

Transcriptome complexity and its relation to numerous diseases underpins the need to predict in silico splice variants and the regulatory elements that affect them. Building upon our recently described splicing code, we developed AVISPA, a Galaxy-based web tool for splicing prediction and analysis. Given an exon and its proximal sequence, the tool predicts whether the exon is alternatively spliced, displays tissue-dependent splicing patterns, and whether it has associated regulatory elements. We assess AVISPAs accuracy on an independent dataset of tissue-dependent exons, and illustrate how the tool can be applied to analyze a gene of interest. AVISPA is available at http://avispa.biociphers.org.

Collaboration


Dive into the Leo J. Lee's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Qun Pan

University of Toronto

View shared research outputs
Top Co-Authors

Avatar

Andrew Delong

University of Western Ontario

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ofer Shai

University of Toronto

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Yoseph Barash

University of Pennsylvania

View shared research outputs
Researchain Logo
Decentralizing Knowledge