Brendan J. Frey
University of Toronto
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Brendan J. Frey.
Nature Genetics | 2008
Qun Pan; Ofer Shai; Leo J. Lee; Brendan J. Frey; Benjamin J. Blencowe
We carried out the first analysis of alternative splicing complexity in human tissues using mRNA-Seq data. New splice junctions were detected in ∼20% of multiexon genes, many of which are tissue specific. By combining mRNA-Seq and EST-cDNA sequence data, we estimate that transcripts from ∼95% of multiexon genes undergo alternative splicing and that there are ∼100,000 intermediate- to high-abundance alternative splicing events in major human tissues. From a comparison with quantitative alternative splicing microarray profiling data, we also show that mRNA-Seq data provide reliable measurements for exon inclusion levels.
Nature | 2013
Debashish Ray; Hilal Kazan; Kate B. Cook; Matthew T. Weirauch; Hamed Shateri Najafabadi; Xiao Li; Serge Gueroussov; Mihai Albu; Hong Zheng; Ally Yang; Hong Na; Manuel Irimia; Leah H. Matzat; Ryan K. Dale; Sarah A. Smith; Christopher A. Yarosh; Seth M. Kelly; Behnam Nabet; D. Mecenas; Weimin Li; Rakesh S. Laishram; Mei Qiao; Howard D. Lipshitz; Fabio Piano; Anita H. Corbett; Russ P. Carstens; Brendan J. Frey; Richard A. Anderson; Kristen W. Lynch; Luiz O. F. Penalva
RNA-binding proteins are key regulators of gene expression, yet only a small fraction have been functionally characterized. Here we report a systematic analysis of the RNA motifs recognized by RNA-binding proteins, encompassing 205 distinct genes from 24 diverse eukaryotes. The sequence specificities of RNA-binding proteins display deep evolutionary conservation, and the recognition preferences for a large fraction of metazoan RNA-binding proteins can thus be inferred from their RNA-binding domain sequence. The motifs that we identify in vitro correlate well with in vivo RNA-binding data. Moreover, we can associate them with distinct functional roles in diverse types of post-transcriptional regulation, enabling new insights into the functions of RNA-binding proteins both in normal physiology and in human disease. These data provide an unprecedented overview of RNA-binding proteins and their targets, and constitute an invaluable resource for determining post-transcriptional regulatory mechanisms in eukaryotes.
Nature | 2010
Yoseph Barash; John A. Calarco; Weijun Gao; Qun Pan; Xinchen Wang; Ofer Shai; Benjamin J. Blencowe; Brendan J. Frey
Alternative splicing has a crucial role in the generation of biological complexity, and its misregulation is often involved in human disease. Here we describe the assembly of a ‘splicing code’, which uses combinations of hundreds of RNA features to predict tissue-dependent changes in alternative splicing for thousands of exons. The code determines new classes of splicing patterns, identifies distinct regulatory programs in different tissues, and identifies mutation-verified regulatory sequences. Widespread regulatory strategies are revealed, including the use of unexpectedly large combinations of features, the establishment of low exon inclusion levels that are overcome by features in specific tissues, the appearance of features deeper into introns than previously appreciated, and the modulation of splice variant levels by transcript structure characteristics. The code detected a class of exons whose inclusion silences expression in adult tissues by activating nonsense-mediated messenger RNA decay, but whose exclusion promotes expression during embryogenesis. The code facilitates the discovery and detailed characterization of regulated alternative splicing events on a genome-wide scale.
Science | 2012
Nuno L. Barbosa-Morais; Manuel Irimia; Qun Pan; Hui Yuan Xiong; Serge Gueroussov; Leo J. Lee; Slobodeniuc; Claudia Kutter; Stephen Watt; Recep Colak; Tae-Hyung Kim; Misquitta-Ali Cm; Wilson; Philip M. Kim; Duncan T. Odom; Brendan J. Frey; Benjamin J. Blencowe
Whence Species Variation? Vertebrates have widely varying phenotypes that are at odds with their much more limited proteincoding genotypes and conserved messenger RNA expression patterns. Genes with multiple exons and introns can undergo alternative splicing, potentially resulting in multiple protein isoforms (see the Perspective by Papasaikas and Valcárcel). Barbosa-Morais et al. (p. 1587) and Merkin et al. (p. 1593) analyzed alternative splicing across the genomes of a variety of vertebrates, including human, primates, rodents, opossum, platypus, chicken, lizard, and frog. The findings suggest that the evolution of alternative splicing has for the most part been very rapid and that alternative splicing patterns of most organs more strongly reflect the identity of the species rather than the organ type. Species-classifying alternative splicing can affect key regulators, often in disordered regions of proteins that may influence protein-protein interactions, or in regions involved in protein phosphorylation. The patterns and complexity of messenger RNA splicing across vertebrates cluster by species rather than by organ. How species with similar repertoires of protein-coding genes differ so markedly at the phenotypic level is poorly understood. By comparing organ transcriptomes from vertebrate species spanning ~350 million years of evolution, we observed significant differences in alternative splicing complexity between vertebrate lineages, with the highest complexity in primates. Within 6 million years, the splicing profiles of physiologically equivalent organs diverged such that they are more strongly related to the identity of a species than they are to organ type. Most vertebrate species-specific splicing patterns are cis-directed. However, a subset of pronounced splicing changes are predicted to remodel protein interactions involving trans-acting regulators. These events likely further contributed to the diversification of splicing and other transcriptomic changes that underlie phenotypic differences among vertebrate species.
Cell | 2006
Thomas Kislinger; Brian Cox; Anitha Kannan; Clement Chung; Pingzhao Hu; Alexandr Ignatchenko; Michelle S. Scott; Anthony O. Gramolini; Quaid Morris; Michael Hallett; Janet Rossant; Timothy R. Hughes; Brendan J. Frey; Andrew Emili
Organs and organelles represent core biological systems in mammals, but the diversity in protein composition remains unclear. Here, we combine subcellular fractionation with exhaustive tandem mass spectrometry-based shotgun sequencing to examine the protein content of four major organellar compartments (cytosol, membranes [microsomes], mitochondria, and nuclei) in six organs (brain, heart, kidney, liver, lung, and placenta) of the laboratory mouse, Mus musculus. Using rigorous statistical filtering and machine-learning methods, the subcellular localization of 3274 of the 4768 proteins identified was determined with high confidence, including 1503 previously uncharacterized factors, while tissue selectivity was evaluated by comparison to previously reported mRNA expression patterns. This molecular compendium, fully accessible via a searchable web-browser interface, serves as a reliable reference of the expressed tissue and organelle proteomes of a leading model mammal.
Science | 2015
Hui Y. Xiong; Babak Alipanahi; Leo J. Lee; Hannes Bretschneider; Daniele Merico; Ryan K. C. Yuen; Yimin Hua; Serge Gueroussov; Hamed Shateri Najafabadi; Timothy R. Hughes; Quaid Morris; Yoseph Barash; Adrian R. Krainer; Nebojsa Jojic; Stephen W. Scherer; Benjamin J. Blencowe; Brendan J. Frey
Predicting defects in RNA splicing Most eukaryotic messenger RNAs (mRNAs) are spliced to remove introns. Splicing generates uninterrupted open reading frames that can be translated into proteins. Splicing is often highly regulated, generating alternative spliced forms that code for variant proteins in different tissues. RNA-binding proteins that bind specific sequences in the mRNA regulate splicing. Xiong et al. develop a computational model that predicts splicing regulation for any mRNA sequence (see the Perspective by Guigó and Valcárcel). They use this to analyze more than half a million mRNA splicing sequence variants in the human genome. They are able to identify thousands of known disease-causing mutations, as well as many new disease candidates, including 17 new autism-linked genes. Science, this issue 10.1126/science.1254806; see also p. 124 A model predicts how thousands of disease-linked nucleotide variants affect messenger RNA splicing. [Also see Perspective by Guigó and Valcárcel] INTRODUCTION Advancing whole-genome precision medicine requires understanding how gene expression is altered by genetic variants, especially those that are far outside of protein-coding regions. We developed a computational technique that scores how strongly genetic variants affect RNA splicing, a critical step in gene expression whose disruption contributes to many diseases, including cancers and neurological disorders. A genome-wide analysis reveals tens of thousands of variants that alter splicing and are enriched with a wide range of known diseases. Our results provide insight into the genetic basis of spinal muscular atrophy, hereditary nonpolyposis colorectal cancer, and autism spectrum disorder. RATIONALE We used “deep learning” computer algorithms to derive a computational model that takes as input DNA sequences and applies general rules to predict splicing in human tissues. Given a test variant, which may be up to 300 nucleotides into an intron, our model can be used to compute a score for how much the variant alters splicing. The model is not biased by existing disease annotations or population data and was derived in such a way that it can be used to study diverse diseases and disorders and to determine the consequences of common, rare, and even spontaneous variants. RESULTS Our technique is able to accurately classify disease-causing variants and provides insights into the role of aberrant splicing in disease. We scored more than 650,000 DNA variants and found that disease-causing variants have higher scores than common variants and even those associated with disease in genome-wide association studies (GWAS). Our model predicts substantial and unexpected aberrant splicing due to variants within introns and exons, including those far from the splice site. For example, among intronic variants that are more than 30 nucleotides away from any splice site, known disease variants alter splicing nine times as often as common variants; among missense exonic disease variants, those that least affect protein function are more than five times as likely as other variants to alter splicing. Autism has been associated with disrupted splicing in brain regions, so we used our method to score variants detected using whole-genome sequencing data from individuals with and without autism. Genes with high-scoring variants include many that have previously been linked with autism, as well as new genes with known neurodevelopmental phenotypes. Most of the high-scoring variants are intronic and cannot be detected by exome analysis techniques. When we scored clinical variants in spinal muscular atrophy and colorectal cancer genes, up to 94% of variants found to alter splicing using minigene reporters were correctly classified. CONCLUSION In the context of precision medicine, causal support for variants independent of existing whole-genome variant studies is greatly needed. Our computational model was trained to predict splicing from DNA sequence alone, without using disease annotations or population data. Consequently, its predictions are independent of and complementary to population data, GWAS, expression-based quantitative trait loci (QTL), and functional annotations of the genome. As such, our technique greatly expands the opportunities for understanding the genetic determinants of disease. “Deep learning” reveals the genetic origins of disease. A computational system mimics the biology of RNA splicing by correlating DNA elements with splicing levels in healthy human tissues. The system can scan DNA and identify damaging genetic variants, including those deep within introns. This procedure has led to insights into the genetics of autism, cancers, and spinal muscular atrophy. To facilitate precision medicine and whole-genome annotation, we developed a machine-learning technique that scores how strongly genetic variants affect RNA splicing, whose alteration contributes to many diseases. Analysis of more than 650,000 intronic and exonic variants revealed widespread patterns of mutation-driven aberrant splicing. Intronic disease mutations that are more than 30 nucleotides from any splice site alter splicing nine times as often as common variants, and missense exonic disease mutations that have the least impact on protein function are five times as likely as others to alter splicing. We detected tens of thousands of disease-causing mutations, including those involved in cancers and spinal muscular atrophy. Examination of intronic and exonic variants found using whole-genome sequencing of individuals with autism revealed misspliced genes with neurodevelopmental phenotypes. Our approach provides evidence for causal variants and should enable new discoveries in precision medicine.
Nature Methods | 2007
Jim C. Huang; Tomas Babak; Timothy W. Corson; Gordon Chua; Sofia Khan; Brenda L. Gallie; Timothy R. Hughes; Benjamin J. Blencowe; Brendan J. Frey; Quaid Morris
We demonstrate that paired expression profiles of microRNAs (miRNAs) and mRNAs can be used to identify functional miRNA-target relationships with high precision. We used a Bayesian data analysis algorithm, GenMiR++, to identify a network of 1,597 high-confidence target predictions for 104 human miRNAs, which was supported by RNA expression data across 88 tissues and cell types, sequence complementarity and comparative genomics data. We experimentally verified our predictions by investigating the result of let-7b downregulation in retinoblastoma using quantitative reverse transcriptase (RT)-PCR and microarray profiling: some of our verified let-7b targets include CDC25A and BCL7A. Compared to sequence-based predictions, our high-scoring GenMiR++ predictions had much more consistent Gene Ontology annotations and were more accurate predictors of which mRNA levels respond to changes in let-7b levels.
Cell | 2003
Wen Tao Peng; Mark D. Robinson; Sanie Mnaimneh; Nevan J. Krogan; Gerard Cagney; Quaid Morris; Armaity P. Davierwala; Jörg Grigull; Xueqi Yang; Wen Zhang; Nicholas Mitsakakis; Owen Ryan; Nira Datta; Vladimir Jojic; Chris Pal; Veronica Canadien; Dawn Richards; Bryan Beattie; Lani F. Wu; Steven J. Altschuler; Sam T. Roweis; Brendan J. Frey; Andrew Emili; Jack Greenblatt; Timothy R. Hughes
Predictive analysis using publicly available yeast functional genomics and proteomics data suggests that many more proteins may be involved in biogenesis of ribonucleoproteins than are currently known. Using a microarray that monitors abundance and processing of noncoding RNAs, we analyzed 468 yeast strains carrying mutations in protein-coding genes, most of which have not previously been associated with RNA or RNP synthesis. Many strains mutated in uncharacterized genes displayed aberrant noncoding RNA profiles. Ten factors involved in noncoding RNA biogenesis were verified by further experimentation, including a protein required for 20S pre-rRNA processing (Tsr2p), a protein associated with the nuclear exosome (Lrp1p), and a factor required for box C/D snoRNA accumulation (Bcd1p). These data present a global view of yeast noncoding RNA processing and confirm that many currently uncharacterized yeast proteins are involved in biogenesis of noncoding RNA.
IEEE Transactions on Pattern Analysis and Machine Intelligence | 2003
Brendan J. Frey; Nebojsa Jojic
Clustering is a simple, effective way to derive useful representations of data, such as images and videos. Clustering explains the input as one of several prototypes, plus noise. In situations where each input has been randomly transformed (e.g., by translation, rotation, and shearing in images and videos), clustering techniques tend to extract cluster centers that account for variations in the input due to transformations, instead of more interesting and potentially useful structure. For example, if images from a video sequence of a person walking across a cluttered background are clustered, it would be more useful for the different clusters to represent different poses and expressions, instead of different positions of the person and different configurations of the background clutter. We describe a way to add transformation invariance to mixture models, by approximating the nonlinear transformation manifold by a discrete set of points. We show how the expectation maximization algorithm can be used to jointly learn clusters, while at the same time inferring the transformation associated with each input. We compare this technique with other methods for filtering noisy images obtained from a scanning electron microscope, clustering images from videos of faces into different categories of identification and pose and removing foreground obstructions from video. We also demonstrate that the new technique is quite insensitive to initial conditions and works better than standard techniques, even when the standard techniques are provided with extra data.
Nature | 2013
Hong Han; Manuel Irimia; P. Joel Ross; Hoon-Ki Sung; Babak Alipanahi; Laurent David; Azadeh Golipour; Mathieu Gabut; Iacovos P. Michael; Emil N. Nachman; Eric T. Wang; Dan Trcka; Tadeo Thompson; Dave O’Hanlon; Valentina Slobodeniuc; Nuno L. Barbosa-Morais; Christopher B. Burge; Jason Moffat; Brendan J. Frey; Andras Nagy; James Ellis; Jeffrey L. Wrana; Benjamin J. Blencowe
Previous investigations of the core gene regulatory circuitry that controls the pluripotency of embryonic stem (ES) cells have largely focused on the roles of transcription, chromatin and non-coding RNA regulators. Alternative splicing represents a widely acting mode of gene regulation, yet its role in regulating ES-cell pluripotency and differentiation is poorly understood. Here we identify the muscleblind-like RNA binding proteins, MBNL1 and MBNL2, as conserved and direct negative regulators of a large program of cassette exon alternative splicing events that are differentially regulated between ES cells and other cell types. Knockdown of MBNL proteins in differentiated cells causes switching to an ES-cell-like alternative splicing pattern for approximately half of these events, whereas overexpression of MBNL proteins in ES cells promotes differentiated-cell-like alternative splicing patterns. Among the MBNL-regulated events is an ES-cell-specific alternative splicing switch in the forkhead family transcription factor FOXP1 that controls pluripotency. Consistent with a central and negative regulatory role for MBNL proteins in pluripotency, their knockdown significantly enhances the expression of key pluripotency genes and the formation of induced pluripotent stem cells during somatic cell reprogramming.