Michael Sammeth
Federal University of Rio de Janeiro
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Michael Sammeth.
Nature | 2012
Sarah Djebali; Carrie A. Davis; Angelika Merkel; Alexander Dobin; Timo Lassmann; Ali Mortazavi; Andrea Tanzer; Julien Lagarde; Wei Lin; Felix Schlesinger; Chenghai Xue; Georgi K. Marinov; Jainab Khatun; Brian A. Williams; Chris Zaleski; Joel Rozowsky; Maik Röder; Felix Kokocinski; Rehab F. Abdelhamid; Tyler Alioto; Igor Antoshechkin; Michael T. Baer; Nadav S. Bar; Philippe Batut; Kimberly Bell; Ian Bell; Sudipto Chakrabortty; Xian Chen; Jacqueline Chrast; Joao Curado
Eukaryotic cells make many types of primary and processed RNAs that are found either in specific subcellular compartments or throughout the cells. A complete catalogue of these RNAs is not yet available and their characteristic subcellular localizations are also poorly understood. Because RNA represents the direct output of the genetic information encoded by genomes and a significant proportion of a cell’s regulatory capabilities are focused on its synthesis, processing, transport, modification and translation, the generation of such a catalogue is crucial for understanding genome function. Here we report evidence that three-quarters of the human genome is capable of being transcribed, as well as observations about the range and levels of expression, localization, processing fates, regulatory regions and modifications of almost all currently annotated and thousands of previously unannotated RNAs. These observations, taken together, prompt a redefinition of the concept of a gene.
Nature | 2013
Tuuli Lappalainen; Michael Sammeth; Marc R. Friedländer; Peter A. C. 't Hoen; Jean Monlong; Manuel A. Rivas; Mar Gonzàlez-Porta; Natalja Kurbatova; Thasso Griebel; Pedro G. Ferreira; Matthias Barann; Thomas Wieland; Liliana Greger; M. van Iterson; Jonas Carlsson Almlöf; Paolo Ribeca; Irina Pulyakhina; Daniela Esser; Thomas Giger; Andrew Tikhonov; Marc Sultan; G. Bertier; Daniel G. MacArthur; Monkol Lek; Esther Lizano; Henk P. J. Buermans; Ismael Padioleau; Thomas Schwarzmayr; Olof Karlberg; Halit Ongen
Genome sequencing projects are discovering millions of genetic variants in humans, and interpretation of their functional effects is essential for understanding the genetic basis of variation in human traits. Here we report sequencing and deep analysis of messenger RNA and microRNA from lymphoblastoid cell lines of 462 individuals from the 1000 Genomes Project—the first uniformly processed high-throughput RNA-sequencing data from multiple human populations with high-quality genome sequences. We discover extremely widespread genetic variation affecting the regulation of most genes, with transcript structure and expression level variation being equally common but genetically largely independent. Our characterization of causal regulatory variation sheds light on the cellular mechanisms of regulatory and loss-of-function variation, and allows us to infer putative causal variants for dozens of disease-associated loci. Altogether, this study provides a deep understanding of the cellular mechanisms of transcriptome variation and of the landscape of functional variants in the human genome.
Science | 2015
Marta Melé; Pedro G. Ferreira; Ferran Reverter; David S. DeLuca; Jean Monlong; Michael Sammeth; Taylor R. Young; Jakob M. Goldmann; Dmitri D. Pervouchine; Timothy J. Sullivan; Rory Johnson; Ayellet V. Segrè; Sarah Djebali; Anastasia Niarchou; Fred A. Wright; Tuuli Lappalainen; Miquel Calvo; Gad Getz; Emmanouil T. Dermitzakis; Kristin Ardlie; Roderic Guigó
Expression, genetic variation, and tissues Human genomes show extensive genetic variation across individuals, but we have only just started documenting the effects of this variation on the regulation of gene expression. Furthermore, only a few tissues have been examined per genetic variant. In order to examine how genetic expression varies among tissues within individuals, the Genotype-Tissue Expression (GTEx) Consortium collected 1641 postmortem samples covering 54 body sites from 175 individuals. They identified quantitative genetic traits that affect gene expression and determined which of these exhibit tissue-specific expression patterns. Melé et al. measured how transcription varies among tissues, and Rivas et al. looked at how truncated protein variants affect expression across tissues. Science, this issue p. 648, p. 660, p. 666; see also p. 640 RNA expression documents patterns of human transcriptome variation across individuals and tissues. [Also see Perspective by Gibson] Transcriptional regulation and posttranscriptional processing underlie many cellular and organismal phenotypes. We used RNA sequence data generated by Genotype-Tissue Expression (GTEx) project to investigate the patterns of transcriptome variation across individuals and tissues. Tissues exhibit characteristic transcriptional signatures that show stability in postmortem samples. These signatures are dominated by a relatively small number of genes—which is most clearly seen in blood—though few are exclusive to a particular tissue and vary more across tissues than individuals. Genes exhibiting high interindividual expression variation include disease candidates associated with sex, ethnicity, and age. Primary transcription is the major driver of cellular specificity, with splicing playing mostly a complementary role; except for the brain, which exhibits a more divergent splicing program. Variation in splicing, despite its stochasticity, may play in contrast a comparatively greater role in defining individual phenotypes.
Nature Structural & Molecular Biology | 2009
Hagen Tilgner; Christoforos Nikolaou; Sonja Althammer; Michael Sammeth; Miguel Beato; Juan Valcárcel; Roderic Guigó
Chromatin structure influences transcription, but its role in subsequent RNA processing is unclear. Here we present analyses of high-throughput data that imply a relationship between nucleosome positioning and exon definition. First, we have found stable nucleosome occupancy within human and Caenorhabditis elegans exons that is stronger in exons with weak splice sites. Conversely, we have found that pseudoexons—intronic sequences that are not included in mRNAs but are flanked by strong splice sites—show nucleosome depletion. Second, the ratio between nucleosome occupancy within and upstream from the exons correlates with exon-inclusion levels. Third, nucleosomes are positioned central to exons rather than proximal to splice sites. These exonic nucleosomal patterns are also observed in non-expressed genes, suggesting that nucleosome marking of exons exists in the absence of transcription. Our analysis provides a framework that contributes to the understanding of splicing on the basis of chromatin architecture.
Nature Methods | 2012
Santiago Marco-Sola; Michael Sammeth; Roderic Guigó; Paolo Ribeca
Because of ever-increasing throughput requirements of sequencing data, most existing short-read aligners have been designed to focus on speed at the expense of accuracy. The Genome Multitool (GEM) mapper can leverage string matching by filtration to search the alignment space more efficiently, simultaneously delivering precision (performing fully tunable exhaustive searches that return all existing matches, including gapped ones) and speed (being several times faster than comparable state-of-the-art tools).
PLOS Genetics | 2012
Decio L. Eizirik; Michael Sammeth; Thomas Bouckenooghe; Guy Bottu; Giorgia Sisino; Mariana Igoillo-Esteve; Fernanda Ortis; Izortze Santin; Maikel L Colli; Jenny Barthson; Luc Bouwens; Linda Hughes; Lorna Gregory; Gerton Lunter; Lorella Marselli; Piero Marchetti; Mark I. McCarthy; Miriam Cnop
Type 1 diabetes (T1D) is an autoimmune disease in which pancreatic beta cells are killed by infiltrating immune cells and by cytokines released by these cells. Signaling events occurring in the pancreatic beta cells are decisive for their survival or death in diabetes. We have used RNA sequencing (RNA–seq) to identify transcripts, including splice variants, expressed in human islets of Langerhans under control conditions or following exposure to the pro-inflammatory cytokines interleukin-1β (IL-1β) and interferon-γ (IFN-γ). Based on this unique dataset, we examined whether putative candidate genes for T1D, previously identified by GWAS, are expressed in human islets. A total of 29,776 transcripts were identified as expressed in human islets. Expression of around 20% of these transcripts was modified by pro-inflammatory cytokines, including apoptosis- and inflammation-related genes. Chemokines were among the transcripts most modified by cytokines, a finding confirmed at the protein level by ELISA. Interestingly, 35% of the genes expressed in human islets undergo alternative splicing as annotated in RefSeq, and cytokines caused substantial changes in spliced transcripts. Nova1, previously considered a brain-specific regulator of mRNA splicing, is expressed in islets and its knockdown modified splicing. 25/41 of the candidate genes for T1D are expressed in islets, and cytokines modified expression of several of these transcripts. The present study doubles the number of known genes expressed in human islets and shows that cytokines modify alternative splicing in human islet cells. Importantly, it indicates that more than half of the known T1D candidate genes are expressed in human islets. This, and the production of a large number of chemokines and cytokines by cytokine-exposed islets, reinforces the concept of a dialog between pancreatic islets and the immune system in T1D. This dialog is modulated by candidate genes for the disease at both the immune system and beta cell level.
Nucleic Acids Research | 2012
Thasso Griebel; Benedikt Zacher; Paolo Ribeca; Emanuele Raineri; Vincent Lacroix; Roderic Guigó; Michael Sammeth
High-throughput sequencing of cDNA libraries constructed from cellular RNA complements (RNA-Seq) naturally provides a digital quantitative measurement for every expressed RNA molecule. Nature, impact and mutual interference of biases in different experimental setups are, however, still poorly understood—mostly due to the lack of data from intermediate protocol steps. We analysed multiple RNA-Seq experiments, involving different sample preparation protocols and sequencing platforms: we broke them down into their common—and currently indispensable—technical components (reverse transcription, fragmentation, adapter ligation, PCR amplification, gel segregation and sequencing), investigating how such different steps influence abundance and distribution of the sequenced reads. For each of those steps, we developed universally applicable models, which can be parameterised by empirical attributes of any experimental protocol. Our models are implemented in a computer simulation pipeline called the Flux Simulator, and we show that read distributions generated by different combinations of these models reproduce well corresponding evidence obtained from the corresponding experimental setups. We further demonstrate that our in silico RNA-Seq provides insights about hidden precursors that determine the final configuration of reads along gene bodies; enhancing or compensatory effects that explain apparently controversial observations can be observed. Moreover, our simulations identify hitherto unreported sources of systematic bias from RNA hydrolysis, a fragmentation technique currently employed by most RNA-Seq protocols.
Nature Biotechnology | 2013
Peter A. C. 't Hoen; Marc R. Friedländer; Jonas Carlsson Almlöf; Michael Sammeth; Irina Pulyakhina; Seyed Yahya Anvar; Jeroen F. J. Laros; Henk P. J. Buermans; Olof Karlberg; Mathias Brännvall; Johan T. den Dunnen; Gert-Jan B. van Ommen; Ivo Gut; Roderic Guigó; Xavier Estivill; Ann-Christine Syvänen; Emmanouil T. Dermitzakis; Tuuli Lappalainen
RNA sequencing is an increasingly popular technology for genome-wide analysis of transcript sequence and abundance. However, understanding of the sources of technical and interlaboratory variation is still limited. To address this, the GEUVADIS consortium sequenced mRNAs and small RNAs of lymphoblastoid cell lines of 465 individuals in seven sequencing centers, with a large number of replicates. The variation between laboratories appeared to be considerably smaller than the already limited biological variation. Laboratory effects were mainly seen in differences in insert size and GC content and could be adequately corrected for. In small-RNA sequencing, the microRNA (miRNA) content differed widely between samples owing to competitive sequencing of rRNA fragments. This did not affect relative quantification of miRNAs. We conclude that distributing RNA sequencing among different laboratories is feasible, given proper standardization and randomization procedures. We provide a set of quality measures and guidelines for assessing technical biases in RNA-seq data.
PLOS Computational Biology | 2008
Michael Sammeth; Sylvain Foissac; Roderic Guigó
Understanding the molecular mechanisms responsible for the regulation of the transcriptome present in eukaryotic cells is one of the most challenging tasks in the postgenomic era. In this regard, alternative splicing (AS) is a key phenomenon contributing to the production of different mature transcripts from the same primary RNA sequence. As a plethora of different transcript forms is available in databases, a first step to uncover the biology that drives AS is to identify the different types of reflected splicing variation. In this work, we present a general definition of the AS event along with a notation system that involves the relative positions of the splice sites. This nomenclature univocally and dynamically assigns a specific “AS code” to every possible pattern of splicing variation. On the basis of this definition and the corresponding codes, we have developed a computational tool (AStalavista) that automatically characterizes the complete landscape of AS events in a given transcript annotation of a genome, thus providing a platform to investigate the transcriptome diversity across genes, chromosomes, and species. Our analysis reveals that a substantial part—in human more than a quarter—of the observed splicing variations are ignored in common classification pipelines. We have used AStalavista to investigate and to compare the AS landscape of different reference annotation sets in human and in other metazoan species and found that proportions of AS events change substantially depending on the annotation protocol, species-specific attributes, and coding constraints acting on the transcripts. The AStalavista system therefore provides a general framework to conduct specific studies investigating the occurrence, impact, and regulation of AS.
Nucleic Acids Research | 2007
Sylvain Foissac; Michael Sammeth
In the process of establishing more and more complete annotations of eukaryotic genomes, a constantly growing number of alternative splicing (AS) events has been reported over the last decade. Consequently, the increasing transcript coverage also revealed the real complexity of some variations in the exon–intron structure between transcript variants and the need for computational tools to address ‘complex’ AS events. ASTALAVISTA (alternative splicing transcriptional landscape visualization tool) employs an intuitive and complete notation system to univocally identify such events. The method extracts AS events dynamically from custom gene annotations, classifies them into groups of common types and visualizes a comprehensive picture of the resulting AS landscape. Thus, ASTALAVISTA can characterize AS for whole transcriptome data from reference annotations (GENCODE, REFSEQ, ENSEMBL) as well as for genes selected by the user according to common functional/structural attributes of interest: http://genome.imim.es/astalavista