Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Leonardo Collado-Torres is active.

Publication


Featured researches published by Leonardo Collado-Torres.


Nucleic Acids Research | 2011

RegulonDB version 7.0: transcriptional regulation of Escherichia coli K-12 integrated within genetic sensory response units (Gensor Units)

Socorro Gama-Castro; Heladia Salgado; Martín Peralta-Gil; Alberto Santos-Zavaleta; Luis Muñiz-Rascado; Hilda Solano-Lira; Verónica Jiménez-Jacinto; Verena Weiss; Jair Santiago García-Sotelo; Alejandra López-Fuentes; Liliana Porrón-Sotelo; Shirley Alquicira-Hernández; Alejandra Medina-Rivera; Irma Martínez-Flores; Kevin Alquicira-Hernández; Ruth Martínez-Adame; César Bonavides-Martínez; Juan Miranda-Ríos; Araceli M. Huerta; Alfredo Mendoza-Vargas; Leonardo Collado-Torres; Blanca Taboada; Leticia Vega-Alvarado; Maricela Olvera; Leticia Olvera; Ricardo Grande; Julio Collado-Vides

RegulonDB (http://regulondb.ccg.unam.mx/) is the primary reference database of the best-known regulatory network of any free-living organism, that of Escherichia coli K-12. The major conceptual change since 3 years ago is an expanded biological context so that transcriptional regulation is now part of a unit that initiates with the signal and continues with the signal transduction to the core of regulation, modifying expression of the affected target genes responsible for the response. We call these genetic sensory response units, or Gensor Units. We have initiated their high-level curation, with graphic maps and superreactions with links to other databases. Additional connectivity uses expandable submaps. RegulonDB has summaries for every transcription factor (TF) and TF-binding sites with internal symmetry. Several DNA-binding motifs and their sizes have been redefined and relocated. In addition to data from the literature, we have incorporated our own information on transcription start sites (TSSs) and transcriptional units (TUs), obtained by using high-throughput whole-genome sequencing technologies. A new portable drawing tool for genomic features is also now available, as well as new ways to download the data, including web services, files for several relational database manager systems and text files including BioPAX format.


Nature Neuroscience | 2015

Developmental regulation of human cortex transcription and its clinical relevance at single base resolution

Andrew E. Jaffe; J H Shin; Leonardo Collado-Torres; Jeffrey T. Leek; Ran Tao; Chao Li; Yuan Gao; Yankai Jia; Brady J. Maher; Thomas M. Hyde; Joel E. Kleinman; Daniel R. Weinberger

Transcriptome analysis of human brain provides fundamental insight into development and disease, but it largely relies on existing annotation. We sequenced transcriptomes of 72 prefrontal cortex samples across six life stages and identified 50,650 differentially expression regions (DERs) associated with developmental and aging, agnostic of annotation. While many DERs annotated to non-exonic sequence (41.1%), most were similarly regulated in cytosolic mRNA extracted from independent samples. The DERs were developmentally conserved across 16 brain regions and in the developing mouse cortex, and were expressed in diverse cell and tissue types. The DERs were further enriched for active chromatin marks and clinical risk for neurodevelopmental disorders such as schizophrenia. Lastly, we demonstrate quantitatively that these DERs associate with a changing neuronal phenotype related to differentiation and maturation. These data show conserved molecular signatures of transcriptional dynamics across brain development, have potential clinical relevance and highlight the incomplete annotation of the human brain transcriptome.


Nature Biotechnology | 2017

Reproducible RNA-seq analysis using recount2

Leonardo Collado-Torres; Abhinav Nellore; Kai Kammers; Shannon Ellis; Margaret A. Taub; Kasper D. Hansen; Andrew E. Jaffe; Ben Langmead; Jeffrey T. Leek

c 16. Köster, J. & Rahmann, S. Bioinformatics 28, 2520– 2522 (2012). 17. Di Tommaso, P. et al. PeerJ 3, e1273 (2015). 18. Goecks, J., Nekrutenko, A. & Taylor, J. Genome Biol. 11, R86 (2010). 19. Blankenberg, D. et al. Genome Biol. 15, 403 (2014). 20. Vivian, J. et al. Preprint at bioRxiv http://biorxiv.org/ content/early/2016/07/07/062497 (2016). 21. Stamatakis, A. Bioinformatics 22, 2688–2690 (2006). 22. Byron, S.A., Van Keuren-Jensen, K.R., Engelthaler, D.M., Carpten, J.D. & Craig, D.W. Nat. Rev. Genet. 17, 257–271 (2016).


Genome Biology | 2016

Human splicing diversity and the extent of unannotated splice junctions across human RNA-seq samples on the Sequence Read Archive

Abhinav Nellore; Andrew E. Jaffe; Jean Philippe Fortin; José Alquicira-Hernández; Leonardo Collado-Torres; Siruo Wang; Robert A. Phillips; Nishika Karbhari; Kasper D. Hansen; Ben Langmead; Jeffrey T. Leek

BackgroundGene annotations, such as those in GENCODE, are derived primarily from alignments of spliced cDNA sequences and protein sequences. The impact of RNA-seq data on annotation has been confined to major projects like ENCODE and Illumina Body Map 2.0.ResultsWe aligned 21,504 Illumina-sequenced human RNA-seq samples from the Sequence Read Archive (SRA) to the human genome and compared detected exon-exon junctions with junctions in several recent gene annotations. We found 56,861 junctions (18.6%) in at least 1000 samples that were not annotated, and their expression associated with tissue type. Junctions well expressed in individual samples tended to be annotated. Newer samples contributed few novel well-supported junctions, with the vast majority of detected junctions present in samples before 2013. We compiled junction data into a resource called intropolis available at http://intropolis.rail.bio. We used this resource to search for a recently validated isoform of the ALK gene and characterized the potential functional implications of unannotated junctions with publicly available TRAP-seq data.ConclusionsConsidering only the variation contained in annotation may suffice if an investigator is interested only in well-expressed transcript isoforms. However, genes that are not generally well expressed and nonetheless present in a small but significant number of samples in the SRA are likelier to be incompletely annotated. The rate at which evidence for novel junctions has been added to the SRA has tapered dramatically, even to the point of an asymptote. Now is perhaps an appropriate time to update incomplete annotations to include splicing present in the now-stable snapshot provided by the SRA.


Bioinformatics | 2016

Rail-RNA: Scalable analysis of RNA-seq splicing and coverage

Abhinav Nellore; Leonardo Collado-Torres; Andrew E. Jaffe; José Alquicira-Hernández; Christopher Wilks; Jacob Pritt; James T. Morton; Jeffrey T. Leek; Ben Langmead

Motivation: RNA sequencing (RNA‐seq) experiments now span hundreds to thousands of samples. Current spliced alignment software is designed to analyze each sample separately. Consequently, no information is gained from analyzing multiple samples together, and it requires extra work to obtain analysis products that incorporate data from across samples. Results: We describe Rail‐RNA, a cloud‐enabled spliced aligner that analyzes many samples at once. Rail‐RNA eliminates redundant work across samples, making it more efficient as samples are added. For many samples, Rail‐RNA is more accurate than annotation‐assisted aligners. We use Rail‐RNA to align 667 RNA‐seq samples from the GEUVADIS project on Amazon Web Services in under 16 h for US


Nucleic Acids Research | 2017

Flexible expressed region analysis for RNA-seq with derfinder

Leonardo Collado-Torres; Abhinav Nellore; Christopher Wilks; Michael I. Love; Ben Langmead; Rafael A. Irizarry; Jeffrey T. Leek; Andrew E. Jaffe

0.91 per sample. Rail‐RNA outputs alignments in SAM/BAM format; but it also outputs (i) base‐level coverage bigWigs for each sample; (ii) coverage bigWigs encoding normalized mean and median coverages at each base across samples analyzed; and (iii) exon‐exon splice junctions and indels (features) in columnar formats that juxtapose coverages in samples in which a given feature is found. Supplementary outputs are ready for use with downstream packages for reproducible statistical analysis. We use Rail‐RNA to identify expressed regions in the GEUVADIS samples and show that both annotated and unannotated (novel) expressed regions exhibit consistent patterns of variation across populations and with respect to known confounding variables. Availability and Implementation: Rail‐RNA is open‐source software available at http://rail.bio. Contacts: [email protected] or [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.


Translational Psychiatry | 2017

Altered expression of histamine signaling genes in autism spectrum disorder

C Wright; J H Shin; Anandita Rajpurohit; Amy Deep-Soboslay; Leonardo Collado-Torres; Nicholas J. Brandon; Thomas M. Hyde; Joel E. Kleinman; Andrew E. Jaffe; Alan J. Cross; D.R. Weinberger

Differential expression analysis of RNA sequencing (RNA-seq) data typically relies on reconstructing transcripts or counting reads that overlap known gene structures. We previously introduced an intermediate statistical approach called differentially expressed region (DER) finder that seeks to identify contiguous regions of the genome showing differential expression signal at single base resolution without relying on existing annotation or potentially inaccurate transcript assembly. We present the derfinder software that improves our annotation-agnostic approach to RNA-seq analysis by: (i) implementing a computationally efficient bump-hunting approach to identify DERs that permits genome-scale analyses in a large number of samples, (ii) introducing a flexible statistical modeling framework, including multi-group and time-course analyses and (iii) introducing a new set of data visualizations for expressed region analysis. We apply this approach to public RNA-seq data from the Genotype-Tissue Expression (GTEx) project and BrainSpan project to show that derfinder permits the analysis of hundreds of samples at base resolution in R, identifies expression outside of known gene boundaries and can be used to visualize expressed regions at base-resolution. In simulations, our base resolution approaches enable discovery in the presence of incomplete annotation and is nearly as powerful as feature-level methods when the annotation is complete. derfinder analysis using expressed region-level and single base-level approaches provides a compromise between full transcript reconstruction and feature-level analysis. The package is available from Bioconductor at www.bioconductor.org/packages/derfinder.


Nature Neuroscience | 2018

Developmental and genetic regulation of the human cortex transcriptome illuminate schizophrenia pathogenesis

Andrew E. Jaffe; Richard E. Straub; Joo Heon Shin; Ran Tao; Yuan Gao; Leonardo Collado-Torres; Tony Kam-Thong; Hualin S. Xi; Jie Quan; Qiang Chen; Carlo Colantuoni; William S Ulrich; Brady J. Maher; Amy Deep-Soboslay; Alan J. Cross; Nicholas J. Brandon; Jeffrey T. Leek; Thomas M. Hyde; Joel E. Kleinman; Daniel R. Weinberger

The histaminergic system (HS) has a critical role in cognition, sleep and other behaviors. Although not well studied in autism spectrum disorder (ASD), the HS is implicated in many neurological disorders, some of which share comorbidity with ASD, including Tourette syndrome (TS). Preliminary studies suggest that antagonism of histamine receptors 1–3 reduces symptoms and specific behaviors in ASD patients and relevant animal models. In addition, the HS mediates neuroinflammation, which may be heightened in ASD. Together, this suggests that the HS may also be altered in ASD. Using RNA sequencing (RNA-seq), we investigated genome-wide expression, as well as a focused gene set analysis of key HS genes (HDC, HNMT, HRH1, HRH2, HRH3 and HRH4) in postmortem dorsolateral prefrontal cortex (DLPFC) initially in 13 subjects with ASD and 39 matched controls. At the genome level, eight transcripts were differentially expressed (false discovery rate <0.05), six of which were small nucleolar RNAs (snoRNAs). There was no significant diagnosis effect on any of the individual HS genes but expression of the gene set of HNMT, HRH1, HRH2 and HRH3 was significantly altered. Curated HS gene sets were also significantly differentially expressed. Differential expression analysis of these gene sets in an independent RNA-seq ASD data set from DLPFC of 47 additional subjects confirmed these findings. Understanding the physiological relevance of an altered HS may suggest new therapeutic options for the treatment of ASD.


Nucleic Acids Research | 2018

Improving the value of public RNA-seq expression data by phenotype prediction

Shannon Ellis; Leonardo Collado-Torres; Andrew E. Jaffe; Jeffrey T. Leek

Genome-wide association studies have identified 108 schizophrenia risk loci, but biological mechanisms for individual loci are largely unknown. Using developmental, genetic and illness-based RNA sequencing expression analysis in human brain, we characterized the human brain transcriptome around these loci and found enrichment for developmentally regulated genes with novel examples of shifting isoform usage across pre- and postnatal life. We found widespread expression quantitative trait loci (eQTLs), including many with transcript specificity and previously unannotated sequence that were independently replicated. We leveraged this general eQTL database to show that 48.1% of risk variants for schizophrenia associate with nearby expression. We lastly found 237 genes significantly differentially expressed between patients and controls, which replicated in an independent dataset, implicated synaptic processes, and were strongly regulated in early development. These findings together offer genetics- and diagnosis-related targets for better modeling of schizophrenia risk. This resource is publicly available at http://eqtl.brainseq.org/phase1.The authors surveyed gene expression across cortical development and in individuals with schizophrenia. Three-fold more risk variants influenced expression than known. Risk genes showed developmental regulation, while diagnosis changes implicated largely treatment effects.


bioRxiv | 2016

recount: A large-scale resource of analysis-ready RNA-seq expression data

Leonardo Collado-Torres; Abhinav Nellore; Kai Kammers; Shannon Ellis; Margaret A. Taub; Kasper D. Hansen; Andrew E. Jaffe; Ben Langmead; Jeffrey T. Leek

Abstract Publicly available genomic data are a valuable resource for studying normal human variation and disease, but these data are often not well labeled or annotated. The lack of phenotype information for public genomic data severely limits their utility for addressing targeted biological questions. We develop an in silico phenotyping approach for predicting critical missing annotation directly from genomic measurements using well-annotated genomic and phenotypic data produced by consortia like TCGA and GTEx as training data. We apply in silico phenotyping to a set of 70 000 RNA-seq samples we recently processed on a common pipeline as part of the recount2 project. We use gene expression data to build and evaluate predictors for both biological phenotypes (sex, tissue, sample source) and experimental conditions (sequencing strategy). We demonstrate how these predictions can be used to study cross-sample properties of public genomic data, select genomic projects with specific characteristics, and perform downstream analyses using predicted phenotypes. The methods to perform phenotype prediction are available in the phenopredict R package and the predictions for recount2 are available from the recount R package. With data and phenotype information available for 70,000 human samples, expression data is available for use on a scale that was not previously feasible.

Collaboration


Dive into the Leonardo Collado-Torres's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ben Langmead

Johns Hopkins University

View shared research outputs
Top Co-Authors

Avatar

Joo Heon Shin

Johns Hopkins University

View shared research outputs
Top Co-Authors

Avatar

Ran Tao

Johns Hopkins University

View shared research outputs
Top Co-Authors

Avatar

Thomas M. Hyde

Johns Hopkins University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Amy Deep-Soboslay

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge