Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Tim Hubbard is active.

Publication


Featured researches published by Tim Hubbard.


Journal of Molecular Biology | 1995

SCOP : A structural classification of proteins database for the investigation of sequences and structures

Alexey G. Murzin; Steven E. Brenner; Tim Hubbard; Cyrus Chothia

To facilitate understanding of, and access to, the information available for protein structures, we have constructed the Structural Classification of Proteins (scop) database. This database provides a detailed and comprehensive description of the structural and evolutionary relationships of the proteins of known structure. It also provides for each entry links to co-ordinates, images of the structure, interactive viewers, sequence data and literature references. Two search facilities are available. The homology search permits users to enter a sequence and obtain a list of any structures to which it has significant levels of sequence similarity. The key word search finds, for a word entered by the user, matches from both the text of the scop database and the headers of Brookhaven Protein Databank structure files. The database is freely accessible on World Wide Web (WWW) with an entry point to URL http: parallel scop.mrc-lmb.cam.ac.uk magnitude of scop.


Nature | 2012

Landscape of transcription in human cells

Sarah Djebali; Carrie A. Davis; Angelika Merkel; Alexander Dobin; Timo Lassmann; Ali Mortazavi; Andrea Tanzer; Julien Lagarde; Wei Lin; Felix Schlesinger; Chenghai Xue; Georgi K. Marinov; Jainab Khatun; Brian A. Williams; Chris Zaleski; Joel Rozowsky; Maik Röder; Felix Kokocinski; Rehab F. Abdelhamid; Tyler Alioto; Igor Antoshechkin; Michael T. Baer; Nadav S. Bar; Philippe Batut; Kimberly Bell; Ian Bell; Sudipto Chakrabortty; Xian Chen; Jacqueline Chrast; Joao Curado

Eukaryotic cells make many types of primary and processed RNAs that are found either in specific subcellular compartments or throughout the cells. A complete catalogue of these RNAs is not yet available and their characteristic subcellular localizations are also poorly understood. Because RNA represents the direct output of the genetic information encoded by genomes and a significant proportion of a cell’s regulatory capabilities are focused on its synthesis, processing, transport, modification and translation, the generation of such a catalogue is crucial for understanding genome function. Here we report evidence that three-quarters of the human genome is capable of being transcribed, as well as observations about the range and levels of expression, localization, processing fates, regulatory regions and modifications of almost all currently annotated and thousands of previously unannotated RNAs. These observations, taken together, prompt a redefinition of the concept of a gene.


Genome Research | 2012

The GENCODE v7 catalog of human long noncoding RNAs: Analysis of their gene structure, evolution, and expression

Thomas Derrien; Rory Johnson; Giovanni Bussotti; Andrea Tanzer; Sarah Djebali; Hagen Tilgner; Gregory Guernec; David Martin; Angelika Merkel; David G. Knowles; Julien Lagarde; Lavanya Veeravalli; Xiaoan Ruan; Yijun Ruan; Timo Lassmann; Piero Carninci; James B. Brown; Leonard Lipovich; José Manuel Rodríguez González; Mark G. Thomas; Carrie A. Davis; Ramin Shiekhattar; Thomas R. Gingeras; Tim Hubbard; Cedric Notredame; Jennifer Harrow; Roderic Guigó

The human genome contains many thousands of long noncoding RNAs (lncRNAs). While several studies have demonstrated compelling biological and disease roles for individual examples, analytical and experimental approaches to investigate these genes have been hampered by the lack of comprehensive lncRNA annotation. Here, we present and analyze the most complete human lncRNA annotation to date, produced by the GENCODE consortium within the framework of the ENCODE project and comprising 9277 manually annotated genes producing 14,880 transcripts. Our analyses indicate that lncRNAs are generated through pathways similar to that of protein-coding genes, with similar histone-modification profiles, splicing signals, and exon/intron lengths. In contrast to protein-coding genes, however, lncRNAs display a striking bias toward two-exon transcripts, they are predominantly localized in the chromatin and nucleus, and a fraction appear to be preferentially processed into small RNAs. They are under stronger selective pressure than neutrally evolving sequences-particularly in their promoter regions, which display levels of selection comparable to protein-coding genes. Importantly, about one-third seem to have arisen within the primate lineage. Comprehensive analysis of their expression in multiple human organs and brain regions shows that lncRNAs are generally lower expressed than protein-coding genes, and display more tissue-specific expression patterns, with a large fraction of tissue-specific lncRNAs expressed in the brain. Expression correlation analysis indicates that lncRNAs show particularly striking positive correlation with the expression of antisense coding genes. This GENCODE annotation represents a valuable resource for future studies of lncRNAs.


Genome Research | 2012

GENCODE: The reference human genome annotation for The ENCODE Project

Jennifer Harrow; Adam Frankish; José Manuel Rodríguez González; Electra Tapanari; Mark Diekhans; Felix Kokocinski; Bronwen Aken; Daniel Barrell; Amonida Zadissa; Stephen M. J. Searle; I. Barnes; Alexandra Bignell; Veronika Boychenko; Toby Hunt; Mike Kay; Gaurab Mukherjee; Jeena Rajan; Gloria Despacio-Reyes; Gary Saunders; Charles A. Steward; Rachel A. Harte; Mike Lin; Cédric Howald; Andrea Tanzer; Thomas Derrien; Jacqueline Chrast; Nathalie Walters; Suganthi Balasubramanian; Baikang Pei; Michael L. Tress

The GENCODE Consortium aims to identify all gene features in the human genome using a combination of computational analysis, manual annotation, and experimental validation. Since the first public release of this annotation data set, few new protein-coding loci have been added, yet the number of alternative splicing transcripts annotated has steadily increased. The GENCODE 7 release contains 20,687 protein-coding and 9640 long noncoding RNA loci and has 33,977 coding transcripts not represented in UCSC genes and RefSeq. It also has the most comprehensive annotation of long noncoding RNA (lncRNA) loci publicly available with the predominant transcript form consisting of two exons. We have examined the completeness of the transcript annotation and found that 35% of transcriptional start sites are supported by CAGE clusters and 62% of protein-coding genes have annotated polyA sites. Over one-third of GENCODE protein-coding genes are supported by peptide hits derived from mass spectrometry spectra submitted to Peptide Atlas. New models derived from the Illumina Body Map 2.0 RNA-seq data identify 3689 new loci not currently in GENCODE, of which 3127 consist of two exon models indicating that they are possibly unannotated long noncoding loci. GENCODE 7 is publicly available from gencodegenes.org and via the Ensembl and UCSC Genome Browsers.


Nucleic Acids Research | 2002

The Ensembl genome database project

Tim Hubbard; Darren Barker; Ewan Birney; Graham Cameron; Yuan Chen; L. Clark; Tony Cox; James Cuff; V. Curwen; Thomas A. Down; Richard Durbin; E. Eyras; James Gilbert; Martin Hammond; L. Huminiecki; Arek Kasprzyk; Heikki Lehväslaiho; Philip Lijnzaad; Craig Melsopp; Emmanuel Mongin; R. Pettett; M. Pocock; Simon Potter; A. Rust; Esther Schmidt; Stephen M. J. Searle; Guy Slater; J. Smith; W. Spooner; A. Stabenau

The Ensembl (http://www.ensembl.org/) database project provides a bioinformatics framework to organise biology around the sequences of large genomes. It is a comprehensive source of stable automatic annotation of the human genome sequence, with confirmed gene predictions that have been integrated with external data sources, and is available as either an interactive web site or as flat files. It is also an open source software engineering project to develop a portable system able to handle very large genomes and associated requirements from sequence analysis to data storage and visualisation. The Ensembl site is one of the leading sources of human genome sequence annotation and provided much of the analysis for publication by the international human genome project of the draft genome. The Ensembl system is being installed around the world in both companies and academic sites on machines ranging from supercomputers to laptops.


Nucleic Acids Research | 1997

SCOP: a Structural Classification of Proteins database

Tim Hubbard; Bart Ailey; Steven E. Brenner; Alexey G. Murzin; Cyrus Chothia

The Structural Classification of Proteins (SCOP) database provides a detailed and comprehensive description of the relationships of all known proteins structures. The classification is on hierarchical levels: the first two levels, family and superfamily, describe near and far evolutionary relationships; the third, fold, describes geometrical relationships. The distinction between evolutionary relationships and those that arise from the physics and chemistry of proteins is a feature that is unique to this database, so far. SCOP also provides for each structure links to atomic co-ordinates, images of the structures, interactive viewers, sequence data, data on any conformational changes related to function and literature references. The database is freely accessible on the World Wide Web (WWW) with an entry point at URL http://scop.mrc-lmb.cam.ac.uk/scop/


Nucleic Acids Research | 2007

Data growth and its impact on the SCOP database: new developments

Antonina Andreeva; Dave Howorth; John-Marc Chandonia; Steven E. Brenner; Tim Hubbard; Cyrus Chothia; Alexey G. Murzin

The Structural Classification of Proteins (SCOP) database is a comprehensive ordering of all proteins of known structure, according to their evolutionary and structural relationships. The SCOP hierarchy comprises the following levels: Species, Protein, Family, Superfamily, Fold and Class. While keeping the original classification scheme intact, we have changed the production of SCOP in order to cope with a rapid growth of new structural data and to facilitate the discovery of new protein relationships. We describe ongoing developments and new features implemented in SCOP. A new update protocol supports batch classification of new protein structures by their detected relationships at Family and Superfamily levels in contrast to our previous sequential handling of new structural data by release date. We introduce pre-SCOP, a preview of the SCOP developmental version that enables earlier access to the information on new relationships. We also discuss the impact of worldwide Structural Genomics initiatives, which are producing new protein structures at an increasing rate, on the rates of discovery and growth of protein families and superfamilies. SCOP can be accessed at http://scop.mrc-lmb.cam.ac.uk/scop.


Nature Biotechnology | 2008

A Bayesian deconvolution strategy for immunoprecipitation-based DNA methylome analysis

Thomas A. Down; Vardhman K. Rakyan; Daniel J. Turner; Paul Flicek; Heng Li; Eugene Kulesha; Stefan Gräf; Nathan Johnson; Javier Herrero; Eleni M. Tomazou; Natalie P. Thorne; Liselotte Bäckdahl; Marlis Herberth; Kevin L. Howe; David K. Jackson; Marcos M Miretti; John C. Marioni; Ewan Birney; Tim Hubbard; Richard Durbin; Simon Tavaré; Stephan Beck

DNA methylation is an indispensible epigenetic modification required for regulating the expression of mammalian genomes. Immunoprecipitation-based methods for DNA methylome analysis are rapidly shifting the bottleneck in this field from data generation to data analysis, necessitating the development of better analytical tools. In particular, an inability to estimate absolute methylation levels remains a major analytical difficulty associated with immunoprecipitation-based DNA methylation profiling. To address this issue, we developed a cross-platform algorithm—Bayesian tool for methylation analysis (Batman)—for analyzing methylated DNA immunoprecipitation (MeDIP) profiles generated using oligonucleotide arrays (MeDIP-chip) or next-generation sequencing (MeDIP-seq). We developed the latter approach to provide a high-resolution whole-genome DNA methylation profile (DNA methylome) of a mammalian genome. Strong correlation of our data, obtained using mature human spermatozoa, with those obtained using bisulfite sequencing suggest that combining MeDIP-seq or MeDIP-chip with Batman provides a robust, quantitative and cost-effective functional genomic strategy for elucidating the function of DNA methylation.


Nucleic Acids Research | 2002

SCOP database in 2002: refinements accommodate structural genomics.

Loredana Lo Conte; Steven E. Brenner; Tim Hubbard; Cyrus Chothia; Alexey G. Murzin

The SCOP (Structural Classification of Proteins) database is a comprehensive ordering of all proteins of known structure, according to their evolutionary and structural relationships. Protein domains in SCOP are grouped into species and hierarchically classified into families, superfamilies, folds and classes. Recently, we introduced a new set of features with the aim of standardizing access to the database, and providing a solid basis to manage the increasing number of experimental structures expected from structural genomics projects. These features include: a new set of identifiers, which uniquely identify each entry in the hierarchy; a compact representation of protein domain classification; a new set of parseable files, which fully describe all domains in SCOP and the hierarchy itself. These new features are reflected in the ASTRAL compendium. The SCOP search engine has also been updated, and a set of links to external resources added at the level of domain entries. SCOP can be accessed at http://scop.mrc-lmb.cam.ac.uk/scop.


Genome Research | 2009

The consensus coding sequence (CCDS) project: Identifying a common protein-coding gene set for the human and mouse genomes

Kim D. Pruitt; Jennifer Harrow; Rachel A. Harte; Craig Wallin; Mark Diekhans; Donna Maglott; Steve Searle; Catherine M. Farrell; Jane Loveland; Barbara J. Ruef; Elizabeth Hart; Marie-Marthe Suner; Melissa J. Landrum; Bronwen Aken; Sarah Ayling; Robert Baertsch; Julio Fernandez-Banet; Joshua L. Cherry; Val Curwen; Michael DiCuccio; Manolis Kellis; Jennifer M. Lee; Michael F. Lin; Michael Schuster; Andrew Shkeda; Clara Amid; Garth Brown; Oksana Dukhanina; Adam Frankish; Jennifer Hart

Effective use of the human and mouse genomes requires reliable identification of genes and their products. Although multiple public resources provide annotation, different methods are used that can result in similar but not identical representation of genes, transcripts, and proteins. The collaborative consensus coding sequence (CCDS) project tracks identical protein annotations on the reference mouse and human genomes with a stable identifier (CCDS ID), and ensures that they are consistently represented on the NCBI, Ensembl, and UCSC Genome Browsers. Importantly, the project coordinates on manually reviewing inconsistent protein annotations between sites, as well as annotations for which new evidence suggests a revision is needed, to progressively converge on a complete protein-coding set for the human and mouse reference genomes, while maintaining a high standard of reliability and biological accuracy. To date, the project has identified 20,159 human and 17,707 mouse consensus coding regions from 17,052 human and 16,893 mouse genes. Three evaluation methods indicate that the entries in the CCDS set are highly likely to represent real proteins, more so than annotations from contributing groups not included in CCDS. The CCDS database thus centralizes the function of identifying well-supported, identically-annotated, protein-coding regions.

Collaboration


Dive into the Tim Hubbard's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jennifer Harrow

Wellcome Trust Sanger Institute

View shared research outputs
Top Co-Authors

Avatar

Cyrus Chothia

Laboratory of Molecular Biology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Alexey G. Murzin

Laboratory of Molecular Biology

View shared research outputs
Top Co-Authors

Avatar

Ewan Birney

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

Adam Frankish

Wellcome Trust Sanger Institute

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Felix Kokocinski

Wellcome Trust Sanger Institute

View shared research outputs
Top Co-Authors

Avatar

James Gilbert

Wellcome Trust Sanger Institute

View shared research outputs
Researchain Logo
Decentralizing Knowledge