Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Annabel E. Todd is active.

Publication


Featured researches published by Annabel E. Todd.


Nucleic Acids Research | 2004

The CATH Domain Structure Database and related resources Gene3D and DHS provide comprehensive domain family information for genome analysis

Frances M. G. Pearl; Annabel E. Todd; Ian Sillitoe; Mark Dibley; Oliver Redfern; Tony E. Lewis; Christopher G. Bennett; Russell L. Marsden; Alastair Grant; David A. Lee; Adrian Akpor; Michael Maibaum; Andrew P. Harrison; Timothy Dallman; Gabrielle A. Reeves; Ilhem Diboun; Sarah Addou; Stefano Lise; Caroline E. Johnston; Antonio Sillero; Janet M. Thornton; Christine A. Orengo

The CATH database of protein domain structures (http://www.biochem.ucl.ac.uk/bsm/cath/) currently contains 43 229 domains classified into 1467 superfamilies and 5107 sequence families. Each structural family is expanded with sequence relatives from GenBank and completed genomes, using a variety of efficient sequence search protocols and reliable thresholds. This extended CATH protein family database contains 616 470 domain sequences classified into 23 876 sequence families. This results in the significant expansion of the CATH HMM model library to include models built from the CATH sequence relatives, giving a 10% increase in coverage for detecting remote homologues. An improved Dictionary of Homologous superfamilies (DHS) (http://www.biochem.ucl.ac.uk/bsm/dhs/) containing specific sequence, structural and functional information for each superfamily in CATH considerably assists manual validation of homologues. Information on sequence relatives in CATH superfamilies, GenBank and completed genomes is presented in the CATH associated DHS and Gene3D resources. Domain partnership information can be obtained from Gene3D (http://www.biochem.ucl.ac.uk/bsm/cath/Gene3D/). A new CATH server has been implemented (http://www.biochem.ucl.ac.uk/cgi-bin/cath/CathServer.pl) providing automatic classification of newly determined sequences and structures using a suite of rapid sequence and structure comparison methods. The statistical significance of matches is assessed and links are provided to the putative superfamily or fold group to which the query sequence or structure is assigned.


Nucleic Acids Research | 2000

Assigning genomic sequences to CATH.

Frances M. G. Pearl; David A. Lee; James E. Bray; Ian Sillitoe; Annabel E. Todd; Andrew P. Harrison; Janet M. Thornton; Christine A. Orengo

We report the latest release (version 1.6) of the CATH protein domains database (http://www.biochem.ucl. ac.uk/bsm/cath ). This is a hierarchical classification of 18 577 domains into evolutionary families and structural groupings. We have identified 1028 homo-logous superfamilies in which the proteins have both structural, and sequence or functional similarity. These can be further clustered into 672 fold groups and 35 distinct architectures. Recent developments of the database include the generation of 3D templates for recognising structural relatives in each fold group, which has led to significant improvements in the speed and accuracy of updating the database and also means that less manual validation is required. We also report the establishment of the CATH-PFDB (Protein Family Database), which associates 1D sequences with the 3D homologous superfamilies. Sequences showing identifiable homology to entries in CATH have been extracted from GenBank using PSI-BLAST. A CATH-PSIBLAST server has been established, which allows you to scan a new sequence against the database. The CATH Dictionary of Homologous Superfamilies (DHS), which contains validated multiple structural alignments annotated with consensus functional information for evolutionary protein superfamilies, has been updated to include annotations associated with sequence relatives identified in GenBank. The DHS is a powerful tool for considering the variation of functional properties within a given CATH superfamily and in deciding what functional properties may be reliably inherited by a newly identified relative.


Nature Structural & Molecular Biology | 2000

From structure to function: approaches and limitations.

Janet M. Thornton; Annabel E. Todd; Duncan Milburn; Neera Borkakoti; Christine A. Orengo

This review presents a summary of current approaches to extract functional information from structural data on proteins and their complexes. While structural homologs may reveal possible biochemical functions (which may be hidden at the sequence level), elucidating the exact biological role of a protein in vivo will only be possible by including other results, such as data on expression and localization.


Current Opinion in Structural Biology | 1999

From protein structure to function

Christine A. Orengo; Annabel E. Todd; Janet M. Thornton

Several databases of protein structural families now exist-organised according to both evolutionary relationships and common folding arrangements. Although these lag behind sequence databases in size, the prospect of structural genomics initiatives means that they may soon include representatives of many of the sequence families. To some extent, functional information can be derived from structural similarity. For some structural families, their function is highly conserved, whereas, for others, it can only be inherited or derived on the basis of additional information (e.g. sequence patterns, common residue clusters and characteristic surface properties).


Trends in Biochemical Sciences | 2002

Plasticity of enzyme active sites

Annabel E. Todd; Christine A. Orengo; Janet M. Thornton

The expectation is that any similarity in reaction chemistry shared by enzyme homologues is mediated by common functional groups conserved through evolution. However, detailed enzyme studies have revealed the flexibility of many active sites, in that different functional groups, unconserved with respect to position in the primary sequence, mediate the same mechanistic role. Nevertheless, the catalytic atoms might be spatially equivalent. More rarely, the active sites have completely different locations in the protein scaffold. This variability could result from: (1) the hopping of functional groups from one position to another to optimize catalysis; (2) the independent specialization of a low-activity primordial enzyme in different phylogenetic lineages; (3) functional convergence after evolutionary divergence; or (4) circular permutation events.


Structure | 2002

Sequence and Structural Differences between Enzyme and Nonenzyme Homologs

Annabel E. Todd; Christine A. Orengo; Janet M. Thornton

To improve our understanding of the evolution of novel functions, we performed a sequence, structural, and functional analysis of homologous enzymes and nonenzymes of known three-dimensional structure. In most examples identified, the nonenzyme is derived from an ancestral catalytic precursor (as opposed to the reverse evolutionary scenario, nonenzyme to enzyme), and the active site pocket has been disrupted in some way, owing to the substitution of critical catalytic residues and/or steric interactions that impede substrate binding and catalysis. Pairwise sequence identity is typically insignificant, and almost one-half of the enzyme and nonenzyme pairs do not share any similarity in function. Heterooligomeric enzymes comprising homologous subunits in which one chain is catalytically inactive and enzyme polypeptides that contain internal catalytic and noncatalytic duplications of an ancient enzyme domain are also discussed.


Proteomics | 2002

The CATH protein family database: a resource for structural and functional annotation of genomes.

Christine A. Orengo; James E. Bray; Daniel W. A. Buchan; Andrew P. Harrison; David A. Lee; Frances M. G. Pearl; Ian Sillitoe; Annabel E. Todd; Janet M. Thornton

Over the last decade, there have been huge increases in the numbers of protein sequences and structures determined. In parallel, many methods have been developed for recognising similarities between these proteins, arising from their common evolutionary background, and for clustering such relatives into protein families. Here we review some of the protein family resources available to the biologist and describe how these can be used to provide structural and functional annotations for newly determined sequences. In particular we describe recent developments to the CATH domain database of protein structural families which have facilitated genome annotation and which have also revealed important caveats that must be considered when transferring functional data between homologous proteins.


Nucleic Acids Research | 2010

Comprehensive reanalysis of transcription factor knockout expression data in Saccharomyces cerevisiae reveals many new targets

Jüri Reimand; Juan M. Vaquerizas; Annabel E. Todd; Jaak Vilo; Nicholas M. Luscombe

Transcription factor (TF) perturbation experiments give valuable insights into gene regulation. Genome-scale evidence from microarray measurements may be used to identify regulatory interactions between TFs and targets. Recently, Hu and colleagues published a comprehensive study covering 269 TF knockout mutants for the yeast Saccharomyces cerevisiae. However, the information that can be extracted from this valuable dataset is limited by the method employed to process the microarray data. Here, we present a reanalysis of the original data using improved statistical techniques freely available from the BioConductor project. We identify over 100 000 differentially expressed genes—nine times the total reported by Hu et al. We validate the biological significance of these genes by assessing their functions, the occurrence of upstream TF-binding sites, and the prevalence of protein–protein interactions. The reanalysed dataset outperforms the original across all measures, indicating that we have uncovered a vastly expanded list of relevant targets. In summary, this work presents a high-quality reanalysis that maximizes the information contained in the Hu et al. compendium. The dataset is available from ArrayExpress (accession: E-MTAB-109) and it will be invaluable to any scientist interested in the yeast transcriptional regulatory system.


Nucleic Acids Research | 2001

A rapid classification protocol for the CATH Domain Database to support structural genomics

Frances M. G. Pearl; Nigel J. Martin; James E. Bray; Daniel W. A. Buchan; Andrew P. Harrison; David A. Lee; Gabrielle A. Reeves; Adrian J. Shepherd; Ian Sillitoe; Annabel E. Todd; Janet M. Thornton; Christine A. Orengo

In order to support the structural genomic initiatives, both by rapidly classifying newly determined structures and by suggesting suitable targets for structure determination, we have recently developed several new protocols for classifying structures in the CATH domain database (http://www.biochem.ucl.ac.uk/bsm/cath). These aim to increase the speed of classification of new structures using fast algorithms for structure comparison (GRATH) and to improve the sensitivity in recognising distant structural relatives by incorporating sequence information from relatives in the genomes (DomainFinder). In order to ensure the integrity of the database given the expected increase in data, the CATH Protein Family Database (CATH-PFDB), which currently includes 25,320 structural domains and a further 160,000 sequence relatives has now been installed in a relational ORACLE database. This was essential for developing more rigorous validation procedures and for allowing efficient querying of the database, particularly for genome analysis. The associated Dictionary of Homologous Superfamilies [Bray,J.E., Todd,A.E., Pearl,F.M.G., Thornton,J.M. and Orengo,C.A. (2000) Protein Eng., 13, 153-165], which provides multiple structural alignments and functional information to assist in assigning new relatives, has also been expanded recently and now includes information for 903 homologous superfamilies. In order to improve coverage of known structures, preliminary classification levels are now provided for new structures at interim stages in the classification protocol. Since a large proportion of new structures can be rapidly classified using profile-based sequence analysis [e.g. PSI-BLAST: Altschul,S.F., Madden,T.L., Schaffer,A.A., Zhang,J., Zhang,Z., Miller,W. and Lipman,D.J. (1997) Nucleic Acids Res., 25, 3389-3402], this provides preliminary classification for easily recognisable homologues, which in the latest release of CATH (version 1.7) represented nearly three-quarters of the non-identical structures.


Structure | 2009

The CATH Hierarchy Revisited—Structural Divergence in Domain Superfamilies and the Continuity of Fold Space

Alison L. Cuff; Oliver Redfern; Lesley H. Greene; Ian Sillitoe; Tony E. Lewis; Mark Dibley; Adam J. Reid; Frances M. G. Pearl; Tim Dallman; Annabel E. Todd; Richard C. Garratt; Janet M. Thornton; Christine A. Orengo

Summary This paper explores the structural continuum in CATH and the extent to which superfamilies adopt distinct folds. Although most superfamilies are structurally conserved, in some of the most highly populated superfamilies (4% of all superfamilies) there is considerable structural divergence. While relatives share a similar fold in the evolutionary conserved core, diverse elaborations to this core can result in significant differences in the global structures. Applying similar protocols to examine the extent to which structural overlaps occur between different fold groups, it appears this effect is confined to just a few architectures and is largely due to small, recurring super-secondary motifs (e.g., αβ-motifs, α-hairpins). Although 24% of superfamilies overlap with superfamilies having different folds, only 14% of nonredundant structures in CATH are involved in overlaps. Nevertheless, the existence of these overlaps suggests that, in some regions of structure space, the fold universe should be seen as more continuous.

Collaboration


Dive into the Annabel E. Todd's collaboration.

Top Co-Authors

Avatar

Janet M. Thornton

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ian Sillitoe

University College London

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

David A. Lee

Queen Mary University of London

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Gabrielle A. Reeves

European Bioinformatics Institute

View shared research outputs
Researchain Logo
Decentralizing Knowledge