Mark B. Swindells
Ontario Institute for Cancer Research
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Mark B. Swindells.
Structure | 1997
Christine A. Orengo; Alex D. Michie; Susan Jones; David Jones; Mark B. Swindells; Janet M. Thornton
BACKGROUND Protein evolution gives rise to families of structurally related proteins, within which sequence identities can be extremely low. As a result, structure-based classifications can be effective at identifying unanticipated relationships in known structures and in optimal cases function can also be assigned. The ever increasing number of known protein structures is too large to classify all proteins manually, therefore, automatic methods are needed for fast evaluation of protein structures. RESULTS We present a semi-automatic procedure for deriving a novel hierarchical classification of protein domain structures (CATH). The four main levels of our classification are protein class (C), architecture (A), topology (T) and homologous superfamily (H). Class is the simplest level, and it essentially describes the secondary structure composition of each domain. In contrast, architecture summarises the shape revealed by the orientations of the secondary structure units, such as barrels and sandwiches. At the topology level, sequential connectivity is considered, such that members of the same architecture might have quite different topologies. When structures belonging to the same T-level have suitably high similarities combined with similar functions, the proteins are assumed to be evolutionarily related and put into the same homologous superfamily. CONCLUSIONS Analysis of the structural families generated by CATH reveals the prominent features of protein structure space. We find that nearly a third of the homologous superfamilies (H-levels) belong to ten major T-levels, which we call superfolds, and furthermore that nearly two-thirds of these H-levels cluster into nine simple architectures. A database of well-characterised protein structure families, such as CATH, will facilitate the assignment of structure-function/evolution relationships to both known and newly determined protein structures.
Nature | 1998
Toshiyuki Tanaka; Soumitra K. Saha; Chieri Tomomori; Rieko Ishima; Dingjiang Liu; Kit I. Tong; Heiyoung Park; Rinku Dutta; Ling Qin; Mark B. Swindells; Toshimasa Yamazaki; Akira Ono; Masatsune Kainosho; Masayori Inouye; Mitsuhiko Ikura
Bacteria live in capricious environments, in which they must continuously sense external conditions in order to adjust their shape, motility and physiology. The histidine–aspartate phosphorelay signal-transduction system (also known as the two-component system) is important in cellular adaptation to environmental changes in both prokaryotes and lower eukaryotes,. In this system, protein histidine kinases function as sensors and signal transducers. The Escherichia coli osmosensor, EnvZ, is a transmembrane protein with histidine kinase activity in its cytoplasmic region. The cytoplasmic region contains two functional domains: domain A (residues 223–289) contains the conserved histidine residue (H243), a site of autophosphorylation as well as transphosphorylation to the conserved D55 residue of response regulator OmpR, whereas domain B (residues 290–450) encloses several highly conserved regions (G1, G2, F and N boxes) and is able to phosphorylate H243. Here we present the solution structure of domain B, the catalytic core of EnvZ. This core has a novel protein kinase structure, distinct from the serine/threonine/tyrosine kinase fold, with unanticipated similarities to both heat-shock protein 90 and DNA gyrase B.
Proteins | 1999
Kyoko L. Yap; James B. Ames; Mark B. Swindells; Mitsuhiko Ikura
The EF‐hand motif, which assumes a helix‐loop‐helix structure normally responsible for Ca2+ binding, is found in a large number of functionally diverse Ca2+ binding proteins collectively known as the EF‐hand protein superfamily. In many superfamily members, Ca2+ binding induces a conformational change in the EF‐hand motif, leading to the activation or inactivation of target proteins. In calmodulin and troponin C, this is described as a change from the closed conformational state in the absence of Ca2+ to the open conformational state in its presence. It is now clear from structures of other EF‐hand proteins that this “closed‐to‐open” conformational transition is not the sole model for EF‐hand protein structural response to Ca2+. More complex modes of conformational change are observed in EF‐hand proteins that interact with a covalently attached acyl group (e.g., recoverin) and in those that dimerize (e.g., S100B, calpain). In fact, EF‐hand proteins display a multitude of unique conformational states, together constituting a conformational continuum. Using a quantitative 3D approach termed vector geometry mapping (VGM), we discuss this tertiary structural diversity of EF‐hand proteins and its correlation with target recognition. Proteins 1999;37:499–507. ©1999 Wiley‐Liss, Inc.
Nature Structural & Molecular Biology | 1999
Masanori Osawa; Hiroshi Tokumitsu; Mark B. Swindells; Hiroyuki Kurihara; Masaya Orita; Tadao Shibanuma; Toshio Furuya; Mitsuhiko Ikura
The structure of calcium-bound calmodulin (Ca2+/CaM) complexed with a 26-residue peptide, corresponding to the CaM-binding domain of rat Ca2+/CaM-dependent protein kinase kinase (CaMKK), has been determined by NMR spectroscopy. In this complex, the CaMKK peptide forms a fold comprising an α-helix and a hairpin-like loop whose C-terminus folds back on itself. The binding orientation of this CaMKK peptide by the two CaM domains is opposite to that observed in all other CaM–target complexes determined so far. The N- and C-terminal hydrophobic pockets of Ca2+/CaM anchor Trp 444 and Phe 459 of the CaMKK peptide, respectively. This 14-residue separation between two key hydrophobic groups is also unique among previously determined CaM complexes. The present structure represents a new and distinct class of Ca2+/CaM target recognition that may be shared by other Ca2+/CaM-stimulated proteins.
Bioinformatics | 1998
Asaf Salamov; Tetsuo Nishikawa; Mark B. Swindells
MOTIVATION In cDNA sequencing projects, it is vital to know whether the protein coding region of a sequence is complete, or whether errors have occurred during library construction. Here we present a linear discriminant approach that predicts this completeness by estimating the probability of each ATG being the initiation codon. RESULTS Because of the current shortage of full-length cDNA data on which to base this work, tests were performed on a non-redundant set of 660 initiation codon-containing DNA sequences that had been conceptually spliced into mRNA/cDNA. We also used an edited set of the same sequences that only contained the region following the initiation codon as a negative control. Using the criterion that only a single prediction is allowed for each sequence, a cut-off was selected at which discrimination of both positive and negative sets was equal. At this cut-off, 67% of each set could be correctly distinguished, with the correct ATG codon also being identified in the positive set. Reliability could be increased further by raising the cut-off or including homologues, the relative merits of which are discussed. AVAILABILITY The prediction program, called ATGpr, and other data are available at http://www.hri.co.jp/atgpr CONTACT [email protected]
Trends in Biochemical Sciences | 2002
David Jones; Mark B. Swindells
Most biologists now conduct sequence searches as a matter of course. But how do we know that a relationship predicted by a homology search is a true, rather than false, hit with the same score? Many biologists design their own experiments with exquisite care yet still assume that results from programs with more than 20 adjustable parameters are 100% reliable. This article explains some of the key steps in getting the most from PSI-Blast, one of the most popular and powerful homology search programs currently available.
PLOS Computational Biology | 2005
Anna E. Lobley; Mark B. Swindells; Christine A. Orengo; David Jones
Natively unstructured regions are a common feature of eukaryotic proteomes. Between 30% and 60% of proteins are predicted to contain long stretches of disordered residues, and not only have many of these regions been confirmed experimentally, but they have also been found to be essential for protein function. In this study, we directly address the potential contribution of protein disorder in predicting protein function using standard Gene Ontology (GO) categories. Initially we analyse the occurrence of protein disorder in the human proteome and report ontology categories that are enriched in disordered proteins. Pattern analysis of the distributions of disordered regions in human sequences demonstrated that the functions of intrinsically disordered proteins are both length- and position-dependent. These dependencies were then encoded in feature vectors to quantify the contribution of disorder in human protein function prediction using Support Vector Machine classifiers. The prediction accuracies of 26 GO categories relating to signalling and molecular recognition are improved using the disorder features. The most significant improvements were observed for kinase, phosphorylation, growth factor, and helicase categories. Furthermore, we provide predicted GO term assignments using these classifiers for a set of unannotated and orphan human proteins. In this study, the importance of capturing protein disorder information and its value in function prediction is demonstrated. The GO category classifiers generated can be used to provide more reliable predictions and further insights into the behaviour of orphan and unannotated proteins.
BioEssays | 1998
Mark B. Swindells; Christine A. Orengo; David Jones; E. Gail Hutchinson; Janet M. Thornton
In a similar manner to sequence database searching, it is also possible to compare three‐dimensional protein structures. Such methods can be extremely useful because a structural similarity may represent a distant evolutionary relationship that is undetectable by sequence analysis. In this review, we summarise the most popular structure comparison methods, show how they can be used for database searching, and then describe some of the most advanced attempts to develop comprehensive protein structure classifications. With such data, it is possible to identify distant evolutionary relationships, provide libraries of unique folds for structure prediction, estimate the total number of folds that exist, and investigate the preference for certain types of structures over others. BioEssays 20:884–891, 1998.
intelligent systems in molecular biology | 2004
Richard A. George; Ruth V. Spriggs; Janet M. Thornton; Bissan Al-Lazikani; Mark B. Swindells
MOTIVATION Domains are the units of protein structure, function and evolution. It is therefore essential to utilize knowledge of domains when studying the evolution of function, or when assigning function to genome sequence data. For this purpose, we have developed a database of catalytic domains, SCOPEC, by combining structural domain information from SCOP, full-length sequence information from Swiss-Prot, and verified functional information from the Enzyme Classification (EC) database. Two major problems need to be overcome to create a database of domain-function relationships; (1) for sequences, EC numbers are typically assigned to whole sequences rather than the functional unit, and (2) The Protein Data Bank (PDB) structures elucidated from a larger multi-domain protein will often have EC annotation although the relevant catalytic domain may lie elsewhere. RESULTS SCOPEC entries have high quality enzyme assignments; having passed both computational and manual checks. SCOPEC currently contains entries for 75% of all EC annotations in the PDB. Overall, EC number is fairly well conserved within a superfamily, even when the proteins are distantly related. Initial analysis is encouraging; suggesting that there is a 50:50 chance of conserved function in distant homologues first detected by a third iteration PSI-BLAST search. Therefore, we envisage that a knowledge-based approach to function assignment using the domain-EC relationships in SCOPEC will gain a marked improvement over this base line. AVAILABILITY The SCOPEC database is a valuable resource in the analysis and prediction of protein structure and function. It can be obtained or queried at our website http://www.enzome.com
Current Opinion in Structural Biology | 1991
Mark B. Swindells; Janet M. Thornton
Abstract Modelling on the basis of a homologous structure is the only reliable method available to predict the three-dimensional structure of a protein from its sequence. The past year has seen considerable advances in both the development of automated procedures and their application to proteins of outstanding biological interest.
Collaboration
Dive into the Mark B. Swindells's collaboration.
National Institute of Advanced Industrial Science and Technology
View shared research outputs