Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Michele Clamp is active.

Publication


Featured researches published by Michele Clamp.


Bioinformatics | 2004

The Jalview Java alignment editor

Michele Clamp; James Cuff; Stephen M. J. Searle; Geoffrey J. Barton

Multiple sequence alignment remains a crucial method for understanding the function of groups of related nucleic acid and protein sequences. However, it is known that automatic multiple sequence alignments can often be improved by manual editing. Therefore, tools are needed to view and edit multiple sequence alignments. Due to growth in the sequence databases, multiple sequence alignments can often be large and difficult to view efficiently. The Jalview Java alignment editor is presented here, which enables fast viewing and editing of large multiple sequence alignments.


Nucleic Acids Research | 2002

The Ensembl genome database project

Tim Hubbard; Darren Barker; Ewan Birney; Graham Cameron; Yuan Chen; L. Clark; Tony Cox; James Cuff; V. Curwen; Thomas A. Down; Richard Durbin; E. Eyras; James Gilbert; Martin Hammond; L. Huminiecki; Arek Kasprzyk; Heikki Lehväslaiho; Philip Lijnzaad; Craig Melsopp; Emmanuel Mongin; R. Pettett; M. Pocock; Simon Potter; A. Rust; Esther Schmidt; Stephen M. J. Searle; Guy Slater; J. Smith; W. Spooner; A. Stabenau

The Ensembl (http://www.ensembl.org/) database project provides a bioinformatics framework to organise biology around the sequences of large genomes. It is a comprehensive source of stable automatic annotation of the human genome sequence, with confirmed gene predictions that have been integrated with external data sources, and is available as either an interactive web site or as flat files. It is also an open source software engineering project to develop a portable system able to handle very large genomes and associated requirements from sequence analysis to data storage and visualisation. The Ensembl site is one of the leading sources of human genome sequence annotation and provided much of the analysis for publication by the international human genome project of the draft genome. The Ensembl system is being installed around the world in both companies and academic sites on machines ranging from supercomputers to laptops.


Bioinformatics | 1998

JPred: a consensus secondary structure prediction server.

James Cuff; Michele Clamp; Asim S. Siddiqui; M. Finlay; Geoffrey J. Barton

UNLABELLED An interactive protein secondary structure prediction Internet server is presented. The server allows a single sequence or multiple alignment to be submitted, and returns predictions from six secondary structure prediction algorithms that exploit evolutionary information from multiple sequences. A consensus prediction is also returned which improves the average Q3 accuracy of prediction by 1% to 72.9%. The server simplifies the use of current prediction algorithms and allows conservation patterns important to structure and function to be identified. AVAILABILITY http://barton.ebi.ac.uk/servers/jpred.h tml CONTACT [email protected]


Nature | 2011

A high-resolution map of human evolutionary constraint using 29 mammals

Kerstin Lindblad-Toh; Manuel Garber; Or Zuk; Michael F. Lin; Brian J. Parker; Stefan Washietl; Pouya Kheradpour; Jason Ernst; Gregory Jordan; Evan Mauceli; Lucas D. Ward; Craig B. Lowe; Alisha K. Holloway; Michele Clamp; Sante Gnerre; Jessica Alföldi; Kathryn Beal; Jean Chang; Hiram Clawson; James Cuff; Federica Di Palma; Stephen Fitzgerald; Paul Flicek; Mitchell Guttman; Melissa J. Hubisz; David B. Jaffe; Irwin Jungreis; W. James Kent; Dennis Kostka; Marcia Lara

The comparison of related genomes has emerged as a powerful lens for genome interpretation. Here we report the sequencing and comparative analysis of 29 eutherian genomes. We confirm that at least 5.5% of the human genome has undergone purifying selection, and locate constrained elements covering ∼4.2% of the genome. We use evolutionary signatures and comparisons with experimental data sets to suggest candidate functions for ∼60% of constrained bases. These elements reveal a small number of new coding exons, candidate stop codon readthrough events and over 10,000 regions of overlapping synonymous constraint within protein-coding exons. We find 220 candidate RNA structural families, and nearly a million elements overlapping potential promoter, enhancer and insulator regions. We report specific amino acid residues that have undergone positive selection, 280,000 non-coding elements exapted from mobile elements and more than 1,000 primate- and human-accelerated elements. Overlap with disease-associated variants indicates that our findings will be relevant for studies of human biology, health and disease.


Proceedings of the National Academy of Sciences of the United States of America | 2007

Distinguishing protein-coding and noncoding genes in the human genome

Michele Clamp; Ben Fry; Mike Kamal; Xiaohui Xie; James Cuff; Michael F. Lin; Manolis Kellis; Kerstin Lindblad-Toh; Eric S. Lander

Although the Human Genome Project was completed 4 years ago, the catalog of human protein-coding genes remains a matter of controversy. Current catalogs list a total of ≈24,500 putative protein-coding genes. It is broadly suspected that a large fraction of these entries are functionally meaningless ORFs present by chance in RNA transcripts, because they show no evidence of evolutionary conservation with mouse or dog. However, there is currently no scientific justification for excluding ORFs simply because they fail to show evolutionary conservation: the alternative hypothesis is that most of these ORFs are actually valid human genes that reflect gene innovation in the primate lineage or gene loss in the other lineages. Here, we reject this hypothesis by carefully analyzing the nonconserved ORFs—specifically, their properties in other primates. We show that the vast majority of these ORFs are random occurrences. The analysis yields, as a by-product, a major revision of the current human catalogs, cutting the number of protein-coding genes to ≈20,500. Specifically, it suggests that nonconserved ORFs should be added to the human gene catalog only if there is clear evidence of an encoded protein. It also provides a principled methodology for evaluating future proposed additions to the human gene catalog. Finally, the results indicate that there has been relatively little true innovation in mammalian protein-coding genes.


Journal of Heredity | 2009

Genome 10K: A Proposal to Obtain Whole-Genome Sequence for 10 000 Vertebrate Species

David Haussler; Stephen J. O'Brien; Oliver A. Ryder; F. Keith Barker; Michele Clamp; Andrew J. Crawford; Robert Hanner; Olivier Hanotte; Warren E. Johnson; Jimmy A. McGuire; Webb Miller; Robert W. Murphy; William J. Murphy; Frederick H. Sheldon; Barry Sinervo; Byrappa Venkatesh; E. O. Wiley; Fred W. Allendorf; George Amato; C. Scott Baker; Aaron M. Bauer; Albano Beja-Pereira; Eldredge Bermingham; Giacomo Bernardi; Cibele R. Bonvicino; Sydney Brenner; Terry Burke; Joel Cracraft; Mark Diekhans; Scott V. Edwards

The human genome project has been recently complemented by whole-genome assessment sequence of 32 mammals and 24 nonmammalian vertebrate species suitable for comparative genomic analyses. Here we anticipate a precipitous drop in costs and increase in sequencing efficiency, with concomitant development of improved annotation technology and, therefore, propose to create a collection of tissue and DNA specimens for 10,000 vertebrate species specifically designated for whole-genome sequencing in the very near future. For this purpose, we, the Genome 10K Community of Scientists (G10KCOS), will assemble and allocate a biospecimen collection of some 16,203 representative vertebrate species spanning evolutionary diversity across living mammals, birds, nonavian reptiles, amphibians, and fishes (ca. 60,000 living species). In this proposal, we present precise counts for these 16,203 individual species with specimens presently tagged and stipulated for DNA sequencing by the G10KCOS. DNA sequencing has ushered in a new era of investigation in the biological sciences, allowing us to embark for the first time on a truly comprehensive study of vertebrate evolution, the results of which will touch nearly every aspect of vertebrate biological enquiry.


Genome Biology | 2002

Apollo: a sequence annotation editor

Suzanna E. Lewis; Smj Searle; Nomi L. Harris; M Gibson; Vivek Iyer; John Richter; C Wiel; Leyla Bayraktaroglu; Ewan Birney; Madeline A. Crosby; Joshua S Kaminker; Beverley B. Matthews; Se Prochnik; Christopher D. Smith; Jl Tupy; Gerald M. Rubin; S Misra; Christopher J. Mungall; Michele Clamp

The well-established inaccuracy of purely computational methods for annotating genome sequences necessitates an interactive tool to allow biological experts to refine these approximations by viewing and independently evaluating the data supporting each annotation. Apollo was developed to meet this need, enabling curators to inspect genome annotations closely and edit them. FlyBase biologists successfully used Apollo to annotate the Drosophila melanogaster genome and it is increasingly being used as a starting point for the development of customized annotation editing tools for other genome projects.


Nucleic Acids Research | 2003

Ensembl 2002: accommodating comparative genomics

Michele Clamp; D. Andrews; Darren Barker; Paul Bevan; Graham Cameron; Yuting Chen; Louise Clark; Tony Cox; James Cuff; Val Curwen; Thomas A. Down; Richard Durbin; Eduardo Eyras; James Gilbert; Martin Hammond; Tim Hubbard; Arek Kasprzyk; Damian Keefe; Heikki Lehväslaiho; Vishwanath R. Iyer; Craig Melsopp; Emmanuel Mongin; Roger Pettett; Simon Potter; Alistair G. Rust; Esther Schmidt; Steve Searle; Guy Slater; James Smith; William Spooner

The Ensembl (http://www.ensembl.org/) database project provides a bioinformatics framework to organise biology around the sequences of large genomes. It is a comprehensive source of stable automatic annotation of human, mouse and other genome sequences, available as either an interactive web site or as flat files. Ensembl also integrates manually annotated gene structures from external sources where available. As well as being one of the leading sources of genome annotation, Ensembl is an open source software engineering project to develop a portable system able to handle very large genomes and associated requirements. These range from sequence analysis to data storage and visualisation and installations exist around the world in both companies and at academic sites. With both human and mouse genome sequences available and more vertebrate sequences to follow, many of the recent developments in Ensembl have focusing on developing automatic comparative genome analysis and visualisation.


Bioinformatics | 2009

Identifying novel constrained elements by exploiting biased substitution patterns

Manuel Garber; Mitchell Guttman; Michele Clamp; Michael C. Zody; Nir Friedman; Xiaohui Xie

MOTIVATION Comparing the genomes from closely related species provides a powerful tool to identify functional elements in a reference genome. Many methods have been developed to identify conserved sequences across species; however, existing methods only model conservation as a decrease in the rate of mutation and have ignored selection acting on the pattern of mutations. RESULTS We present a new approach that takes advantage of deeply sequenced clades to identify evolutionary selection by uncovering not only signatures of rate-based conservation but also substitution patterns characteristic of sequence undergoing natural selection. We describe a new statistical method for modeling biased nucleotide substitutions, a learning algorithm for inferring site-specific substitution biases directly from sequence alignments and a hidden Markov model for detecting constrained elements characterized by biased substitutions. We show that the new approach can identify significantly more degenerate constrained sequences than rate-based methods. Applying it to the ENCODE regions, we identify as much as 10.2% of these regions are under selection. AVAILABILITY The algorithms are implemented in a Java software package, called SiPhy, freely available at http://www.broadinstitute.org/science/software/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.


Nature | 2001

Mining the draft human genome

Ewan Birney; Alex Bateman; Michele Clamp; Tim Hubbard

Now that the draft human genome sequence is available, everyone wants to be able to use it. However, we have perhaps become complacent about our ability to turn new genomes into lists of genes. The higher volume of data associated with a larger genome is accompanied by a much greater increase in complexity. We need to appreciate both the scale of the challenge of vertebrate genome analysis and the limitations of current gene prediction methods and understanding.

Collaboration


Dive into the Michele Clamp's collaboration.

Top Co-Authors

Avatar

Ewan Birney

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Emmanuel Mongin

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

Val Curwen

Wellcome Trust Sanger Institute

View shared research outputs
Top Co-Authors

Avatar

Craig Melsopp

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

James Gilbert

Wellcome Trust Sanger Institute

View shared research outputs
Top Co-Authors

Avatar

Laura Clarke

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

Simon Potter

Wellcome Trust Sanger Institute

View shared research outputs
Top Co-Authors

Avatar

Stephen M. J. Searle

Wellcome Trust Sanger Institute

View shared research outputs
Researchain Logo
Decentralizing Knowledge