Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Alex L. Mitchell is active.

Publication


Featured researches published by Alex L. Mitchell.


Nucleic Acids Research | 2016

The Pfam protein families database: towards a more sustainable future

Robert D. Finn; Penelope Coggill; Ruth Y. Eberhardt; Sean R. Eddy; Jaina Mistry; Alex L. Mitchell; Simon Potter; Marco Punta; Matloob Qureshi; Amaia Sangrador-Vegas; Gustavo A. Salazar; John G. Tate; Alex Bateman

In the last two years the Pfam database (http://pfam.xfam.org) has undergone a substantial reorganisation to reduce the effort involved in making a release, thereby permitting more frequent releases. Arguably the most significant of these changes is that Pfam is now primarily based on the UniProtKB reference proteomes, with the counts of matched sequences and species reported on the website restricted to this smaller set. Building families on reference proteomes sequences brings greater stability, which decreases the amount of manual curation required to maintain them. It also reduces the number of sequences displayed on the website, whilst still providing access to many important model organisms. Matches to the full UniProtKB database are, however, still available and Pfam annotations for individual UniProtKB sequences can still be retrieved. Some Pfam entries (1.6%) which have no matches to reference proteomes remain; we are working with UniProt to see if sequences from them can be incorporated into reference proteomes. Pfam-B, the automatically-generated supplement to Pfam, has been removed. The current release (Pfam 29.0) includes 16 295 entries and 559 clans. The facility to view the relationship between families within a clan has been improved by the introduction of a new tool.


Nucleic Acids Research | 2009

InterPro: the integrative protein signature database

Sarah Hunter; Rolf Apweiler; Teresa K. Attwood; Amos Marc Bairoch; Alex Bateman; David Binns; Peer Bork; Ujjwal Das; Louise Daugherty; Lauranne Duquenne; Robert D. Finn; Julian Gough; Daniel H. Haft; Nicolas Hulo; Daniel Kahn; Elizabeth Kelly; Aurélie Laugraud; Ivica Letunic; David M. Lonsdale; Rodrigo Lopez; John Maslen; Craig McAnulla; Jennifer McDowall; Jaina Mistry; Alex L. Mitchell; Nicola Mulder; Darren A. Natale; Christine A. Orengo; Antony F. Quinn; Jeremy D. Selengut

The InterPro database (http://www.ebi.ac.uk/interpro/) integrates together predictive models or ‘signatures’ representing protein domains, families and functional sites from multiple, diverse source databases: Gene3D, PANTHER, Pfam, PIRSF, PRINTS, ProDom, PROSITE, SMART, SUPERFAMILY and TIGRFAMs. Integration is performed manually and approximately half of the total ∼58 000 signatures available in the source databases belong to an InterPro entry. Recently, we have started to also display the remaining un-integrated signatures via our web interface. Other developments include the provision of non-signature data, such as structural data, in new XML files on our FTP site, as well as the inclusion of matchless UniProtKB proteins in the existing match XML files. The web interface has been extended and now links out to the ADAN predicted protein–protein interaction database and the SPICE and Dasty viewers. The latest public release (v18.0) covers 79.8% of UniProtKB (v14.1) and consists of 16 549 entries. InterPro data may be accessed either via the web address above, via web services, by downloading files by anonymous FTP or by using the InterProScan search software (http://www.ebi.ac.uk/Tools/InterProScan/).


Bioinformatics | 2014

InterProScan 5: genome-scale protein function classification

Philip Jones; David Binns; Hsin-Yu Chang; Matthew Fraser; Weizhong Li; Craig McAnulla; Hamish McWilliam; John Maslen; Alex L. Mitchell; Gift Nuka; Sebastien Pesseat; Antony F. Quinn; Amaia Sangrador-Vegas; Maxim Scheremetjew; Siew-Yit Yong; Rodrigo Lopez; Sarah Hunter

Motivation: Robust large-scale sequence analysis is a major challenge in modern genomic science, where biologists are frequently trying to characterize many millions of sequences. Here, we describe a new Java-based architecture for the widely used protein function prediction software package InterProScan. Developments include improvements and additions to the outputs of the software and the complete reimplementation of the software framework, resulting in a flexible and stable system that is able to use both multiprocessor machines and/or conventional clusters to achieve scalable distributed data analysis. InterProScan is freely available for download from the EMBl-EBI FTP site and the open source code is hosted at Google Code. Availability and implementation: InterProScan is distributed via FTP at ftp://ftp.ebi.ac.uk/pub/software/unix/iprscan/5/ and the source code is available from http://code.google.com/p/interproscan/. Contact: http://www.ebi.ac.uk/support or [email protected] or [email protected]


Nucleic Acids Research | 2012

InterPro in 2011: new developments in the family and domain prediction database

Sarah Hunter; P. D. Jones; Alex L. Mitchell; Rolf Apweiler; Teresa K. Attwood; Alex Bateman; Thomas Bernard; David Binns; Peer Bork; Sarah W. Burge; Edouard de Castro; Penny Coggill; Matthew Corbett; Ujjwal Das; Louise Daugherty; Lauranne Duquenne; Robert D. Finn; Matthew Fraser; Julian Gough; Daniel H. Haft; Nicolas Hulo; Daniel Kahn; Elizabeth Kelly; Ivica Letunic; David M. Lonsdale; Rodrigo Lopez; John Maslen; Craig McAnulla; Jennifer McDowall; Conor McMenamin

InterPro (http://www.ebi.ac.uk/interpro/) is a database that integrates diverse information about protein families, domains and functional sites, and makes it freely available to the public via Web-based interfaces and services. Central to the database are diagnostic models, known as signatures, against which protein sequences can be searched to determine their potential function. InterPro has utility in the large-scale analysis of whole genomes and meta-genomes, as well as in characterizing individual protein sequences. Herein we give an overview of new developments in the database and its associated software since 2009, including updates to database content, curation processes and Web and programmatic interfaces.


Nucleic Acids Research | 2015

The InterPro protein families database: the classification resource after 15 years

Alex L. Mitchell; Hsin-Yu Chang; Louise Daugherty; Matthew Fraser; Sarah Hunter; Rodrigo Lopez; Craig McAnulla; Conor McMenamin; Gift Nuka; Sebastien Pesseat; Amaia Sangrador-Vegas; Maxim Scheremetjew; Claudia Rato; Siew-Yit Yong; Alex Bateman; Marco Punta; Teresa K. Attwood; Christian J. A. Sigrist; Nicole Redaschi; Catherine Rivoire; Ioannis Xenarios; Daniel Kahn; Dominique Guyot; Peer Bork; Ivica Letunic; Julian Gough; Matt E. Oates; Daniel H. Haft; Hongzhan Huang; Darren A. Natale

The InterPro database (http://www.ebi.ac.uk/interpro/) is a freely available resource that can be used to classify sequences into protein families and to predict the presence of important domains and sites. Central to the InterPro database are predictive models, known as signatures, from a range of different protein family databases that have different biological focuses and use different methodological approaches to classify protein families and domains. InterPro integrates these signatures, capitalizing on the respective strengths of the individual databases, to produce a powerful protein classification resource. Here, we report on the status of InterPro as it enters its 15th year of operation, and give an overview of new developments with the database and its associated Web interfaces and software. In particular, the new domain architecture search tool is described and the process of mapping of Gene Ontology terms to InterPro is outlined. We also discuss the challenges faced by the resource given the explosive growth in sequence data in recent years. InterPro (version 48.0) contains 36 766 member database signatures integrated into 26 238 InterPro entries, an increase of over 3993 entries (5081 signatures), since 2012.


Nucleic Acids Research | 2004

InterPro, progress and status in 2005

Nicola Mulder; Rolf Apweiler; Teresa K. Attwood; Amos Marc Bairoch; Alex Bateman; David Binns; Paul Bradley; Peer Bork; Phillip Bucher; Lorenzo Cerutti; Richard R. Copley; Emmanuel Courcelle; Ujjwal Das; Richard Durbin; Wolfgang Fleischmann; Julian Gough; Daniel H. Haft; Nicola Harte; Nicolas Hulo; Daniel Kahn; Alexander Kanapin; Maria Krestyaninova; David M. Lonsdale; Rodrigo Lopez; Ivica Letunic; John Maslen; Jennifer McDowall; Alex L. Mitchell; Anastasia N. Nikolskaya; Sandra Orchard

InterPro, an integrated documentation resource of protein families, domains and functional sites, was created to integrate the major protein signature databases. Currently, it includes PROSITE, Pfam, PRINTS, ProDom, SMART, TIGRFAMs, PIRSF and SUPERFAMILY. Signatures are manually integrated into InterPro entries that are curated to provide biological and functional information. Annotation is provided in an abstract, Gene Ontology mapping and links to specialized databases. New features of InterPro include extended protein match views, taxonomic range information and protein 3D structure data. One of the new match views is the InterPro Domain Architecture view, which shows the domain composition of protein matches. Two new entry types were introduced to better describe InterPro entries: these are active site and binding site. PIRSF and the structure-based SUPERFAMILY are the latest member databases to join InterPro, and CATH and PANTHER are soon to be integrated. InterPro release 8.0 contains 11 007 entries, representing 2573 domains, 8166 families, 201 repeats, 26 active sites, 21 binding sites and 20 post-translational modification sites. InterPro covers over 78% of all proteins in the Swiss-Prot and TrEMBL components of UniProt. The database is available for text- and sequence-based searches via a webserver (http://www.ebi.ac.uk/interpro), and for download by anonymous FTP (ftp://ftp.ebi.ac.uk/pub/databases/interpro).


Nucleic Acids Research | 2007

New developments in the InterPro database

Nicola Mulder; Rolf Apweiler; Teresa K. Attwood; Amos Marc Bairoch; Alex Bateman; David Binns; Peer Bork; Virginie Buillard; Lorenzo Cerutti; Richard R. Copley; Emmanuel Courcelle; Ujjwal Das; Louise Daugherty; Mark Dibley; Robert D. Finn; Wolfgang Fleischmann; Julian Gough; Daniel H. Haft; Nicolas Hulo; Sarah Hunter; Daniel Kahn; Alexander Kanapin; Anish Kejariwal; Alberto Labarga; Petra S. Langendijk-Genevaux; David M. Lonsdale; Rodrigo Lopez; Ivica Letunic; John Maslen; Craig McAnulla

InterPro is an integrated resource for protein families, domains and functional sites, which integrates the following protein signature databases: PROSITE, PRINTS, ProDom, Pfam, SMART, TIGRFAMs, PIRSF, SUPERFAMILY, Gene3D and PANTHER. The latter two new member databases have been integrated since the last publication in this journal. There have been several new developments in InterPro, including an additional reading field, new database links, extensions to the web interface and additional match XML files. InterPro has always provided matches to UniProtKB proteins on the website and in the match XML file on the FTP site. Additional matches to proteins in UniParc (UniProt archive) are now available for download in the new match XML files only. The latest InterPro release (13.0) contains more than 13 000 entries, covering over 78% of all proteins in UniProtKB. The database is available for text- and sequence-based searches via a webserver (), and for download by anonymous FTP (). The InterProScan search tool is now also available via a web service at .


Nucleic Acids Research | 2017

InterPro in 2017—beyond protein family and domain annotations

Robert D. Finn; Teresa K. Attwood; Patricia C. Babbitt; Alex Bateman; Peer Bork; Alan Bridge; Hsin Yu Chang; Zsuzsanna Dosztányi; Sara El-Gebali; Matthew Fraser; Julian Gough; David R Haft; Gemma L. Holliday; Hongzhan Huang; Xiaosong Huang; Ivica Letunic; Rodrigo Lopez; Shennan Lu; Huaiyu Mi; Jaina Mistry; Darren A. Natale; Marco Necci; Gift Nuka; Christine A. Orengo; Youngmi Park; Sebastien Pesseat; Damiano Piovesan; Simon Potter; Neil D. Rawlings; Nicole Redaschi

InterPro (http://www.ebi.ac.uk/interpro/) is a freely available database used to classify protein sequences into families and to predict the presence of important domains and sites. InterProScan is the underlying software that allows both protein and nucleic acid sequences to be searched against InterPros predictive models, which are provided by its member databases. Here, we report recent developments with InterPro and its associated software, including the addition of two new databases (SFLD and CDD), and the functionality to include residue-level annotation and prediction of intrinsic disorder. These developments enrich the annotations provided by InterPro, increase the overall number of residues annotated and allow more specific functional inferences.


Nucleic Acids Research | 2002

PRINTS and PRINTS-S shed light on protein ancestry

Terri K. Attwood; Martin J. Blythe; Darren R. Flower; Anna Gaulton; J. E. Mabey; Neil Maudling; L. McGregor; Alex L. Mitchell; G. Moulton; Kelly Paine; Philip Scordis

The PRINTS database houses a collection of protein fingerprints. These may be used to make family and tentative functional assignments for uncharacterised sequences. The September 2001 release (version 32.0) includes 1600 fingerprints, encoding approximately 10 000 motifs, covering a range of globular and membrane proteins, modular polypeptides and so on. In addition to its continued steady growth, we report here its use as a source of annotation in the InterPro resource, and the use of its relational cousin, PRINTS-S, to model relationships between families, including those beyond the reach of conventional sequence analysis approaches. The database is accessible for BLAST, fingerprint and text searches at http://www.bioinf.man.ac.uk/dbbrowser/PRINTS/.


Nucleic Acids Research | 2014

EBI metagenomics—a new resource for the analysis and archiving of metagenomic data

Sarah Hunter; Matthew Corbett; Hubert Denise; Matthew Fraser; Alejandra Gonzalez-Beltran; Chris Hunter; Philip Jones; Rasko Leinonen; Craig McAnulla; Eamonn Maguire; John Maslen; Alex L. Mitchell; Gift Nuka; Arnaud Oisel; Sebastien Pesseat; Rajesh Radhakrishnan; Philippe Rocca-Serra; Maxim Scheremetjew; Peter Sterk; Daniel Vaughan; Guy Cochrane; Dawn Field; Susanna-Assunta Sansone

Metagenomics is a relatively recently established but rapidly expanding field that uses high-throughput next-generation sequencing technologies to characterize the microbial communities inhabiting different ecosystems (including oceans, lakes, soil, tundra, plants and body sites). Metagenomics brings with it a number of challenges, including the management, analysis, storage and sharing of data. In response to these challenges, we have developed a new metagenomics resource (http://www.ebi.ac.uk/metagenomics/) that allows users to easily submit raw nucleotide reads for functional and taxonomic analysis by a state-of-the-art pipeline, and have them automatically stored (together with descriptive, standards-compliant metadata) in the European Nucleotide Archive.

Collaboration


Dive into the Alex L. Mitchell's collaboration.

Top Co-Authors

Avatar

Robert D. Finn

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Sarah Hunter

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

Maxim Scheremetjew

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

Craig McAnulla

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

Matthew Fraser

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

Sebastien Pesseat

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

Alex Bateman

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Siew-Yit Yong

European Bioinformatics Institute

View shared research outputs
Researchain Logo
Decentralizing Knowledge