Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Catherine Rivoire is active.

Publication


Featured researches published by Catherine Rivoire.


Nucleic Acids Research | 2012

InterPro in 2011: new developments in the family and domain prediction database

Sarah Hunter; P. D. Jones; Alex L. Mitchell; Rolf Apweiler; Teresa K. Attwood; Alex Bateman; Thomas Bernard; David Binns; Peer Bork; Sarah W. Burge; Edouard de Castro; Penny Coggill; Matthew Corbett; Ujjwal Das; Louise Daugherty; Lauranne Duquenne; Robert D. Finn; Matthew Fraser; Julian Gough; Daniel H. Haft; Nicolas Hulo; Daniel Kahn; Elizabeth Kelly; Ivica Letunic; David M. Lonsdale; Rodrigo Lopez; John Maslen; Craig McAnulla; Jennifer McDowall; Conor McMenamin

InterPro (http://www.ebi.ac.uk/interpro/) is a database that integrates diverse information about protein families, domains and functional sites, and makes it freely available to the public via Web-based interfaces and services. Central to the database are diagnostic models, known as signatures, against which protein sequences can be searched to determine their potential function. InterPro has utility in the large-scale analysis of whole genomes and meta-genomes, as well as in characterizing individual protein sequences. Herein we give an overview of new developments in the database and its associated software since 2009, including updates to database content, curation processes and Web and programmatic interfaces.


Nucleic Acids Research | 2015

The InterPro protein families database: the classification resource after 15 years

Alex L. Mitchell; Hsin-Yu Chang; Louise Daugherty; Matthew Fraser; Sarah Hunter; Rodrigo Lopez; Craig McAnulla; Conor McMenamin; Gift Nuka; Sebastien Pesseat; Amaia Sangrador-Vegas; Maxim Scheremetjew; Claudia Rato; Siew-Yit Yong; Alex Bateman; Marco Punta; Teresa K. Attwood; Christian J. A. Sigrist; Nicole Redaschi; Catherine Rivoire; Ioannis Xenarios; Daniel Kahn; Dominique Guyot; Peer Bork; Ivica Letunic; Julian Gough; Matt E. Oates; Daniel H. Haft; Hongzhan Huang; Darren A. Natale

The InterPro database (http://www.ebi.ac.uk/interpro/) is a freely available resource that can be used to classify sequences into protein families and to predict the presence of important domains and sites. Central to the InterPro database are predictive models, known as signatures, from a range of different protein family databases that have different biological focuses and use different methodological approaches to classify protein families and domains. InterPro integrates these signatures, capitalizing on the respective strengths of the individual databases, to produce a powerful protein classification resource. Here, we report on the status of InterPro as it enters its 15th year of operation, and give an overview of new developments with the database and its associated Web interfaces and software. In particular, the new domain architecture search tool is described and the process of mapping of Gene Ontology terms to InterPro is outlined. We also discuss the challenges faced by the resource given the explosive growth in sequence data in recent years. InterPro (version 48.0) contains 36 766 member database signatures integrated into 26 238 InterPro entries, an increase of over 3993 entries (5081 signatures), since 2012.


Nucleic Acids Research | 2017

InterPro in 2017—beyond protein family and domain annotations

Robert D. Finn; Teresa K. Attwood; Patricia C. Babbitt; Alex Bateman; Peer Bork; Alan Bridge; Hsin Yu Chang; Zsuzsanna Dosztányi; Sara El-Gebali; Matthew Fraser; Julian Gough; David R Haft; Gemma L. Holliday; Hongzhan Huang; Xiaosong Huang; Ivica Letunic; Rodrigo Lopez; Shennan Lu; Huaiyu Mi; Jaina Mistry; Darren A. Natale; Marco Necci; Gift Nuka; Christine A. Orengo; Youngmi Park; Sebastien Pesseat; Damiano Piovesan; Simon Potter; Neil D. Rawlings; Nicole Redaschi

InterPro (http://www.ebi.ac.uk/interpro/) is a freely available database used to classify protein sequences into families and to predict the presence of important domains and sites. InterProScan is the underlying software that allows both protein and nucleic acid sequences to be searched against InterPros predictive models, which are provided by its member databases. Here, we report recent developments with InterPro and its associated software, including the addition of two new databases (SFLD and CDD), and the functionality to include residue-level annotation and prediction of intrinsic disorder. These developments enrich the annotations provided by InterPro, increase the overall number of residues annotated and allow more specific functional inferences.


Nucleic Acids Research | 2012

The UniProt-GO Annotation database in 2011

Emily Dimmer; Rachael P. Huntley; Yasmin Alam-Faruque; Tony Sawford; Claire O'Donovan; María Martín; Benoit Bely; Paul Browne; Wei Mun Chan; Ruth Eberhardt; Michael Gardner; Kati Laiho; D Legge; Michele Magrane; Klemens Pichler; Diego Poggioli; Harminder Sehra; Andrea H. Auchincloss; Kristian B. Axelsen; Marie-Claude Blatter; Emmanuel Boutet; Silvia Braconi-Quintaje; Lionel Breuza; Alan Bridge; Elizabeth Coudert; Anne Estreicher; L Famiglietti; Serenella Ferro-Rojas; Marc Feuermann; Arnaud Gos

The GO annotation dataset provided by the UniProt Consortium (GOA: http://www.ebi.ac.uk/GOA) is a comprehensive set of evidenced-based associations between terms from the Gene Ontology resource and UniProtKB proteins. Currently supplying over 100 million annotations to 11 million proteins in more than 360 000 taxa, this resource has increased 2-fold over the last 2 years and has benefited from a wealth of checks to improve annotation correctness and consistency as well as now supplying a greater information content enabled by GO Consortium annotation format developments. Detailed, manual GO annotations obtained from the curation of peer-reviewed papers are directly contributed by all UniProt curators and supplemented with manual and electronic annotations from 36 model organism and domain-focused scientific resources. The inclusion of high-quality, automatic annotation predictions ensures the UniProt GO annotation dataset supplies functional information to a wide range of proteins, including those from poorly characterized, non-model organism species. UniProt GO annotations are freely available in a range of formats accessible by both file downloads and web-based views. In addition, the introduction of a new, normalized file format in 2010 has made for easier handling of the complete UniProt-GOA data set.


Nucleic Acids Research | 2009

HAMAP: a database of completely sequenced microbial proteome sets and manually curated microbial protein families in UniProtKB/Swiss-Prot

Tania Lima; Andrea H. Auchincloss; Elisabeth Coudert; Guillaume Keller; Karine Michoud; Catherine Rivoire; Virginie Bulliard; Edouard de Castro; Corinne Lachaize; Delphine Baratin; Isabelle Phan; Lydie Bougueleret; Amos Marc Bairoch

The growth in the number of completely sequenced microbial genomes (bacterial and archaeal) has generated a need for a procedure that provides UniProtKB/Swiss-Prot-quality annotation to as many protein sequences as possible. We have devised a semi-automated system, HAMAP (High-quality Automated and Manual Annotation of microbial Proteomes), that uses manually built annotation templates for protein families to propagate annotation to all members of manually defined protein families, using very strict criteria. The HAMAP system is composed of two databases, the proteome database and the family database, and of an automatic annotation pipeline. The proteome database comprises biological and sequence information for each completely sequenced microbial proteome, and it offers several tools for CDS searches, BLAST options and retrieval of specific sets of proteins. The family database currently comprises more than 1500 manually curated protein families and their annotation templates that are used to annotate proteins that belong to one of the HAMAP families. On the HAMAP website, individual sequences as well as whole genomes can be scanned against all HAMAP families. The system provides warnings for the absence of conserved amino acid residues, unusual sequence length, etc. Thanks to the implementation of HAMAP, more than 200 000 microbial proteins have been fully annotated in UniProtKB/Swiss-Prot (HAMAP website: http://www.expasy.org/sprot/hamap).


Nucleic Acids Research | 2013

HAMAP in 2013, new developments in the protein family classification and annotation system

Ivo Pedruzzi; Catherine Rivoire; Andrea H. Auchincloss; Elisabeth Coudert; Guillaume Keller; Edouard de Castro; Delphine Baratin; Béatrice A. Cuche; Lydie Bougueleret; Sylvain Poux; Nicole Redaschi; Ioannis Xenarios; Alan Bridge

HAMAP (High-quality Automated and Manual Annotation of Proteins—available at http://hamap.expasy.org/) is a system for the classification and annotation of protein sequences. It consists of a collection of manually curated family profiles for protein classification, and associated annotation rules that specify annotations that apply to family members. HAMAP was originally developed to support the manual curation of UniProtKB/Swiss-Prot records describing microbial proteins. Here we describe new developments in HAMAP, including the extension of HAMAP to eukaryotic proteins, the use of HAMAP in the automated annotation of UniProtKB/TrEMBL, providing high-quality annotation for millions of protein sequences, and the future integration of HAMAP into a unified system for UniProtKB annotation, UniRule. HAMAP is continuously updated by expert curators with new family profiles and annotation rules as new protein families are characterized. The collection of HAMAP family classification profiles and annotation rules can be browsed and viewed on the HAMAP website, which also provides an interface to scan user sequences against HAMAP profiles.


Nucleic Acids Research | 2015

HAMAP in 2015: updates to the protein family classification and annotation system

Ivo Pedruzzi; Catherine Rivoire; Andrea H. Auchincloss; Elisabeth Coudert; Guillaume Keller; Edouard de Castro; Delphine Baratin; Béatrice A. Cuche; Lydie Bougueleret; Sylvain Poux; Nicole Redaschi; Ioannis Xenarios; Alan Bridge

HAMAP (High-quality Automated and Manual Annotation of Proteins—available at http://hamap.expasy.org/) is a system for the automatic classification and annotation of protein sequences. HAMAP provides annotation of the same quality and detail as UniProtKB/Swiss-Prot, using manually curated profiles for protein sequence family classification and expert curated rules for functional annotation of family members. HAMAP data and tools are made available through our website and as part of the UniRule pipeline of UniProt, providing annotation for millions of unreviewed sequences of UniProtKB/TrEMBL. Here we report on the growth of HAMAP and updates to the HAMAP system since our last report in the NAR Database Issue of 2013. We continue to augment HAMAP with new family profiles and annotation rules as new protein families are characterized and annotated in UniProtKB/Swiss-Prot; the latest version of HAMAP (as of 3 September 2014) contains 1983 family classification profiles and 1998 annotation rules (up from 1780 and 1720). We demonstrate how the complex logic of HAMAP rules allows for precise annotation of individual functional variants within large homologous protein families. We also describe improvements to our web-based tool HAMAP-Scan which simplify the classification and annotation of sequences, and the incorporation of an improved sequence-profile search algorithm.


Database | 2009

Collaborative annotation of genes and proteins between UniProtKB/Swiss-Prot and dictyBase

Pascale Gaudet; Lydie Lane; Petra Fey; Alan Bridge; Sylvain Poux; Andrea H. Auchincloss; Kristian B. Axelsen; S. Braconi Quintaje; Emmanuel Boutet; P. Brown; Elisabeth Coudert; Ruchira S. Datta; W.C. de Lima; T. de Oliveira Lima; Séverine Duvaud; N. Farriol-Mathis; S. Ferro Rojas; Marc Feuermann; Alain Gateau; Ursula Hinz; Chantal Hulo; J. James; S. Jimenez; Florence Jungo; Guillaume Keller; P Lemercier; Damien Lieberherr; M. Moinat; A. Nikolskaya; I. Pedruzzi

UniProtKB/Swiss-Prot, a curated protein database, and dictyBase, the Model Organism Database for Dictyostelium discoideum, have established a collaboration to improve data sharing. One of the major steps in this effort was the ‘Dicty annotation marathon’, a week-long exercise with 30 annotators aimed at achieving a major increase in the number of D. discoideum proteins represented in UniProtKB/Swiss-Prot. The marathon led to the annotation of over 1000 D. discoideum proteins in UniProtKB/Swiss-Prot. Concomitantly, there were a large number of updates in dictyBase concerning gene symbols, protein names and gene models. This exercise demonstrates how UniProtKB/Swiss-Prot can work in very close cooperation with model organism databases and how the annotation of proteins can be accelerated through those collaborations.


Database | 2016

Minimizing proteome redundancy in the UniProt Knowledgebase.

Borisas Bursteinas; Ramona Britto; Benoit Bely; Andrea H. Auchincloss; Catherine Rivoire; Nicole Redaschi; Claire O'Donovan; María Martín

Advances in high-throughput sequencing have led to an unprecedented growth in genome sequences being submitted to biological databases. In particular, the sequencing of large numbers of nearly identical bacterial genomes during infection outbreaks and for other large-scale studies has resulted in a high level of redundancy in nucleotide databases and consequently in the UniProt Knowledgebase (UniProtKB). Redundancy negatively impacts on database searches by causing slower searches, an increase in statistical bias and cumbersome result analysis. The redundancy combined with the large data volume increases the computational costs for most reuses of UniProtKB data. All of this poses challenges for effective discovery in this wealth of data. With the continuing development of sequencing technologies, it is clear that finding ways to minimize redundancy is crucial to maintaining UniProts essential contribution to data interpretation by our users. We have developed a methodology to identify and remove highly redundant proteomes from UniProtKB. The procedure identifies redundant proteomes by performing pairwise alignments of sets of sequences for pairs of proteomes and subsequently, applies graph theory to find dominating sets that provide a set of non-redundant proteomes with a minimal loss of information. This method was implemented for bacteria in mid-2015, resulting in a removal of 50 million proteins in UniProtKB. With every new release, this procedure is used to filter new incoming proteomes, resulting in a more scalable and scientifically valuable growth of UniProtKB. Database URL: http://www.uniprot.org/proteomes/


Computational Biology and Chemistry | 2003

Automated annotation of microbial proteomes in SWISS-PROT

Alexandre Gattiker; Karine Michoud; Catherine Rivoire; Andrea H. Auchincloss; Elisabeth Coudert; Tania Lima; Paul J. Kersey; Marco Pagni; Christian J. A. Sigrist; Corinne Lachaize; Anne-Lise Veuthey; Elisabeth Gasteiger; Amos Marc Bairoch

Collaboration


Dive into the Catherine Rivoire's collaboration.

Top Co-Authors

Avatar

Andrea H. Auchincloss

Swiss Institute of Bioinformatics

View shared research outputs
Top Co-Authors

Avatar

Alan Bridge

Swiss Institute of Bioinformatics

View shared research outputs
Top Co-Authors

Avatar

Elisabeth Coudert

Swiss Institute of Bioinformatics

View shared research outputs
Top Co-Authors

Avatar

Nicole Redaschi

Swiss Institute of Bioinformatics

View shared research outputs
Top Co-Authors

Avatar

Edouard de Castro

Swiss Institute of Bioinformatics

View shared research outputs
Top Co-Authors

Avatar

Guillaume Keller

Swiss Institute of Bioinformatics

View shared research outputs
Top Co-Authors

Avatar

Alex Bateman

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

Ivica Letunic

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Matthew Fraser

European Bioinformatics Institute

View shared research outputs
Researchain Logo
Decentralizing Knowledge