Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Benoit Bely is active.

Publication


Featured researches published by Benoit Bely.


Nucleic Acids Research | 2012

The UniProt-GO Annotation database in 2011

Emily Dimmer; Rachael P. Huntley; Yasmin Alam-Faruque; Tony Sawford; Claire O'Donovan; María Martín; Benoit Bely; Paul Browne; Wei Mun Chan; Ruth Eberhardt; Michael Gardner; Kati Laiho; D Legge; Michele Magrane; Klemens Pichler; Diego Poggioli; Harminder Sehra; Andrea H. Auchincloss; Kristian B. Axelsen; Marie-Claude Blatter; Emmanuel Boutet; Silvia Braconi-Quintaje; Lionel Breuza; Alan Bridge; Elizabeth Coudert; Anne Estreicher; L Famiglietti; Serenella Ferro-Rojas; Marc Feuermann; Arnaud Gos

The GO annotation dataset provided by the UniProt Consortium (GOA: http://www.ebi.ac.uk/GOA) is a comprehensive set of evidenced-based associations between terms from the Gene Ontology resource and UniProtKB proteins. Currently supplying over 100 million annotations to 11 million proteins in more than 360u2009000 taxa, this resource has increased 2-fold over the last 2u2009years and has benefited from a wealth of checks to improve annotation correctness and consistency as well as now supplying a greater information content enabled by GO Consortium annotation format developments. Detailed, manual GO annotations obtained from the curation of peer-reviewed papers are directly contributed by all UniProt curators and supplemented with manual and electronic annotations from 36 model organism and domain-focused scientific resources. The inclusion of high-quality, automatic annotation predictions ensures the UniProt GO annotation dataset supplies functional information to a wide range of proteins, including those from poorly characterized, non-model organism species. UniProt GO annotations are freely available in a range of formats accessible by both file downloads and web-based views. In addition, the introduction of a new, normalized file format in 2010 has made for easier handling of the complete UniProt-GOA data set.


Proteomics | 2015

Analysis of the tryptic search space in UniProt databases

Emanuele Alpi; Johannes Griss; Alan Wilter Sousa da Silva; Benoit Bely; R Antunes; Hermann Zellner; Daniel Ríos; Claire O'Donovan; Juan Antonio Vizcaíno; María Martín

In this article, we provide a comprehensive study of the content of the Universal Protein Resource (UniProt) protein data sets for human and mouse. The tryptic search spaces of the UniProtKB (UniProt knowledgebase) complete proteome sets were compared with other data sets from UniProtKB and with the corresponding International Protein Index, reference sequence, Ensembl, and UniRef100 (where UniRef is UniProt reference clusters) organism‐specific data sets. All protein forms annotated in UniProtKB (both the canonical sequences and isoforms) were evaluated in this study. In addition, natural and disease‐associated amino acid variants annotated in UniProtKB were included in the evaluation. The peptide unicity was also evaluated for each data set. Furthermore, the peptide information in the UniProtKB data sets was also compared against the available peptide‐level identifications in the main MS‐based proteomics repositories. Identifying the peptides observed in these repositories is an important resource of information for protein databases as they provide supporting evidence for the existence of otherwise predicted proteins. Likewise, the repositories could use the information available in UniProtKB to direct reprocessing efforts on specific sets of peptides/proteins of interest. In summary, we provide comprehensive information about the different organism‐specific sequence data sets available from UniProt, together with the pros and cons for each, in terms of search space for MS‐based bottom‐up proteomics workflows. The aim of the analysis is to provide a clear view of the tryptic search space of UniProt and other protein databases to enable scientists to select those most appropriate for their purposes.


Nucleic Acids Research | 2018

The eukaryotic linear motif resource - 2018 update.

Marc Gouw; Sushama Michael; Hugo Sámano‐Sánchez; Manjeet Kumar; András Zeke; Benjamin Lang; Benoit Bely; Lucía B. Chemes; Norman E. Davey; Ziqi Deng; Francesca Diella; Clara-Marie Gürth; Ann-Kathrin Huber; Stefan Kleinsorg; Lara S. Schlegel; Nicolas Palopoli; Kim Van Roey; Brigitte Altenberg; Attila Reményi; Holger Dinkel; Toby J. Gibson

Abstract Short linear motifs (SLiMs) are protein binding modules that play major roles in almost all cellular processes. SLiMs are short, often highly degenerate, difficult to characterize and hard to detect. The eukaryotic linear motif (ELM) resource (elm.eu.org) is dedicated to SLiMs, consisting of a manually curated database of over 275 motif classes and over 3000 motif instances, and a pipeline to discover candidate SLiMs in protein sequences. For 15 years, ELM has been one of the major resources for motif research. In this database update, we present the latest additions to the database including 32 new motif classes, and new features including Uniprot and Reactome integration. Finally, to help provide cellular context, we present some biological insights about SLiMs in the cell cycle, as targets for bacterial pathogenicity and their functionality in the human kinome.


Database | 2016

From data repositories to submission portals: rethinking the role of domain-specific databases in CollecTF.

Sefa Kılıç; Dinara M. Sagitova; Shoshannah Wolfish; Benoit Bely; Mélanie Courtot; Stacy Ciufo; Tatiana Tatusova; Claire O’Donovan; Marcus C. Chibucos; María Martín; Ivan Erill

Domain-specific databases are essential resources for the biomedical community, leveraging expert knowledge to curate published literature and provide access to referenced data and knowledge. The limited scope of these databases, however, poses important challenges on their infrastructure, visibility, funding and usefulness to the broader scientific community. CollecTF is a community-oriented database documenting experimentally validated transcription factor (TF)-binding sites in the Bacteria domain. In its quest to become a community resource for the annotation of transcriptional regulatory elements in bacterial genomes, CollecTF aims to move away from the conventional data-repository paradigm of domain-specific databases. Through the adoption of well-established ontologies, identifiers and collaborations, CollecTF has progressively become also a portal for the annotation and submission of information on transcriptional regulatory elements to major biological sequence resources (RefSeq, UniProtKB and the Gene Ontology Consortium). This fundamental change in database conception capitalizes on the domain-specific knowledge of contributing communities to provide high-quality annotations, while leveraging the availability of stable information hubs to promote long-term access and provide high-visibility to the data. As a submission portal, CollecTF generates TF-binding site information through direct annotation of RefSeq genome records, definition of TF-based regulatory networks in UniProtKB entries and submission of functional annotations to the Gene Ontology. As a database, CollecTF provides enhanced search and browsing, targeted data exports, binding motif analysis tools and integration with motif discovery and search platforms. This innovative approach will allow CollecTF to focus its limited resources on the generation of high-quality information and the provision of specialized access to the data. Database URL: http://www.collectf.org/


Database | 2016

Minimizing proteome redundancy in the UniProt Knowledgebase.

Borisas Bursteinas; Ramona Britto; Benoit Bely; Andrea H. Auchincloss; Catherine Rivoire; Nicole Redaschi; Claire O'Donovan; María Martín

Advances in high-throughput sequencing have led to an unprecedented growth in genome sequences being submitted to biological databases. In particular, the sequencing of large numbers of nearly identical bacterial genomes during infection outbreaks and for other large-scale studies has resulted in a high level of redundancy in nucleotide databases and consequently in the UniProt Knowledgebase (UniProtKB). Redundancy negatively impacts on database searches by causing slower searches, an increase in statistical bias and cumbersome result analysis. The redundancy combined with the large data volume increases the computational costs for most reuses of UniProtKB data. All of this poses challenges for effective discovery in this wealth of data. With the continuing development of sequencing technologies, it is clear that finding ways to minimize redundancy is crucial to maintaining UniProts essential contribution to data interpretation by our users. We have developed a methodology to identify and remove highly redundant proteomes from UniProtKB. The procedure identifies redundant proteomes by performing pairwise alignments of sets of sequences for pairs of proteomes and subsequently, applies graph theory to find dominating sets that provide a set of non-redundant proteomes with a minimal loss of information. This method was implemented for bacteria in mid-2015, resulting in a removal of 50 million proteins in UniProtKB. With every new release, this procedure is used to filter new incoming proteomes, resulting in a more scalable and scientifically valuable growth of UniProtKB. Database URL: http://www.uniprot.org/proteomes/


F1000Research | 2010

Source of annotations in the UniProt Knowledgebase

Benoit Bely; María Martín; Rolf Apweiler

Collaboration


Dive into the Benoit Bely's collaboration.

Top Co-Authors

Avatar

María Martín

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

Claire O'Donovan

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

Andrea H. Auchincloss

Swiss Institute of Bioinformatics

View shared research outputs
Top Co-Authors

Avatar

Alan Wilter Sousa da Silva

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

Benjamin Lang

Laboratory of Molecular Biology

View shared research outputs
Top Co-Authors

Avatar

Borisas Bursteinas

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

Claire O’Donovan

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

Daniel Ríos

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

Diego Poggioli

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

Emanuele Alpi

European Bioinformatics Institute

View shared research outputs
Researchain Logo
Decentralizing Knowledge