Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Adriano Barbosa-Silva is active.

Publication


Featured researches published by Adriano Barbosa-Silva.


Nucleic Acids Research | 2009

MedlineRanker: flexible ranking of biomedical literature

Jean-Fred Fontaine; Adriano Barbosa-Silva; Martin H. Schaefer; Matthew R. Huska; Enrique M. Muro; Miguel A. Andrade-Navarro

The biomedical literature is represented by millions of abstracts available in the Medline database. These abstracts can be queried with the PubMed interface, which provides a keyword-based Boolean search engine. This approach shows limitations in the retrieval of abstracts related to very specific topics, as it is difficult for a non-expert user to find all of the most relevant keywords related to a biomedical topic. Additionally, when searching for more general topics, the same approach may return hundreds of unranked references. To address these issues, text mining tools have been developed to help scientists focus on relevant abstracts. We have implemented the MedlineRanker webserver, which allows a flexible ranking of Medline for a topic of interest without expert knowledge. Given some abstracts related to a topic, the program deduces automatically the most discriminative words in comparison to a random selection. These words are used to score other abstracts, including those from not yet annotated recent publications, which can be then ranked by relevance. We show that our tool can be highly accurate and that it is able to process millions of abstracts in a practical amount of time. MedlineRanker is free for use and is available at http://cbdm.mdc-berlin.de/tools/medlineranker.


Nucleic Acids Research | 2011

Génie: literature-based gene prioritization at multi genomic scale.

Jean-Fred Fontaine; Florian Priller; Adriano Barbosa-Silva; Miguel A. Andrade-Navarro

Biomedical literature is traditionally used as a way to inform scientists of the relevance of genes in relation to a research topic. However many genes, especially from poorly studied organisms, are not discussed in the literature. Moreover, a manual and comprehensive summarization of the literature attached to the genes of an organism is in general impossible due to the high number of genes and abstracts involved. We introduce the novel Génie algorithm that overcomes these problems by evaluating the literature attached to all genes in a genome and to their orthologs according to a selected topic. Génie showed high precision (up to 100%) and the best performance in comparison to other algorithms in most of the benchmarks, especially when high sensitivity was required. Moreover, the prioritization of zebrafish genes involved in heart development, using human and mouse orthologs, showed high enrichment in differentially expressed genes from microarray experiments. The Génie web server supports hundreds of species, millions of genes and offers novel functionalities. Common run times below a minute, even when analyzing the human genome with hundreds of thousands of literature records, allows the use of Génie in routine lab work. Availability: http://cbdm.mdc-berlin.de/tools/genie/.


Biodata Mining | 2010

A reference guide for tree analysis and visualization

Georgios A. Pavlopoulos; Theodoros G. Soldatos; Adriano Barbosa-Silva; Reinhard Schneider

The quantities of data obtained by the new high-throughput technologies, such as microarrays or ChIP-Chip arrays, and the large-scale OMICS-approaches, such as genomics, proteomics and transcriptomics, are becoming vast. Sequencing technologies become cheaper and easier to use and, thus, large-scale evolutionary studies towards the origins of life for all species and their evolution becomes more and more challenging. Databases holding information about how data are related and how they are hierarchically organized expand rapidly. Clustering analysis is becoming more and more difficult to be applied on very large amounts of data since the results of these algorithms cannot be efficiently visualized. Most of the available visualization tools that are able to represent such hierarchies, project data in 2D and are lacking often the necessary user friendliness and interactivity. For example, the current phylogenetic tree visualization tools are not able to display easy to understand large scale trees with more than a few thousand nodes. In this study, we review tools that are currently available for the visualization of biological trees and analysis, mainly developed during the last decade. We describe the uniform and standard computer readable formats to represent tree hierarchies and we comment on the functionality and the limitations of these tools. We also discuss on how these tools can be developed further and should become integrated with various data sources. Here we focus on freely available software that offers to the users various tree-representation methodologies for biological data analysis.


Nucleic Acids Research | 2014

uORFdb—a comprehensive literature database on eukaryotic uORF biology

Klaus Wethmar; Adriano Barbosa-Silva; Miguel A. Andrade-Navarro; Achim Leutz

Approximately half of all human transcripts contain at least one upstream translational initiation site that precedes the main coding sequence (CDS) and gives rise to an upstream open reading frame (uORF). We generated uORFdb, publicly available at http://cbdm.mdc-berlin.de/tools/uorfdb, to serve as a comprehensive literature database on eukaryotic uORF biology. Upstream ORFs affect downstream translation by interfering with the unrestrained progression of ribosomes across the transcript leader sequence. Although the first uORF-related translational activity was observed >30 years ago, and an increasing number of studies link defective uORF-mediated translational control to the development of human diseases, the features that determine uORF-mediated regulation of downstream translation are not well understood. The uORFdb was manually curated from all uORF-related literature listed at the PubMed database. It categorizes individual publications by a variety of denominators including taxon, gene and type of study. Furthermore, the database can be filtered for multiple structural and functional uORF-related properties to allow convenient and targeted access to the complex field of eukaryotic uORF biology.


BMC Bioinformatics | 2011

PESCADOR, a web-based tool to assist text-mining of biointeractions extracted from PubMed queries

Adriano Barbosa-Silva; Jean-Fred Fontaine; Elisa Donnard; Fernanda Stussi; J. Miguel Ortega; Miguel A. Andrade-Navarro

BackgroundBiological function is greatly dependent on the interactions of proteins with other proteins and genes. Abstracts from the biomedical literature stored in the NCBIs PubMed database can be used for the derivation of interactions between genes and proteins by identifying the co-occurrences of their terms. Often, the amount of interactions obtained through such an approach is large and may mix processes occurring in different contexts. Current tools do not allow studying these data with a focus on concepts of relevance to a user, for example, interactions related to a disease or to a biological mechanism such as protein aggregation.ResultsTo help the concept-oriented exploration of such data we developed PESCADOR, a web tool that extracts a network of interactions from a set of PubMed abstracts given by a user, and allows filtering the interaction network according to user-defined concepts. We illustrate its use in exploring protein aggregation in neurodegenerative disease and in the expansion of pathways associated to colon cancer.ConclusionsPESCADOR is a platform independent web resource available at: http://cbdm.mdc-berlin.de/tools/pescador/


BMC Bioinformatics | 2010

LAITOR--Literature Assistant for Identification of Terms co-Occurrences and Relationships.

Adriano Barbosa-Silva; Theodoros G. Soldatos; Ivan L. F. Magalhaes; Georgios A. Pavlopoulos; Jean-Fred Fontaine; Miguel A. Andrade-Navarro; Reinhard Schneider; J. Miguel Ortega

BackgroundBiological knowledge is represented in scientific literature that often describes the function of genes/proteins (bioentities) in terms of their interactions (biointeractions). Such bioentities are often related to biological concepts of interest that are specific of a determined research field. Therefore, the study of the current literature about a selected topic deposited in public databases, facilitates the generation of novel hypotheses associating a set of bioentities to a common context.ResultsWe created a text mining system (LAITOR: LiteratureAssistant forIdentification ofTerms co-Occurrences andRelationships) that analyses co-occurrences of bioentities, biointeractions, and other biological terms in MEDLINE abstracts. The method accounts for the position of the co-occurring terms within sentences or abstracts. The system detected abstracts mentioning protein-protein interactions in a standard test (BioCreative II IAS test data) with a precision of 0.82-0.89 and a recall of 0.48-0.70. We illustrate the application of LAITOR to the detection of plant response genes in a dataset of 1000 abstracts relevant to the topic.ConclusionsText mining tools combining the extraction of interacting bioentities and biological concepts with network displays can be helpful in developing reasonable hypotheses in different scientific backgrounds.


Nucleic Acids Research | 2010

Martini: using literature keywords to compare gene sets

Theodoros G. Soldatos; Seán I. O'Donoghue; Venkata P. Satagopam; Lars Juhl Jensen; Nigel P. Brown; Adriano Barbosa-Silva; Reinhard Schneider

Life scientists are often interested to compare two gene sets to gain insight into differences between two distinct, but related, phenotypes or conditions. Several tools have been developed for comparing gene sets, most of which find Gene Ontology (GO) terms that are significantly over-represented in one gene set. However, such tools often return GO terms that are too generic or too few to be informative. Here, we present Martini, an easy-to-use tool for comparing gene sets. Martini is based, not on GO, but on keywords extracted from Medline abstracts; Martini also supports a much wider range of species than comparable tools. To evaluate Martini we created a benchmark based on the human cell cycle, and we tested several comparable tools (CoPub, FatiGO, Marmite and ProfCom). Martini had the best benchmark performance, delivering a more detailed and accurate description of function. Martini also gave best or equal performance with three other datasets (related to Arabidopsis, melanoma and ovarian cancer), suggesting that Martini represents an advance in the automated comparison of gene sets. In agreement with previous studies, our results further suggest that literature-derived keywords are a richer source of gene-function information than GO annotations. Martini is freely available at http://martini.embl.de.


Big Data | 2016

Integration and Visualization of Translational Medicine Data for Better Understanding of Human Diseases

Venkata P. Satagopam; Wei Gu; Serge Eifes; Piotr Gawron; Marek Ostaszewski; Stephan Gebel; Adriano Barbosa-Silva; Rudi Balling; Reinhard Schneider

Abstract Translational medicine is a domain turning results of basic life science research into new tools and methods in a clinical environment, for example, as new diagnostics or therapies. Nowadays, the process of translation is supported by large amounts of heterogeneous data ranging from medical data to a whole range of -omics data. It is not only a great opportunity but also a great challenge, as translational medicine big data is difficult to integrate and analyze, and requires the involvement of biomedical experts for the data processing. We show here that visualization and interoperable workflows, combining multiple complex steps, can address at least parts of the challenge. In this article, we present an integrated workflow for exploring, analysis, and interpretation of translational medicine data in the context of human health. Three Web services—tranSMART, a Galaxy Server, and a MINERVA platform—are combined into one big data pipeline. Native visualization capabilities enable the biomedical experts to get a comprehensive overview and control over separate steps of the workflow. The capabilities of tranSMART enable a flexible filtering of multidimensional integrated data sets to create subsets suitable for downstream processing. A Galaxy Server offers visually aided construction of analytical pipelines, with the use of existing or custom components. A MINERVA platform supports the exploration of health and disease-related mechanisms in a contextualized analytical visualization system. We demonstrate the utility of our workflow by illustrating its subsequent steps using an existing data set, for which we propose a filtering scheme, an analytical pipeline, and a corresponding visualization of analytical results. The workflow is available as a sandbox environment, where readers can work with the described setup themselves. Overall, our work shows how visualization and interfacing of big data processing services facilitate exploration, analysis, and interpretation of translational medicine data.


Bioinformatics | 2017

SmartR: an open-source platform for interactive visual analytics for translational research data

Sascha Herzinger; Wei Gu; Venkata P. Satagopam; Serge Eifes; Kavita Rege; Adriano Barbosa-Silva; Reinhard Schneider

Summary: In translational research, efficient knowledge exchange between the different fields of expertise is crucial. An open platform that is capable of storing a multitude of data types such as clinical, pre‐clinical or OMICS data combined with strong visual analytical capabilities will significantly accelerate the scientific progress by making data more accessible and hypothesis generation easier. The open data warehouse tranSMART is capable of storing a variety of data types and has a growing user community including both academic institutions and pharmaceutical companies. tranSMART, however, currently lacks interactive and dynamic visual analytics and does not permit any post‐processing interaction or exploration. For this reason, we developed SmartR, a plugin for tranSMART, that equips the platform not only with several dynamic visual analytical workflows, but also provides its own framework for the addition of new custom workflows. Modern web technologies such as D3.js or AngularJS were used to build a set of standard visualizations that were heavily improved with dynamic elements. Availability and Implementation: The source code is licensed under the Apache 2.0 License and is freely available on GitHub: https://github.com/transmart/SmartR. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.


BMC Genomics | 2011

Preimplantation development regulatory pathway construction through a text-mining approach

Elisa Donnard; Adriano Barbosa-Silva; Rafael Lm Guedes; Gabriel da Rocha Fernandes; Henrique Velloso; Matthew J. Kohn; Miguel A. Andrade-Navarro; J. Miguel Ortega

BackgroundThe integration of sequencing and gene interaction data and subsequent generation of pathways and networks contained in databases such as KEGG Pathway is essential for the comprehension of complex biological processes. We noticed the absence of a chart or pathway describing the well-studied preimplantation development stages; furthermore, not all genes involved in the process have entries in KEGG Orthology, important information for knowledge application with relation to other organisms.ResultsIn this work we sought to develop the regulatory pathway for the preimplantation development stage using text-mining tools such as Medline Ranker and PESCADOR to reveal biointeractions among the genes involved in this process. The genes present in the resulting pathway were also used as seeds for software developed by our group called SeedServer to create clusters of homologous genes. These homologues allowed the determination of the last common ancestor for each gene and revealed that the preimplantation development pathway consists of a conserved ancient core of genes with the addition of modern elements.ConclusionsThe generation of regulatory pathways through text-mining tools allows the integration of data generated by several studies for a more complete visualization of complex biological processes. Using the genes in this pathway as “seeds” for the generation of clusters of homologues, the pathway can be visualized for other organisms. The clustering of homologous genes together with determination of the ancestry leads to a better understanding of the evolution of such process.

Collaboration


Dive into the Adriano Barbosa-Silva's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

J. Miguel Ortega

Universidade Federal de Minas Gerais

View shared research outputs
Top Co-Authors

Avatar

Jean-Fred Fontaine

Max Delbrück Center for Molecular Medicine

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Elisa Donnard

Universidade Federal de Minas Gerais

View shared research outputs
Top Co-Authors

Avatar

Theodoros G. Soldatos

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Henrique Velloso

Universidade Federal de Minas Gerais

View shared research outputs
Top Co-Authors

Avatar

José Miguel Ortega

Universidade Federal de Minas Gerais

View shared research outputs
Researchain Logo
Decentralizing Knowledge