Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Joel P. Arrais is active.

Publication


Featured researches published by Joel P. Arrais.


PLOS ONE | 2007

Large Scale Comparative Codon-Pair Context Analysis Unveils General Rules that Fine-Tune Evolution of mRNA Primary Structure

Gabriela R. Moura; Miguel Pinheiro; Joel P. Arrais; Ana C. Gomes; Laura Carreto; Adelaide Freitas; José Luís Oliveira; Manuel A. S. Santos

Background Codon usage and codon-pair context are important gene primary structure features that influence mRNA decoding fidelity. In order to identify general rules that shape codon-pair context and minimize mRNA decoding error, we have carried out a large scale comparative codon-pair context analysis of 119 fully sequenced genomes. Methodologies/Principal Findings We have developed mathematical and software tools for large scale comparative codon-pair context analysis. These methodologies unveiled general and species specific codon-pair context rules that govern evolution of mRNAs in the 3 domains of life. We show that evolution of bacterial and archeal mRNA primary structure is mainly dependent on constraints imposed by the translational machinery, while in eukaryotes DNA methylation and tri-nucleotide repeats impose strong biases on codon-pair context. Conclusions The data highlight fundamental differences between prokaryotic and eukaryotic mRNA decoding rules, which are partially independent of codon usage.


BMC Genomics | 2009

Parallel DNA pyrosequencing unveils new zebrafish microRNAs

Ana R. Soares; Patrícia Pereira; Bruno Santos; Conceição Egas; Ana C. Gomes; Joel P. Arrais; José Luís Oliveira; Gabriela R. Moura; Manuel A. S. Santos

BackgroundMicroRNAs (miRNAs) are a new class of small RNAs of approximately 22 nucleotides in length that control eukaryotic gene expression by fine tuning mRNA translation. They regulate a wide variety of biological processes, namely developmental timing, cell differentiation, cell proliferation, immune response and infection. For this reason, their identification is essential to understand eukaryotic biology. Their small size, low abundance and high instability complicated early identification, however cloning/Sanger sequencing and new generation genome sequencing approaches overcame most technical hurdles and are being used for rapid miRNA identification in many eukaryotes.ResultsWe have applied 454 DNA pyrosequencing technology to miRNA discovery in zebrafish (Danio rerio). For this, a series of cDNA libraries were prepared from miRNAs isolated at different embryonic time points and from fully developed organs. Each cDNA library was tagged with specific sequences and was sequenced using the Roche FLX genome sequencer. This approach retrieved 90% of the 192 miRNAs previously identified by cloning/Sanger sequencing and bioinformatics. Twenty five novel miRNAs were predicted, 107 miRNA star sequences and also 41 candidate miRNA targets were identified. A miRNA expression profile built on the basis of pyrosequencing read numbers showed high expression of most miRNAs throughout zebrafish development and identified tissue specific miRNAs.ConclusionThis study increases the number of zebrafish miRNAs from 192 to 217 and demonstrates that a single DNA mini-chip pyrosequencing run is effective in miRNA identification in zebrafish. This methodology also produced sufficient information to elucidate miRNA expression patterns during development and in differentiated organs. Moreover, some zebrafish miRNA star sequences were more abundant than their corresponding miRNAs, suggesting a functional role for the former in gene expression control in this vertebrate model organism.


BMC Bioinformatics | 2010

Concept-based query expansion for retrieving gene related publications from MEDLINE.

Sérgio Matos; Joel P. Arrais; João Maia-Rodrigues; José Luís Oliveira

BackgroundAdvances in biotechnology and in high-throughput methods for gene analysis have contributed to an exponential increase in the number of scientific publications in these fields of study. While much of the data and results described in these articles are entered and annotated in the various existing biomedical databases, the scientific literature is still the major source of information. There is, therefore, a growing need for text mining and information retrieval tools to help researchers find the relevant articles for their study. To tackle this, several tools have been proposed to provide alternative solutions for specific user requests.ResultsThis paper presents QuExT, a new PubMed-based document retrieval and prioritization tool that, from a given list of genes, searches for the most relevant results from the literature. QuExT follows a concept-oriented query expansion methodology to find documents containing concepts related to the genes in the user input, such as protein and pathway names. The retrieved documents are ranked according to user-definable weights assigned to each concept class. By changing these weights, users can modify the ranking of the results in order to focus on documents dealing with a specific concept. The methods performance was evaluated using data from the 2004 TREC genomics track, producing a mean average precision of 0.425, with an average of 4.8 and 31.3 relevant documents within the top 10 and 100 retrieved abstracts, respectively.ConclusionsQuExT implements a concept-based query expansion scheme that leverages gene-related information available on a variety of biological resources. The main advantage of the system is to give the user control over the ranking of the results by means of a simple weighting scheme. Using this approach, researchers can effortlessly explore the literature regarding a group of genes and focus on the different aspects relating to these genes.


BMC Systems Biology | 2014

Computational prediction of the human-microbial oral interactome

Edgar D. Coelho; Joel P. Arrais; Sérgio Matos; Carlos Pereira; Nuno Rosa; Maria José Correia; Marlene Barros; José Luís Oliveira

BackgroundThe oral cavity is a complex ecosystem where human chemical compounds coexist with a particular microbiota. However, shifts in the normal composition of this microbiota may result in the onset of oral ailments, such as periodontitis and dental caries. In addition, it is known that the microbial colonization of the oral cavity is mediated by protein-protein interactions (PPIs) between the host and microorganisms. Nevertheless, this kind of PPIs is still largely undisclosed. To elucidate these interactions, we have created a computational prediction method that allows us to obtain a first model of the Human-Microbial oral interactome.ResultsWe collected high-quality experimental PPIs from five major human databases. The obtained PPIs were used to create our positive dataset and, indirectly, our negative dataset. The positive and negative datasets were merged and used for training and validation of a naïve Bayes classifier. For the final prediction model, we used an ensemble methodology combining five distinct PPI prediction techniques, namely: literature mining, primary protein sequences, orthologous profiles, biological process similarity, and domain interactions. Performance evaluation of our method revealed an area under the ROC-curve (AUC) value greater than 0.926, supporting our primary hypothesis, as no single set of features reached an AUC greater than 0.877. After subjecting our dataset to the prediction model, the classified result was filtered for very high confidence PPIs (probability ≥ 1-10−7), leading to a set of 46,579 PPIs to be further explored.ConclusionsWe believe this dataset holds not only important pathways involved in the onset of infectious oral diseases, but also potential drug-targets and biomarkers. The dataset used for training and validation, the predictions obtained and the network final network are available at http://bioinformatics.ua.pt/software/oralint.


Archives of Oral Biology | 2013

OralCard: A bioinformatic tool for the study of oral proteome

Joel P. Arrais; Nuno Rosa; José Melo; Edgar D. Coelho; Diana Amaral; Maria José Correia; Marlene Barros; José Luís Oliveira

OBJECTIVES The molecular complexity of the human oral cavity can only be clarified through identification of components that participate within it. However current proteomic techniques produce high volumes of information that are dispersed over several online databases. Collecting all of this data and using an integrative approach capable of identifying unknown associations is still an unsolved problem. This is the main motivation for this work. RESULTS We present the online bioinformatic tool OralCard, which comprises results from 55 manually curated articles reflecting the oral molecular ecosystem (OralPhysiOme). It comprises experimental information available from the oral proteome both of human (OralOme) and microbial origin (MicroOralOme) structured in protein, disease and organism. CONCLUSIONS This tool is a key resource for researchers to understand the molecular foundations implicated in biology and disease mechanisms of the oral cavity. The usefulness of this tool is illustrated with the analysis of the oral proteome associated with diabetes melitus type 2. OralCard is available at http://bioinformatics.ua.pt/oralcard.


Journal of Integrative Bioinformatics | 2007

GeneBrowser: an approach for integration and functional classification of genomic data

Joel P. Arrais; Bruno Santos; João Fernandes; Laura Carreto; Manuel A. S. Santos; José Luís Oliveira

Summary The achievements coming from genome analysis depend greatly on the quality of computational and processing methods. Tools for functional mRNA profiling and for gene information integration have become essential to this task. We have developed GeneBrowser as a novel approach that combines the advantages of mRNA profiling tools, at genome-scale experiments, with the features provided by data integration systems. For a given set of genes, GeneBrowser integrates bibliography information with functional annotations, using Gene Ontology, Entrez Gene, KEGG Orthology and KEGG Pathways. The result is a comprehensive and easy to use web application that helps researchers to extract knowledge from large data sets and to speed up the discovery process. Availability: GeneBrowser is freely available at http://bioinformatics.ua.pt/genebrowser


PLOS ONE | 2012

Enchytraeus albidus Microarray: Enrichment, Design, Annotation and Database (EnchyBASE)

Sara C. Novais; Joel P. Arrais; Pedro Lopes; Tine Vandenbrouck; Wim De Coen; Dick Roelofs; Amadeu M.V.M. Soares; Mónica J.B. Amorim

Enchytraeus albidus (Oligochaeta) is an ecologically relevant species used as standard test organisms for risk assessment. Effects of stressors in this species are commonly determined at the population level using reproduction and survival as endpoints. The assessment of transcriptomic responses can be very useful e.g. to understand underlying mechanisms of toxicity with gene expression fingerprinting. In the present paper the following is being addressed: 1) development of suppressive subtractive hybridization (SSH) libraries enriched for differentially expressed genes after metal and pesticide exposures; 2) sequencing and characterization of all generated cDNA inserts; 3) development of a publicly available genomic database on E. albidus. A total of 2100 Expressed Sequence Tags (ESTs) were isolated, sequenced and assembled into 1124 clusters (947 singletons and 177 contigs). From these sequences, 41% matched known proteins in GenBank (BLASTX, e-value≤10-5) and 37% had at least one Gene Ontology (GO) term assigned. In total, 5.5% of the sequences were assigned to a metabolic pathway, based on KEGG. With this new sequencing information, an Agilent custom oligonucleotide microarray was designed, representing a potential tool for transcriptomic studies. EnchyBASE (http://bioinformatics.ua.pt/enchybase/) was developed as a web freely available database containing genomic information on E. albidus and will be further extended in the near future for other enchytraeid species. The database so far includes all ESTs generated for E. albidus from three cDNA libraries. This information can be downloaded and applied in functional genomics and transcription studies.


PLOS Computational Biology | 2016

Computational Discovery of Putative Leads for Drug Repositioning through Drug-Target Interaction Prediction.

Edgar D. Coelho; Joel P. Arrais; José Luís Oliveira

De novo experimental drug discovery is an expensive and time-consuming task. It requires the identification of drug-target interactions (DTIs) towards targets of biological interest, either to inhibit or enhance a specific molecular function. Dedicated computational models for protein simulation and DTI prediction are crucial for speed and to reduce the costs associated with DTI identification. In this paper we present a computational pipeline that enables the discovery of putative leads for drug repositioning that can be applied to any microbial proteome, as long as the interactome of interest is at least partially known. Network metrics calculated for the interactome of the bacterial organism of interest were used to identify putative drug-targets. Then, a random forest classification model for DTI prediction was constructed using known DTI data from publicly available databases, resulting in an area under the ROC curve of 0.91 for classification of out-of-sampling data. A drug-target network was created by combining 3,081 unique ligands and the expected ten best drug targets. This network was used to predict new DTIs and to calculate the probability of the positive class, allowing the scoring of the predicted instances. Molecular docking experiments were performed on the best scoring DTI pairs and the results were compared with those of the same ligands with their original targets. The results obtained suggest that the proposed pipeline can be used in the identification of new leads for drug repositioning. The proposed classification model is available at http://bioinformatics.ua.pt/software/dtipred/.


Journal of Bioinformatics and Computational Biology | 2015

Computational methodology for predicting the landscape of the human-microbial interactome region level influence.

Edgar D. Coelho; André M. Santiago; Joel P. Arrais; José Luís Oliveira

Microbial communities thrive in close association among themselves and with the host, establishing protein-protein interactions (PPIs) with the latter, and thus being able to benefit (positively impact) or disturb (negatively impact) biological events in the host. Despite major collaborative efforts to sequence the Human microbiome, there is still a great lack of understanding their impact. We propose a computational methodology to predict the impact of microbial proteins in human biological events, taking into account the abundance of each microbial protein and its relation to all other microbial and human proteins. This alternative methodology is centered on an improved impact estimation algorithm that integrates PPIs between human and microbial proteins with Reactome pathway data. This methodology was applied to study the impact of 24 microbial phyla over different cellular events, within 10 different human microbiomes. The results obtained confirm findings already described in the literature and explore new ones. We believe the Human microbiome can no longer be ignored as not only is there enough evidence correlating microbiome alterations and disease states, but also the return to healthy states once these alterations are reversed.


Open Access Bioinformatics | 2011

Using biomedical networks to prioritize gene–disease associations

Joel P. Arrais; José Luís Oliveira

correspondence: Joel Perdiz Arrais University of Aveiro, 3810-193 Aveiro, Portugal Tel +35 1234370500 Fax +35 1234370545 email [email protected] Abstract: Understanding the genetic foundations of genetic diseases, such as cancer, Alzheimer disease, or Huntington’s disease, is critical to the development of new diagnostics and treatments. Several computational methods have been used to speed up the discovery process, eg, by selecting the molecular targets for a given disease. However, despite the achievements obtained over recent years, better solutions are still required. This paper presents an innovative computational method that addresses the problem of using disperse biomedical knowledge to select the best candidate genes associated with a disease. The method uses a network representation of current biomedical knowledge that includes biomolecular concepts such as genes, diseases, pathways, and biological process. It also applies information extraction techniques to enrich the network with more dynamic and updated data. A biologically inspired algorithm is applied to this network in order to identify association levels between genes and diseases. The solution proposed here surpasses many limitations of previous methods such as the need for training data. The validation applied demonstrates that the proposed method has best overall results compared with state-of-the-art methods as it also performs especially well for the critical top-rank positions. We believe this method represents a major advance over previous work and that it will be a key tool for future gene–disease association studies.

Collaboration


Dive into the Joel P. Arrais's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Nuno Rosa

Catholic University of Portugal

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Maria José Correia

The Catholic University of America

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge