Edward de Oliveira Ribeiro

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Edward de Oliveira Ribeiro is active.

Explore More

Publication

Featured researches published by Edward de Oliveira Ribeiro.

Archive | 2012

Towards a Hybrid Federated Cloud Platform to Efficiently Execute Bioinformatics Workflows

Hugo Saldanha; Edward de Oliveira Ribeiro; Carlos Borges; Aletéia Patrícia Favacho de Araújo; Ricardo Gallon; Maristela Holanda; Maria Emilia Telles Walter; Roberto C. Togawa; João Carlos Setubal

Current generation of high-throughput DNA sequencing machines [1, 35, 66] can generate large amounts of DNA sequence data. For example, the machine HiSeq 2000 from the company Illumina, a current workhorse of genome centers, is capable of generating 600 Giga base-pairs of sequence in one single run [35]. The Human Microbiome project (https://commonfund.nih.gov/hmp) and the 1000 Genomes project (http://www.1000genomes.org) are two examples of projects that are generating terabyte-scale amounts of DNA sequence.

bioinformatics and biomedicine | 2013

ACOsched: A scheduling algorithm in a federated cloud infrastructure for bioinformatics applications

Gabriel S. S. de Oliveira; Edward de Oliveira Ribeiro; Diogo A. Ferreira; Aletéia Patrícia Favacho de Araújo; Maristela Holanda; Maria Emilia Telles Walter

Task scheduling in a federated cloud environment is a complex problem since there are several cloud providers presenting distinct memory and storage capacities that should be addressed. This article focus on the task scheduling problem in BioNimbuZ, a federated cloud infrastructure for executing bioinformatics applications, which was previously proposed by our group. We present a scheduling algorithm based on Load Balancing Ant Colony (LBACO), called ACOsched, to perform efficient distribution of tasks by finding the best cloud in the federation to execute these tasks. We developed experiments using real biological data, executing the Bowtie mapping tool on one instance of BioNimbuZ, composed by two cloud providers, Amazon EC2 and a bioinformatics laboratory at the University of Brasilia/Brazil. The obtained results show that ACOsched led to a significant improvement in the makespan time of Bowtie executing in BioNimbuZ, when compared to the simple round robin algorithm called DynamicAHP, previously developed in this federated cloud infrastrucutre.

cluster computing and the grid | 2014

A Storage Policy for a Hybrid Federated Cloud platform: A Case Study for Bioinformatics

Deric Lima; Breno Moura; Gabriel S. S. de Oliveira; Edward de Oliveira Ribeiro; Aletéia Patrícia Favacho de Araújo; Maristela Holanda; Roberto C. Togawa; Maria Emilia Telles Walter

Bioinformatics tools require large-scale processing mainly due to very large databases achieving gigabytes of size. In federated cloud environments, although services and resources may be shared, storage is particularly difficult, due to distinct computational capabilities and data management policies of several separated clouds. In this work, we propose a storage policy for BioNimbuZ, a hybrid federated cloud platform designed to execute bioinformatics applications. Our storage policy, BioClouZ, aims to perform efficient choices to distribute and replicate files to the best available cloud resources in the federation in order to reduce computational time. BioClouZ uses four parameters - latency, uptime, free size and cost, weighted (according to ad hoc tests) to model their influences to data storage and recovery. Experiments were performed with real biological data executing a commonly used tool to map short reads in a reference genome in BioNimbuZ, composed of clouds executing in Amazon EC2, Azure and University of Brasilia. The results showed that, when compared to the greedy algorithm first used in BioNimbuZ, the BioClouZ policy significantly improved the total execution time due to more efficient choices of the clouds to store the files. Other bioinformatics applications can be used with BioClouZ in BioNimbuZ as well, since the platform was designed independently from particular tools and databases.

bioinformatics and biomedicine | 2016

BioNimbuZ: A federated cloud platform for bioinformatics applications

Michel Rosa; Breno Moura; Guilherme Vergara; Lucas Santos; Edward de Oliveira Ribeiro; Maristela Holanda; Maria Emilia Telles Walter; Aletéia Patrícia Favacho de Araújo

Challenges in bioinformatics include tools to treat large-scale processing, mainly due to the large volumes of data generated by high-throughput sequencing machines. Besides, many of these tools are not user friendly, and do not distribute their workloads properly. In federated cloud environments, even though services and resources are shared and available online, the processes of a workflow execution are almost entirely not automated, and the majority of these processes do not efficiently balance their workloads. This paper presents the federated cloud platform, called BioNimbuZ, a hybrid platform designed to execute bioinformatics applications easily and efficiently, with good workload balance. Our tests were performed using a real bioinformatics workflow, with fragments generated by the Illumina sequencer, having achieved good performance in practice.

high performance computing and communications | 2008

p2pBIOFOCO: Proposing a Peer-to-Peer System for Distributed BLAST Execution

Edward de Oliveira Ribeiro; Maria Emilia Telles Walter; Marcos Mota do Carmo Costa; Roberto C. Togawa; Georgios J. Pappas

The peer-to-peer (P2P) model is currently being used for applications that demand great amount of time and space resources, taking advantage of idle and geographically distributed machines. Bioinformatics is a field that requires large amount of computational resources. In this context, we propose a P2P model that executes a particular bioinformatics application, named BLAST. This is a method for comparing two biological sequences at a time. But it requires a great amount of computation, since it compares a big file containing thousands of sequences that are being investigated and a very large database containing millions of biological sequences. We describe in detail the architecture of our framework, and present experiments using real biological data, that were executed on machines available in three institutions of Brazil - University of Brasilia, Catholic University of Brasilia and Embrapa/Genetic Resources and Biotechnology. The obtained results, when comparing the execution of BLAST in this P2P framework and its execution in the best machine of the three institutions, show that our P2P model can efficiently run other bioinformatics applications.

world conference on information systems and technologies | 2018

An Evaluation of Data Model for NoSQL Document-Based Databases

Debora G. Reis; Fabio S. Gasparoni; Maristela Holanda; Marcio Victorino; Marcelo Ladeira; Edward de Oliveira Ribeiro

NoSQL databases offer flexibility in the data model. The document-based databases may have some data models built with embedded documents, and others made with referenced documents. The challenge lies in choosing the structure of the data. This paper proposes a study to analyze if different data models can have an impact on the performance of database queries. To this end, we created three data models: embedded, referenced, and hybrid. We ran experiments on each data model in a MongoDB cluster, comparing the response time of 3 different queries in each model. Results showed a disparity in performance between the data models. We also evaluated the use of indexes in each data model. Results showed that, depending on the type of query and field searched some types of indexes presented higher performance compared to others. Additionally, we carried out an analysis of the space occupied on the storage disk. This analysis shows that the choice of model also affects disk space for storing data and indexes.

international conference on cloud computing and services science | 2011

A CLOUD ARCHITECTURE FOR BIOINFORMATICS WORKFLOWS

Hugo Saldanha; Edward de Oliveira Ribeiro; Maristela Holanda; Aletéia Patrícia Favacho de Araújo; Genaína Nunes Rodrigues; Maria Emilia Telles Walter; João C. Setubal; Alberto M. R. Dávila

international conference on cloud computing and services science | 2018

TASK SCHEDULING IN A FEDERATED CLOUD INFRASTRUCTURE FOR BIOINFORMATICS APPLICATIONS

C. A. L. Borges; Hugo Saldanha; Edward de Oliveira Ribeiro; Maristela Holanda; Aletéia Patrícia Favacho de Araújo; Maria Emilia Telles Walter

international conference on cloud computing and services science | 2018

A Hadoop Open Source Backup Solution

Heitor Faria; Rodrigo Hagstrom; Marco Antonio Sousa Reis; Breno G. S. Costa; Edward de Oliveira Ribeiro; Maristela Holanda; Priscila Solís Barreto; Aletéia Patrícia Favacho de Araújo

iberian conference on information systems and technologies | 2017