Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Maristela Holanda is active.

Publication


Featured researches published by Maristela Holanda.


BMC Bioinformatics | 2013

Provenance in bioinformatics workflows.

Renato de Paula; Maristela Holanda; Luciana S. A. Gomes; Sérgio Lifschitz; Maria Emilia Telles Walter

In this work, we used the PROV-DM model to manage data provenance in workflows of genome projects. This provenance model allows the storage of details of one workflow execution, e.g., raw and produced data and computational tools, their versions and parameters. Using this model, biologists can access details of one particular execution of a workflow, compare results produced by different executions, and plan new experiments more efficiently. In addition to this, a provenance simulator was created, which facilitates the inclusion of provenance data of one genome project workflow execution. Finally, we discuss one case study, which aims to identify genes involved in specific metabolic pathways of Bacillus cereus, as well as to compare this isolate with other phylogenetic related bacteria from the Bacillus group. B. cereus is an extremophilic bacteria, collected in warm water in the Midwestern Region of Brazil, its DNA samples having been sequenced with an NGS machine.


Archive | 2012

Towards a Hybrid Federated Cloud Platform to Efficiently Execute Bioinformatics Workflows

Hugo Saldanha; Edward de Oliveira Ribeiro; Carlos Borges; Aletéia Patrícia Favacho de Araújo; Ricardo Gallon; Maristela Holanda; Maria Emilia Telles Walter; Roberto C. Togawa; João Carlos Setubal

Current generation of high-throughput DNA sequencing machines [1, 35, 66] can generate large amounts of DNA sequence data. For example, the machine HiSeq 2000 from the company Illumina, a current workhorse of genome centers, is capable of generating 600 Giga base-pairs of sequence in one single run [35]. The Human Microbiome project (https://commonfund.nih.gov/hmp) and the 1000 Genomes project (http://www.1000genomes.org) are two examples of projects that are generating terabyte-scale amounts of DNA sequence.


bioinformatics and biomedicine | 2013

ACOsched: A scheduling algorithm in a federated cloud infrastructure for bioinformatics applications

Gabriel S. S. de Oliveira; Edward de Oliveira Ribeiro; Diogo A. Ferreira; Aletéia Patrícia Favacho de Araújo; Maristela Holanda; Maria Emilia Telles Walter

Task scheduling in a federated cloud environment is a complex problem since there are several cloud providers presenting distinct memory and storage capacities that should be addressed. This article focus on the task scheduling problem in BioNimbuZ, a federated cloud infrastructure for executing bioinformatics applications, which was previously proposed by our group. We present a scheduling algorithm based on Load Balancing Ant Colony (LBACO), called ACOsched, to perform efficient distribution of tasks by finding the best cloud in the federation to execute these tasks. We developed experiments using real biological data, executing the Bowtie mapping tool on one instance of BioNimbuZ, composed by two cloud providers, Amazon EC2 and a bioinformatics laboratory at the University of Brasilia/Brazil. The obtained results show that ACOsched led to a significant improvement in the makespan time of Bowtie executing in BioNimbuZ, when compared to the simple round robin algorithm called DynamicAHP, previously developed in this federated cloud infrastrucutre.


Comparative and Functional Genomics | 2015

Evaluating the Cassandra NoSQL Database Approach for Genomic Data Persistency.

Rodrigo Aniceto; Rene Xavier; Valéria Monteze Guimarães; Fernanda Hondo; Maristela Holanda; Maria Emilia Telles Walter; Sérgio Lifschitz

Rapid advances in high-throughput sequencing techniques have created interesting computational challenges in bioinformatics. One of them refers to management of massive amounts of data generated by automatic sequencers. We need to deal with the persistency of genomic data, particularly storing and analyzing these large-scale processed data. To find an alternative to the frequently considered relational database model becomes a compelling task. Other data models may be more effective when dealing with a very large amount of nonconventional data, especially for writing and retrieving operations. In this paper, we discuss the Cassandra NoSQL database approach for storing genomic data. We perform an analysis of persistency and I/O operations with real data, using the Cassandra database system. We also compare the results obtained with a classical relational database system and another NoSQL database approach, MongoDB.


iberian conference on information systems and technologies | 2015

A mobile public participation geographic information system architecture for collecting opinions about public services

Breno D. C. Camargos; Maristela Holanda; Aletéia Patrícia Favacho de Araújo

One challenge of public service managers is to use collective intelligence to approximate the opinion of citizens for administrative decisions. The rapid advance of Global Position System (GPS) and technologies included in Web 2.0, followed by greater accessibility to these technologies by citizens through smartphones, can be used to include the population that is present in public establishment environments, with regard to the administrative process of their city. The denominated Public Participation Geographic Information System - PPGIS is characterized by encouraging the population to create voluntary information with geographical features. On this surrounding context, this article proposes an architecture able to use geoprocessing to bring closer the public service user and the manager of these services.


iberian conference on information systems and technologies | 2014

Business process modelling: A study case within the Brazilian Ministry of Planning, Budgeting and Management

Rafael Timóteo de Sousa; Flávio E. G. de Deus; Bruno Aires; Gabriela Ribeiro; Aletéia Patrícia Favacho de Araújo; Maristela Holanda; Sandra S. A. N. Vidal; Renan M. G. dos Santos; Fabiano P. Cortes

Business process modelling is nowadays an important phase for requirements engineering. The description of the business process aims to reduce the distance between the users of the system and its developers. This paper presents a study case regarding the application of business process mapping using BPMN (Business Process Model and Notation), to develop a system for human resources management within the Ministry of Planning, Budget and Management in Brazil. The system to be developed has an estimated 60,000 function points effort and involves more than 500 people in the requirements phase and more in the subsequent phases of process mapping, development, validation and test.


bioinformatics and biomedicine | 2015

A study of genomic data provenance in NoSQL document-oriented database systems

Valéria Monteze Guimarães; Fernanda Hondo; Rodrigo Coutinho de Almeida; Harley Vera; Maristela Holanda; Aletéia Patrícia Favacho de Araújo; Maria Emilia Telles Walter; Sérgio Lifschitz

This work considers a scientific experiment as a computational workflow. Provenance models store details of each workflow execution, including produced data, computational tools parameters and their versions, among others. This way, scientists can review details of a particular workflow execution, compare information generated among different executions and plan new ones efficiently. In the bioinformatics domain, particularly in the presence of large volumes of data, persistency of those data generated during the workflow execution is still a research challenge. In this article, we consider a study on provenance data storage for bioinformatics in a document-oriented NoSQL database system. We present data modeling issues and discuss an actual implementation into MongoDB.


cluster computing and the grid | 2014

A Storage Policy for a Hybrid Federated Cloud platform: A Case Study for Bioinformatics

Deric Lima; Breno Moura; Gabriel S. S. de Oliveira; Edward de Oliveira Ribeiro; Aletéia Patrícia Favacho de Araújo; Maristela Holanda; Roberto C. Togawa; Maria Emilia Telles Walter

Bioinformatics tools require large-scale processing mainly due to very large databases achieving gigabytes of size. In federated cloud environments, although services and resources may be shared, storage is particularly difficult, due to distinct computational capabilities and data management policies of several separated clouds. In this work, we propose a storage policy for BioNimbuZ, a hybrid federated cloud platform designed to execute bioinformatics applications. Our storage policy, BioClouZ, aims to perform efficient choices to distribute and replicate files to the best available cloud resources in the federation in order to reduce computational time. BioClouZ uses four parameters - latency, uptime, free size and cost, weighted (according to ad hoc tests) to model their influences to data storage and recovery. Experiments were performed with real biological data executing a commonly used tool to map short reads in a reference genome in BioNimbuZ, composed of clouds executing in Amazon EC2, Azure and University of Brasilia. The results showed that, when compared to the greedy algorithm first used in BioNimbuZ, the BioClouZ policy significantly improved the total execution time due to more efficient choices of the clouds to store the files. Other bioinformatics applications can be used with BioClouZ in BioNimbuZ as well, since the platform was designed independently from particular tools and databases.


bioinformatics and biomedicine | 2014

Storing provenance data of genome project workflows using graph database

Rodrigo Pinheiro; Bruno Aires; Aletéia Patrícia Favacho de Araújo; Maristela Holanda; Maria Emilia Telles Walter; Sérgio Lifschitz

Many scientific experiments are designed as computational workflows in bioinformatics. However, the amount of data generated increases at every phase of each execution, hindering the identification of the source and the transformation of data. Therefore, it has become necessary to create new tools to store data provenance, mainly which resources and parameters were used to generate the results, among other information, to validate and publish the experiment. In this paper, we propose to use graph database to store data provenance using the PROV-DM model of bioinformatics workflows. To validate the model, we developed a simulator that worked as a logbook to capture data provenance. A workflow with real genomic data showed that very little additional data should be stored, which means that our provenance model can be easily included in genome projects.


bioinformatics and biomedicine | 2013

Automatic capture of provenance data in genome project workflows

Rodrigo Pinheiro; Maristela Holanda; Aletéia Patrícia Favacho de Araújo; Maria Emilia Telles Walter; Sérgio Lifschitz

Many scientific experiments are designed as computational workflows in the bioinformatics domain, which facilitates implementation and analysis. However, the amount of data generated increases at every phase of each execution, hindering the identification of the source and the data transformation. Therefore, it has become necessary to create new tools to verify automatically which resources and parameters were used to generate the results, among other information to validate and publish the experiment. This functionality of automatically capturing data provenance has been receiving attention in the scientific community, primarily with regard to bioinformatics projects, due the fact that the same workflow is executed several times with different parameters and versions of the tools. In this paper, we propose to use relational schema to automatically store data provenance using the PROV-DM model for workflows in bioinformatics projects.

Collaboration


Dive into the Maristela Holanda's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Sérgio Lifschitz

Pontifical Catholic University of Rio de Janeiro

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge