Is this you? Create Your Porfile

Fernanda Araujo Baião

Universidade Federal do Estado do Rio de Janeiro

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Fernanda Araujo Baião is active.

Explore More

Publication

Featured researches published by Fernanda Araujo Baião.

international conference on cloud computing | 2010

SciCumulus: A Lightweight Cloud Middleware to Explore Many Task Computing Paradigm in Scientific Workflows

Daniel de Oliveira; Eduardo S. Ogasawara; Fernanda Araujo Baião; Marta Mattoso

Most of the large-scale scientific experiments modeled as scientific workflows produce a large amount of data and require workflow parallelism to reduce workflow execution time. Some of the existing Scientific Workflow Management Systems (SWfMS) explore parallelism techniques - such as parameter sweep and data fragmentation. In those systems, several computing resources are used to accomplish many computational tasks in homogeneous environments, such as multiprocessor machines or cluster systems. Cloud computing has become a popular high performance computing model in which (virtualized) resources are provided as services over the Web. Some scientists are starting to adopt the cloud model in scientific domains and are moving their scientific workflows (programs and data) from local environments to the cloud. Nevertheless, it is still difficult for the scientist to express a parallel computing paradigm for the workflow on the cloud. Capturing distributed provenance data at the cloud is also an issue. Existing approaches for executing scientific workflows using parallel processing are mainly focused on homogeneous environments whereas, in the cloud, the scientist has to manage new aspects such as initialization of virtualized instances, scheduling over different cloud environments, impact of data transferring and management of instance images. In this paper we propose SciCumulus, a cloud middleware that explores parameter sweep and data fragmentation parallelism in scientific workflow activities (with provenance support). It works between the SWfMS and the cloud. SciCumulus is designed considering cloud specificities. We have evaluated our approach by executing simulated experiments to analyze the overhead imposed by clouds on the workflow execution time.

grid computing | 2012

A Provenance-based Adaptive Scheduling Heuristic for Parallel Scientific Workflows in Clouds

Daniel de Oliveira; Kary A. C. S. Ocaña; Fernanda Araujo Baião; Marta Mattoso

In the last years, scientific workflows have emerged as a fundamental abstraction for structuring and executing scientific experiments in computational environments. Scientific workflows are becoming increasingly complex and more demanding in terms of computational resources, thus requiring the usage of parallel techniques and high performance computing (HPC) environments. Meanwhile, clouds have emerged as a new paradigm where resources are virtualized and provided on demand. By using clouds, scientists have expanded beyond single parallel computers to hundreds or even thousands of virtual machines. Although the initial focus of clouds was to provide high throughput computing, clouds are already being used to provide an HPC environment where elastic resources can be instantiated on demand during the course of a scientific workflow. However, this model also raises many open, yet important, challenges such as scheduling workflow activities. Scheduling parallel scientific workflows in the cloud is a very complex task since we have to take into account many different criteria and to explore the elasticity characteristic for optimizing workflow execution. In this paper, we introduce an adaptive scheduling heuristic for parallel execution of scientific workflows in the cloud that is based on three criteria: total execution time (makespan), reliability and financial cost. Besides scheduling workflow activities based on a 3-objective cost model, this approach also scales resources up and down according to the restrictions imposed by scientists before workflow execution. This tuning is based on provenance data captured and queried at runtime. We conducted a thorough validation of our approach using a real bioinformatics workflow. The experiments were performed in SciCumulus, a cloud workflow engine for managing scientific workflow execution.

conference on advanced information systems engineering | 2009

A Method for Service Identification from Business Process Models in a SOA Approach

Leonardo Guerreiro Azevedo; Flávia Maria Santoro; Fernanda Araujo Baião; Jairo Souza; Kate Revoredo; Vinícios Pereira; Isolda Herlain

Various approaches for services development in SOA propose business processes as a starting point. However, there is a lack of systematic methods for services identification during business analysis. We believe that there has to exist a integrated view of organizational business processes to promote an effective SOA approach, which will improve IS requirements understanding. In this context, we propose a method, and a detailed set of activities, for guiding the service designer in identifying the most appropriate set of services to support organization business activities. The method was applied in a real scenario of a Brazilian Petroleum organization.

Archive | 2010

Towards a Taxonomy for Cloud Computing from an e-Science Perspective

Daniel de Oliveira; Fernanda Araujo Baião; Marta Mattoso

In the last few years, cloud computing has emerged as a computational paradigm that enables scientists to build more complex scientific applications to manage large data sets or high-performance applications, based on distributed resources. By following this paradigm, scientists may use distributed resources (infrastructure, storage, databases, and applications) without having to deal with implementation or configuration details. In fact, there are many cloud computing environments already available for use. Despite its fast growth and adoption, the definition of cloud computing is not a consensus. This makes it very difficult to comprehend the cloud computing field as a whole, correlate, classify, and compare the various existing proposals. Over the years, taxonomy techniques have been used to create models that allow for the classification of concepts within a domain. The main objective of this chapter is to apply taxonomy techniques in the cloud computing domain. This chapter discusses many aspects involved with cloud computing that are important from a scientific perspective. It contributes by proposing a taxonomy based on characteristics that are fundamental for scientific applications typically associated with the cloud paradigm.

decision support systems | 2013

Detection of naming convention violations in process models for different languages

Henrik Leopold; Rami-Habib Eid-Sabbagh; Jan Mendling; Leonardo Guerreiro Azevedo; Fernanda Araujo Baião

Companies increasingly use business process modeling for documenting and redesigning their operations. However, due to the size of such modeling initiatives, they often struggle with the quality assurance of their model collections. While many model properties can already be checked automatically, there is a notable gap of techniques for checking linguistic aspects such as naming conventions of process model elements. In this paper, we address this problem by introducing an automatic technique for detecting violations of naming conventions. This technique is based on text corpora and independent of linguistic resources such as WordNet. Therefore, it can be easily adapted to the broad set of languages for which corpora exist. We demonstrate the applicability of the technique by analyzing nine process model collections from practice, including over 27,000 labels and covering three different languages. The results of the evaluation show that our technique yields stable results and can reliably deal with ambiguous cases. In this way, this paper provides an important contribution to the field of automated quality assurance of conceptual models. We present an automatic technique for detecting violations of naming conventions.The technique is based on text corpora and independent of linguistic resources.Because of its design, the approach can be easily adapted to other languages.The evaluation includes 27,000 labels and three different languages.

data and knowledge engineering | 2005

Managing structural genomic workflows using web services

Maria Cláudia Cavalcanti; Rafael Targino; Fernanda Araujo Baião; Shaila C. Rössle; Paulo Mascarello Bisch; Paulo F. Pires; Maria Luiza Machado Campos; Marta Mattoso

In silico scientific experiments encompass multiple combinations of program and data resources. Each resource combination in an execution flow is called a scientific workflow. In bioinformatics environments, program composition is a frequent operation, requiring complex management. A scientist faces many challenges when building an experiment: finding the right program to use, the adequate parameters to tune, managing input/output data, building and reusing workflows. Typically, these workflows are implemented using script languages because of their simplicity, despite their specificity and difficulty of reuse. In contrast, Web service technology was specially conceived to encapsulate and combine programs and data, providing interoperation between applications from different platforms. The Web services approach is superior to scripts with regard to interoperability, scalability and flexibility issues. We have combined metadata support with Web services within a framework that supports scientific workflows. While most works are focused on metadata issues to manage and integrate heterogeneous scientific data sources, in this work we concentrate on metadata support to program management within workflows. We have used this framework with a real structural genomic workflow, showing its viability and evidencing its advantages.

Concurrency and Computation: Practice and Experience | 2012

An adaptive parallel execution strategy for cloud-based scientific workflows

Daniel de Oliveira; Eduardo S. Ogasawara; Kary A. C. S. Ocaña; Fernanda Araujo Baião; Marta Mattoso

Many of the existing large‐scale scientific experiments modeled as scientific workflows are compute‐intensive. Some scientific workflow management systems already explore parallel techniques, such as parameter sweep and data fragmentation, to improve performance. In those systems, computing resources are used to accomplish many computational tasks in high performance environments, such as multiprocessor machines or clusters. Meanwhile, cloud computing provides scalable and elastic resources that can be instantiated on demand during the course of a scientific experiment, without requiring its users to acquire expensive infrastructure or to configure many pieces of software. In fact, because of these advantages some scientists have already adopted the cloud model in their scientific experiments. However, this model also raises many challenges. When scientists are executing scientific workflows that require parallelism, it is hard to decide a priori the amount of resources to use and how long they will be needed because the allocation of these resources is elastic and based on demand. In addition, scientists have to manage new aspects such as initialization of virtual machines and impact of data staging. SciCumulus is a middleware that manages the parallel execution of scientific workflows in cloud environments. In this paper, we introduce an adaptive approach for executing parallel scientific workflows in the cloud. This approach adapts itself according to the availability of resources during workflow execution. It checks the available computational power and dynamically tunes the workflow activity size to achieve better performance. Experimental evaluation showed the benefits of parallelizing scientific workflows using the adaptive approach of SciCumulus, which presented an increase of performance up to 47.1%. Copyright

Information Systems Management | 2008

Towards Collaboration Maturity in Business Processes: An Exploratory Study in Oil Production Processes

Andréa Magalhães Magdaleno; Claudia Cappelli; Fernanda Araujo Baião; Flávia Maria Santoro; Renata Mendes de Araujo

Abstract Organizations have been relying on collaboration for knowledge sharing and productivity improvement in order to reduce costs or boost revenue. However, organizations still cannot assure that collaboration is properly conducted in daily work. This paper presents an approach to stimulating collaboration between professionals in an organization. The approach, combining a BPM methodology with the CollabMM collaboration maturity model and its corresponding method, is a result of an exploratory study in a real setting in an oil company in Brazil. The project is a move towards improving decision-making during one of the companys business processes and establishing collaboration among professionals through information sharing.

Distributed and Parallel Databases | 2004

A Distribution Design Methodology for Object DBMS

Fernanda Araujo Baião; Marta Mattoso; Gerson Zaverucha

The design of distributed databases involves making decisions on the fragmentation and placement of data and programs across the sites of a computer network. The first phase of the distribution design in a top-down approach is the fragmentation phase, which clusters in fragments the information accessed simultaneously by applications. Most distribution design algorithms propose a horizontal or vertical class fragmentation. However, the user has no assistance in the choice between these techniques. In this work we present a detailed methodology for the design of distributed object databases that includes: (i) an analysis phase, to indicate the most adequate fragmentation technique to be applied in each class of the database schema; (ii) a horizontal class fragmentation algorithm, and (iii) a vertical class fragmentation algorithm. Basically, the analysis phase is responsible for driving the choice between the horizontal and the vertical partitioning techniques, or even the combination of both, in order to assist distribution designers in the fragmentation phase of object databases. Experiments using our methodology have resulted in fragmentation schemas offering a high degree of parallelism together with an important reduction of irrelevant data.

Future Generation Computer Systems | 2013

Performance evaluation of parallel strategies in public clouds: A study with phylogenomic workflows

Daniel de Oliveira; Kary A. C. S. Ocaña; Eduardo S. Ogasawara; Jonas Dias; João Carlos de A. R. Gonçalves; Fernanda Araujo Baião; Marta Mattoso

Data analysis is an exploratory process that demands high performance computing (HPC). SciPhylomics, for example, is a data-intensive workflow that aims at producing phylogenomic trees based on an input set of protein sequences of genomes to infer evolutionary relationships among living organisms. SciPhylomics can benefit from parallel processing techniques provided by existing approaches such as SciCumulus cloud workflow engine and MapReduce implementations such as Hadoop. Despite some performance fluctuations, computing clouds provide a new dimension for HPC due to its elasticity and availability features. In this paper, we present a performance evaluation for SciPhylomics executions in a real cloud environment. The workflow was executed using two parallel execution approaches (SciCumulus and Hadoop) at the Amazon EC2 cloud. Our results reinforce the benefits of parallelizing data for the phylogenomic inference workflow using MapReduce-like parallel approaches in the cloud. The performance results demonstrate that this class of bioinformatics experiment is suitable to be executed in the cloud despite its need for high performance capabilities. The evaluated workflow shares many features of several data intensive workflows, which present first insights that these cloud execution results can be extrapolated to other classes of experiments.

Explore More