Rafael Ferreira da Silva

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Rafael Ferreira da Silva is active.

Explore More

Publication

Featured researches published by Rafael Ferreira da Silva.

Future Generation Computer Systems | 2015

Pegasus, a workflow management system for science automation

Ewa Deelman; Karan Vahi; Gideon Juve; Mats Rynge; Scott Callaghan; Philip J. Maechling; Rajiv Mayani; Weiwei Chen; Rafael Ferreira da Silva; Miron Livny; Kent Wenger

Modern science often requires the execution of large-scale, multi-stage simulation and data analysis pipelines to enable the study of complex systems. The amount of computation and data involved in these pipelines requires scalable workflow management systems that are able to reliably and efficiently coordinate and automate data movement and task execution on distributed computational resources: campus clusters, national cyberinfrastructures, and commercial and academic clouds. This paper describes the design, development and evolution of the Pegasus Workflow Management System, which maps abstract workflow descriptions onto distributed computing infrastructures. Pegasus has been used for more than twelve years by scientists in a wide variety of domains, including astronomy, seismology, bioinformatics, physics and others. This paper provides an integrated view of the Pegasus system, showing its capabilities that have been developed over time in response to application needs and to the evolution of the scientific computing platforms. The paper describes how Pegasus achieves reliable, scalable workflow execution across a wide variety of computing infrastructures. Comprehensive description of the Pegasus Workflow Management System.Detailed explanation of Pegasus workflow transformations.Data management in Pegasus.Earthquake science application example.

IEEE Transactions on Medical Imaging | 2013

A Virtual Imaging Platform for Multi-Modality Medical Image Simulation

Tristan Glatard; Carole Lartizien; Bernard Gibaud; Rafael Ferreira da Silva; Germain Forestier; Frédéric Cervenansky; Martino Alessandrini; Hugues Benoit-Cattin; Olivier Bernard; Sorina Camarasu-Pop; Nadia Cerezo; Patrick Clarysse; Alban Gaignard; Patrick Hugonnard; Hervé Liebgott; Simon Marache; Adrien Marion; Johan Montagnat; Joachim Tabary; Denis Friboulet

This paper presents the Virtual Imaging Platform (VIP), a platform accessible at http://vip.creatis.insa-lyon.fr to facilitate the sharing of object models and medical image simulators, and to provide access to distributed computing and storage resources. A complete overview is presented, describing the ontologies designed to share models in a common repository, the workίow template used to integrate simulators, and the tools and strategies used to exploit computing and storage resources. Simulation results obtained in four image modalities and with different models show that VIP is versatile and robust enough to support large simulations. The platform currently has 200 registered users who consumed 33 years of CPU time in 2011.

Frontiers in Neuroinformatics | 2015

Reproducibility of neuroimaging analyses across operating systems

Tristan Glatard; Lindsay B. Lewis; Rafael Ferreira da Silva; Reza Adalat; Natacha Beck; Claude Lepage; Pierre Rioux; Marc-Etienne Rousseau; Tarek Sherif; Ewa Deelman; Najmeh Khalili-Mahani; Alan C. Evans

Neuroimaging pipelines are known to generate different results depending on the computing platform where they are compiled and executed. We quantify these differences for brain tissue classification, fMRI analysis, and cortical thickness (CT) extraction, using three of the main neuroimaging packages (FSL, Freesurfer and CIVET) and different versions of GNU/Linux. We also identify some causes of these differences using library and system call interception. We find that these packages use mathematical functions based on single-precision floating-point arithmetic whose implementations in operating systems continue to evolve. While these differences have little or no impact on simple analysis pipelines such as brain extraction and cortical tissue classification, their accumulation creates important differences in longer pipelines such as subcortical tissue classification, fMRI analysis, and cortical thickness extraction. With FSL, most Dice coefficients between subcortical classifications obtained on different operating systems remain above 0.9, but values as low as 0.59 are observed. Independent component analyses (ICA) of fMRI data differ between operating systems in one third of the tested subjects, due to differences in motion correction. With Freesurfer and CIVET, in some brain regions we find an effect of build or operating system on cortical thickness. A first step to correct these reproducibility issues would be to use more precise representations of floating-point numbers in the critical sections of the pipelines. The numerical stability of pipelines should also be reviewed.

workflows in support of large scale science | 2013

Toward fine-grained online task characteristics estimation in scientific workflows

Rafael Ferreira da Silva; Gideon Juve; Ewa Deelman; Tristan Glatard; Frédéric Desprez; Douglas Thain; Benjamín Tovar; Miron Livny

Task characteristics estimations such as runtime, disk space, and memory consumption, are commonly used by scheduling algorithms and resource provisioning techniques to provide successful and efficient workflow executions. These methods assume that accurate estimations are available, but in production systems it is hard to compute such estimates with good accuracy. In this work, we first profile three real scientific workflows collecting fine-grained information such as process I/O, runtime, memory usage, and CPU utilization. We then propose a method to automatically characterize workflow task needs based on these profiles. Our method estimates task runtime, disk space, and memory consumption based on the size of tasks input data. It looks for correlations between the parameters of a dataset, and if no correlation is found, the dataset is divided into smaller subsets by using a clustering technique. Task behavior estimates are done based on the ratio parameter/input data size if they are correlated, or based on the mean value. However, task dependencies in scientific workflows lead to a chain of estimation errors. To correct such errors, we propose an online estimation process based on the MAPE-K loop where task executions are constantly monitored and estimates are updated accordingly. Experiment results show that our online estimation process yields much more accurate predictions than an offline approach, where all task needs are estimated at once.

Future Generation Computer Systems | 2013

Monte Carlo simulation on heterogeneous distributed systems: A computing framework with parallel merging and checkpointing strategies

Sorina Camarasu-Pop; Tristan Glatard; Rafael Ferreira da Silva; Pierre Gueth; David Sarrut; Hugues Benoit-Cattin

This paper introduces an end-to-end framework for efficient computing and merging of Monte Carlo simulations on heterogeneous distributed systems. Simulations are parallelized using a dynamic load-balancing approach and multiple parallel mergers. Checkpointing is used to improve reliability and to enable incremental results merging from partial results. A model is proposed to analyze the behavior of the proposed framework and help tune its parameters. Experimental results obtained on a production grid infrastructure show that the model fits the real makespan with a relative error of maximum 10%, that using multiple parallel mergers reduces the makespan by 40% on average, that checkpointing enables the completion of very long simulations and that it can be used without penalizing the makespan.

international conference on e-science | 2013

Balanced Task Clustering in Scientific Workflows

Weiwei Chen; Rafael Ferreira da Silva; Ewa Deelman; Rizos Sakellariou

Scientific workflows can be composed of many fine computational granularity tasks. The runtime of these tasks may be shorter than the duration of system overheads, for example, when using multiple resources of a cloud infrastructure. Task clustering is a runtime optimization technique that merges multiple short tasks into a single job such that the scheduling overhead is reduced and the overall runtime performance is improved. However, existing task clustering strategies only provide a coarse-grained approach that relies on an over-simplified workflow model. In our work, we examine the reasons that cause Runtime Imbalance and Dependency Imbalance in task clustering. Next, we propose quantitative metrics to evaluate the severity of the two imbalance problems respectively. Furthermore, we propose a series of task balancing methods to address these imbalance problems. Finally, we analyze their relationship with the performance of these task balancing methods. A trace-based simulation shows our methods can significantly improve the runtime performance of two widely used workflows compared to the actual implementation of task clustering.

Future Generation Computer Systems | 2015

Using imbalance metrics to optimize task clustering in scientific workflow executions

Weiwei Chen; Rafael Ferreira da Silva; Ewa Deelman; Rizos Sakellariou

Scientific workflows can be composed of many fine computational granularity tasks. The runtime of these tasks may be shorter than the duration of system overheads, for example, when using multiple resources of a cloud infrastructure. Task clustering is a runtime optimization technique that merges multiple short running tasks into a single job such that the scheduling overhead is reduced and the overall runtime performance is improved. However, existing task clustering strategies only provide a coarse-grained approach that relies on an over-simplified workflow model. In this work, we examine the reasons that cause Runtime Imbalance and Dependency Imbalance in task clustering. Then, we propose quantitative metrics to evaluate the severity of the two imbalance problems. Furthermore, we propose a series of task balancing methods (horizontal and vertical) to address the load balance problem when performing task clustering for five widely used scientific workflows. Finally, we analyze the relationship between these metric values and the performance of proposed task balancing methods. A trace-based simulation shows that our methods can significantly decrease the runtime of workflow applications when compared to a baseline execution. We also compare the performance of our methods with two algorithms described in the literature. Generalize the runtime imbalance and dependency imbalance problem in task clustering.Propose quantitative imbalance metrics to improve task clustering.Evaluate the imbalance metrics and balanced task clustering methods with five workflows.

international conference on e-science | 2014

Community Resources for Enabling Research in Distributed Scientific Workflows

Rafael Ferreira da Silva; Weiwei Chen; Gideon Juve; Karan Vahi; Ewa Deelman

A significant amount of recent research in scientific workflows aims to develop new techniques, algorithms and systems that can overcome the challenges of efficient and robust execution of ever larger workflows on increasingly complex distributed infrastructures. Since the infrastructures, systems and applications are complex, and their behavior is difficult to reproduce using physical experiments, much of this research is based on simulation. However, there exists a shortage of realistic datasets and tools that can be used for such studies. In this paper we describe a collection of tools and data that have enabled research in new techniques, algorithms, and systems for scientific workflows. These resources include: 1) execution traces of real workflow applications from which workflow and system characteristics such as resource usage and failure profiles can be extracted, 2) a synthetic workflow generator that can produce realistic synthetic workflows based on profiles extracted from execution traces, and 3) a simulator framework that can simulate the execution of synthetic workflows on realistic distributed infrastructures. This paper describes how we have used these resources to investigate new techniques for efficient and robust workflow execution, as well as to provide improvements to the Pegasus Workflow Management System or other workflow tools. Our goal in describing these resources is to share them with other researchers in the workflow research community.

international conference on parallel processing | 2012

A science-gateway workload archive to study pilot jobs, user activity, bag of tasks, task sub-steps, and workflow executions

Rafael Ferreira da Silva; Tristan Glatard

Archives of distributed workloads acquired at the infrastructure level reputably lack information about users and application-level middleware. Science gateways provide consistent access points to the infrastructure, and therefore are an interesting information source to cope with this issue. In this paper, we describe a workload archive acquired at the science-gateway level, and we show its added value on several case studies related to user accounting, pilot jobs, fine-grained task analysis, bag of tasks, and workflows. Results show that science-gateway workload archives can detect workload wrapped in pilot jobs, improve user identification, give information on distributions of data transfer times, make bag-of-task detection accurate, and retrieve characteristics of workflow executions. Some limits are also identified.

IEEE Internet Computing | 2016

Pegasus in the Cloud: Science Automation through Workflow Technologies

Ewa Deelman; Karan Vahi; Mats Rynge; Gideon Juve; Rajiv Mayani; Rafael Ferreira da Silva

The Pegasus Workflow Management System maps abstract, resource-independent workflow descriptions onto distributed computing resources. As a result of this planning process, Pegasus workflows are portable across different infrastructures, optimizable for performance and efficiency, and automatically map to many different storage systems and data flows. This approach makes Pegasus a powerful solution for executing scientific workflows in the cloud.

Explore More