César A. F. De Rose
Pontifícia Universidade Católica do Rio Grande do Sul
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by César A. F. De Rose.
Software - Practice and Experience | 2011
Rodrigo N. Calheiros; Rajiv Ranjan; Anton Beloglazov; César A. F. De Rose; Rajkumar Buyya
Cloud computing is a recent advancement wherein IT infrastructure and applications are provided as ‘services’ to end‐users under a usage‐based payment model. It can leverage virtualized services even on the fly based on requirements (workload patterns and QoS) varying with time. The application services hosted under Cloud computing model have complex provisioning, composition, configuration, and deployment requirements. Evaluating the performance of Cloud provisioning policies, application workload models, and resources performance models in a repeatable manner under varying system and user configurations and requirements is difficult to achieve. To overcome this challenge, we propose CloudSim: an extensible simulation toolkit that enables modeling and simulation of Cloud computing systems and application provisioning environments. The CloudSim toolkit supports both system and behavior modeling of Cloud system components such as data centers, virtual machines (VMs) and resource provisioning policies. It implements generic application provisioning techniques that can be extended with ease and limited effort. Currently, it supports modeling and simulation of Cloud computing environments consisting of both single and inter‐networked clouds (federation of clouds). Moreover, it exposes custom interfaces for implementing policies and provisioning techniques for allocation of VMs under inter‐networked Cloud computing scenarios. Several researchers from organizations, such as HP Labs in U.S.A., are using CloudSim in their investigation on Cloud resource provisioning and energy‐efficient management of data center resources. The usefulness of CloudSim is demonstrated by a case study involving dynamic provisioning of application services in the hybrid federated clouds environment. The result of this case study proves that the federated Cloud computing model significantly improves the application QoS requirements under fluctuating resource and service demand patterns. Copyright
Future Generation Computer Systems | 2011
Tiago C. Ferreto; Marco Aurelio Stelmar Netto; Rodrigo N. Calheiros; César A. F. De Rose
Virtualization has become a key technology for simplifying service management and reducing energy costs in data centers. One of the challenges faced by data centers is to decide when, how, and which virtual machines (VMs) have to be consolidated into a single physical server. Server consolidation involves VM migration, which has a direct impact on service response time. Most of the existing solutions for server consolidation rely on eager migrations, which try to minimize the number of physical servers running VMs. These solutions generate unnecessary migrations due to unpredictable workloads that require VM resizing. This paper proposes an LP formulation and heuristics to control VM migration, which prioritize virtual machines with steady capacity. We performed experiments using TU-Berlin and Google data center workloads to compare our migration control strategy against existing eager-migration-based solutions. We observed that avoiding migration of VMs with steady capacity reduces the number of migrations with minimal penalty in the number of physical servers.
Software - Practice and Experience | 2013
Rodrigo N. Calheiros; Marco Aurelio Stelmar Netto; César A. F. De Rose; Rajkumar Buyya
Cloud computing allows the deployment and delivery of application services for users worldwide. Software as a Service providers with limited upfront budget can take advantage of Cloud computing and lease the required capacity in a pay‐as‐you‐go basis, which also enables flexible and dynamic resource allocation according to service demand. One key challenge potential Cloud customers have before renting resources is to know how their services will behave in a set of resources and the costs involved when growing and shrinking their resource pool. Most of the studies in this area rely on simulation‐based experiments, which consider simplified modeling of applications and computing environment. In order to better predict services behavior on Cloud platforms, we developed an integrated architecture that is based on both simulation and emulation. The proposed architecture, named EMUSIM, automatically extracts information from application behavior via emulation and then uses this information to generate the corresponding simulation model. We performed experiments using an image processing application as a case study and found that EMUSIM was able to accurately model such application via emulation and use the model to supply information about its potential performance in a Cloud provider. We also discuss our experience using EMUSIM for deploying applications in a real public Cloud provider. EMUSIM is based on an open source software stack and therefore it can be extended for analysis behavior of several other applications. Copyright
computer software and applications conference | 2012
Vincent C. Emeakaroha; Tiago C. Ferreto; Marco Aurelio Stelmar Netto; Ivona Brandic; César A. F. De Rose
Cloud resources and services are offered based on Service Level Agreements (SLAs) that state usage terms and penalties in case of violations. Although, there is a large body of work in the area of SLA provisioning and monitoring at infrastructure and platform layers, SLAs are usually assumed to be guaranteed at the application layer. However, application monitoring is a challenging task due to monitored metrics of the platform or infrastructure layer that cannot be easily mapped to the required metrics at the application layer. Sophisticated SLA monitoring among those layers to avoid costly SLA penalties and maximize the provider profit is still an open research challenge. This paper proposes an application monitoring architecture named CASViD, which stands for Cloud Application SLA Violation Detection architecture. CASViD architecture monitors and detects SLA violations at the application layer, and includes tools for resource allocation, scheduling, and deployment. Different from most of the existing monitoring architectures, CASViD focuses on application level monitoring, which is relevant when multiple customers share the same resources in a Cloud environment. We evaluate our architecture in a real Cloud testbed using applications that exhibit heterogeneous behaviors in order to investigate the effective measurement intervals for efficient monitoring of different application types. The achieved results show that our architecture, with low intrusion level, is able to monitor, detect SLA violations, and suggest effective measurement intervals for various workloads.
parallel, distributed and network-based processing | 2014
Miguel G. Xavier; Marcelo Veiga Neves; César A. F. De Rose
Virtualization as a platform for resource-intensive applications, such as MapReduce (MR), has been the subject of many studies in the last years, as it has brought benefits such as better manageability, overall resource utilization, security and scalability. Nevertheless, because of the performance overheads, virtualization has traditionally been avoided in computing environments where performance is a critical factor. In this context, container-based virtualization can be considered a lightweight alternative to the traditional hypervisor-based virtualization systems. In fact, there is a trend towards using containers in MR clusters in order to provide resource sharing and performance isolation (e.g., Mesos and YARN). However, there are still no studies evaluating the performance overhead of the current container-based systems and their ability to provide performance isolation when running MR applications. In this work, we conducted experiments to effectively compare and contrast the current container-based systems (Linux VServer, OpenVZ and Linux Containers (LXC)) in terms of performance and manageability when running on MR clusters. Our results showed that although all container-based systems reach a near-native performance for MapReduce workloads, LXC is the one that offers the best relationship between performance and management capabilities (specially regarding to performance isolation).
international conference on parallel processing | 2009
Rodrigo N. Calheiros; Rajkumar Buyya; César A. F. De Rose
Distributed system emulators provide a paramount platform for testing of network protocols and distributed applications in clusters and networks of workstations. However, to allow testers to benefit from these systems, it is necessary an efficient and automatic mapping of hundreds, or even thousands, of virtual nodes to physical hosts-and the mapping of the virtual links between guests to physical paths in the physical environment. In this paper we present a heuristic to map both virtual machines to hosts and virtual links between virtual machines to paths in the real system. We define the problem we are addressing, present the solution for it and evaluate it in different usage scenarios.
parallel computing | 2007
César A. F. De Rose; Hans-Ulrich Heiss; Barry Linnert
Current processor allocation techniques for highly parallel systems use centralized front-end based algorithms which restrict applied strategies to static allocation, low parallelism, and weak fault tolerance. To lift these restrictions, we are investigating a distributed approach to processor allocation in multicomputers where currently no centralized data structure with information about the state of all processors exists. This approach will allow the implementation of more complex allocation schemes and possibly the consideration of dynamic allocation, where parallel applications would be able to adapt the allocated processor partition to its actual demand at running time, resulting in a more efficient utilization of system resources. Noncontiguous versions of a distributed dynamic processor allocation scheme are proposed and studied in this paper as an alternative for parallel programming models to allow dynamic creation and task deletion. Simulations compare the performance of the proposed dynamic strategies with static counterparts and also with well-known centralized algorithms in an environment with growing and shrinking processor demands. To demonstrate dynamic allocation is feasible with current technologies, results of the experiments are presented for a 96 nodes SCI hpcLine Primergy Server cluster.
international parallel and distributed processing symposium | 2014
Marcelo Veiga Neves; César A. F. De Rose; Kostas Katrinis; Hubertus Franke
The rise of Internet of Things sensors, social networking and mobile devices has led to an explosion of available data. Gaining insights into this data has led to the area of Big Data analytics. The MapReduce framework, as implemented in Hadoop, is one of the most popular frameworks for Big Data analysis. To handle the ever-increasing data size, Hadoop is a scalable framework that allows dedicated, seemingly unbound numbers of servers to participate in the analytics process. Response time of an analytics request is an important factor for time to value/insights. While the compute and disk I/O requirements can be scaled with the number of servers, scaling the system leads to increased network traffic. Arguably, the communication-heavy phase of MapReduce contributes significantly to the overall response time, the problem is further aggravated, if communication patterns are heavily skewed, as is not uncommon in many MapReduce workloads. In this paper we present a system that reduces the skew impact by transparently predicting data communication volume at runtime and mapping the many end-to-end flows among the various processes to the underlying network, using emerging software-defined networking technologies to avoid hotspots in the network. Dependent on the network oversubscription ratio, we demonstrate reduction in job completion time between 3% and 46% for popular MapReduce benchmarks like Sort and Nutch.
Future Generation Computer Systems | 2008
César A. F. De Rose; Tiago C. Ferreto; Rodrigo N. Calheiros; Walfredo Cirne; Lauro Beltrão Costa; Daniel Fireman
As the adoption of grid computing in organizations expands, the need for wise utilization of different types of resource also increases. A volatile resource, such as a desktop computer, is a common type of resource found in grids. However, using efficiently other types of resource, such as space-shared resources, represented by parallel supercomputers and clusters of workstations, is extremely important, since they can provide a great amount of computation power. Using space-shared resources in grids is not straightforward since they require jobs a priori to specify some parameters, such as allocation time and amount of processors. Current solutions (e.g. Grid Resource and Allocation Management (GRAM)) are based on the explicit definition of these parameters by the user. On the other hand, good progress has been made in supporting Bag-of-Tasks (BoT) applications on grids. This is a restricted model of parallelism on which tasks do not communicate among themselves, making recovering from failures a simple matter of reexecuting tasks. As such, there is no need to specify a maximum number of resources, or a period of time that resources must be executing the application, such as required by space-shared resources. Besides, this state of affairs makes leverage from space-shared resources hard for BoT applications running on grid. This paper presents an Explicit Allocation Strategy, in which an adaptor automatically fits grid requests to the resource in order to decrease the turn-around time of the application. We compare it with another strategy described in our previous work, called Transparent Allocation Strategy, in which idle nodes of the space-shared resource are donated to the grid. As we shall see, both strategies provide good results. Moreover, they are complementary in the sense that they fulfill different usage roles. The Transparent Allocation Strategy enables a resource owner to raise its utilization by offering cycles that would otherwise go wasted, while protecting the local workload from increased contention. The Explicit Allocation Strategy, conversely, allows a user to benefit from the accesses she has to space-shared resources in the grid, enabling her natively to submit tasks without having to craft (time, processors) requests.
international conference on parallel processing | 2011
Tiago C. Ferreto; César A. F. De Rose; Hans-Ulrich Heiss
Server consolidation is a vital mechanism in modern data centers in order to minimize expenses with infrastructure. In most cases, server consolidation may require migrating virtual machines between different physical servers. Although the downtime of live-migration is negligible, the amount of time to migrate all virtual machines can be substantial, delaying the completion of the consolidation process. This paper proposes a new server consolidation algorithm, which guarantees that migrations are completed in a given maximum time. The migration time is estimated using the max-min fairness model, in order to consider the competition of migration flows for the network infrastructure. The algorithm was simulated using a real workload and shows a good consolidation ratio in comparison to other algorithms, while also guaranteeing a maximum migration time.