Francesc Guim
Polytechnic University of Catalonia
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Francesc Guim.
international conference on green computing | 2010
Ivan Rodero; Juan Jaramillo; Andres Quiroz; Manish Parashar; Francesc Guim; Stephen W. Poole
As energy efficiency and associated costs become key concerns, consolidated and virtualized data centers and clouds are attractive computing platforms for data- and compute- intensive applications. These platforms provide an abstraction of nearly-unlimited computing resources through the elastic use of pools of consolidated resources, and provide opportunities for higher utilization and energy savings. Recently, these platforms are also being considered for more traditional high-performance computing (HPC) applications that have typically targeted Grids and similar conventional HPC platforms. However, maximizing energy efficiency, cost-effectiveness, and utilization for these applications while ensuring performance and other Quality of Service (QoS) guarantees, requires leveraging important and extremely challenging tradeoffs. These include, for example, the tradeoff between the need to efficiently create and provision Virtual Machines (VMs) on data center resources and the need to accommodate the heterogeneous resource demands and runtimes of these applications. In this paper we present an energy-aware online provisioning approach for HPC applications on consolidated and virtualized computing platforms. Energy efficiency is achieved using a workload-aware, just-right dynamic provisioning mechanism and the ability to power down subsystems of a host system that are not required by the VMs mapped to it. We evaluate the presented approach using real HPC workload traces from widely distributed production systems. The results presented demonstrated that compared to typical reactive or predefined provisioning, our approach achieves significant improvements in energy efficiency with an acceptable QoS penalty.
Archive | 2008
Ivan Rodero; Francesc Guim; Julita Corbalan; Liana Fong; Yanbin Liu; Seyed Masoud Sadjadi
A Grid Resource Broker for a Grid domain, or also known as meta-scheduler, is a middleware component used for matching works to available Grid resources from one or more IT organizations. A Grid meta-scheduler usually has its own interfaces for the functionalities it provides and its own job scheduling objectives. This situation causes two main problems: the user uniform access to the Grid is lost, and the scheduling decisions are taken separately while they should be done in coordination. These problems have been observed in different efforts such as the HPC-Europa project but they are still open problems. In this paper we discuss the requirements to achieve a more uniform access to the Grids through a new approach to global brokering. As the results of these discussions on brokering requirements, we propose a meta-brokering design, so called metameta-scheduler design, and discuss how it can be realized as a centralized model for the HPC-Europa project, and as a distributed model for the LA Grid project.
job scheduling strategies for parallel processing | 2007
Francesc Guim; Julita Corbalan
The number of distributed high performance computing architectures has increased exponentially these last years. Thus, systems composed by several computational resources provided by different Research centers and Universities have become very popular. Job scheduling policies have been adapted to these new scenarios in which several independent resources have to be managed. New policies have been designed to take into account issues like multi-cluster environments, heterogeneous systems and the geographical distribution of the resources. Several centralized scheduling solutions have been proposed in the literature for these environments, such as centralized schedulers, centralized queues and global controllers. These approaches use a unique scheduling entity responsible for scheduling all the jobs that are submitted by the users. In this paper we propose the usage of self-scheduling techniques for dispatching the jobs that are submitted to a set of distributed computational hosts that are managed by independent schedulers (such as MOAB or LoadLeveler). It is a non-centralized and job-guided scheduling policy whose main goal is to optimize the job wait time. Thus, the scheduling decisions are done independently for each job instead of using a global policy where all the jobs are considered. On top of this, as a part of the proposed solution, we also demonstrate how the usage of job wait time prediction techniques can substantially improve the performance obtained in the described architecture.
Archive | 2007
Krzysztof Kurowski; Ariel Oleksiak; Jarek Nabrzyski; Agnieszka Kwiecien; Marcin Wojtkiewicz; Maciej Dyczkowski; Francesc Guim; Julita Corbalan; Jesús Labarta
To date, many of existing Grid resource brokers make their decisions concerning selection of the best resources for computational jobs using basic resource parameters such as, for instance, load. This approach may often be insufficient. Estimations of job start and execution times are needed in order to make more adequate decisions and to provide better quality of service for end-users. Nevertheless, due to heterogeneity of Grids and often incomplete information available the results of performance prediction methods may be very inaccurate. Therefore, estimations of prediction errors should be also taken into consideration during a resource selection phase. We present in this paper the multi-criteria resource selection method based on estimations of job start and execution times, and prediction errors. To this end, we use GRMS [28] and GPRES tools. Tests have been conducted based on workload traces which were recorded from a parallel machine at UPC. These traces cover 3 years of job information as recorded by the LoadLeveler batch management systems. We show that the presented method can considerably improve the efficiency of resource selection decisions.
modeling, analysis, and simulation on computer and telecommunication systems | 2007
Francesc Guim; Julita Corbalan; Jesús Labarta
Job scheduling policies for HPC centers have been extensively studied during these last years, specially backfilling based policies. Almost all of these studies have been done using simulation tools. These tools evaluate the performance of scheduling policies using the workloads and the resource definition as an input. To the best of our knowledge, all the existent simulators use the runtime (either requested or real) provided in the workload as a basis of their simulations. However, the runtime of a job, even executed with a fixed number of processors, depends on runtime issues such as the specific resource selection policy used for allocate the jobs or the resource jobs requirements. This paper is the first part of a more complex research project that analyzes the impact in the system performance of considering the resource sharing of running jobs. With this purpose we have included in our job scheduler simulator (the Alvio simulator) a performance model that estimates the penalty introduced in the application runtime when sharing the memory bandwidth. Experiments have been conducted with two resource selection policies and we present both the impact from the point of view of global performance metrics, such as average slowdown, and per job impact such as percentage of penalized runtime.
Archive | 2008
Ivan Rodero; Francesc Guim; Julita Corbalan; Ariel Goyeneche
In large Grids, like the National Grid Service (NGS), or large distributed architecture different scheduling entities are involved. Despite a global scheduling approach would archive higher performance and could increment the utilization of global system in these scenarios usually independent schedulers carry out its own scheduling decisions. In this paper we present how a coordinated scheduling among all the different centers using data mining prediction techniques can substantially improve the performance of the global distributed infrastructure, and can provide a uniform access to the user to all the heterogeneous Grid resources. We present the Grid Backfilling meta-scheduling policy that optimizes the global utilization of the system resources and increases substantially the response time for the jobs. We also present how data mining techniques applied to historical information can provide very suitable inputs for carrying out the Grid Backfilling meta-scheduling decisions.
Archive | 2008
Attila Kertesz; Ivan Rodero; Francesc Guim
Since the management and the optimal utilization of the highly dynamic grid resources cannot be handled by the users themselves, various grid Resource Brokers have been developed, supporting different grids. To ease interoperability and the higher level utilization of different resource brokers, we introduce a metadata model for storing broker capabilities and show how an implementation of this model can be realized. We believe that this abstraction will help standardizing inter-broker communication to enable more efficient grid resource utilization.
high performance distributed computing | 2010
Ivan Rodero; Juan Jaramillo; Andres Quiroz; Manish Parashar; Francesc Guim
As energy efficiency and associated costs become key concerns, consolidated and virtualized data centers and clouds are attractive computing platforms for data- and compute-intensive applications. Recently, these platforms are also being considered for more traditional high-performance computing (HPC) applications. However, maximizing energy efficiency, cost-effectiveness, and utilization for these applications while ensuring performance and other Quality of Service (QoS) guarantees, requires leveraging important and extremely challenging tradeoffs. These include, for example, the tradeoff between the need to efficiently create and provision Virtual Machines (VMs) on data center resources and the need to accommodate the heterogeneous resource demands and runtimes of the applications that run on them. In this paper we propose an energy-aware online provisioning approach for HPC applications on consolidated and virtualized computing platforms. Energy efficiency is achieved using a workload-aware, just-right dynamic provisioning mechanism and the ability to power down subsystems of a host system that are not required by the VMs mapped to it. Our preliminary evaluations show that our approach can improve energy efficiency with an acceptable QoS penalty.
cluster computing and the grid | 2006
Ivan Rodero; Francesc Guim; Julita Corbalan; Jesús Labarta
The description of the jobs is a very important issue for the scheduling and management of grid jobs. Since there are a lot of different languages for describing grid jobs, the GGF have presented the Job Submission Description Language (JSDL) to standardize the job submission language. We believe that the JSDL is a good solution but it has some deficiencies regarding the parallelism issues. In this paper, we propose an extension of the JSDL to specify the parallelism details of grid jobs. This extension is proposed in general terms for supporting current multilevel parallel applications and incoming approaches in parallel programming models. We also discus the suitability of the multilevel parallel programming models for grids, in particular the MPI+OpenMP since our project, the eNANOS project, is based on this hybrid programming model.
job scheduling strategies for parallel processing | 2009
Francesc Guim; Ivan Rodero; Julita Corbalan
Job scheduling policies for HPC centers have been extensively studied in the last few years, especially backfilling based policies. Almost all of these studies have been done using simulation tools. All the existent simulators use the runtime (either estimated or real) provided in the workload as a basis of their simulations. In our previous work we analyzed the impact on system performance of considering the resource sharing (memory bandwidth) of running jobs including a new resource model in the Alvio simulator. Based on this studies we proposed the LessConsume and LessConsume Threshold resource selection policies. Both are oriented to reduce the saturation of the shared resources thus increasing the performance of the system. The results showed how both resource allocation policies shown how the performance of the system can be improved by considering where the jobs are finally allocated. Using the LessConsume Threshold Resource Selection Policy, we propose a new backfilling strategy : the Resource Usage Aware Backfilling job scheduling policy. This is a backfilling based scheduling policy where the algorithms which decide which job has to be executed and how jobs have to be backfilled are based on a different Threshold configurations. This backfilling variant that considers how the shared resources are used by the scheduled jobs. Rather than backfilling the first job that can moved to the run queue based on the job arrival time or job size, it looks ahead to the next queued jobs, and tries to allocate jobs that would experience lower penalized runtime caused by the resource sharing saturation. In the paper we demostrate how the exchange of scheduling information between the local resource manager and the scheduler can improve substantially the performance of the system when the resource sharing is considered. We show how it can achieve a close response time performance that the shorest job first Backfilling with First Fit (oriented to improve the start time for the allocated jobs) providing a qualitative improvement in the number of killed jobs and in the percentage of penalized runtime.