Eduardo Huedo
Complutense University of Madrid
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Eduardo Huedo.
Software - Practice and Experience | 2004
Eduardo Huedo; Rubén S. Montero; Ignacio Martín Llorente
Grids offer a dramatic increase in the number of available processing and storing resources that can be delivered to applications. However, efficient job submission and management continue being far from accessible to ordinary scientists and engineers due to their dynamic and complex nature. This paper describes a new Globus based framework that allows an easier and more efficient execution of jobs in a ‘submit and forget’ fashion. The framework automatically performs the steps involved in job submission and also watches over its efficient execution. In order to obtain a reasonable degree of performance, job execution is adapted to dynamic resource conditions and application demands. Adaptation is achieved by supporting automatic application migration following performance degradation, ‘better’ resource discovery, requirement change, owner decision or remote resource failure. The framework is currently functional on any Grid testbed based on Globus because it does not require new system software to be installed in the resources. The paper also includes practical experiences of the behavior of our framework on the TRGP and UCM‐CAB testbeds. Copyright
Future Generation Computer Systems | 2007
Eduardo Huedo; Rubén S. Montero; Ignacio Martín Llorente
The last version of the Globus Toolkit includes both pre-WS and WS GRAM services to submit, monitor, and control jobs on remote Grid resources. In the medium term and until a full transition is accomplished, both pre-WS and WS GRAM services will coexist in Grid infrastructures. In this paper, we describe the modular architecture of the GridWay meta-scheduler, which allows the simultaneous and coordinated use of pre-WS and WS GRAM services and, therefore, makes easy the transition to a Web Service implementation of the Globus components. Such functionality is demonstrated on a infrastructure that comprises resources from a research testbed, based on the Globus Toolkit 4.0, and the EGEE production infrastructure, based on the LCG middleware. The Web Service implementation of Globus components has been optimized for flexibility, stability and scalability. However, part of the Grid community is still reluctant to transition to the Web Service model due mainly to its supposed lower performance. We demonstrate that WS GRAM achieves a performance comparable to that of pre-WS GRAM.
Future Generation Computer Systems | 2009
Katia Leal; Eduardo Huedo; Ignacio Martín Llorente
In this paper we present a decentralized model for scheduling independent tasks in Federated Grids. This model consists of a set of meta-schedulers on each of the grid infrastructures of the Federated Grid. Each meta-scheduler has to implement a mapping strategy in order to improve two of the most common objective functions of task scheduling problems: makespan and resource performance. We consider four possible algorithms that have to provide a simple, decoupled, and coarse-grained solution that could be deployed in any Grid. The main axis of the algorithms is that they consider the performance of the infrastructures forming the Federated Grid, not only their state.
parallel, distributed and network-based processing | 2004
Eduardo Huedo; Rubén S. Montero; Ignacio Martín Llorente
Grids offer a dramatic increase in the number of available compute and storage resources that can be delivered to applications. This new computational infrastructure provides a promising platform to execute loosely coupled, high-throughput parameter sweep applications. This kind of applications arises naturally in many scientific and engineering fields like bioinformatics, computational fluid dynamics (CFD), particle physics, etc. The efficient execution and scheduling of parameter sweep applications is challenging because of the dynamic and heterogeneous nature of grids. We present a scheduling algorithm built on top of the GridWay framework that combines: (i) adaptive scheduling to reflect the dynamic grid characteristics; (ii) adaptive execution to migrate running jobs to better resources and provide fault tolerance; (iii) re-use of common files between tasks to reduce the file transfer overhead. The efficiency of the approach is demonstrated in the execution of a CFD application on a highly heterogeneous research testbed.
grid and pervasive computing | 2009
Constantino Vázquez Blanco; Eduardo Huedo; Rubén S. Montero; Ignacio Martín Llorente
Grid computing involves the ability to harness together the power of computing resources. In this paper we push forward this philosophy and show technologies enabling federation of grid infrastructures regardless of their interface. The aim is to provide the ability to build arbitrary complex grid infrastructure able to sustain the demand required by any given service. In this very same line, this paper also addresses mechanisms that potentially can be used to meet a given quality of service or satisfy peak demands this service may have. These mechanisms imply the elastic growth of the grid infrastructure making use of cloud providers, regardless of whether they are commercial, like Amazon EC2 and GoGrid, or scientific, like Globus Nimbus. Both these technologies of federation and dynamic provisioning are demonstrated in two experiments. The first is designed to show the feasibility of the federation solution by harnessing resources of the TeraGrid, EGEE and Open Science Grid infrastructures through a single point of entry. The second experiment is aimed to show the overheads caused in the process of offloading jobs to resources created in the cloud.
parallel computing | 2006
Rubén S. Montero; Eduardo Huedo; Ignacio Martín Llorente
Grids constitute a promising platform to execute loosely coupled, high-throughput parameter sweep applications, which arise naturally in many scientific and engineering fields like bio-informatics, computational fluid dynamics, particle physics, etc. In spite of the simple computational structure of these applications, its efficient execution and scheduling are challenging because of the dynamic and heterogeneous nature of Grids. In this work, we propose a benchmarking methodology to analyze the performance of computational Grids in the execution of high throughput computing applications, that combines: (i) a representative benchmark included in the NAS Grid Benchmark suite; (ii) a performance model that provides a way to parametrize and compare different Grids; and (iii) a set of application-level performance metrics to analyze and predict the performance of this kind of applications. The benchmarking methodology will be applied to the performance analysis of a Globus-based research testbed that spans heterogeneous resources in five institutions.
Journal of Systems Architecture | 2006
Eduardo Huedo; Rubén S. Montero; Ignacio Martín Llorente
Reliability, in terms of Grid component fault tolerance and minimum quality of service, is an important aspect that has to be addressed to foster Grid technology adoption. Software reliability is critically important in todays integrated and distributed systems, as is often the weak link in system performance. In general, reliability is difficult to measure, and specially in Grid environments, where evaluation methodologies are novel and controversial matters. This paper describes a straightforward procedure to analyze the reliability of computational grids from the viewpoint of an end user. The procedure is illustrated in the evaluation of a research Grid infrastructure based on Globus basic services and the GridWay meta-scheduler. The GridWay support for fault tolerance is also demonstrated in a production-level environment. Results show that GridWay is a reliable workload management tool for dynamic and faulty Grid environments. Transparently to the end user, GridWay is able to detect and recover from any of the Grid element failure, outage and saturation conditions specified by the reliability analysis procedure.
european conference on parallel processing | 2003
Rubén S. Montero; Eduardo Huedo; Ignacio Martín Llorente
The ability to migrate running applications among different grid resources is generally accepted as the solution to adapt to dynamic resource load, availability and cost. In this paper we focus on opportunistic migration when a new resource becomes available in the Grid. In this situation the performance of the new host, the remaining execution time of the application, and also the proximity of the new resource to the needed data, become critical factors to decide if job migration is feasible and worthwhile. We discuss the extension of the GridWay framework to consider all the previous factors in the resource selection and migration stages in order to improve response times of individual applications. The benefits of the new resource selector will be demonstrated for the execution of a computational fluid dynamics (CFD) code.
Future Generation Computer Systems | 2011
Constantino Vázquez; Eduardo Huedo; Rubén S. Montero; Ignacio Martín Llorente
Cloud computing is being built on top of established grid technology concepts. On the other hand, it is also true that cloud computing has much to offer to grid infrastructures. The aim of this paper is to provide the ability to build arbitrary complex grid infrastructures able to sustain the demand required by any given service, taking advantage of the pay-per-use model and the seemingly unlimited capacity of the cloud computing paradigm. It addresses mechanisms that potentially can be used to meet a given quality of service or satisfy peak demands this service may have. These mechanisms imply the elastic growth of the grid infrastructure making use of cloud providers, regardless of whether they are commercial, like Amazon EC2 and GoGrid, or scientific, like Globus Nimbus. This technology of dynamic provisioning is demonstrated in an experiment, aimed to show the overheads caused in the process of offloading jobs to resources created in the cloud.
Future Generation Computer Systems | 2009
Eduardo Huedo; Rubén S. Montero; Ignacio Martín Llorente
Grid resource management has been traditionally limited to just two levels of hierarchy, namely local resource managers and metaschedulers. This results in a non-manageable, and thus not scalable, architecture, where each metascheduler has to be able to access thousands of resources, which also implies having a detailed knowledge about their interfaces and configuration. This paper presents a recursive architecture allowing an arbitrary number of levels in the hierarchy. This way, resources can be arranged in different ways-for example, following organizational boundaries or aggregating them by similarity-while hiding the access details. An implementation of this architecture is shown, as well as its benefits in terms of autonomy, scalability, deployment and security. The proposed implementation is based on existing interfaces, allowing for standardization.