Frédéric Vivien
École normale supérieure de Lyon
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Frédéric Vivien.
Journal of Parallel and Distributed Computing | 2010
Mark Stillwell; David Schanzenbach; Frédéric Vivien; Henri Casanova
Commodity clusters are used routinely for deploying service hosting platforms. Due to hardware and operation costs, clusters need to be shared among multiple services. Crucial for enabling such shared hosting platforms is virtual machine (VM) technology, which allows consolidation of hardware resources. A key challenge, however, is to make appropriate decisions when allocating hardware resources to service instances. In this work we propose a formulation of the resource allocation problem in shared hosting platforms for static workloads with servers that provide multiple types of resources. Our formulation supports a mix of best-effort and QoS scenarios, and, via a precisely defined objective function, promotes performance, fairness, and cluster utilization. Further, this formulation makes it possible to compute a bound on the optimal resource allocation. We propose several classes of resource allocation algorithms, which we evaluate in simulation. We are able to identify an algorithm that achieves average performance close to the optimal across many experimental scenarios. Furthermore, this algorithm runs in only a few seconds for large platforms and thus is usable in practice.
cluster computing and the grid | 2009
Mark Stillwell; David Schanzenbach; Frédéric Vivien; Henri Casanova
We propose a novel approach for sharing cluster resources among competing jobs. The key advantage of our approach over current solutions is that it increases cluster utilization while optimizing a user-centric metric that captures both notions of performance and fairness. We motivate and formalize the corresponding resource allocation problem, determine its complexity, and propose several algorithms to solve it in the case of a static workload that consists of sequential jobs. Via extensive simulation experiments we identify an algorithm that runs quickly, that is always on par with or better than its competitors, and that produces resource allocations that are close to optimal. We find that the extension of our approach to parallel jobs leads to similarly good results. Finally, we explain how to extend our work to dynamicworkloads.
parallel computing | 1998
Pierre Boulet; Alain Darte; Georges-André Silber; Frédéric Vivien
Abstract In this paper, we survey loop parallelization algorithms, analyzing the dependence representations they use, the loop transformations they generate, the code generation schemes they require, and their ability to incorporate various optimizing criteria such as maximal parallelism detection, permutable loop detection, minimization of synchronizations, easiness of code generation, etc. We complete the discussion by presenting new results related to code generation and loop fusion for a particular class of multidimensional schedules called shifted linear schedules. We demonstrate that algorithms based on such schedules lead to simple codes.
ieee international conference on high performance computing data and analytics | 2011
Marin Bougeret; Henri Casanova; Mikaël Rabie; Yves Robert; Frédéric Vivien
This work provides an analysis of checkpointing strategies for minimizing expected job execution times in an environment that is subject to processor failures. In the case of both sequential and parallel jobs, we give the optimal solution for exponentially distributed failure inter-arrival times, which, to the best of our knowledge, is the first rigorous proof that periodic checkpointing is optimal. For non-exponentially distributed failures, we develop a dynamic programming algorithm to maximize the amount of work completed before the next failure, which provides a good heuristic for minimizing the expected execution time. Our work considers various models of job parallelism and of parallel checkpointing overhead. We first perform extensive simulation experiments assuming that failures follow Exponential or Weibull distributions, the latter being more representative of real-world systems. The obtained results not only corroborate our theoretical findings, but also show that our dynamic programming algorithm significantly outperforms previously proposed solutions in the case of Weibull failures. We then discuss results from simulation experiments that use failure logs from production clusters. These results confirm that our dynamic programming algorithm significantly outperforms existing solutions for real-world clusters.
Parallel Processing Letters | 1997
Alain Darte; Georges-André Silber; Frédéric Vivien
Tiling is a technique used for exploiting medium-grain parallelism in nested loops. It relies on a first step that detects sets of permutable nested loops. All algorithms developed so far consider the statements of the loop body as a single block, in other words, they are not able to take advantage of the structure of dependences between different statements. In this paper, we overcame this limitation by showing how the structure of the reduced dependence graph can be taken into account for detecting more permutable loops. Our method combines graph retiming techniques and graph scheduling techniques. It can be viewed as an extension of Wolf and Lams algorithm to the case of loops with multiple statements. Loan independent dependences play a particular role in our study, and we show how the way we handle them can be useful for fine-grain loop parallelization as well.
parallel, distributed and network-based processing | 2004
Arnaud Giersch; Yves Robert; Frédéric Vivien
This paper is devoted to scheduling a large collection of independent tasks onto heterogeneous clusters. The tasks depend upon (input) files which initially reside on a master processor. A given file may well be shared by several tasks. The role of the master is to distribute the files to the processors, so that they can execute the tasks. The objective for the master is to select which file to send to which slave, and in which order, so as to minimize the total execution time. The contribution of this paper is twofold. On the theoretical side, we establish complexity results that assess the difficulty of the problem. On the practical side, we design several new heuristics, which are shown to perform as efficiently as the best heuristics in [H. Casanova et al. (2000), (1999)] although their cost is an order of magnitude lower.
International Journal of Parallel Programming | 1997
Alain Darte; Frédéric Vivien
This paper presents an optimal algorithm for detecting line or medium grain parallelism in nested loops whose dependences are described by an approximation of distance vectors by polyhedra. In particular, this algorithm is optimal for the classical approximation by direction sectors. This result generalizes, to the case of several statements. Wolf and Lams algorithm which is optimal for a single statement. Our algorithm relies on a dependence uniformization process and on parallelization techniques related to system of uniform recurrence equations. It can also be viewed as a combination of both Allen and Kennedys algorithm and Wolf and Lams algorithm.
IEEE Transactions on Parallel and Distributed Systems | 2004
Arnaud Legrand; Hélène Renard; Yves Robert; Frédéric Vivien
We consider the mapping of iterative algorithms onto heterogeneous clusters. The application data is partitioned over the processors, which are arranged along a virtual ring. At each iteration, independent calculations are carried out in parallel, and some communications take place between consecutive processors in the ring. The aim is to determine how to slice the application data into chunks, and to assign these chunks to the processors, so that the total execution time is minimized. One major difficulty is to embed a processor ring into a network that typically is not fully connected, so that some communication links have to be shared by several processor pairs. We establish a complexity result that assesses the difficulty of this problem, and we design a practical heuristic that provides efficient mapping, routing, link- sharing, and data distribution schemes.
IEEE Transactions on Computers | 2010
Anne Benoit; Loris Marchal; Jean-François Pineau; Yves Robert; Frédéric Vivien
Scheduling problems are already difficult on traditional parallel machines, and they become extremely challenging on heterogeneous clusters. In this paper, we deal with the problem of scheduling multiple applications, made of collections of independent and identical tasks, on a heterogeneous master-worker platform. The applications are submitted online, which means that there is no a priori (static) knowledge of the workload distribution at the beginning of the execution. The objective is to minimize the maximum stretch, i.e., the maximum ratio between the actual time an application has spent in the system and the time this application would have spent if executed alone. On the theoretical side, we design an optimal algorithm for the offline version of the problem (when all release dates and application characteristics are known beforehand). We also introduce a heuristic for the general case of online applications. On the practical side, we have conducted extensive simulations and MPI experiments, showing that we are able to deal with very large problem instances in a few seconds. Also, the solution that we compute totally outperforms classical heuristics from the literature, thereby fully assessing the usefulness of our approach.
ieee international conference on high performance computing data and analytics | 2013
Sheng Di; Yves Robert; Frédéric Vivien; Derrick Kondo; Cho-Li Wang; Franck Cappello
In this paper, we aim at optimizing fault-tolerance techniques based on a checkpointing/restart mechanism, in the context of cloud computing. Our contribution is three-fold. (1) We derive a fresh formula to compute the optimal number of checkpoints for cloud jobs with varied distributions of failure events. Our analysis is not only generic with no assumption on failure probability distribution, but also attractively simple to apply in practice. (2) We design an adaptive algorithm to optimize the impact of checkpointing regarding various costs like checkpointing/restart overhead. (3) We evaluate our optimized solution in a real cluster environment with hundreds of virtual machines and Berkeley Lab Checkpoint/Restart tool. Task failure events are emulated via a production trace produced on a large-scale Google data center. Experiments confirm that our solution is fairly suitable for Google systems. Our optimized formula outperforms Youngs formula by 3-10 percent, reducing wall-clock lengths by 50-100 seconds per job on average.