Is this you? Create Your Porfile

Renaud Lachaize

French Institute for Research in Computer Science and Automation

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Renaud Lachaize is active.

Explore More

Publication

Featured researches published by Renaud Lachaize.

architectural support for programming languages and operating systems | 2013

Traffic management: a holistic approach to memory placement on NUMA systems

Mohammad Dashti; Alexandra Fedorova; Justin R. Funston; Fabien Gaud; Renaud Lachaize; Baptiste Lepers; Vivien Quéma; Mark Roth

NUMA systems are characterized by Non-Uniform Memory Access times, where accessing data in a remote node takes longer than a local access. NUMA hardware has been built since the late 80s, and the operating systems designed for it were optimized for access locality. They co-located memory pages with the threads that accessed them, so as to avoid the cost of remote accesses. Contrary to older systems, modern NUMA hardware has much smaller remote wire delays, and so remote access costs per se are not the main concern for performance, as we discovered in this work. Instead, congestion on memory controllers and interconnects, caused by memory traffic from data-intensive applications, hurts performance a lot more. Because of that, memory placement algorithms must be redesigned to target traffic congestion. This requires an arsenal of techniques that go beyond optimizing locality. In this paper we describe Carrefour, an algorithm that addresses this goal. We implemented Carrefour in Linux and obtained performance improvements of up to 3.6 relative to the default kernel, as well as significant improvements compared to NUMA-aware patchsets available for Linux. Carrefour never hurts performance by more than 4% when memory placement cannot be improved. We present the design of Carrefour, the challenges of implementing it on modern hardware, and draw insights about hardware support that would help optimize system software on future NUMA systems.

cooperative information systems | 2004

Evaluation of a Group Communication Middleware for Clustered J2EE Application Servers

Takoua Abdellatif; Emmanuel Cecchet; Renaud Lachaize

Clusters have become the de facto platform to scale J2EE application servers. Each tier of the server uses group communication to maintain consistency between replicated nodes. JGroups is the most commonly used Java middleware for group communications in J2EE open source implementations. No evaluation has been done yet to evaluate the scalability of this middleware and its impact on application server scalability.

international conference on cluster computing | 2002

Using idle disks in a cluster as a high-performance storage system

Jørgen Sværke Hansen; Renaud Lachaize

In many clusters today, the local disks of a node are only used sporadically. This paper describes the software support for sharing of disks in clusters, where the disks are distributed across the nodes in the cluster, thereby allowing them to be combined into a high-performance storage system. Compared to centralized storage servers, such an architecture allows the total I/O capacity of the cluster to scale up with the number of nodes and disks. Additionally, our software allows customizing the functionality of the remote disk access using a library of code modules. A prototype has been implemented on a cluster connected by a Scalable Coherent Interface (SCI) network and performance measurements using both raw device access and a distributed file system show that the performance is comparable to dedicated storage systems and that the overhead of the framework is moderate even during high load. Thus, the prospects are that clusters sharing disks distributed among the nodes will allow both the application processing power and total I/O capacity of the cluster to scale up with the number of nodes.

Concurrency and Computation: Practice and Experience | 2004

An asynchronous middleware for Grid resource monitoring

Vivien Quéma; Renaud Lachaize; Emmanuel Cecchet

Resource management in a Grid computing environment raises several technical issues. The monitoring infrastructure must be scalable, flexible, configurable and adaptable to support thousands of devices in a highly dynamic environment where operational conditions are constantly changing.

ACM Queue | 2015

Challenges of memory management on modern NUMA systems

Fabien Gaud; Baptiste Lepers; Justin R. Funston; Mohammad Dashti; Alexandra Fedorova; Vivien Quéma; Renaud Lachaize; Mark Roth

Modern server-class systems are typically built as several multicore chips put together in a single system. Each chip has a local DRAM (dynamic random-access memory) module; together they are referred to as a node. Nodes are connected via a high-speed interconnect, and the system is fully coherent. This means that, transparently to the programmer, a core can issue requests to its node’s local memory as well as to the memories of other nodes. The key distinction is that remote requests will take longer, because they are subject to longer wire delays and may have to jump several hops as they traverse the interconnect. The latency of memory-access times is hence non-uniform, because it depends on where the request originates and where it is destined to go. Such systems are referred to as NUMA (non-uniform memory access).

international conference on distributed computing systems | 2010

Efficient Workstealing for Multicore Event-Driven Systems

Fabien Gaud; Sylvain Genevès; Renaud Lachaize; Baptiste Lepers; Fabien Mottet; Gilles Muller; Vivien Quéma

Many high-performance communicating systems are designed using the event-driven paradigm. As multicore platforms are now pervasive, it becomes crucial for such systems to take advantage of the available hardware parallelism. Event-coloring is a promising approach in this regard. First, it allows programmers to simply and progressively inject support for the safe, parallel execution of multiple event handlers through the use of annotations. Second, it relies on a workstealing algorithm to dynamically balance the execution of event handlers on the available cores. This paper studies the impact of the workstealing algorithm on the overall system performance. We first show that the only existing workstealing algorithm designed for event-coloring runtimes is not always efficient: for instance, it causes a 33% performance degradation on a Web server. We then introduce several enhancements to improve the workstealing behavior. An evaluation using both micro benchmarks and real applications, a Web server and the Secure File Server (SFS), shows that our system consistently outperforms a state-of-the-art runtime (Libasync-smp), with or without workstealing. In particular, our new workstealing improves performance by up to +25% compared to Libasync-smp without workstealing and by up to +73% compared to the Libasync-smp workstealing algorithm, in the Web server case.

international conference on parallel and distributed systems | 2008

Orchestra: Extensible Block-Level Support for Resource and Data Sharing in Networked Storage Systems

Michail D. Flouris; Renaud Lachaize; Angelos Bilas

High-performance storage systems are evolving towards decentralized commodity clusters that can scale in capacity, processing power, and network throughput. Building such systems requires: (a)Sharing physical resources among applications; (b)Sharing data among applications; (c) Allowing customized views of data for applications. Current solutions satisfy typically the first two requirements through a distributed file-system, resulting in monolithic, hard-to-manage storage systems. In this paper, we present Orchestra, a novel storage system that addresses all three above requirements below the file-system by extending the block layer. To provide customized views, Orchestra allows applications to create semantically-rich virtual block devices by combining simpler ones. To achieve efficient resource and data sharing it supports block-level allocation and byte-range locking as in-band mechanisms. We implement Orchestra under Linux and use it to build a shared cluster file-system. We evaluate it on a 16-node cluster, finding that the flexibility offered by Orchestra introduces little overhead beyond mandatory communication and disk access costs.

cluster computing and the grid | 2005

A distributed shared buffer space for data-intensive applications

Renaud Lachaize; Jørgen Sværke Hansen

Efficient memory allocation and data transfer for cluster-based data-intensive applications is a difficult task. Both changes in cluster interconnects and application workloads usually require timing of the application and network code. We propose separating control and data transfer traffic by accessing data through a DSM-like cluster-wide shared buffer space and only including buffer references in the control messages. Using a generic API for accessing buffers allows for tuning data transfer without changing the application code. A prototype, implemented in the context of a distributed storage system, has been validated with several networking technologies, showing that such a framework can combine performance and flexibility.

Journal of Parallel and Distributed Computing | 2010

Extensible block-level storage virtualization in cluster-based systems

Michail D. Flouris; Renaud Lachaize; Konstantinos Chasapis; Angelos Bilas

High-performance storage systems are evolving towards decentralized commodity clusters that can scale in capacity, processing power, and network throughput. Building such systems requires: (a) Sharing physical resources among applications; (b) Sharing data among applications; (c) Allowing customized data views. Current solutions typically satisfy the first two requirements through a cluster file-system, resulting in monolithic, hard-to-manage systems. In this paper we present a storage system that addresses all three requirements by extending the block layer below the file-system. First, we discuss how our system provides customized (virtualized) storage views within a single node. Then, we discuss how it scales in clustered setups. To achieve efficient resource and data sharing we support block-level allocation and locking as in-band mechanisms. We implement a prototype under Linux and use it to build a shared cluster file-system. Our evaluation in a 24-node cluster setup concludes that our approach offers flexibility, scalability and reduced effort to implement new functionality.

international conference on cluster computing | 2006

Using Lightweight Transactions and Snapshots for Fault-Tolerant Services Based on Shared Storage Bricks

Michail D. Flouris; Renaud Lachaize; Angelos Bilas

To satisfy current and future application needs in a cost effective manner, storage systems are evolving from monolithic disk arrays to networked storage architectures based on commodity components. So far, this architectural transition has mostly been envisioned as a way to scale capacity and performance. In this work we examine how the block-level interface exported by such networked storage systems can be extended to deal with reliability. Our goals are: (a) At the design level, to examine how strong reliability semantics can be offered at the block level; (b) At the implementation level, to examine the mechanisms required and how they may be provided in a modular and configurable manner. We first discuss how transactional-type semantics may be offered at the block level. We present a system design that uses the concept of atomic update intervals combined with existing, block-level locking and snapshot mechanisms, in contrast to the more common journaling techniques. We discuss in detail the design of the associated mechanisms and the trade-offs and challenges when dividing the required functionality between the file-system and the block-level storage. Our approach is based on a unified and thus, non-redundant set of mechanisms for providing reliability both at the block and file level. Our design and implementation effectively provide a tunable, lightweight transactions mechanism to higher system and application layers. Finally, we describe how the associated protocols can be implemented in a modular way in a prototype storage system we are currently building. As our system is currently being implemented, we do not present performance results

Explore More