Patrick Marlier
University of Neuchâtel
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Patrick Marlier.
european conference on computer systems | 2010
Dave Christie; Jaewoong Chung; Stephan Diestelhorst; Michael P. Hohmuth; Martin T. Pohlack; Christof Fetzer; Martin Nowack; Torvald Riegel; Pascal Felber; Patrick Marlier; Etienne Rivière
AMDs Advanced Synchronization Facility (ASF) is an x86 instruction set extension proposal intended to simplify and speed up the synchronization of concurrent programs. In this paper, we report our experiences using ASF for implementing transactional memory. We have extended a C/C++ compiler to support language-level transactions and generate code that takes advantage of ASF. We use a software fallback mechanism for transactions that cannot be committed within ASF (e.g., because of hardware capacity limitations). Our evaluation uses a cycle-accurate x86 simulator that we have extended with ASF support. Building a complete ASF-based software stack allows us to evaluate the performance gains that a user-level program can obtain from ASF. Our measurements on a wide range of benchmarks indicate that the overheads traditionally associated with software transactional memories can be significantly reduced with the help of ASF.
IEEE Transactions on Parallel and Distributed Systems | 2010
Pascal Felber; Christof Fetzer; Patrick Marlier; Torvald Riegel
Software transactional memory (STM) is a concurrency control mechanism that is widely considered to be easier to use by programmers than other mechanisms such as locking. The first generations of STMs have either relied on visible read designs, which simplify conflict detection while pessimistically ensuring a consistent view of shared data to the application, or optimistic invisible read designs that are significantly more efficient but require incremental validation to preserve consistency, at a cost that increases quadratically with the number of objects read in a transaction. Most of the recent designs now use a “time-based” (or “time stamp-based”) approach to still benefit from the performance advantage of invisible reads without incurring the quadratic overhead of incremental validation. In this paper, we give an overview of the time-based STM approach and discuss its benefits and limitations. We formally introduce the first time-based STM algorithm, the Lazy Snapshot Algorithm (LSA). We study its semantics and the impact of its design parameters, notably multiversioning and dynamic snapshot extension. We compare it against other classical designs and we demonstrate that its performance is highly competitive, both for obstruction-free and lock-based STM designs.
dependable systems and networks | 2011
Walther Maldonado; Patrick Marlier; Pascal Felber; Julia L. Lawall; Gilles Muller; Etienne Rivière
Software Transactional Memory (STM) is an optimistic concurrency control mechanism that simplifies the development of parallel programs. Still, the interest of STM has not yet been demonstrated for reactive applications that require bounded response time for some of their operations. We propose to support such applications by allowing the developer to annotate some transaction blocks with deadlines. Based on previous execution statistics, we adjust the transaction execution strategy by decreasing the level of optimism as the deadlines near through two modes of conservative execution, without overly limiting the progress of concurrent transactions. Our implementation comprises a STM extension for gathering statistics and implementing the execution mode strategies. We have also extended the Linux scheduler to disable preemption or migration of threads that are executing transactions with deadlines. Our experimental evaluation shows that our approach significantly improves the chance of a transaction meeting its deadline when its progress is hampered by conflicts.
symposium on operating systems principles | 2015
Alexander Matveev; Nir Shavit; Pascal Felber; Patrick Marlier
This paper introduces read-log-update (RLU), a novel extension of the popular read-copy-update (RCU) synchronization mechanism that supports scalability of concurrent code by allowing unsynchronized sequences of reads to execute concurrently with updates. RLU overcomes the major limitations of RCU by allowing, for the first time, concurrency of reads with multiple writers, and providing automation that eliminates most of the programming difficulty associated with RCU programming. At the core of the RLU design is a logging and coordination mechanism inspired by software transactional memory algorithms. In a collection of micro-benchmarks in both the kernel and user space, we show that RLU both simplifies the code and matches or improves on the performance of RCU. As an example of its power, we show how it readily scales the performance of a real-world application, Kyoto Cabinet, a truly difficult concurrent programming feat to attempt in general, and in particular with classic RCU.
international symposium on microarchitecture | 2010
Pascal Felber; E Rivière; W M Moreira; Derin Harmanci; Patrick Marlier; Stephan Diestelhorst; Michael P. Hohmuth; Martin T. Pohlack; Adrian Cristal; I Hur; Osman S. Unsal; P Stenström; A Dragojevic; Rachid Guerraoui; M Kapalka; Vincent Gramoli; U Drepper; S Tomić; Yehuda Afek; Guy Korland; Nir Shavit; Christof Fetzer; Martin Nowack; Torvald Riegel
The adoption of multi- and many-core architectures for mainstream computing undoubtedly brings profound changes in the way software is developed. In particular, the use of fine grained locking as the multi-core programmers coordination methodology is considered by more and more experts as a dead-end. The transactional memory (TM) programming paradigm is a strong contender to become the approach of choice for replacing locks and implementing atomic operations in concurrent programming. Combining sequences of concurrent operations into atomic transactions allows a great reduction in the complexity of both programming and verification, by making parts of the code appear to execute sequentially without the need to program using fine-grained locking. Transactions remove from the programmer the burden of figuring out the interaction among concurrent operations that happen to conflict when accessing the same locations in memory. The EU-funded FP7 VELOX project designs, implements and evaluates an integrated TM stack, spanning from programming language to the hardware support, and including runtime and libraries, compilers, and application environments. This paper presents an overview of the VELOX TM stack and its associated challenges and contributions.
international conference on principles of distributed systems | 2013
Yaroslav Hayduk; Anita Sobe; Derin Harmanci; Patrick Marlier; Pascal Felber
The actor model has been successfully used for scalable computing in distributed systems. Actors are objects with a local state, which can only be modified by the exchange of messages. One of the fundamental principles of actor models is to guarantee sequential message processing, which avoids typical concurrency hazards, but limits the achievable message throughput. Preserving the sequential semantics of the actor model is, however, necessary for program correctness. In this paper, we propose to add support for speculative concurrent execution in actors using transactional memory (TM). Our approach is designed to operate with message passing and shared memory, and can thus take advantage of parallelism available on distributed and multi-core systems. The processing of each message is wrapped in a transaction executed atomically and in isolation, but concurrently with other messages. This allows us (1)ato scale while keeping the dependability guarantees ensured by sequential message processing, and (2)ato further increase robustness of the actor model against threats due to the rollback ability that comes for free with transactional processing of messages. We validate our design within the Scala programming language and the Akka framework. We show that the overhead of using transactions is hidden by the improved message processing throughput, thus leading to an overall performance gain.
programming models and applications for multicores and manycores | 2015
Maria Carpen-Amarie; Patrick Marlier; Pascal Felber; Gaël Thomas
In the last few years, managed runtime environments such as the Java Virtual Machine (JVM) are increasingly used on large-scale multicore servers. The garbage collector (GC) represents a critical component of the JVM and has a significant influence on the overall performance and efficiency of the running application. We perform a study on all available Java GCs, both in an academic environment (set of benchmarks), as well as in a simulated real-life situation (client-server application). We mainly focus on the three most widely used collectors: ParallelOld, ConcurrentMarkSweep and G1. We find that they exhibit different behaviours in the two tested environments. In particular, the default Java GC, ParallelOld, proves to be stable and adequate in the first situation, while in the real-life scenario its use results in unacceptable pauses for the application threads. We believe that this is partly due to the memory requirements of the multicore server. G1 GC performs notably bad on the benchmarks when forced to have a full collection between the iterations of the application. Moreover, even though G1 and ConcurrentMarkSweep GCs introduce significantly lower pauses than ParallelOld in the client-server environment, they can still seriously impact the response time on the client. Pauses of around 3 seconds can make a real-time system unusable and may disrupt the communication between nodes in the case of large-scale distributed systems.
parallel computing | 2015
Walther Maldonado; Patrick Marlier; Pascal Felber; Julia L. Lawall; Gilles Muller; Etienne Rivière
Software transactional memory (STM) is an optimistic concurrency control mechanism that simplifies parallel programming. However, there has been little interest in its applicability to reactive applications in which there is a required response time for certain operations. We propose supporting such applications by allowing programmers to associate time with atomic blocks in the form of deadlines and quality-of-service (QoS) requirements. Based on statistics of past executions, we adjust the execution mode of transactions by decreasing the level of optimism as the deadline approaches. In the presence of concurrent deadlines, we propose different conflict resolution policies. Execution mode switching mechanisms allow the meeting of multiple deadlines in a consistent manner, with potential QoS degradations being split fairly among several threads as contention increases, and avoiding starvation. Our implementation consists of extensions to an STM runtime that allow gathering statistics and switching execution modes. We also propose novel contention managers adapted to transactional workloads subject to deadlines. The experimental evaluation shows that our approaches significantly improve the likelihood of a transaction meeting its deadline and QoS requirement, even in cases where progress is hampered by conflicts and other concurrent transactions with deadlines.
acm sigplan symposium on principles and practice of parallel programming | 2010
Walther Maldonado; Patrick Marlier; Pascal Felber; Adi Suissa; Danny Hendler; Alexandra Fedorova; Julia L. Lawall; Gilles Muller
acm symposium on parallel algorithms and architectures | 2011
Torvald Riegel; Patrick Marlier; Martin Nowack; Pascal Felber; Christof Fetzer