Cécile Germain-Renaud

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Cécile Germain-Renaud is active.

Explore More

Publication

Featured researches published by Cécile Germain-Renaud.

Journal of Grid Computing | 2008

Scheduling for Responsive Grids

Cécile Germain-Renaud; Charles Loomis; Jakub T. Mościcki; Romain Texier

Grids are facing the challenge of seamless integration of the Grid power into everyday use. One critical component for this integration is responsiveness, the capacity to support on-demand computing and interactivity. Grid sched uling is involved at two levels in order to provide responsiveness: the policy level and the implementation level. The main contributions of this paper are as follows. First, we present a detailed analysis of the performance of the EGEE Grid with respect to responsiveness. Second, we examine two user-level schedulers located between the general scheduling layer and the application layer. These are the DIANE (distributed analysis environment) framework, a general-purpose overlay system, and a specialized, embedded scheduler for gPTM3D, an interactive medical image analysis application. Finally, we define and demonstrate a virtualization scheme, which achieves guaranteed turnaround time, schedulability analysis, and provides the basis for differentiated services. Both methods target a brokering-based system organized as a federation of batch-scheduled clusters, and an EGEE implementation is described.

IEEE Transactions on Knowledge and Data Engineering | 2014

Data Stream Clustering With Affinity Propagation

Xiangliang Zhang; Cyril Furtlehner; Cécile Germain-Renaud; Michèle Sebag

Data stream clustering provides insights into the underlying patterns of data flows. This paper focuses on selecting the best representatives from clusters of streaming data. There are two main challenges: how to cluster with the best representatives and how to handle the evolving patterns that are important characteristics of streaming data with dynamic distributions. We employ the Affinity Propagation (AP) algorithm presented in 2007 by Frey and Dueck for the first challenge, as it offers good guarantees of clustering optimality for selecting exemplars. The second challenging problem is solved by change detection. The presented StrAP algorithm combines AP with a statistical change point detection test; the clustering model is rebuilt whenever the test detects a change in the underlying data distribution. Besides the validation on two benchmark data sets, the presented algorithm is validated on a real-world application, monitoring the data flow of jobs submitted to the EGEE grid.

knowledge discovery and data mining | 2009

Toward autonomic grids: analyzing the job flow with affinity streaming

Xiangliang Zhang; Cyril Furtlehner; Julien Perez; Cécile Germain-Renaud; Michèle Sebag

The Affinity Propagation (AP) clustering algorithm proposed by Frey and Dueck (2007) provides an understandable, nearly optimal summary of a dataset, albeit with quadratic computational complexity. This paper, motivated by Autonomic Computing, extends AP to the data streaming framework. Firstly a hierarchical strategy is used to reduce the complexity to O(N1+ε); the distortion loss incurred is analyzed in relation with the dimension of the data items. Secondly, a coupling with a change detection test is used to cope with non-stationary data distribution, and rebuild the model as needed. The presented approach StrAP is applied to the stream of jobs submitted to the EGEE Grid, providing an understandable description of the job flow and enabling the system administrator to spot online some sources of failures.

ieee/acm international symposium cluster, cloud and grid computing | 2011

The Grid Observatory

Cécile Germain-Renaud; Alain Cady; Philippe Gauron; Michel Jouvin; Charles Loomis; Janusz Martyniak; Julien Nauroy; Guillaume Philippon; Michèle Sebag

The goal of the Grid Observatory project (GO) is to contribute to an experimental theory of large grid systems by integrating the collection of data on the behaviour of the flagship European Grid Infrastructure (EGI) and its users, the development of models, and an ontology for the domain knowledge. The GO gives access to a database of grid usage traces available to the wider computer science community without the need of grid credentials. The paper presents the architecture of the digital curation process enacted by the GO and examples of their exploitation.

Journal of Grid Computing | 2010

Multi-objective Reinforcement Learning for Responsive Grids

Julien Perez; Cécile Germain-Renaud; Balázs Kégl; Charles Loomis

Grids organize resource sharing, a fundamental requirement of large scientific collaborations. Seamless integration of Grids into everyday use requires responsiveness, which can be provided by elastic Clouds, in the Infrastructure as a Service (IaaS) paradigm. This paper proposes a model-free resource provisioning strategy supporting both requirements. Provisioning is modeled as a continuous action-state space, multi-objective reinforcement learning (RL) problem, under realistic hypotheses; simple utility functions capture the high level goals of users, administrators, and shareholders. The model-free approach falls under the general program of autonomic computing, where the incremental learning of the value function associated with the RL model provides the so-called feedback loop. The RL model includes an approximation of the value function through an Echo State Network. Experimental validation on a real data-set from the EGEE Grid shows that introducing a moderate level of elasticity is critical to ensure a high level of user satisfaction.

Journal of Physics: Conference Series | 2015

The Higgs Machine Learning Challenge

Claire Adam-Bourdarios; Glen Cowan; Cécile Germain-Renaud; Isabelle Guyon; Balázs Kégl; D. Rousseau

The Higgs Machine Learning Challenge was an open data analysis competition that took place between May and September 2014. Samples of simulated data from the ATLAS Experiment at the LHC corresponding to signal events with Higgs bosons decaying to τ+τ– together with background events were made available to the public through the website of the data science organization Kaggle (kaggle.com). Participants attempted to identify the search region in a space of 30 kinematic variables that would maximize the expected discovery significance of the signal process. One of the primary goals of the Challenge was to promote communication of new ideas between the Machine Learning (ML) and HEP communities. In this regard it was a resounding success, with almost 2,000 participants from HEP, ML and other areas. The process of understanding and integrating the new ideas, particularly from ML into HEP, is currently underway.

international conference on autonomic computing | 2009

Responsive elastic computing

Julien Perez; Cécile Germain-Renaud; Balázs Kégl; Charles Loomis

Two production models are candidates for e-science computing: grids enable hardware and software sharing; clouds propose dynamic resource provisioning (elastic computing). Organized sharing is a fundamental requirement for large scientific collaborations; responsiveness, the ability to provide good response time, is a fundamental requirement for seamless integration of the large scale computing resources into everyday use. This paper focuses on a model-free resource provisioning strategy supporting both scenarios. The provisioning problem is modeled as a continuous action-state space, multi-objective reinforcement learning problem, under realistic hypotheses; the high level goals of users, administrators, and shareholders are captured through simple utility functions. We propose an implementation of this reinforcement learning framework, including an approximation of the value function through an Echo State Network, and we validate it on a real dataset.

cluster computing and the grid | 2008

Grid Differentiated Services: A Reinforcement Learning Approach

Julien Perez; Cécile Germain-Renaud; Balázs Kégl; Charles Loomis

Large scale production grids are a major case for autonomic computing. Following the classical definition of Kephart, an autonomic computing system should optimize its own behavior in accordance with high level guidance from humans. This central tenet of this paper is that the combination of utility functions and reinforcement learning (RL) can provide a general and efficient method for dynamically allocating grid resources in order to optimize the satisfaction of both end-users and participating institutions. The flexibility of an RL-based system allows to model the state of the grid, the jobs to be scheduled, and the high-level objectives of the various actors on the grid. RL-based scheduling can seamlessly adapt its decisions to changes in the distributions of inter-arrival time, QoS requirements, and resource availability. Moreover, it requires minimal prior knowledge about the target environment, including user requests and infrastructure. Our experimental results, both on a synthetic workload and a real trace, show that RL is not only a realistic alternative to empirical scheduler design, but is able to outperform them.

grid computing | 2010

Discovering Piecewise Linear Models of Grid Workload

Tamás Élteto; Cécile Germain-Renaud; Pascal Bondon; Michèle Sebag

Despite extensive research focused on enabling QoS for grid users through economic and intelligent resource provisioning, no consensus has emerged on the most promising strategies. On top of intrinsically challenging problems, the complexity and size of data has so far drastically limited the number of comparative experiments. An alternative to experimenting on real, large, and complex data, is to look for well-founded and parsimonious representations. This study is based on exhaustive information about the gLite-monitored jobs from the EGEE grid, representative of a significant fraction of e-science computing activity in Europe. Our main contributions are twofold. First we found that workload models for this grid can consistently be discovered from the real data, and that limiting the range of models to piecewise linear time series models is sufficiently powerful. Second, we present a bootstrapping strategy for building more robust models from the limited samples at hand.

Applied Soft Computing | 2012

Scalable structural break detection

T. íLtetö; Nikolaus Hansen; Cécile Germain-Renaud; Pascal Bondon

This paper deals with a statistical model fitting procedure for non-stationary time series. This procedure selects the parameters of a piecewise autoregressive model using the Minimum Description Length principle. The existing chromosome representation of the piecewise autoregressive model and its corresponding optimisation algorithm are improved. First, we show that our proposed chromosome representation better captures the intrinsic properties of the piecewise autoregressive model. Second, we apply an optimisation algorithm, the Covariance Matrix Adaptation Evolution Strategy (CMA-ES), with which our setup converges faster to the optimal fit. Our proposed method achieves at least one order of magnitude performance improvement compared to the existing solution.

Explore More