Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Anthony Simonet is active.

Publication


Featured researches published by Anthony Simonet.


Future Generation Computer Systems | 2015

Active Data

Anthony Simonet; Gilles Fedak; Matei Ripeanu

The Big Data challenge consists in managing, storing, analyzing and visualizing these huge and ever growing data sets to extract sense and knowledge. As the volume of data grows exponentially, the management of these data becomes more complex in proportion. A key point is to handle the complexity of the data life cycle, i.e. the various operations performed on data: transfer, archiving, replication, deletion, etc. Indeed, data-intensive applications span over a large variety of devices and e-infrastructures which implies that many systems are involved in data management and processing. We propose Active Data, a programming model to automate and improve the expressiveness of data management applications. We first define the concept of data life cycle and introduce a formal model that allows to expose data life cycle across heterogeneous systems and infrastructures. The Active Data programming model allows code execution at each stage of the data life cycle: routines provided by programmers are executed when a set of events (creation, replication, transfer, deletion) happen to any data. We implement and evaluate the model with four use cases: a storage cache to Amazon-S3, a cooperative sensor network, an incremental implementation of the MapReduce programming model and automated data provenance tracking across heterogeneous systems. Altogether, these scenarios illustrate the adequateness of the model to program applications that manage distributed and dynamic data sets. We also show that applications that do not leverage on data life cycle can still benefit from Active Data to improve their performances. We present a formal model to represent the life cycle of data distributed and replicated on many systems.We leverage this model to propose a programming model that allows users to react to life cycle progression.We illustrate the approach with examples of applications that we programmed with this model.


parallel, distributed and network-based processing | 2015

Using Active Data to Provide Smart Data Surveillance to E-Science Users

Anthony Simonet; Kyle Chard; Gilles Fedak; Ian T. Foster

Modern scientific experiments often involve multiple storage and computing platforms, software tools, and analysis scripts. The resulting heterogeneous environments make data management operations challenging, the significant number of events and the absence of data integration makes it difficult to track data provenance, manage sophisticated analysis processes, and recover from unexpected situations. Current approaches often require costly human intervention and are inherently error prone. The difficulties inherent in managing and manipulating such large and highly distributed datasets also limits automated sharing and collaboration. We study a real world e-Science application involving terabytes of data, using three different analysis and storage platforms, and a number of applications and analysis processes. We demonstrate that using a specialized data life cycle and programming model -- Active Data -- we can easily implement global progress monitoring, and sharing, recover from unexpected events, and automate a range of tasks.


petascale data storage workshop | 2013

Active data: a data-centric approach to data life-cycle management

Anthony Simonet; Gilles Fedak; Matei Ripeanu; Samer Al-Kiswany

Data-intensive science offers new opportunities for innovation and discoveries, provided that large datasets can be handled efficiently. Data management for data-intensive science applications is challenging; requiring support for complex data life cycles, coordination across multiple sites, fault tolerance, and scalability to support tens of sites and petabytes of data. In this paper, we argue that data management for data-intensive science applications requires a fundamentally different management approach than the current ad-hoc task centric approach. We propose Active Data, a fundamentally novel paradigm for data life cycle management. Active Data follows two principles: data-centric and event-driven. We report on the Active Data programming model and its preliminary implementation, and discuss the benefits and limitations of the approach on recognized challenging data-intensive science use-cases.


Archive | 2012

Active Data: A Programming Model for Managing Big Data Life Cycle

Anthony Simonet; Gilles Fedak; Matei Ripeanu


poster in Computing in Hign Energy and Nuclear Physics (CHEP'12) | 2012

FlyingGrid : from Volunteer Computing to Volunteer Cloud

Oleg Lodygensky; Etienne Urbah; Simon Dadoun; Anthony Simonet; Gilles Fedak; Simon Delamare; Derrick Kondo; Laurent Duflot; Xavier Garrido


ieee international conference on smart city socialcom sustaincom | 2015

D3-MapReduce: Towards MapReduce for Distributed and Dynamic Data Sets

Haiwu He; Anthony Simonet; Julio Anjos Jose-Francisco Saray; Gilles Fedak; Bing Tang; Lu Lu; Xuanhua Shi; Hai Jin; Mircea Moca; Gheorghe Cosmin Silaghi; Asma Ben Cheikh; Heithem Abbes


Archive | 2015

New Results - MapReduce Computations on HybridDistributed Computations Infrastructures

Gilles Fedak; Julio Cesar Santos dos Anjos; Anthony Simonet


Archive | 2015

New Results - Desktop Grid Computing

Gilles Fedak; Anthony Simonet


Archive | 2015

New Results - Managing Big Data Life Cycle

Gilles Fedak; Anthony Simonet


Archive | 2015

New Software and Platforms - BitDew

Gilles Fedak; Anthony Simonet

Collaboration


Dive into the Anthony Simonet's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Matei Ripeanu

University of British Columbia

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Simon Delamare

Centre national de la recherche scientifique

View shared research outputs
Top Co-Authors

Avatar

Derrick Kondo

University of California

View shared research outputs
Top Co-Authors

Avatar

Ian T. Foster

Argonne National Laboratory

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Kyle Chard

Argonne National Laboratory

View shared research outputs
Researchain Logo
Decentralizing Knowledge