Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Alain Roy is active.

Publication


Featured researches published by Alain Roy.


Journal of Physics: Conference Series | 2007

The open science grid

R. Pordes; D. Petravick; Bill Kramer; Doug Olson; Miron Livny; Alain Roy; P. Avery; K. Blackburn; Torre Wenaus; F. Würthwein; Ian T. Foster; Robert Gardner; Michael Wilde; Alan Blatecky; John McGee; Rob Quick

The Open Science Grid (OSG) provides a distributed facility where the Consortium members provide guaranteed and opportunistic access to shared computing and storage resources. OSG provides support for and evolution of the infrastructure through activities that cover operations, security, software, troubleshooting, addition of new capabilities, and support for existing and engagement with new communities. The OSG SciDAC-2 project provides specific activities to manage and evolve the distributed infrastructure and support its use. The innovative aspects of the project are the maintenance and performance of a collaborative (shared & common) petascale national facility over tens of autonomous computing sites, for many hundreds of users, transferring terabytes of data a day, executing tens of thousands of jobs a day, and providing robust and usable resources for scientific groups of all types and sizes. More information can be found at the OSG web site: www.opensciencegrid.org.


Computer Communications | 2004

End-to-end quality of service for high-end applications

Ian T. Foster; Markus Fidler; Alain Roy; Volker Sander; Linda Winkler

High-end networked applications such as distance visualization, distributed data analysis, and advanced collaborative environments have demanding quality of service (QoS) requirements. Particular challenges include concurrent flows with different QoS specifications, high-bandwidth flows, application-level monitoring and control, and end-to-end QoS across networks and other devices. We describe a QoS architecture and implementation that together help to address these challenges. The General-purpose Architecture for Reservation and Allocation (GARA) supports flow-specific QoS specification, immediate and advance reservation, and online monitoring and control of both individual resources and heterogeneous resource ensembles. Mechanisms provided by the Globus Toolkit are used to address resource discovery and security issues when resources span multiple administrative domains. Our prototype GARA implementation builds on differentiated services mechanisms to enable the coordinated management of two distinct flow types-foreground media flows and background bulk transfers-as well as the co-reservation of networks, CPUs, and storage systems. We present results obtained on a wide area differentiated services testbed that demonstrate our ability to deliver QoS for realistic flows.


Scopus | 2007

Workflow Management in Condor

Peter Couvares; Tevfik Kosar; Alain Roy; Jeff Weber; Kent Wenger

The Condor project began in 1988 and has evolved into a feature-rich batch system that targets high-throughput computing; that is, Condor ([262], [414]) focuses on providing reliable access to computing over long periods of time instead of highly tuned, high-performance computing for short periods of time or a small number of applications.


high performance distributed computing | 2002

Flexibility, manageability, and performance in a Grid storage appliance

John M. Bent; Venkateshwaran Venkataramani; Nick LeRoy; Alain Roy; Joseph Stanley; Andrea C. Arpaci-Dusseau; Remzi H. Arpaci-Dusseau; Miron Livny

We present NeST a flexible software-only storage appliance designed to meet the storage needs of the Grid. NeST has three key features that make it well-suited for deployment in a Grid environment. First, NeST provides a generic data transfer architecture that supports multiple data transfer protocols (including GridFTP and NFS), and allows for the easy addition of new protocols. Second, NeST is dynamic, adapting itself on-the-fly so that it runs effectively on a wide range of hardware and software platforms. Third, NeST is Grid-aware, implying that features that are necessary for integration into the Grid, such as storage space guarantees, mechanisms for resource and data discovery, user authentication, and quality of service, are a part of the NeST infrastructure.


international conference on cluster computing | 2012

ERMS: An Elastic Replication Management System for HDFS

Zhendong Cheng; Zhongzhi Luan; You Meng; Yijing Xu; Depei Qian; Alain Roy; Ning Zhang; Gang Guan

The Hadoop Distributed File System (HDFS) is a distributed storage system that stores large-scale data sets reliably and streams those data sets to applications at high bandwidth. HDFS provides high performance, reliability and availability by replicating data, typically three copies of every data. The data in HDFS changes in popularity over time. To get better performance and higher disk utilization, the replication policy of HDFS should be elastic and adapt to data popularity. In this paper, we describe ERMS, an elastic replication management system for HDFS. ERMS provides an active/standby storage model for HDFS. It utilizes a complex event processing engine to distinguish real-time data types, and then dynamically increases extra replicas for hot data, cleans up these extra replicas when the data cool down, and uses erasure codes for cold data. ERMS also introduces a replica placement strategy for the extra replicas of hot data and erasure coding parities. The experiments show that ERMS effectively improves the reliability and performance of HDFS and reduce storage overhead.


Grid resource management | 2004

GARA: a uniform quality of service architecture

Alain Roy; Volker Sander

Many Grid applications, such as interactive and collaborative environments, can benefit from guarantees for resource performance or quality of service (QoS). Although QoS mechanisms have been developed for different types of resources, they are often difficult to use together because they have different semantics and interfaces. Moreover, many of them do not allow QoS requests to be made in advance of when they are needed. In this chapter, we describe GARA, which is a modular and extensible QoS architecture that allows users to make advance reservations for different types of QoS. We also describe our implementation of network QoS in detail.


Grid resource management | 2004

Condor and preemptive resume scheduling

Alain Roy; Miron Livny

Condor is a batch job system that, unlike many other scheduling systems, allows users to access both dedicated computers and computers that are not always available, perhaps because they are used as desktop computers or are not under local control. This introduces a number of problems, some of which are solved by Condors preemptive resume scheduling, which is the focus of this paper. Preemptive resume scheduling allows jobs to be interrupted while running, and then restarted later. Condor uses preemption in several ways in order to implement the policies supplied by users, computer owners, and system administrators.


grid computing | 2011

A Science Driven Production Cyberinfrastructure--the Open Science Grid

Mine Altunay; P. Avery; K. Blackburn; Brian Bockelman; M. Ernst; Dan Fraser; Robert Quick; Robert Gardner; Sebastien Goasguen; Tanya Levshina; Miron Livny; John McGee; Doug Olson; R. Pordes; Maxim Potekhin; Abhishek Singh Rana; Alain Roy; Chander Sehgal; I. Sfiligoi; Frank Wuerthwein

This article describes the Open Science Grid, a large distributed computational infrastructure in the United States which supports many different high-throughput scientific applications, and partners (federates) with other infrastructures nationally and internationally to form multi-domain integrated distributed systems for science. The Open Science Grid consortium not only provides services and software to an increasingly diverse set of scientific communities, but also fosters a collaborative team of practitioners and researchers who use, support and advance the state of the art in large-scale distributed computing. The scale of the infrastructure can be expressed by the daily throughput of around seven hundred thousand jobs, just under a million hours of computing, a million file transfers, and half a petabyte of data movement. In this paper we introduce and reflect on some of the OSG capabilities, usage and activities.


collaborative computing | 2008

Archer: A Community Distributed Computing Infrastructure for Computer Architecture Research and Education

Renato J. O. Figueiredo; P. Oscar Boykin; José A. B. Fortes; Tao Li; Jie-Kwon Peir; David Wolinsky; Lizy Kurian John; David R. Kaeli; David J. Lilja; Sally A. McKee; Gokhan Memik; Alain Roy; Gary S. Tyson

This paper introduces Archer, a community-based computing infrastructure supporting computer architecture research and education. The Archer system builds on virtualization techniques to provide a collaborative environment that facilitates sharing of computational resources and data among users. It integrates batch scheduling middleware to deliver high-throughput computing services aggregated from resources distributed across wide-area networks and owned by different participating entities in a seamless manner. The paper discusses the motivations that have led to the design of Archer, describes its core middleware components, and presents an analysis of the functionality and performance of the first wide-area deployment of Archer running a representative computer architecture simulation workload.


Concurrency and Computation: Practice and Experience | 2006

Transparent access to Grid resources for user software

S. Klous; Jaime Frey; Se-Chang Son; Douglas Thain; Alain Roy; Miron Livny; Jo van den Brand

Grid computing promises access to large amounts of computing power, but so far adoption of Grid computing has been limited to highly specialized experts for three reasons. First, users are used to batch systems, and interfaces to Grid software are often complex and different to those in batch systems. Second, users are used to having transparent file access, which Grid software does not conveniently provide. Third, efforts to achieve wide‐spread coordination of computers while solving the first two problems is hampered when clusters are on private networks. Here we bring together a variety of software that allows users to almost transparently use Grid resources as if they were local resources while providing transparent access to files, even when private networks intervene. As a motivating example, the BaBar Monte Carlo production system is deployed on a truly distributed environment, the European DataGrid, without any modification to the application itself. Copyright

Collaboration


Dive into the Alain Roy's collaboration.

Top Co-Authors

Avatar

Miron Livny

University of Wisconsin-Madison

View shared research outputs
Top Co-Authors

Avatar

P. Avery

University of Florida

View shared research outputs
Top Co-Authors

Avatar

Doug Olson

Lawrence Berkeley National Laboratory

View shared research outputs
Top Co-Authors

Avatar

F. Würthwein

University of California

View shared research outputs
Top Co-Authors

Avatar

Ian T. Foster

Argonne National Laboratory

View shared research outputs
Top Co-Authors

Avatar

Jaime Frey

University of Wisconsin-Madison

View shared research outputs
Top Co-Authors

Avatar

John McGee

Renaissance Computing Institute

View shared research outputs
Top Co-Authors

Avatar

K. Blackburn

California Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge