Kurt Stockinger
Lawrence Berkeley National Laboratory
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Kurt Stockinger.
conference on high performance computing (supercomputing) | 2002
Ann L. Chervenak; Ewa Deelman; Ian T. Foster; Leanne Guy; Wolfgang Hoschek; Adriana Iamnitchi; Carl Kesselman; Peter Z. Kunszt; Matei Ripeanu; Bob Schwartzkopf; Heinz Stockinger; Kurt Stockinger; Brian Tierney
In wide area computing systems, it is often desirable to create remote read-only copies (replicas) of files. Replication can be used to reduce access latency, improve data locality, and/or increase robustness, scalability and performance for distributed applications. We define a replica location service (RLS) as a system that maintains and provides access to information about the physical locations of copies. An RLS typically functions as one component of a data grid architecture. This paper makes the following contributions. First, we characterize RLS requirements. Next, we describe a parameterized architectural framework, which we name Giggle (for GIGa-scale Global Location Engine), within which a wide range of RLSs can be defined. We define several concrete instantiations of this framework with different performance characteristics. Finally, we present initial performance results for an RLS prototype, demonstrating that RLS systems can be constructed that meet performance goals.
ieee international conference on high performance computing data and analytics | 2003
William H. Bell; David G. Cameron; A. Paul Millar; Luigi Capozza; Kurt Stockinger; Floriano Zini
Computational grids process large, computationally intensive problems on small data sets. In contrast, data grids process large computational problems that in turn require evaluating, mining and producing large amounts of data. Replication, creating geographically disparate identical copies of data, is regarded as one of the major optimization techniques for reducing data access costs. In this paper, several replication algorithms are discussed. These algorithms were studied using the Grid simulator: OptorSim. OptorSim provides a modular framework within which optimization strategies can be studied under different Grid configurations. The goal is to explore the stability and transient behaviour of selected optimization techniques. We detail the design and implementation of OptorSim and analyze various replication algorithms based on different Grid workloads.
cluster computing and the grid | 2003
William H. Bell; David G. Cameron; R. Carvajal-Schiaffino; A. P. Millar; Kurt Stockinger; Floriano Zini
Optimising the use of Grid resources is critical for users to effectively exploit a Data Grid. Data replication is considered a major technique for reducing data access cost to Grid jobs. This paper evaluates a novel replication strategy, based on an economic model, that optimises both the selection of replicas for running jobs and the dynamic creation of replicas in Grid sites. In our model, optimisation agents are located on Grid sites and use an auction protocol for selecting the optimal replica of a data file and a prediction function to make informed decisions about local data replication. We evaluate our replication strategy with OptorSim, a Data Grid simulator developed by the authors. The experiments show that our proposed strategy results in a notable improvement over traditional replication strategies in a Grid environment.
grid computing | 2002
William H. Bell; David G. Cameron; Luigi Capozza; A. Paul Millar; Kurt Stockinger; Floriano Zini
Computational Grids normally deal with large computationally intensive problems on small data sets. In contrast, Data Grids mostly deal with large computational problems that in turn require evaluating and mining large amounts of data. Replication is regarded as one of the major optimisation techniques for providing fast data access.Within this paper, several replication algorithms are studied. This is achieved using the Grid simulator: OptorSim. OptorSim provides a modular framework within which optimisation strategies can be studied under different Grid configurations. The goal is to explore the stability and transient behaviour of selected optimisation techniques.
latin american web congress | 2003
David G. Cameron; R. Carvajal-Schiaffino; A.P. Millar; Caitriana Nicholson; Kurt Stockinger; Floriano Zini
Grid computing is fast emerging as the solution to the problems posed by the massive computational and data handling requirements of many current international scientific projects. Simulation of the grid environment is important to evaluate the impact of potential data handling strategies before being deployed on the grid. We look at the effects of various job scheduling and data replication strategies and compare them in a variety of grid scenarios, evaluating several performance metrics. We use the grid simulator OptorSim, and base our simulations on a world-wide grid testbed for data intensive high energy physics experiments. Our results show that the choice of scheduling and data replication strategies can have a large effect on both job throughput and the overall consumption of grid resources.
Lawrence Berkeley National Laboratory | 2009
Kesheng Wu; Sean Ahern; Edward W Bethel; Jacqueline H. Chen; Hank Childs; E. Cormier-Michel; Cameron Geddes; Junmin Gu; Hans Hagen; Bernd Hamann; Wendy S. Koegler; Jerome Lauret; Jeremy S. Meredith; Peter Messmer; Ekow J. Otoo; V Perevoztchikov; A. M. Poskanzer; Prabhat; Oliver Rübel; Arie Shoshani; Alexander Sim; Kurt Stockinger; Gunther H. Weber; W. M. Zhang
As scientific instruments and computer simulations produce more and more data, the task of locating the essential information to gain insight becomes increasingly difficult. FastBit is an efficient software tool to address this challenge. In this article, we present a summary of the key underlying technologies, namely bitmap compression, encoding, and binning. Together these techniques enable FastBit to answer structured (SQL) queries orders of magnitude faster than popular database systems. To illustrate how FastBit is used in applications, we present three examples involving a high-energy physics experiment, a combustion simulation, and an accelerator simulation. In each case, FastBit significantly reduces the response time and enables interactive exploration on terabytes of data.
high performance distributed computing | 2001
Dirk Düllmann; Wolfgang Hoschek; Javier Jaen-Martinez; Ben Segal; Asad Samar; Heinz Stockinger; Kurt Stockinger
Data grids are currently proposed solutions to large-scale data management problems, including efficient file transfer and replication. Large amounts of data and the world-wide distribution of data stores contribute to the complexity of the data management challenge. Recent architecture proposals and prototypes deal with replication of read-only files but do not address the replica synchronisation problem. We propose a new data grid service, called the Grid Consistency Service (GCS), that sits on top of existing data grid services and allows for replica update synchronisation and consistency maintenance. We give models for different levels of consistency, provided to the Grid user and discuss how they can be included into a replica consistency service for a data grid.
Journal of Grid Computing | 2004
David G. Cameron; A. P. Millar; C. Nicholson; R. Carvajal-Schiaffino; Kurt Stockinger; Floriano Zini
Abstract Many current international scientific projects are based on large scale applications that are both computationally complex and require the management of large amounts of distributed data. Grid computing is fast emerging as the solution to the problems posed by these applications. To evaluate the impact of resource optimisation algorithms, simulation of the Grid environment can be used to achieve important performance results before any algorithms are deployed on the Grid. In this paper, we study the effects of various job scheduling and data replication strategies and compare them in a variety of Grid scenarios using several performance metrics. We use the Grid simulator \textsf{OptorSim} , and base our simulations on a world-wide Grid testbed for data intensive high energy physics experiments. Our results show that scheduling algorithms which take into account both the file access cost of jobs and the workload of computing resources are the most effective at optimising computing and storage resources as well as improving the job throughput. The results also show that, in most cases, the economy-based replication strategies which we have developed improve the Grid performance under changing network loads.
cluster computing and the grid | 2002
Mark James Carman; Floriano Zini; Luciano Serafini; Kurt Stockinger
We are working on a system for the optimised access and replication of data on a Data Grid. Our approach is based on the use of an economic model that includes the actors and the resources in the Grid. Optimisation is obtained via interaction of the actors in the model, whose goals are maximising the profits and minimising the costs of data resource management. In the system, local optimisation results in global optimisation through emergent marketplace behaviour. In this paper we give an overview of our model and present part of the complex economic reasoning required to support this desired marketplace interaction model.
Archive | 2005
C. Nicholson; R. Carvajal-Schiaffino; Kurt Stockinger; Paul Millar; Floriano Zini; David G. Cameron
In large-scale Grids, the replication of files to different sites is an important data management mechanism which can reduce access latencies and give improved usage of resources such as network bandwidth, storage and computing power. In the search for an optimal data replication strategy, the Grid simulator OptorSim was developed as part of the European DataGrid project. Simulations of various HEP Grid scenarios have been undertaken using different job scheduling and file replication algorithms, with the experimental emphasis being on physics analysis use-cases. Previously, the CMS Data Challenge 2002 testbed and UK GridPP testbed were among those simulated; recently, our focus has been on the LCG testbed. A novel economybased strategy has been investigated as well as more traditional methods, with the economic models showing distinct advantages for heavily loaded grids.