Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where David Pease is active.

Publication


Featured researches published by David Pease.


Ibm Systems Journal | 2003

IBM Storage Tank-- A heterogeneous scalable SAN file system

Jai Menon; David Pease; Robert M. Rees; Linda Marie Duyanovich; Bruce Light Hillsberg

As the amount of data being stored in the open systems environment continues to grow, new paradigms for the attachment and management of data and the underlying storage of the data are emerging. One of the emerging technologies in this area is the storage area network (SAN). Using a SAN to connect large amounts of storage to large numbers of computers gives us the potential for new approaches to accessing, sharing, and managing our data and storage. However, existing operating systems and file systems are not built to exploit these new capabilities. IBM Storage Tankâ?¢ is a SAN-based distributed file system and storage management solution that enables many of the promises of SANs, including shared heterogeneous file access, centralized management, and enterprise-wide scalability. In addition, Storage Tank borrows policy-based storage and data management concepts from mainframe computers and makes them available in the open systems environment. This paper explores the goals of the Storage Tank project, the architecture used to achieve these goals, and the current and future plans for the technology.


Ibm Systems Journal | 2003

Beyond backup toward storage management

Michael Allen Kaczmarski; Tricia Jiang; David Pease

The IBM Tivoli Storage Manager, a client/server product providing backup, archive, and space management functions in heterogeneous distributed environments, performs extensive storage management after client data have reached the server. Beyond minimizing the amount of data that a client needs to send on successive backup operations, Tivoli Storage Manager optimizes data placement for disaster recovery, for restore operations, and for fault tolerant access. It also adapts to changes in device technology. The original design points of the product in research have been expanded to provide a comprehensive set of functions that not only facilitate backup but also support content managers and deep storage applications. The design points and functions are described in this paper.


ieee conference on mass storage systems and technologies | 2010

The Linear Tape File System

David Pease; Arnon Amir; Lucas Correia Villa Real; Brian Biskeborn; Michael Richmond; Atsushi Abe

While there are many financial and practical reasons to prefer tape storage over disk for various applications, the difficultly of using tape in a general way is a major inhibitor to its wider usage. We present a file system that takes advantage of a new generation of tape hardware to provide efficient access to tape using standard, familiar system tools and interfaces. The Linear Tape File System (LTFS) makes using tape as easy, flexible, portable, and intuitive as using other removable and sharable media, such as a USB drive.


ieee international workshop on policies for distributed systems and networks | 2005

Policy-based information lifecycle management in a large-scale file system

Mandis Beigi; Murthy V. Devarakonda; Rohit Jain; Marc Adam Kaplan; David Pease; Jim Rubas; Upendra Sharma; Akshat Verma

Policy-based file lifecycle management is important for balancing storage utilization and for regulatory conformance. It poses two important challenges, the need for simple yet effective policy design and an implementation that scales to billions of files. This paper describes the design and an innovative implementation technique of policy-based lifecycle management in a prototype built as a part of IBMs new SAN file system. The policy specification leverages a key abstraction in the file system called storage pools and its ability to support location independence for files. The policy implementation uses an innovative new technique that combines concurrent policy execution and a policy decisions cache, to enable scaling to billions of files under normal usage patterns.


ieee conference on mass storage systems and technologies | 2005

An architecture for lifecycle management in very large file systems

Akshat Verma; Upendra Sharma; Jim Rubas; David Pease; Marc Adam Kaplan; Rohit Jain; Murthy V. Devarakonda; Mandis Beigi

We present a policy-based architecture STEPS for lifecycle management (LCM) in a mass scale distributed file system. The STEPS architecture is designed in the context of IBMs SAN file system (SFS) and leverages the parallelism and scalability offered by SFS, while providing a centralized point of control for policy-based management. The architecture uses novel concepts like policy cache and rate-controlled migration for efficient and non-intrusive execution of the LCM functions, while ensuring that the architecture scales with very large number of files. The architecture has been implemented and used for lifecycle management in a distributed deployment of SFS with heterogeneous data. We conduct experiments on the implementation to study the performance of the architecture. We observed that STEPS is highly scalable with increase in the number as well as the size of the file objects hosted by SFS. The performance study also demonstrated that most of the efficiency of policy execution is derived from policy cache. Further, a rate-control mechanism is necessary to ensure that users are isolated from LCM operations.


distributed systems operations and management | 2003

Eos: An Approach of Using Behavior Implications for Policy-Based Self-Management

Sandeep M. Uttamchandani; Carolyn L. Talcott; David Pease

Systems are becoming exceedingly complex to manage. As such, there is an increasing trend towards developing systems that are self-managing. Policy-based infrastructures have been used to provide a limited degree of automation, by associating actions to system-events. In the context of self-managing systems, the existing policy-specification model fails to capture the following: a) The impact of a rule on system behavior (behavior implications). This is required for automated decision-making. b) Learning mechanisms for refining the invocation heuristics by monitoring the impact of rules.


ieee conference on mass storage systems and technologies | 2005

Security vs performance: tradeoffs using a trust framework

Aameek Singh; Kaladhar Voruganti; Sandeep Gopisetty; David Pease; Linda Marie Duyanovich; Ling Liu

We present an architecture of a trust framework that can be used to intelligently tradeoff between security and performance in a SAN file system. The primary idea is to differentiate between various clients in the system based on their trustworthiness and provide them with differing levels of security and performance. Client trustworthiness reflects its expected behavior and is evaluated in an online fashion using a customizable trust model. We also describe the interface of the trust framework with an example block level security solution for an out-of-band virtualization based SAN file system (SAN FS). The proposed framework can be easily extended to provide differential treatment based on data sensitivity, using a configurable parameter of the trust model. This allows associating stringent security requirements for more sensitive data, while trading off security for better performance for less critical data, a situation regularly desired in an enterprise.


ieee international conference on cloud engineering | 2013

PDS Cloud: Long Term Digital Preservation in the Cloud

Simona Rabinovici-Cohen; John Marberg; Kenneth Nagin; David Pease

The emergence of the cloud and advanced object-based storage services provides opportunities to support novel models for long term preservation of digital assets. Among the benefits of this approach is leveraging the clouds inherent scalability and redundancy to dynamically adapt to evolving needs of digital preservation. PDS Cloud is an OAIS-based preservation-aware storage service employing multiple heterogeneous cloud providers. It materializes the logical concept of a preservation information-object into physical cloud storage objects. Preserved information can be interpreted by deploying virtual appliances in the compute cloud provisioned with cloud storage data objects together with their designated rendering software. PDS Cloud has a hierarchical data model supporting independent tenants whose assets are organized in multiple aggregations based on content and value. Continuous changes to data objects, life-cycle activities, virtual appliances and cloud providers are applied in a manner transparent to the client. PDS Cloud is being developed as an infrastructure component of the European Union ENSURE project, where it is used for preservation of medical and financial data.


ieee conference on mass storage systems and technologies | 2014

DedupT: Deduplication for tape systems

Abdullah Gharaibeh; Cornel Constantinescu; Maohua Lu; Ramani R. Routray; Anurag Sharma; Prasenjit Sarkar; David Pease; Matei Ripeanu

Deduplication is a commonly-used technique on disk-based storage pools. However, deduplication has not been used for tape-based pools: tape characteristics, such as high mount and seek times combined with data fragmentation resulting from deduplication create a toxic combination that leads to unacceptably high retrieval times. This work proposes DedupT, a system that efficiently supports deduplication on tape pools. This paper (i) details the main challenges to enable efficient deduplication on tape libraries, (ii) presents a class of solutions based on graph-modeling of similarity between data items that enables efficient placement on tapes; and (iii) presents the design and evaluation of novel cross-tape and on-tape chunk placement algorithms that alleviate tape mount time overhead and reduce on-tape data fragmentation. Using 4.5 TB of real-world workloads, we show that DedupT retains at least 95% of the deduplication efficiency. We show that DedupT mitigates major retrieval time overheads, and, due to reading less data, is able to offer better restore performance compared to the case of restoring non-deduplicated data.


ieee conference on mass storage systems and technologies | 2005

A hybrid access model for storage area networks

Aameek Singh; Kaladhar Voruganti; Sandeep Gopisetty; David Pease; Ling Liu

We present HSAN & a hybrid storage area network, which uses both in-band (like NFS (R. Sandberg et al., 1985)) and out-of-band visualization (like SAN FS (J. Menon et al., 2003)) access models. HSAN uses hybrid servers that can serve as both metadata and NAS servers to intelligently decide the access model per each request, based on the characteristics of requested data. This is in contrast to existing efforts that merely provide concurrent support for both models and do not exploit model appropriateness for requested data. The HSAN hybrid model is implemented using low overhead cache-admission and cache-replacement schemes and aims to improve overall response times for a wide variety of workloads. Preliminary analysis of the hybrid model indicates performance improvements over both models.

Researchain Logo
Decentralizing Knowledge