Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Ian F. Adams is active.

Publication


Featured researches published by Ian F. Adams.


ACM Transactions on Storage | 2012

Analysis of Workload Behavior in Scientific and Historical Long-Term Data Repositories

Ian F. Adams; Mark W. Storer; Ethan L. Miller

The scope of archival systems is expanding beyond cheap tertiary storage: scientific and medical data is increasingly digital, and the public has a growing desire to digitally record their personal histories. Driven by the increase in cost efficiency of hard drives, and the rise of the Internet, content archives have become a means of providing the public with fast, cheap access to long-term data. Unfortunately, designers of purpose-built archival systems are either forced to rely on workload behavior obtained from a narrow, anachronistic view of archives as simply cheap tertiary storage, or extrapolate from marginally related enterprise workload data and traditional library access patterns. To close this knowledge gap and provide relevant input for the design of effective long-term data storage systems, we studied the workload behavior of several systems within this expanded archival storage space. Our study examined several scientific and historical archives, covering a mixture of purposes, media types, and access models---that is, public versus private. Our findings show that, for more traditional private scientific archival storage, files have become larger, but update rates have remained largely unchanged. However, in the public content archives we observed, we saw behavior that diverges from the traditional “write-once, read-maybe” behavior of tertiary storage. Our study shows that the majority of such data is modified---sometimes unnecessarily---relatively frequently, and that indexing services such as Google and internal data management processes may routinely access large portions of an archive, accounting for most of the accesses. Based on these observations, we identify areas for improving the efficiency and performance of archival storage systems.


ieee international conference on high performance computing data and analytics | 2012

Usage behavior of a large-scale scientific archive

Ian F. Adams; Brian A. Madden; Joel Cameron Frank; Mark W. Storer; Ethan L. Miller; Gene Harano

Archival storage systems for scientific data have been growing in both size and relevance over the past two decades, yet researchers and system designers alike must rely on limited and obsolete knowledge to guide archival management and design. To address this issue, we analyzed three years of filelevel activities from the NCAR mass storage system, providing valuable insight into a large-scale scientific archive with over 1600 users, tens of millions of files, and petabytes of data. Our examination of system usage showed that, while a subset of users were responsible for most of the activity, this activity was widely distributed at the file level. We also show that the physical grouping of files and directories on media can improve archival storage system performance. Based on our observations, we provide suggestions and guidance for both future scientific archival system designs as well as improved tracing of archival activity.


modeling, analysis, and simulation on computer and telecommunication systems | 2014

An Economic Perspective of Disk vs. Flash Media in Archival Storage

Preeti Gupta; Avani Wildani; Ethan L. Miller; Daniel C. Rosenthal; Ian F. Adams; Christina E. Strong; Andy Hospodor

For three decades, Kryders law correctly predicted an exponential increase in bit density on disk platters, leading to an exponential drop in cost per gigabyte, and thus to an entrenched expectation that if data could be stored for a few years the incremental cost of storing it forever would be minimal. However, disk now is over 7 times as expensive as Kryders law would have predicted, and industry projections suggest that in 2020 the gap will reach 200 times, disrupting this expectation. Our model shows that archives based upon alternative media are surprisingly cost competitive with archives based upon traditional disk media over the long-term. We propose using Archival Flash for long-term data preservation, with the trade off between longer data retention period and lower write cycles.


acm international conference on systems and storage | 2013

Examining extended and scientific metadata for scalable index designs

Aleatha Parker-Wood; Darrell D. E. Long; Brian A. Madden; Ian F. Adams; Michael McThrow; Avani Wildani

While file system metadata is well characterized by a variety of workload studies, scientific metadata is much less well understood. We characterize scientific metadata, in order to better understand the implications for index design. Based on our findings, existing solutions for either file system or scientific search will not suffice for indexing a large scientific file system. We describe the problems with existing solutions, and suggest column stores as an alternative approach.


modeling, analysis, and simulation on computer and telecommunication systems | 2014

PERSES: Data Layout for Low Impact Failures

Avani Wildani; Ethan L. Miller; Ian F. Adams; Darrell D. E. Long

Growth in disk capacity continues to outpace advances in read speed and device reliability. This has led to storage systems spending increasing amounts of time in a degraded state while failed disks reconstruct. Users and applications that do not use the data on the failed or degraded drives are negligibly impacted by the failure, increasing the perceived performance of the system. We leverage this observation with PERSES, a statistical data allocation scheme to reduce the performance impact of reconstruction after disk failure. PERSES reduces degradation from the perspective of the user by clustering data on disks such that data with high probability of co-access is placed on the same device as often as possible. Trace-driven simulations show that, by laying out data with PERSES, we can reduce the perceived time lost due to failure over three years by up to 80% compared to arbitrary allocation.


modeling, analysis, and simulation on computer and telecommunication systems | 2010

Examining Energy Use in Heterogeneous Archival Storage Systems

Ian F. Adams; Ethan L. Miller; Mark W. Storer

Controlling energy usage in data centers, and storage in particular, continues to rise in importance. Many systems and models have examined energy efficiency through intelligent spin-down of disks and novel data layouts, yet little work has been done to examine how power usage over the course of months to years is impacted by the characteristics of the storage devices chosen for use. Long-term power usage is particularly important for archival storage systems, since it is a large contributor to overall system cost. In this work, we begin exploring the impact that broad policies (e.g. utilize high-bandwidth devices first) have upon the power efficiency of a disk based archival storage system of heterogeneous devices over the course of a year. Using a discrete event simulator, we found that even simple heuristic policies for allocating space can have significant impact on the power usage of a system. We show that our system growth policies can cause power usage to vary from 10% higher to 18% lower than a naive random data allocation scheme. We also found that under low read rates power is dominated by that used in standby modes. Most interestingly, we found cases where concentrating data on fewer devices yielded increased power usage.


petascale data storage workshop | 2008

Logan: Automatic management for evolvable, large-scale, archival storage

Mark W. Storer; Kevin M. Greenan; Ian F. Adams; Ethan L. Miller; Darrell D. E. Long; Kaladhar Voruganti

Archival storage systems designed to preserve scientific data, business data, and consumer data must maintain and safeguard tens to hundreds of petabytes of data on tens of thousands of media for decades. Such systems are currently designed in the same way as higher-performance, shorter-term storage systems, which have a useful lifetime but must be replaced in their entirety via a ldquofork-liftrdquo upgrade. Thus, while existing solutions can provide good energy efficiency and relatively low cost, they do not adapt well to continuous improvements in technology, becoming less efficient relative to current technology as they age. In an archival storage environment, this paradigm implies an endless series of wholesale migrations and upgrades to remain efficient and up to date. Our approach, Logan, manages node addition, removal, and failure on a distributed network of intelligent storage appliances, allowing the system to gradually evolve as device technology advances. By automatically handling most of the common administration chores-integrating new devices into the system, managing groups of devices that work together to provide redundancy, and recovering from failed devices-Logan reduces management overhead and thus cost. Logan can also improve cost and space efficiency by identifying and decommissioning outdated devices, thus reducing space and power requirements for the archival storage system.


modeling, analysis, and simulation on computer and telecommunication systems | 2013

Single-Snapshot File System Analysis

Avani Wildani; Ian F. Adams; Ethan L. Miller

Metadata snapshots are a common method for gaining insight into file systems due to their small size and relative ease of acquisition. Since they are static, most researchers have used them for relatively simple analyses such as file size distributions and age of files. We hypothesize that it is possible to gain much richer insights into file system and user behavior by clustering features in metadata snapshots and comparing the entropy within clusters to the entropy within natural partitions such as directory hierarchies. We discuss several different methods for gaining deeper insights into metadata snapshots, and show a small proof of concept using data from Los Alamos National Laboratories. In our initial work, we see evidence that it is possible to identify user locality information, traditionally the purview of dynamic traces, using a single static snapshot.


modeling, analysis, and simulation on computer and telecommunication systems | 2013

Validating Storage System Instrumentation

Ian F. Adams; Mark W. Storer; Avani Wildani; Ethan L. Miller; Brian A. Madden

There is a large body of work-such as system administration and intrusion detection-that relies upon storage system logs and snapshots. These solutions rely on accurate system records, however, little effort has been made to verify the correctness of logging instrumentation and log reliability. We present a solution, called ExDiff, that uses expectation differencing to validate storage system logs. Our solution can identify development errors such as the omission of a logging point and runtime errors such as log crashes. ExDiff uses metadata snapshots and activity logs to predict the expected state of the system and compares that with the systems actual state. Mismatches between the expected and actual metadata states can then be used to highlight gaps in log coverage, as well as aid in identifying specific types of missing entries. We show that ExDiff provides valuable insight to system designers, administrators and researchers by accurately identifying gaps in log coverage, providing clues useful in isolating specific types of missing log entries, and highlighting potential misunderstandings in logged action.


ieee international conference on cloud computing technology and science | 2009

Maximizing efficiency by trading storage for computation

Ian F. Adams; Darrell D. E. Long; Ethan L. Miller; Shankar Pasupathy; Mark W. Storer

Collaboration


Dive into the Ian F. Adams's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Avani Wildani

University of California

View shared research outputs
Top Co-Authors

Avatar

Mark W. Storer

University of California

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge