Is this you? Create Your Porfile

Mark W. Storer

University of California, Santa Cruz

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Mark W. Storer is active.

Explore More

Publication

Featured researches published by Mark W. Storer.

workshop on storage security and survivability | 2008

Secure data deduplication

Mark W. Storer; Kevin M. Greenan; Darrell D. E. Long; Ethan L. Miller

As the world moves to digital storage for archival purposes, there is an increasing demand for systems that can provide secure data storage in a cost-effective manner. By identifying common chunks of data both within and between files and storing them only once, deduplication can yield cost savings by increasing the utility of a given amount of storage. Unfortunately, deduplication exploits identical content, while encryption attempts to make all content appear random; the same content encrypted with two different keys results in very different ciphertext. Thus, combining the space efficiency of deduplication with the secrecy aspects of encryption is problematic. We have developed a solution that provides both data security and space efficiency in single-server storage and distributed storage systems. Encryption keys are generated in a consistent manner from the chunk data; thus, identical chunks will always encrypt to the same ciphertext. Furthermore, the keys cannot be deduced from the encrypted chunk data. Since the information each user needs to access and decrypt the chunks that make up a file is encrypted using a key known only to the user, even a full compromise of the system cannot reveal which chunks are used by which users.

ACM Transactions on Storage | 2009

POTSHARDS—a secure, recoverable, long-term archival storage system

Mark W. Storer; Kevin M. Greenan; Ethan L. Miller; Kaladhar Voruganti

Users are storing ever-increasing amounts of information digitally, driven by many factors including government regulations and the publics desire to digitally record their personal histories. Unfortunately, many of the security mechanisms that modern systems rely upon, such as encryption, are poorly suited for storing data for indefinitely long periods of time; it is very difficult to manage keys and update cryptosystems to provide secrecy through encryption over periods of decades. Worse, an adversary who can compromise an archive need only wait for cryptanalysis techniques to catch up to the encryption algorithm used at the time of the compromise in order to obtain “secure” data. To address these concerns, we have developed POTSHARDS, an archival storage system that provides long-term security for data with very long lifetimes without using encryption. Secrecy is achieved by using unconditionally secure secret splitting and spreading the resulting shares across separately managed archives. Providing availability and data recovery in such a system can be difficult; thus, we use a new technique, approximate pointers, in conjunction with secure distributed RAID techniques to provide availability and reliability across independent archives. To validate our design, we developed a prototype POTSHARDS implementation. In addition to providing us with an experimental testbed, this prototype helped us to understand the design issues that must be addressed in order to maximize security.

workshop on storage security and survivability | 2006

Long-term threats to secure archives

Mark W. Storer; Kevin M. Greenan; Ethan L. Miller

Archival storage systems are designed for a write-once, read-maybe usage model which places an emphasis on the long-term preservation of their data contents. In contrast to traditional storage systems in which data lifetimes are measured in months or possibly years, data lifetimes in an archival system are measured in decades. Secure archival storage has the added goal of providing controlled access to its long-term contents. In contrast, public archival systems aim to ensure that their contents are available to anyone.Since secure archival storage systems must store data over much longer periods of time, new threats emerge that affect the security landscape in many novel, subtle ways. These security threats endanger the secrecy, availability and integrity of the archival storage contents. Adequate understanding of these threats is essential to effectively devise new policies and mechanisms to guard against them. We discuss many of these threats in this new context to fill this gap, and show how existing systems meet (or fail to meet) these threats.

ACM Transactions on Storage | 2012

Analysis of Workload Behavior in Scientific and Historical Long-Term Data Repositories

Ian F. Adams; Mark W. Storer; Ethan L. Miller

The scope of archival systems is expanding beyond cheap tertiary storage: scientific and medical data is increasingly digital, and the public has a growing desire to digitally record their personal histories. Driven by the increase in cost efficiency of hard drives, and the rise of the Internet, content archives have become a means of providing the public with fast, cheap access to long-term data. Unfortunately, designers of purpose-built archival systems are either forced to rely on workload behavior obtained from a narrow, anachronistic view of archives as simply cheap tertiary storage, or extrapolate from marginally related enterprise workload data and traditional library access patterns. To close this knowledge gap and provide relevant input for the design of effective long-term data storage systems, we studied the workload behavior of several systems within this expanded archival storage space. Our study examined several scientific and historical archives, covering a mixture of purposes, media types, and access models---that is, public versus private. Our findings show that, for more traditional private scientific archival storage, files have become larger, but update rates have remained largely unchanged. However, in the public content archives we observed, we saw behavior that diverges from the traditional “write-once, read-maybe” behavior of tertiary storage. Our study shows that the majority of such data is modified---sometimes unnecessarily---relatively frequently, and that indexing services such as Google and internal data management processes may routinely access large portions of an archive, accounting for most of the accesses. Based on these observations, we identify areas for improving the efficiency and performance of archival storage systems.

Third IEEE International Security in Storage Workshop (SISW'05) | 2005

POTSHARDS: storing data for the long-term without encryption

Kevin M. Greenan; Mark W. Storer; Ethan L. Miller; Carlos Maltzahn

Many archival storage systems rely on keyed encryption to ensure privacy. A data object in such a system is exposed once the key used to encrypt the data is compromised. When storing data for as long as a few decades or centuries, the use of keyed encryption becomes a real concern. The exposure of a key is bounded by computation effort and management of encryption keys becomes as much of a problem as the management of the data the key is protecting. POTSHARDS is a secure, distributed, very long-term archival storage system that eliminates the use of keyed encryption through the use of unconditionally secure secret sharing. A (m, n) unconditionally secure secret sharing scheme splits an object up into n shares, which provably gives no information about the object, unless m of the shares collaborate. POTSHARDS separates security and redundancy by utilizing two levels of secret sharing. This allows for secure reconstruction upon failure and more flexible storage patterns. The data structures used in POTSHARDS are organized in such a way that an unauthorized user attempting to collect shares will not go unnoticed since it is very difficult to launch a targeted attack on the system. A malicious user would have a difficult time finding the shares for a particular file in a timely or efficient manner. Since POTSHARDS provides secure storage for arbitrarily long periods of time, its data structures include built-in support for consistency checking and data migration. This enables reliable data churning and the movement of data between storage devices

ieee international conference on high performance computing data and analytics | 2012

Usage behavior of a large-scale scientific archive

Ian F. Adams; Brian A. Madden; Joel Cameron Frank; Mark W. Storer; Ethan L. Miller; Gene Harano

Archival storage systems for scientific data have been growing in both size and relevance over the past two decades, yet researchers and system designers alike must rely on limited and obsolete knowledge to guide archival management and design. To address this issue, we analyzed three years of filelevel activities from the NCAR mass storage system, providing valuable insight into a large-scale scientific archive with over 1600 users, tens of millions of files, and petabytes of data. Our examination of system usage showed that, while a subset of users were responsible for most of the activity, this activity was widely distributed at the file level. We also show that the physical grouping of files and directories on media can improve archival storage system performance. Based on our observations, we provide suggestions and guidance for both future scientific archival system designs as well as improved tracing of archival activity.

modeling, analysis, and simulation on computer and telecommunication systems | 2010

Examining Energy Use in Heterogeneous Archival Storage Systems

Ian F. Adams; Ethan L. Miller; Mark W. Storer

Controlling energy usage in data centers, and storage in particular, continues to rise in importance. Many systems and models have examined energy efficiency through intelligent spin-down of disks and novel data layouts, yet little work has been done to examine how power usage over the course of months to years is impacted by the characteristics of the storage devices chosen for use. Long-term power usage is particularly important for archival storage systems, since it is a large contributor to overall system cost. In this work, we begin exploring the impact that broad policies (e.g. utilize high-bandwidth devices first) have upon the power efficiency of a disk based archival storage system of heterogeneous devices over the course of a year. Using a discrete event simulator, we found that even simple heuristic policies for allocating space can have significant impact on the power usage of a system. We show that our system growth policies can cause power usage to vary from 10% higher to 18% lower than a naive random data allocation scheme. We also found that under low read rates power is dominated by that used in standby modes. Most interestingly, we found cases where concentrating data on fewer devices yielded increased power usage.

petascale data storage workshop | 2008

Logan: Automatic management for evolvable, large-scale, archival storage

Mark W. Storer; Kevin M. Greenan; Ian F. Adams; Ethan L. Miller; Darrell D. E. Long; Kaladhar Voruganti

Archival storage systems designed to preserve scientific data, business data, and consumer data must maintain and safeguard tens to hundreds of petabytes of data on tens of thousands of media for decades. Such systems are currently designed in the same way as higher-performance, shorter-term storage systems, which have a useful lifetime but must be replaced in their entirety via a ldquofork-liftrdquo upgrade. Thus, while existing solutions can provide good energy efficiency and relatively low cost, they do not adapt well to continuous improvements in technology, becoming less efficient relative to current technology as they age. In an archival storage environment, this paradigm implies an endless series of wholesale migrations and upgrades to remain efficient and up to date. Our approach, Logan, manages node addition, removal, and failure on a distributed network of intelligent storage appliances, allowing the system to gradually evolve as device technology advances. By automatically handling most of the common administration chores-integrating new devices into the system, managing groups of devices that work together to provide redundancy, and recovering from failed devices-Logan reduces management overhead and thus cost. Logan can also improve cost and space efficiency by identifying and decommissioning outdated devices, thus reducing space and power requirements for the archival storage system.

modeling, analysis, and simulation on computer and telecommunication systems | 2013

Validating Storage System Instrumentation

Ian F. Adams; Mark W. Storer; Avani Wildani; Ethan L. Miller; Brian A. Madden

There is a large body of work-such as system administration and intrusion detection-that relies upon storage system logs and snapshots. These solutions rely on accurate system records, however, little effort has been made to verify the correctness of logging instrumentation and log reliability. We present a solution, called ExDiff, that uses expectation differencing to validate storage system logs. Our solution can identify development errors such as the omission of a logging point and runtime errors such as log crashes. ExDiff uses metadata snapshots and activity logs to predict the expected state of the system and compares that with the systems actual state. Mismatches between the expected and actual metadata states can then be used to highlight gaps in log coverage, as well as aid in identifying specific types of missing entries. We show that ExDiff provides valuable insight to system designers, administrators and researchers by accurately identifying gaps in log coverage, providing clues useful in isolating specific types of missing log entries, and highlighting potential misunderstandings in logged action.

file and storage technologies | 2008