John R. Douceur | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where John R. Douceur is active.

Explore More

Publication

Featured researches published by John R. Douceur.

operating systems design and implementation | 2002

Farsite: federated, available, and reliable storage for an incompletely trusted environment

Atul Adya; William J. Bolosky; Miguel Castro; Gerald Cermak; Ronnie Chaiken; John R. Douceur; Jon Howell; Jacob R. Lorch; Marvin M. Theimer; Roger Wattenhofer

Farsite is a secure, scalable file system that logically functions as a centralized file server but is physically distributed among a set of untrusted computers. Farsite provides file availability and reliability through randomized replicated storage; it ensures the secrecy of file contents with cryptographic techniques; it maintains the integrity of file and directory data with a Byzantine-fault-tolerant protocol; it is designed to be scalable by using a distributed hint mechanism and delegation certificates for pathname translations; and it achieves good performance by locally caching file data, lazily propagating file updates, and varying the duration and granularity of content leases. We report on the design of Farsite and the lessons we have learned by implementing much of that design.

measurement and modeling of computer systems | 2000

Feasibility of a serverless distributed file system deployed on an existing set of desktop PCs

William J. Bolosky; John R. Douceur; David Ely; Marvin M. Theimer

We consider an architecture for a serverless distributed file system that does not assume mutual trust among the client computers. The system provides security, availability, and reliability by distributing multiple encrypted replicas of each file among the client machines. To assess the feasibility of deploying this system on an existing desktop infrastructure, we measure and analyze a large set of client machines in a commercial environment. In particular, we measure and report results on disk usage and content; file activity; and machine uptimes, lifetimes, and loads. We conclude that the measured desktop infrastructure would passably support our proposed system, providing availability on the order of one unfilled file request per user per thousand days.

measurement and modeling of computer systems | 1999

A large-scale study of file-system contents

John R. Douceur; William J. Bolosky

We collect and analyze a snapshot of data from 10,568 file systems of 4801 Windows personal computers in a commercial environment. The file systems contain 140 million files totaling 10.5 TB of data. We develop analytical approximations for distributions of file size, file age, file functional lifetime, directory size, and directory depth, and we compare them to previously derived distributions. We find that file and directory sizes are fairly consistent across file systems, but file lifetimes vary widely and are significantly affected by the job function of the user. Larger files tend to be composed of blocks sized in powers of two, which noticeably affects their size distribution. File-name extensions are strongly correlated with file sizes, and extension popularity varies with user job function. On average, file systems are only half full.

acm special interest group on data communication | 2008

Donnybrook: enabling large-scale, high-speed, peer-to-peer games

Ashwin R. Bharambe; John R. Douceur; Jacob R. Lorch; Thomas Moscibroda; Jeffrey Pang; Srinivasan Seshan; Xinyu Zhuang

Without well-provisioned dedicated servers, modern fast-paced action games limit the number of players who can interact simultaneously to 16-32. This is because interacting players must frequently exchange state updates, and high player counts would exceed the bandwidth available to participating machines. In this paper, we describe Donnybrook, a system that enables epic-scale battles without dedicated server resources, even in a fast-paced game with tight latency bounds. It achieves this scalability through two novel components. First, it reduces bandwidth demand by estimating what players are paying attention to, thereby enabling it to reduce the frequency of sending less important state updates. Second, it overcomes resource and interest heterogeneity by disseminating updates via a multicast system designed for the special requirements of games: that they have multiple sources, are latency-sensitive, and have frequent group membership changes. We present user study results using a prototype implementation based on Quake III that show our approach provides a desirable user experience. We also present simulation results that demonstrate Donnybrooks efficacy in enabling battles of up to 900 players.

european conference on computer systems | 2011

Cycles, cells and platters: an empirical analysisof hardware failures on a million consumer PCs

Edmund B. Nightingale; John R. Douceur; Vince Orgovan

We present the first large-scale analysis of hardware failure rates on a million consumer PCs. We find that many failures are neither transient nor independent. Instead, a large portion of hardware induced failures are recurrent: a machine that crashes from a fault in hardware is up to two orders of magnitude more likely to crash a second time. For example, machines with at least 30 days of accumulated CPU time over an 8 month period had a 1 in 190 chance of crashing due to a CPU subsystem fault. Further, machines that crashed once had a probability of 1 in 3.3 of crashing a second time. Our study examines failures due to faults within the CPU, DRAM and disk subsystems. Our analysis spans desktops and laptops, CPU vendor, overclocking, underclocking, generic vs. brand name, and characteristics such as machine speed and calendar age. Among our many results, we find that CPU fault rates are correlated with the number of cycles executed, underclocked machines are significantly more reliable than machines running at their rated speed, and laptops are more reliable than desktops.

symposium on operating systems principles | 1997

Distributed schedule management in the Tiger video fileserver

William J. Bolosky; Robert P. Fitzgerald; John R. Douceur

Tiger is a scalable, fault-tolerant video file server constructed from a collection of computers connected by a switched network. All content files are striped across all of the computers and disks in a Tiger system. In order to prevent conflicts for a particular resource between two viewers, Tiger schedules viewers so that they do not require access to the same resource at the same time. In the abstract, there is a single, global schedule that describes all of the viewers in the system. In practice, the schedule is distributed among all of the computers in the system, each of which has a possibly partially inconsistent view of a subset of the schedule. By using such a relaxed consistency model for the schedule, Tiger achieves scalability and fault tolerance while still providing the consistent, coordinated service required by viewers.

ieee symposium on security and privacy | 2011

Memoir: Practical State Continuity for Protected Modules

Bryan Parno; Jacob R. Lorch; John R. Douceur; James Mickens; Jonathan M. McCune

To protect computation, a security architecture must safeguard not only the software that performs it but also the state on which the software operates. This requires more than just preserving state confidentiality and integrity, since, e.g., software may err if its state is rolled back to a correct but stale version. For this reason, we present Memoir, the first system that fully ensures the continuity of a protected software modules state. In other words, it ensures that a modules state remains persistently and completely inviolate. A key contribution of Memoir is a technique to ensure rollback resistance without making the system vulnerable to system crashes. It does this by using a deterministic module, storing a concise summary of the modules request history in protected NVRAM, and allowing only safe request replays after crashes. Since frequent NVRAM writes are impractical on modern hardware, we present a novel way to leverage limited trusted hardware to minimize such writes. To ensure the correctness of our design, we develop formal, machine-verified proofs of safety. To demonstrate Memoirs practicality, we have built it and conducted evaluations demonstrating that it achieves reasonable performance on real hardware. Furthermore, by building three useful Memoir-protected modules that rely critically on state continuity, we demonstrate Memoirs versatility.

symposium on reliable distributed systems | 2001

Optimizing file availability in a secure serverless distributed file system

John R. Douceur; Roger Wattenhofer

Farsite is a secure, scalable, distributed file system that logically functions as a centralized file server but that is physically realized on a set of client desktop computers. Farsite provides security, reliability and availability by storing replicas of each file on multiple machines. It continuously monitors machine availability and relocates replicas as necessary to maximize the effective availability of the system. We evaluate several replica placement methods using large-scale simulation with machine availability data from over 50,000 desktop computers. We find that initially placing replicas in an availability-sensitive fashion yields pathological results, whereas very good results are obtained by random initial placement followed by incremental improvement using a scalable, distributed, fault-tolerant and attack-resistant hill-climbing algorithm. The algorithm is resilient to severe restrictions on communication and replica placement, and it does not excessively co-locate replicas of different files on the same set of machines.

symposium on operating systems principles | 1999

Progress-based regulation of low-importance processes

John R. Douceur; William J. Bolosky

MS Manners is a mechanism that employs progress-based regulation to prevent resource contention with low-importance processes from degrading the performance of high-importance processes. The mechanism assumes that resource contention that degrades the performance of a high-importance process will also retard the progress of the low-importance process. MS Manners detects this contention by monitoring the progress of the low-importance process and inferring resource contention from a drop in the progress rate. This technique recognizes contention over any system resource, as long as the performance impact on contending processes is roughly symmetric. MS Manners employs statistical mechanisms to deal with stochastic progress measurements; it automatically calibrates a target progress rate, so no manual tuning is required; it supports multiple progress metrics from applications that perform several distinct tasks; and it orchestrates multiple low-importance processes to prevent measurement interference. Experiments with two low-importance applications show that MS Manners can reduce the degradation of high-importance processes by up to an order of magnitude.

european conference on computer systems | 2006

The SMART way to migrate replicated stateful services

Jacob R. Lorch; Atul Adya; William J. Bolosky; Ronnie Chaiken; John R. Douceur; Jon Howell

Many stateful services use the replicated state machine approach for high availability. In this approach, a service runs on multiple machines to survive machine failures. This paper describes SMART, a new technique for changing the set of machines where such a service runs, i.e., migrating the service. SMART improves upon existing techniques in three important ways. First, SMART allows migrations that replace non-failed machines. Thus, SMART enables load balancing and lets an automated system replace failed machines. Such autonomic migration is an important step toward full autonomic operation, in which administrators play a minor role and need not be available twenty-four hours a day, seven days a week. Second, SMART can pipeline concurrent requests, a useful performance optimization. Third, prior published migration techniques are described in insufficient detail to admit implementation, whereas our description of SMART is complete. In addition to describing SMART, we also demonstrate its practicality by implementing it, evaluating our implementations performance, and using it to build a consistent, replicated, migratable file system. Our experiments demonstrate the performance advantage of pipelining concurrent requests, and show that migration has only a minor and temporary effect on performance.

Explore More