Austin Donnelly | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Austin Donnelly is active.

Explore More

Publication

Featured researches published by Austin Donnelly.

file and storage technologies | 2008

Write off-loading: Practical power management for enterprise storage

Dushyanth Narayanan; Austin Donnelly; Antony I. T. Rowstron

In enterprise data centers power usage is a problem impacting server density and the total cost of ownership. Storage uses a significant fraction of the power budget and there are no widely deployed power-saving solutions for enterprise storage systems. The traditional view is that enterprise workloads make spinning disks down ineffective because idle periods are too short. We analyzed block-level traces from 36 volumes in an enterprise data center for one week and concluded that significant idle periods exist, and that they can be further increased by modifying the read/write patterns using write off-loading. Write off-loading allows write requests on spun-down disks to be temporarily redirected to persistent storage elsewhere in the data center. The key challenge is doing this transparently and efficiently at the block level, without sacrificing consistency or failure resilience. We describe our write off-loading design and implementation that achieves these goals. We evaluate it by replaying portions of our traces on a rack-based testbed. Results show that just spinning disks down when idle saves 28--36% of energy, and write off-loading further increases the savings to 45--60%.

european conference on computer systems | 2009

Migrating server storage to SSDs: analysis of tradeoffs

Dushyanth Narayanan; Eno Thereska; Austin Donnelly; Sameh Elnikety; Antony I. T. Rowstron

Recently, flash-based solid-state drives (SSDs) have become standard options for laptop and desktop storage, but their impact on enterprise server storage has not been studied. Provisioning server storage is challenging. It requires optimizing for the performance, capacity, power and reliability needs of the expected workload, all while minimizing financial costs. In this paper we analyze a number of workload traces from servers in both large and small data centers, to decide whether and how SSDs should be used to support each. We analyze both complete replacement of disks by SSDs, as well as use of SSDs as an intermediate tier between disks and DRAM. We describe an automated tool that, given device models and a block-level trace of a workload, determines the least-cost storage configuration that will support the workloads performance, capacity, and fault-tolerance requirements. We found that replacing disks by SSDs is not a costeffective option for any of our workloads, due to the low capacity per dollar of SSDs. Depending on the workload, the capacity per dollar of SSDs needs to increase by a factor of 3-3000 for an SSD-based solution to break even with a diskbased solution. Thus, without a large increase in SSD capacity per dollar, only the smallest volumes, such as system boot volumes, can be cost-effectively migrated to SSDs. The benefit of using SSDs as an intermediate caching tier is also limited: fewer than 10% of our workloads can reduce provisioning costs by using an SSD tier at todays capacity per dollar, and fewer than 20% can do so at any SSD capacity per dollar. Although SSDs are much more energy-efficient than enterprise disks, the energy savings are outweighed by the hardware costs, and comparable energy savings are achievable with low-power SATA disks.

acm special interest group on data communication | 2010

Symbiotic routing in future data centers

Hussam Abu-Libdeh; Paolo Costa; Antony I. T. Rowstron; Greg O'Shea; Austin Donnelly

Building distributed applications that run in data centers is hard. The CamCube project explores the design of a shipping container sized data center with the goal of building an easier platform on which to build these applications. CamCube replaces the traditional switch-based network with a 3D torus topology, with each server directly connected to six other servers. As in other proposals, e.g. DCell and BCube, multi-hop routing in CamCube requires servers to participate in packet forwarding. To date, as in existing data centers, these approaches have all provided a single routing protocol for the applications. In this paper we explore if allowing applications to implement their own routing services is advantageous, and if we can support it efficiently. This is based on the observation that, due to the flexibility offered by the CamCube API, many applications implemented their own routing protocol in order to achieve specific application-level characteristics, such as trading off higher-latency for better path convergence. Using large-scale simulations we demonstrate the benefits and network-level impact of running multiple routing protocols. We demonstrate that applications are more efficient and do not generate additional control traffic overhead. This motivates us to design an extended routing service allowing easy implementation of application-specific routing protocols on CamCube. Finally, we demonstrate that the additional performance overhead incurred when using the extended routing service on a prototype CamCube is very low.

symposium on operating systems principles | 2009

Fast byte-granularity software fault isolation

Miguel Castro; Manuel Costa; Jean-Philippe Martin; Marcus Peinado; Periklis Akritidis; Austin Donnelly; Paul Barham; Richard Black

Bugs in kernel extensions remain one of the main causes of poor operating system reliability despite proposed techniques that isolate extensions in separate protection domains to contain faults. We believe that previous fault isolation techniques are not widely used because they cannot isolate existing kernel extensions with low overhead on standard hardware. This is a hard problem because these extensions communicate with the kernel using a complex interface and they communicate frequently. We present BGI (Byte-Granularity Isolation), a new software fault isolation technique that addresses this problem. BGI uses efficient byte-granularity memory protection to isolate kernel extensions in separate protection domains that share the same address space. BGI ensures type safety for kernel objects and it can detect common types of errors inside domains. Our results show that BGI is practical: it can isolate Windows drivers without requiring changes to the source code and it introduces a CPU overhead between 0 and 16%. BGI can also find bugs during driver testing. We found 28 new bugs in widely used Windows drivers.

european conference on computer systems | 2011

Sierra: practical power-proportionality for data center storage

Eno Thereska; Austin Donnelly; Dushyanth Narayanan

Online services hosted in data centers show significant diurnal variation in load levels. Thus, there is significant potential for saving power by powering down excess servers during the troughs. However, while techniques like VM migration can consolidate computational load, storage state has always been the elephant in the room preventing this powering down. Migrating storage is not a practical way to consolidate I/O load. This paper presents Sierra, a power-proportional distributed storage subsystem for data centers. Sierra allows powering down of a large fraction of servers during troughs without migrating data and without imposing extra capacity requirements. It addresses the challenges of maintaining read and write availability, no performance degradation, consistency, and fault tolerance for general I/O workloads through a set of techniques including power-aware layout, a distributed virtual log, recovery and migration techniques, and predictive gear scheduling. Replaying live traces from a large, real service (Hotmail) on a cluster shows power savings of 23%. Savings of 40--50% are possible with more complex optimizations.

ifip international conference on intelligence in networks telecommunication network intelligence | 2000

A Network Based Replay Portal

Jacobus E. van der Merwe; Cormac J. Sreenan; Austin Donnelly; Andrea Basso; Charles Robert Kalmanek

A network based video replay service utilizing broadband technologies on the internet. A replacement for current analog or digital TV offerings that offer the same quality and user experience. The capacity used by current offerings (e.g. on a cable access network) will be freed up for use by the new service. The current schedule based broadcast paradigm users are accustomed to is emulated while at the same time offering on-demand viewing based on personal preference or subscription profile. This hybrid offering can lead to bandwidth savings in the access network with interaction of this service with other services on a common packet based infrastructure.

international conference on network protocols | 2004

Ethernet topology discovery without network assistance

Richard Black; Austin Donnelly; Cédric Fournet

This work addresses the problem of layer 2 topology discovery. Current techniques concentrate on using SNMP to query information from Ethernet switches. In contrast, we present a technique that infers the Ethernet (layer 2) topology without assistance from the network elements by injecting suitable probe packets from the end-systems and observing where they are delivered. We describe the algorithm, formally characterize its correctness and completeness, and present our implementation and experimental results. Performance results show that although originally aimed at the home and small office the techniques scale to much larger networks.

very large data bases | 2006

Delay aware querying with seaweed

Dushyanth Narayanan; Austin Donnelly; Richard Mortier; Antony I. T. Rowstron

Large highly distributed data sets are poorly supported by current query technologies. Applications such as endsystem-based network management are characterized by data stored on large numbers of endsystems, with frequent local updates and relatively infrequent global one-shot queries. The challenges are scale (103 to 109 endsystems) and endsystem unavailability. In such large systems, a significant fraction of endsystems and their data will be unavailable at any given time. Existing methods to provide high data availability despite endsystem unavailability involve centralizing, redistributing or replicating the data. At large scale these methods are not scalable. We advocate a design that trades query delay for completeness, incrementally returning results as endsystems become available. We also introduce the idea of completeness prediction, which provides the user with explicit feedback about this delay/completeness trade-off. Completeness prediction is based on replication of compact data summaries and availability models. This metadata is orders of magnitude smaller than the data. Seaweed is a scalable query infrastructure supporting incremental results, online in-network aggregation and completeness prediction. It is built on a distributed hash table (DHT) but unlike previous DHT based approaches it does not redistribute data across the network. It exploits the DHT infrastructure for failure-resilient metadata replication, query dissemination, and result aggregation. We analytically compare Seaweed’s scalability against other approaches and also evaluate the Seaweed prototype running on a large-scale network simulator driven by real-world traces.

local computer networks | 2000

IP route lookups as string matching

Austin Donnelly; Tim Deegan

An IP route lookup can be considered as a string matching problem on the destination address. Finite state automata (FSA) are a flexible and efficient way to match strings. This paper describes how a routing table can be encoded as an FSA and how, through a process of state reduction, we can obtain an optimal representation. This gives insights into the basic properties of the longest-prefix match problem.

international conference on data engineering | 2006

Seaweed: Distributed Scalable Ad Hoc Querying

Richard Mortier; Dushyanth Narayanan; Austin Donnelly; Antony I. T. Rowstron

Many emerging applications such as wide-area network management need to query large, structured, highly distributed datasets. Seaweed is a distributed scalable infrastructure for querying such datasets. In this paper we describe its architecture and design features, using the Anemone network management system as a motivating example. The main contribution is a design supporting accurate query planning and efficient execution across a large number of unreliable endsystems. In contrast to prior work, Seaweed supports ad hoc querying in addition to continuous querying. The paper describes the solutions adopted by Seaweed: latency-based cost estimation, availability-based scheduling, and meta-data aggregation.

Explore More