Dan R. K. Ports | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Dan R. K. Ports is active.

Explore More

Publication

Featured researches published by Dan R. K. Ports.

symposium on cloud computing | 2014

Tales of the Tail: Hardware, OS, and Application-level Sources of Tail Latency

Jialin Li; Naveen Kr. Sharma; Dan R. K. Ports; Steven D. Gribble

Interactive services often have large-scale parallel implementations. To deliver fast responses, the median and tail latencies of a services components must be low. In this paper, we explore the hardware, OS, and application-level sources of poor tail latency in high throughput servers executing on multi-core machines. We model these network services as a queuing system in order to establish the best-achievable latency distribution. Using fine-grained measurements of three different servers (a null RPC service, Memcached, and Nginx) on Linux, we then explore why these servers exhibit significantly worse tail latencies than queuing models alone predict. The underlying causes include interference from background processes, request re-ordering caused by poor scheduling or constrained concurrency models, suboptimal interrupt routing, CPU power saving mechanisms, and NUMA effects. We systematically eliminate these factors and show that Memcached can achieve a median latency of 11 μs and a 99.9th percentile latency of 32 μs at 80% utilization on a four-core system. In comparison, a naïve deployment of Memcached at the same utilization on a single-core system has a median latency of 100 μs and a 99.9th percentile latency of 5 ms. Finally, we demonstrate that tradeoffs exist between throughput, energy, and tail latency.

very large data bases | 2012

Serializable snapshot isolation in PostgreSQL

Dan R. K. Ports; Kevin Grittner

This paper describes our experience implementing PostgreSQLs new serializable isolation level. It is based on the recently-developed Serializable Snapshot Isolation (SSI) technique. This is the first implementation of SSI in a production database release as well as the first in a database that did not previously have a lock-based serializable isolation level. We reflect on our experience and describe how we overcame some of the resulting challenges, including the implementation of a new lock manager, a technique for ensuring memory usage is bounded, and integration with other PostgreSQL features. We also introduce an extension to SSI that improves performance for read-only transactions. We evaluate PostgreSQLs serializable isolation level using several benchmarks and show that it achieves performance only slightly below that of snapshot isolation, and significantly outperforms the traditional two-phase locking approach on read-intensive workloads.

symposium on operating systems principles | 2015

Building consistent transactions with inconsistent replication

Irene Zhang; Naveen Kr. Sharma; Adriana Szekeres; Arvind Krishnamurthy; Dan R. K. Ports

Application programmers increasingly prefer distributed storage systems with strong consistency and distributed transactions (e.g., Googles Spanner) for their strong guarantees and ease of use. Unfortunately, existing transactional storage systems are expensive to use -- in part because they require costly replication protocols, like Paxos, for fault tolerance. In this paper, we present a new approach that makes transactional storage systems more affordable: we eliminate consistency from the replication protocol while still providing distributed transactions with strong consistency to applications. We present TAPIR -- the Transactional Application Protocol for Inconsistent Replication -- the first transaction protocol to use a novel replication protocol, called inconsistent replication, that provides fault tolerance without consistency. By enforcing strong consistency only in the transaction protocol, TAPIR can commit transactions in a single round-trip and order distributed transactions without centralized coordination. We demonstrate the use of TAPIR in a transactional key-value store, TAPIR-KV. Compared to conventional systems, TAPIR-KV provides better latency and throughput.

ACM Transactions on Computer Systems | 2016

Arrakis: The Operating System Is the Control Plane

Simon Peter; Jialin Li; Irene Zhang; Dan R. K. Ports; Doug Woos; Arvind Krishnamurthy; Thomas E. Anderson; Timothy Roscoe

Recent device hardware trends enable a new approach to the design of network server operating systems. In a traditional operating system, the kernel mediates access to device hardware by server applications, to enforce process isolation as well as network and disk security. We have designed and implemented a new operating system, Arrakis, that splits the traditional role of the kernel in two. Applications have direct access to virtualized I/O devices, allowing most I/O operations to skip the kernel entirely, while the kernel is re-engineered to provide network and disk protection without kernel mediation of every operation. We describe the hardware and software changes needed to take advantage of this new abstraction, and we illustrate its power by showing improvements of 2-5x in latency and 9x in throughput for a popular persistent NoSQL store relative to a well-tuned Linux implementation.

international workshop on peer to peer systems | 2005

Arpeggio: metadata searching and content sharing with chord

Austin T. Clements; Dan R. K. Ports; David R. Karger

Arpeggio is a peer-to-peer file-sharing network based on the Chord lookup primitive. Queries for data whose metadata matches a certain criterion are performed efficiently by using a distributed keyword-set index, augmented with index-side filtering. We introduce index gateways, a technique for minimizing index maintenance overhead. Because file data is large, Arpeggio employs subrings to track live source peers without the cost of inserting the data itself into the network. Finally, we introduce postfetching, a technique that uses information in the index to improve the availability of rare files. The result is a system that provides efficient query operations with the scalability and reliability advantages of full decentralization, and a content distribution system tuned to the requirements and capabilities of a peer-to-peer network.

symposium on operating systems principles | 2005

PersiFS: a versioned file system with an efficient representation

Dan R. K. Ports; Austin T. Clements; Erik D. Demaine

The availability of previous file versions is invaluable for recovering from file corruption or user errors such as accidental deletions. Versioned file systems address this need by retaining earlier versions of changed files. Many existing file systems, such as Plan 9, WAFL, AFS, and others, use a snap-shotting approach: they record and archive the state of the file system at periodic intervals. However, this fails to capture modifications that are made between snapshots. Our system, PersiFS, is continuously versioned, meaning that it stores every modification, and thus allows access to the file system state as it appeared at any specified time. To make this feasible, we use a number of efficient data structures to optimize both access time and disk space.

symposium on operating systems principles | 2017

Eris: Coordination-Free Consistent Transactions Using In-Network Concurrency Control

Jialin Li; Ellis Michael; Dan R. K. Ports

Distributed storage systems aim to provide strong consistency and isolation guarantees on an architecture that is partitioned across multiple shards for scalability and replicated for fault tolerance. Traditionally, achieving all of these goals has required an expensive combination of atomic commitment and replication protocols -- introducing extensive coordination overhead. Our system, Eris, takes a different approach. It moves a core piece of concurrency control functionality, which we term multi-sequencing, into the datacenter network itself. This network primitive takes on the responsibility for consistently ordering transactions, and a new lightweight transaction protocol ensures atomicity. The end result is that Eris avoids both replication and transaction coordination overhead: we show that it can process a large class of distributed transactions in a single round-trip from the client to the storage system without any explicit coordination between shards or replicas in the normal case. It provides atomicity, consistency, and fault tolerance with less than 10% overhead -- achieving throughput 3.6-35x higher and latency 72-80% lower than a conventional design on standard benchmarks.

symposium on cloud computing | 2016

Disciplined Inconsistency with Consistency Types

Brandon Holt; James Bornholt; Irene Zhang; Dan R. K. Ports; Mark Oskin; Luis Ceze

Distributed applications and web services, such as online stores or social networks, are expected to be scalable, available, responsive, and fault-tolerant. To meet these steep requirements in the face of high round-trip latencies, network partitions, server failures, and load spikes, applications use eventually consistent datastores that allow them to weaken the consistency of some data. However, making this transition is highly error-prone because relaxed consistency models are notoriously difficult to understand and test. In this work, we propose a new programming model for distributed data that makes consistency properties explicit and uses a type system to enforce consistency safety. With the Inconsistent, Performance-bound, Approximate (IPA) storage system, programmers specify performance targets and correctness requirements as constraints on persistent data structures and handle uncertainty about the result of datastore reads using new consistency types. We implement a prototype of this model in Scala on top of an existing datastore, Cassandra, and use it to make performance/correctness tradeoffs in two applications: a ticket sales service and a Twitter clone. Our evaluation shows that IPA prevents consistency-based programming errors and adapts consistency automatically in response to changing network conditions, performing comparably to weak consistency and 2-10× faster than strong consistency.

asia pacific workshop on systems | 2014

Machine fault tolerance for reliable datacenter systems

Danyang Zhuo; Qiao Zhang; Dan R. K. Ports; Arvind Krishnamurthy; Thomas E. Anderson

Although rare in absolute terms, undetected CPU, memory, and disk errors occur often enough at datacenter scale to significantly affect overall system reliability and availability. In this paper, we propose a new failure model, called Machine Fault Tolerance, and a new abstraction, a replicated write-once trusted table, to provide improved resilience to these types of failures. Since most machine failures manifest in application server and operating system code, we assume a Byzantine model for those parts of the system. However, by assuming that the hypervisor and network are trustworthy, we are able to reduce the overhead of machine-fault masking to be close to that of non-Byzantine Paxos.

very large data bases | 2017

ZaliQL: causal inference from observational data at scale

Babak Salimi; Corey Cole; Dan R. K. Ports; Dan Suciu

Causal inference from observational data is a subject of active research and development in statistics and computer science. Many statistical software packages have been developed for this purpose. However, these toolkits do not scale to large datasets. We propose and demonstrate ZaliQL: a SQL-based framework for drawing causal inference from observational data. ZaliQL supports the state-of-the-art methods for causal inference and runs at scale within PostgreSQL database system. In addition, we built a visual interface to wrap around ZaliQL. In our demonstration, we will use this GUI to show a live investigation of the causal effect of different weather conditions on flight delays.

Explore More