Is this you? Create Your Porfile

Ken Yocum

University of California, San Diego

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ken Yocum is active.

Explore More

Publication

Featured researches published by Ken Yocum.

acm special interest group on data communication | 2007

Cloud control with distributed rate limiting

Barath Raghavan; Kashi Venkatesh Vishwanath; Sriram Ramabhadran; Ken Yocum; Alex C. Snoeren

Todays cloud-based services integrate globally distributed resources into seamless computing platforms. Provisioning and accounting for the resource usage of these Internet-scale applications presents a challenging technical problem. This paper presents the design and implementation of distributed rate limiters, which work together to enforce a global rate limit across traffic aggregates at multiple sites, enabling the coordinated policing of a cloud-based services network traffic. Our abstraction not only enforces a global limit, but also ensures that congestion-responsive transport-layer flows behave as if they traversed a single, shared limiter. We present two designs - one general purpose, and one optimized for TCP - that allow service operators to explicitly trade off between communication costs and system accuracy, efficiency, and scalability. Both designs are capable of rate limiting thousands of flows with negligible overhead (less than 3% in the tested configuration). We demonstrate that our TCP-centric design is scalable to hundreds of nodes while robust to both loss and communication delay, making it practical for deployment in nationwide service providers.

IEEE Communications Magazine | 2001

End system optimizations for high-speed TCP

Jeffrey S. Chase; Andrew J. Gallatin; Ken Yocum

The delivered TCP performance on high-speed networks is often limited by the sending and receiving hosts, rather than by the network hardware or the TCP protocol implementation itself. In this case, systems can achieve higher bandwidth by reducing host overheads through a variety of optimizations above and below the TCP protocol stack, given support from the network interface. This article surveys the most important of these optimizations and illustrates their effects quantitatively with empirical results from an experimental network delivering up to 2 Gb/s of end-to-end TCP bandwidth.

symposium on cloud computing | 2010

Stateful bulk processing for incremental analytics

Dionysios Logothetis; Christopher Olston; Benjamin Reed; Kevin C. Webb; Ken Yocum

This work addresses the need for stateful dataflow programs that can rapidly sift through huge, evolving data sets. These data-intensive applications perform complex multi-step computations over successive generations of data inflows, such as weekly web crawls, daily image/video uploads, log files, and growing social networks. While programmers may simply re-run the entire dataflow when new data arrives, this is grossly inefficient, increasing result latency and squandering hardware resources and energy. Alternatively, programmers may use prior results to incrementally incorporate the changes. However, current large-scale data processing tools, such as Map-Reduce or Dryad, limit how programmers incorporate and use state in data-parallel programs. Straightforward approaches to incorporating state can result in custom, fragile code and disappointing performance. This work presents a generalized architecture for continuous bulk processing (CBP) that raises the level of abstraction for building incremental applications. At its core is a flexible, groupwise processing operator that takes state as an explicit input. Unifying stateful programming with a data-parallel operator affords several fundamental opportunities for minimizing the movement of data in the underlying processing system. As case studies, we show how one can use a small set of flexible dataflow primitives to perform web analytics and mine large-scale, evolving graphs in an incremental fashion. Experiments with our prototype using real-world data indicate significant data movement and running time reductions relative to current practice. For example, incrementally computing PageRank using CBP can reduce data movement by 46% and cut running time in half.

ACM Transactions on Computer Systems | 2011

DieCast: Testing Distributed Systems with an Accurate Scale Model

Diwaker Gupta; Kashi Venkatesh Vishwanath; Marvin McNett; Amin Vahdat; Ken Yocum; Alex C. Snoeren; Geoffrey M. Voelker

Large-scale network services can consist of tens of thousands of machines running thousands of unique software configurations spread across hundreds of physical networks. Testing such services for complex performance problems and configuration errors remains a difficult problem. Existing testing techniques, such as simulation or running smaller instances of a service, have limitations in predicting overall service behavior at such scales. Testing large services should ideally be done at the same scale and configuration as the target deployment, which can be technically and economically infeasible. We present DieCast, an approach to scaling network services in which we multiplex all of the nodes in a given service configuration as virtual machines across a much smaller number of physical machines in a test harness. We show how to accurately scale CPU, network, and disk to provide the illusion that each VM matches a machine in the original service in terms of both available computing resources and communication behavior. We present the architecture and evaluation of a system we built to support such experimentation and discuss its limitations. We show that for a variety of services---including a commercial high-performance cluster-based file system---and resource utilization levels, DieCast matches the behavior of the original service while using a fraction of the physical resources.

symposium on operating systems principles | 2005

To infinity and beyond: time warped network emulation

Diwaker Gupta; Ken Yocum; Marvin McNett; Alex C. Snoeren; Amin Vahdat; Geoffrey M. Voelker

This work explores the viability and benefits of time dilation - providing the illusion to an operating system and its applications that time is passing at a rate different from real time. For example, we may wish to convince a system that for every 10 seconds of wall clock time, only one second of time passes in the hosts dilated time frame. This enables external stimuli to appear to take place at higher rates than would be physically possible. For example, a host dilated by a factor of 10 receiving data from a network interface at a real rate of 1-Gbps believes it is receiving data at 10-Gbps.

acm special interest group on data communication | 2013

High-fidelity switch models for software-defined network emulation

Danny Yuxing Huang; Ken Yocum; Alex C. Snoeren

Software defined networks (SDNs) depart from traditional network architectures by explicitly allowing third-party software access to the networks control plane. Thus, SDN protocols such as OpenFlow give network operators the ability to innovate by authoring or buying network controller software independent of the hardware. However, this split design can make planning and designing large SDNs even more challenging than traditional networks. While existing network emulators allow operators to ascertain the behavior of traditional networks when subjected to a given workload, we find that current approaches fail to account for significant vendor-specific artifacts in the SDN switch control path. We benchmark OpenFlow-enabled switches from three vendors and illustrate how differences in their implementation dramatically impact latency and throughput. We present a measurement methodology and emulator extension to reproduce these control-path performance artifacts, restoring the fidelity of emulation.

very large data bases | 2008

Ad-hoc data processing in the cloud

Dionysios Logothetis; Ken Yocum

Ad-hoc data processing has proven to be a critical paradigm for Internet companies processing large volumes of unstructured data. However, the emergence of cloud-based computing, where storage and CPU are outsourced to multiple third-parties across the globe, implies large collections of highly distributed and continuously evolving data. Our demonstration combines the power and simplicity of the MapReduce abstraction with a wide-scale distributed stream processor, Mortar. While our incremental MapReduce operators avoid data re-processing, the stream processor manages the placement and physical data flow of the operators across the wide area. We demonstrate a distributed web indexing engine against which users can submit and deploy continuous MapReduce jobs. A visualization component illustrates both the incremental indexing and index searches in real time.

PLOS Biology | 2014

Redefining Genomic Privacy: Trust and Empowerment

Yaniv Erlich; James B. Williams; David Glazer; Ken Yocum; Nita A. Farahany; Maynard V. Olson; Arvind Narayanan; Lincoln Stein; Jan A. Witkowski; Robert C. Kain

Current models of protecting human subjects create a zero-sum game of privacy versus data utility. We propose shifting the paradigm to techniques that facilitate trust between researchers and participants.

modeling, analysis, and simulation on computer and telecommunication systems | 2003

Toward scaling network emulation using topology partitioning

Ken Yocum; Ethan Eade; Julius Degesys; David Becker; Jeffrey S. Chase; Amin Vahdat

Scalability is the primary challenge to studying large complex network systems with network emulation. This paper studies topology partitioning, assigning disjoint pieces of the network topology across processors, as a technique to increase emulation capacity with increasing hardware resources. We develop methods to create partitions based on expected communication across the topology. Our evaluation methodology quantifies the communication overhead or efficiency of the resulting partitions. We implement and contrast three partitioning strategies in ModelNet, a large-scale network emulator, using different topologies and uniform communication patterns. Results show that standard graph partitioning algorithms can double the efficiency of the emulation for Internet-like topologies relative to random partitioning.

conference on high performance computing (supercomputing) | 2006

Improving grid resource allocation via integrated selection and binding

Yang-Suk Kee; Ken Yocum; Andrew A. Chien; Henri Casanova

Discovering and acquiring appropriate, complex resource collections in large-scale distributed computing environments is a fundamental challenge and is critical to application performance. This paper presents a new formulation of the resource selection problem and a new solution to the resource selection and binding problem called integrated selection and binding. Composition operators in our resource description language and efficient data organization enable our approach to allocate complex resource collections efficiently and effectively even in the presence of competition for resources. Our empirical evaluation shows that the integrated approach can produce solutions of significantly higher quality at higher success rate and lower cost than the traditional separate approach. The success rate of the integrated approach can tolerate as much as 15%-60% lower resource availability than the separate approach. Moreover, most requests have at least the 98th percentile rank and can be served in 6 seconds with a population of 1 million hosts

Explore More