Is this you? Create Your Porfile

Younggyun Koh

Georgia Institute of Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Younggyun Koh is active.

Explore More

Publication

Featured researches published by Younggyun Koh.

international symposium on performance analysis of systems and software | 2007

An Analysis of Performance Interference Effects in Virtual Environments

Younggyun Koh; Rob C. Knauerhase; Paul Brett; Mic Bowman; Zhihua Wen; Calton Pu

Virtualization is an essential technology in modern datacenters. Despite advantages such as security isolation, fault isolation, and environment isolation, current virtualization techniques do not provide effective performance isolation between virtual machines (VMs). Specifically, hidden contention for physical resources impacts performance differently in different workload configurations, causing significant variance in observed system throughput. To this end, characterizing workloads that generate performance interference is important in order to maximize overall utility. In this paper, we study the effects of performance interference by looking at system-level workload characteristics. In a physical host, we allocate two VMs, each of which runs a sample application chosen from a wide range of benchmark and real-world workloads. For each combination, we collect performance metrics and runtime characteristics using an instrumented Ken hypervisor. Through subsequent analysis of collected data, we identify clusters of applications that generate certain types of performance interference. Furthermore, we develop mathematical models to predict the performance of a new application from its workload characteristics. Our evaluation shows our techniques were able to predict performance with average error of approximately 5%

international conference on cloud computing | 2010

Understanding Performance Interference of I/O Workload in Virtualized Cloud Environments

Xing Pu; Ling Liu; Yiduo Mei; Sankaran Sivathanu; Younggyun Koh; Calton Pu

Server virtualization offers the ability to slice large, underutilized physical servers into smaller, parallel virtual machines (VMs), enabling diverse applications to run in isolated environments on a shared hardware platform. Effective management of virtualized cloud environments introduces new and unique challenges, such as efficient CPU scheduling for virtual machines, effective allocation of virtual machines to handle both CPU intensive and I/O intensive workloads. Although a fair number of research projects have dedicated to measuring, scheduling, and resource management of virtual machines, there still lacks of in-depth understanding of the performance factors that can impact the efficiency and effectiveness of resource multiplexing and resource scheduling among virtual machines. In this paper, we present our experimental study on the performance interference in parallel processing of CPU and network intensive workloads in the Xen Virtual Machine Monitors (VMMs). We conduct extensive experiments to measure the performance interference among VMs running network I/O workloads that are either CPU bound or network bound. Based on our experiments and observations, we conclude with four key findings that are critical to effective management of virtualized cloud environments for both cloud service providers and cloud consumers. First, running network-intensive workloads in isolated environments on a shared hardware platform can lead to high overheads due to extensive context switches and events in driver domain and VMM. Second, co-locating CPU-intensive workloads in isolated environments on a shared hardware platform can incur high CPU contention due to the demand for fast memory pages exchanges in I/O channel. Third, running CPU-intensive workloads and network-intensive workloads in conjunction incurs the least resource contention, delivering higher aggregate performance. Last but not the least, identifying factors that impact the total demand of the exchanged memory pages is critical to the in-depth understanding of the interference overheads in I/O channel in the driver domain and VMM.

IEEE Transactions on Services Computing | 2013

Who Is Your Neighbor: Net I/O Performance Interference in Virtualized Clouds

Xing Pu; Ling Liu; Yiduo Mei; Sankaran Sivathanu; Younggyun Koh; Calton Pu; Yuanda Cao

User-perceived performance continues to be the most important QoS indicator in cloud-based data centers today. Effective allocation of virtual machines (VMs) to handle both CPU intensive and I/O intensive workloads is a crucial performance management capability in virtualized clouds. Although a fair amount of researches have dedicated to measuring and scheduling jobs among VMs, there still lacks of in-depth understanding of performance factors that impact the efficiency and effectiveness of resource multiplexing and scheduling among VMs. In this paper, we present the experimental research on performance interference in parallel processing of CPU-intensive and network-intensive workloads on Xen virtual machine monitor (VMM). Based on our study, we conclude with five key findings which are critical for effective performance management and tuning in virtualized clouds. First, colocating network-intensive workloads in isolated VMs incurs high overheads of switches and events in Dom0 and VMM. Second, colocating CPU-intensive workloads in isolated VMs incurs high CPU contention due to fast I/O processing in I/O channel. Third, running CPU-intensive and network-intensive workloads in conjunction incurs the least resource contention, delivering higher aggregate performance. Fourth, performance of network-intensive workload is insensitive to CPU assignment among VMs, whereas adaptive CPU assignment among VMs is critical to CPU-intensive workload. The more CPUs pinned on Dom0 the worse performance is achieved by CPU-intensive workload. Last, due to fast I/O processing in I/O channel, limitation on grant table is a potential bottleneck in Xen. We argue that identifying the factors that impact the total demand of exchanged memory pages is important to the in-depth understanding of interference costs in Dom0 and VMM.

acm symposium on applied computing | 2009

Fast networking with socket-outsourcing in hosted virtual machine environments

Hideki Eiraku; Yasushi Shinjo; Calton Pu; Younggyun Koh; Kazuhiko Kato

This paper proposes a novel method of achieving fast networking in hosted virtual machine (VM) environments. This method, called socket-outsourcing, replaces the socket layer in a guest operating system (OS) with the socket layer of the host OS. Socket-outsourcing increases network performance by eliminating duplicate message copying in both the host OS and the guest OS. Furthermore, socket-outsourcing significantly enhances inter-VM communication within the same host OS since it enables network packets to bypass the protocol stack in guest OSes. Socket-outsourcing was implemented in two representative operating systems (Linux and NetBSD) and on two virtual machine monitors (Linux KVM and PansyVM). These virtual machine monitors provided support for socket-outsourcing through shard memory, event queues, and VM-specific Remote Procedure Call between a guest OS and a host OS. The experimental results revealed that a guest OS outsourcing the socket layer achieved the same network throughput as a native OS using up to four Gigabit Ethernet links. Moreover, the benchmark results obtained from an N-tier Web application that generated a significant amount of inter-VM communication indicated that socket-outsourcing improved performance by up to 45 percent compared with conventional hosted VM environments.

automated software engineering | 2005

Clearwater: extensible, flexible, modular code generation

Galen S. Swint; Calton Pu; Gueyoung Jung; Wenchang Yan; Younggyun Koh; Qinyi Wu; Charles Consel; Akhil Sahai; Koichi Moriyama

Distributed applications typically interact with a number of heterogeneous and autonomous components that evolve independently. Methodical development of such applications can benefit from approaches based on domain-specific languages (DSLs). However, the evolution and customization of heterogeneous components introduces significant challenges to accommodating the syntax and semantics of a DSL in addition to the heterogeneous platforms on which they must run. In this paper, we address the challenge of implementing code generators for two such DSLs that are flexible (resilient to changes in generators or input formats), extensible (able to support multiple output targets and multiple input variants), and modular (generated code can be re-written). Our approach, Clearwater, leverages XML and XSLT standards: XML supports extensibility and mutability for in-progress specification formats, and XSLT provides flexibility and extensibility for multiple target languages. Modularity arises from using XML meta-tags in the code generator itself, which supports controlled addition, subtraction, or replacement to the generated code via XML-weaving. We discuss the use of our approach and show its advantages in two non-trivial code generators: the Infopipe Stub Generator (ISG) to support distributed flow applications, and the Automated Composable Code Translator to support automated distributed application deployment. As an example, the ISG accepts as input an XML description and generates output for C, C++, or Java using a number of communications platforms such as sockets and publish-subscribe.

network computing and applications | 2009

Improving Virtualized Windows Network Performance by Delegating Network Processing

Younggyun Koh; Calton Pu; Yasushi Shinjo; Hideki Eiraku; Go Saito; Daiyuu Nobori

Virtualized environments are important building blocks in consolidated data centers and cloud computing. Full virtualization (FV) allows unmodified guest OSes to run on virtualization-aware microprocessors. However, the significant overhead of device emulation in FV has caused high I/O overhead. Current implementations based on paravirtualization can only reduce such overhead partially. This paper describes the Linsock approach that applies the outsourcing method to speed up I/O in FV environments by combining different guest OS and host OS. Concretely, Linsock replaces the guest Windows’ network processing with the host Linux kernel’s on the same machine. Linsock has been implemented on Linux Kernel-based Virtual Machine (KVM) as the host virtual machine (VM) environment. Our measurement results with Linsock show significant performance increase of more than 300% compared with device paravirtualization in a 10Gbps Ethernet networking environment. In addition, Linsock also yields a fourfold increase in inter-VM communication performance.

local computer networks | 2006

Efficient Packet Processing in User-Level OSes: A Study of UML

Younggyun Koh; Calton Pu; Sapan Bhatia; Charles Consel

Network server consolidation has become popular through virtualization technology that builds secure, isolated network systems on shared hardware. One of the virtualization techniques used is that of user-level operating systems. (ULOSes) However, the isolation and security they bring comes at the price of performance, as virtualization introduces a number of overheads into the system. Such overheads can be surprisingly large, especially for complex OS modules like network protocol stacks. Our studies of the TCP/IP stack in user-mode Linux (UML), an implementation of a ULOS, attribute the resulting slow-downs to three main sources: the execution of privileged code, memory management across layers, and additional instructions to execute. To mitigate these bottlenecks, we present five optimization techniques, improving the network performance significantly, reducing packet processing latency by 60% and increasing network throughput by three folds. Furthermore, the network throughput of the improved ULOS is comparable to that of native Linux up to gigabit speeds

network computing and applications | 2004

Infopipes: the ISL/ISG implementation evaluation

Galen S. Swint; Calton Pu; Younggyun Koh; Ling Liu; Wenchang Yan; Charles Consel; Koichi Moriyama; Jonathan Walpole

We provide a performance comparison of generated Infopipes that have been translated and the Spi/XlP variant of Infopipe specification into executable code. Infopipes are abstractions to support information flow applications. These tools are evaluated through a realistic application: a continuous image streaming program. We implement the application in C and compare its performance to both a hand-written application and one that uses SunRPC.

international conference on distributed computing systems | 2017

The Millibottleneck Theory of Performance Bugs, and Its Experimental Verification

Calton Pu; Joshua Kimball; Chien-An Lai; Tao Zhu; Jack Li; Junhee Park; Qingyang Wang; Deepal Jayasinghe; Pengcheng Xiong; Simon Malkowski; Qinyi Wu; Gueyoung Jung; Younggyun Koh; Galen S. Swint

The performance of n-tier web-facing applications often suffer from response time long-tail problem. With relatively low resource utilization (less than 50%) and the majority of requests returning within a few milliseconds, a non-negligible num-ber of normally short requests may take seconds to return. We propose the millibottleneck theory of performance bugs (that lead to long-tail problems). Several case studies have confirmed the millibottlenecks (that last a few tens to hundreds of milliseconds) as causal agents of long requests. A concrete example (garbage collection) illustrates the experimental verification of millibottlenecks. An open source fine-grain monitoring toolkit is being devel-oped to facilitate the experimental research on millibottlenecks.

Archive | 2002