Dongwoo Lee | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Dongwoo Lee is active.

Explore More

Publication

Featured researches published by Dongwoo Lee.

ieee conference on mass storage systems and technologies | 2015

Improving performance by bridging the semantic gap between multi-queue SSD and I/O virtualization framework

Tae Yong Kim; Dong Hyun Kang; Dongwoo Lee; Young Ik Eom

Virtualization has become one of the most helpful techniques, and today it is prevalent in several computing environments including desktops, data-centers, and enterprises. However, an I/O scalability issue in virtualized environments still needs to be addressed because I/O layers are implemented to be oblivious to the I/O behaviors on virtual machines (VM). In particular, when a multi-queue solid state drive (SSD) is used as a secondary storage, each VM reveals semantic gap that degrades the overall performance of the VM by up to 74%. This is due to two key problems. First, the multi-queue SSD accelerates the possibility of lock contentions. Second, even though both the host machine and the multi-queue SSD provide multiple I/O queues for I/O parallelism, existing Virtio-Blk-Data-Plane supports only one I/O queue by an I/O thread for submitting all I/O requests. In this paper, we propose a novel approach, including the design of virtual CPU (vCPU)-dedicated queues and I/O threads, which efficiently distributes the lock contentions and addresses the parallelism issue of Virtio-Blk-Data-Plane in virtualized environments. We design our approach based on the above principle, which allocates a dedicated queue and an I/O thread for each vCPU to reduce the semantic gap. We also implement our approach based on Linux 3.17, and modify both the Virtio-Blk frontend driver of guest OS and the Virtio-Blk backend driver of Quick Emulator (QEMU) 2.1.2. Our experimental results with various I/O traces clearly show that our design improves the I/O operations per second (IOPS) in virtualized environments by up to 167% over existing QEMU.

IEEE Transactions on Consumer Electronics | 2015

Effective flash-based SSD caching for high performance home cloud server

Dongwoo Lee; Changwoo Min; Young Ik Eom

In the home cloud environment, the storage performance of home cloud servers, which govern connected devices and provide resources with virtualization features, is critical to improve the end-user experience. To improve the storage performance of virtualized home cloud servers in a cost-effective manner, caching schemes using flash-based solid state drives (SSD) have been widely studied. Although previous studies successfully narrow the speed gap between memory and hard disk drives, they only focused on how to manage the cache space, but were less interested in how to use the cache space efficiently taking into account the characteristics of flash-based SSD. Moreover, SSD caching is used as a read-only cache due to two well-known limitations of SSD: slow write and limited lifespan. Since storage access in virtual machines is performed in a more complex and costly manner, the limitations of SSD affect more significantly the storage performance. This paper proposes a novel SSD caching scheme and virtual disk image format, named sequential virtual disk (SVD), for achieving high-performance home cloud environments. The proposed techniques are based on the workload characteristics, in which synchronous random writes dominate, while taking into consideration the characteristics of flash memory and storage stack of the virtualized systems. Unlike previous studies, SSD is used as a read-write cache in the proposed caching scheme to effectively mitigate the performance degradation of synchronous random writes. The prototype was evaluated with some realistic workloads, through which the developed scheme was shown to allow improvement of the storage access performance by 21% to 112%, with reduction in the number of erasures on SSD by about 56% on average.

international conference on ubiquitous information management and communication | 2013

Cgroups-based scheduling scheme for heterogeneous workloads in smart TV systems

Younghyun Joo; Dongwoo Lee; Jung-Hoon Kim; Young Ik Eom

A smart TV has been changed from an appliance which handles just multimedia contents to a high tech device that provides more various and valuable services based on the fast network and the multicore environment. Furthermore, a smart TV is expected to play a role as a home server which cooperates with other IT appliances. For this reason, the importance of analyzing the various workloads and understanding the properties in the systems has increased. It is hard to meet users requirements in the formal hardware resource and providing many services simultaneously cause the resource limitation for operating smart TV systems, i.e. it is hard to provide various services smoothly. In this paper, we analyze the smart TV workload resource consumption properties and users request response latency problem when some workloads performed at the same time. Then, we propose cgroups-based CPU scheduling scheme using cgroups. This scheduling scheme guarantees response to users request and the video quality.

The Journal of Supercomputing | 2013

Light-weight kernel instrumentation framework using dynamic binary translation

Dongwoo Lee; Inhyuk Kim; Jeehong Kim; Hyung Kook Jun; Won Tae Kim; Sang-Won Lee; Young Ik Eom

Mobile platforms such as Android and iOS, which are based on typical operating systems, have been widely adopted in various computing devices from smart phones even to smart TVs. Along with this, the necessity of kernel instrumentation framework has also grown up for efficient development and debugging of a kernel itself and its components. Although the existing approaches are providing some information about the kernel state including physical register value and primitive memory map, it is hard for the developers to understand and exploit the information. Moreover, the excessive analysis overhead in the existing approach makes them impractical to be used in real systems. Meanwhile, there have been a few studies on analyzing the user-level applications using dynamic binary translation and they are now widely used. In this paper, by extending this idea of dynamic binary translation for user-level applications to the kernel, we propose a new dynamic kernel instrumentation framework. Our framework focuses on the modules such as device drivers, rather than the kernel itself, since the modules comprise a large portion of OS development. Because of the frequent execution of kernel modules, the dynamic kernel instrumentation framework should guarantee the quality of the translated target code. However, costly optimizations to achieve high execution performance are rather harmful to the overall performance. Therefore, in order to improve performance of both translations, we suggest light-weight translator based on pseudo-machine instruction representation and tabular-base translation instead of typical intermediate representation. We implement our framework on Linux system, and our experimental evaluations show that it could quite effectively instrument the target with nominal overhead.

international conference on ubiquitous information management and communication | 2014

Delegating OpenGL commands to host for hardware support in virtualized environments

Younghyun Joo; Dongwoo Lee; Young Ik Eom

Today, the virtualization is a very important technology which is widely used in various area, from small mobile devices to virtual machine (VM) servers for large scale cloud computing. Now, hypervisor provides CPU and memory resources for the VMs with high performance like native machine through many researches on the virtualized environment. However, device virtualization techniques, especially those for GPU devices, are less studied than the other virtualization techniques. It is a chief obstacle to perform graphics processing in the virtualized environment. Since VM cannot access the physical GPU device directly, existing GPU device virtualization techniques have some limitations on 3D acceleration. Especially, those techniques spend more time to perform graphics processing because they use software rendering on the Mesa Software Fallback module in the guest OS. In this paper, we propose a GPU device virtualization technique that can improve OpenGL graphics performance. By using concurrent I/O request queue between the host emulation process and the guest OS, GPU device can be accessed directly. Our scheme can avoid graphics processing in the graphics stack of the guest OS and also can reduce vmexit overheads. The emulation process can perform the graphics processing using GPU hardware rendering. Our evaluation shows that the proposed technique has about 2.5x higher frame rate than existing Mesa software rendering.

international conference on computational science and its applications | 2011

A Fast Lock-Free User Memory Space Allocator for Embedded Systems

Dongwoo Lee; Jung-Hoon Kim; Ung Mo Kim; Young Ik Eom; Hyung Kook Jun; Won Tae Kim

Many embedded systems get improvements on hardware such as massive memory and multi-cores. According these improvements, some application that demands per-formance of excessive operations per seconds has been app-eared. These applications often use dynamic memory allocation. But, existing allocators does not scale well, thus those applications is limited theirs performance by allocators. Moreover, because the applications that run on embedded systems are rarely powered-off, the external fragmentation problem is critical. This paper introduces the allocator, lock-free and scalable that free the synchronization cost and low the fragmentation. Our allocator has per-thread heap and allocate the close size memory instead of the exact size of memory to reduce synchronization cost and allocation/de-allocation time. Our result on test application that can be run with 1 to 32 threads demonstrate that our allocator yields low average fragmentation and improves overall program performance over the standard Linux allocator by up to a factor of 60 on 32 threads, and up to a factor of 10 over the next best allocator we tested.

international conference on consumer electronics | 2015

Effective SSD caching for high-performance home cloud server

Dongwoo Lee; Changwoo Min; Young Ik Eom

In this paper, we propose a novel SSD caching scheme and virtual disk image format, named SVD (Sequential Virtual Disk). Our proposed techniques are based on workload characteristics in home cloud server, in which synchronous random writes dominate. Unlike previous studies, we use SSD as a read-write cache in order to effectively mitigate performance degradation of synchronous random writes.

Journal of KIISE | 2015

SSD Caching for Improving Performance of Virtualized IoT Gateway

Dongwoo Lee; Young Ik Eom

It is important to improve the performance of storage in the home cloud environment within the virtualized IoT gateway since the performance of applications deeply depends on storage. Though SSD caching is applied in order to improve the storage, it is only used for read-cache due to the limitations of SSD such as poor write performance and small write endurance. However, it is important to improve performance of the write operation in the home cloud server, in order to improve the end-user experience. This paper propose a novel SSD caching which considers write-data as well as read-data. We validate the enhancement in the performance of random-write by transforming it to the sequential patterns.

international conference on consumer electronics | 2014

Lock-free memory allocator without garbage collection on multicore embedded devices

Youngjoong Cho; Dongwoo Lee; Hyung Kook Jun; Young Ik Eom

This paper suggests a new memory allocator for embedded devices. We design a memory allocator that uses lock-free mechanism for reducing overheads incurred by locks in multi-core environment. Experimental results show that our memory allocator is faster and more scalable than existing memory allocator in executing multi-threaded applications in embedded devices. On average, performance improvement of our memory allocator is 26% in producer-consumer pattern and 65% in general pattern.

international conference on big data and smart computing | 2014

A paravirtualized file system for accelerating file I/O

Kihong Lee; Dongwoo Lee; Young Ik Eom

With several new virtualization technologies, a virtual machine has gradually achieved higher performance. However, I/O-intensive workloads still suffer from performance degradation due to CPU mode switching and duplicated I/O stacks. In this paper, we propose a framework for improving file I/O performance in a virtualized environment, which consists of a paravirtualized file system, a shared queue, and an I/O-dedicated thread. The key idea is to make file I/O requests handled without mode switching and to bypass a guest I/O stack. We implemented a prototype and measured its performance. The results show that our framework gives 1.2-1.6 times better performance than virtio, and most of vmexits are eliminated.

Explore More