Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Wooseok Chang is active.

Publication


Featured researches published by Wooseok Chang.


international conference on big data | 2015

Evaluating different distributed-cyber-infrastructure for data and compute intensive scientific application

Arghya Kusum Das; Seung-Jong Park; Jae-Ki Hong; Wooseok Chang

Scientists are increasingly using the current state of the art big data analytic software (e.g., Hadoop, Giraph, etc.) for their data-intensive applications over HPC environment. However, understanding and designing the hardware environment that these data- and compute-intensive applications require for good performance is challenging. With this motivation, we evaluated the performance of big data software over three different distributed-cyber-infrastructures, including a traditional HPC-cluster called SuperMikeII, a regular datacenter called SwatIII, and a novel MicroBrick-based hyperscale system called CeresII, using our own benchmark Parallel Genome Assembler (PGA). PGA is developed atop Hadoop and Giraph and serves as a good real-world example of a data- as well as compute-intensive workload. To evaluate the impact of both individual hardware components as well as overall organization, we changed the configuration of SwatIII in different ways. Comparing the individual impact of different hardware components (e.g., network, storage and memory) over different clusters, we observed 70% improvement in the Hadoop-workload and almost 35% improvement in the Giraph-workload in SwatIII over SuperMikeII by using SSD (thus, increasing the disk I/O rate) and scaling it up in terms of memory (which increases the caching). Then, we provide significant insight on efficient and cost-effective organization of these hardware components. Here, The MicroBrick-based CeresII prototype shows similar performance as SuperMikeII while giving more than 2-times improvement in performance/


international congress on big data | 2014

Performance Implications of SSDs in Virtualized Hadoop Clusters

Sungyong Ahn; Sangkyu Park; Jae-Ki Hong; Wooseok Chang

in the entire benchmark test.


international middleware conference | 2015

Cgroup++: Enhancing I/O Resource Management of Linux Cgroup on NUMA Systems with NVMe SSDs

Junghi Min; Sungyong Ahn; Kwanghyun La; Wooseok Chang; Jihong Kim

BigData manipulates a massive volume of data for which the traditional techniques are not effective. Apache Hadoop is currently a most popular software framework supporting BigData analysis. As the scale of Hadoop cluster grows larger, building Hadoop clusters in virtualized environment draws a great attention. However, the performance optimization of Hadoop cluster in virtualized environment is difficult because of the virtualization overhead. In this paper the performance implications of SSDs in virtualized Hadoop clusters is identified and the overhead of virtualization is shown to be minimized with SSDs. The study presented in this paper reveals that the main virtualization overhead is I/O bottleneck due to fragmented and randomized I/O workload aggravated by virtualization. However, SSDs are more tolerable to the workload than HDDs. As a result, the virtualization overhead with SSDs is much less than with HDDs. Also, in the case of SSDs, the virtualized Hadoop cluster sustains good performance regardless of the number of VMs.


international conference on cloud computing | 2017

Augmenting Amdahl's Second Law: A Theoretical Model to Build Cost-Effective Balanced HPC Infrastructure for Data-Driven Science

Arghya Kusum Das; Jae-Ki Hong; Sayan Goswami; Richard Platania; Kisung Lee; Wooseok Chang; Seung-Jong Park; Ling Liu

For container-based virtualization such as Linux container (LXC), efficient and proportional resource sharing is an important design requirement. However, existing container resource management techniques do not adequately meet this requirement on modern server machines, especially NUMA machines with NVMe SSDs. In this paper, we propose an efficient proportional-share Linux Cgroup, called Cgroup++, for container-based virtualization. Unlike Cgroup, Cgroup++ takes into account of the storage asymmetry of modern NUMA machines in managing storage I/O requests. By exploiting the storage asymmetry in scheduling CPU cores for a given Cgroup instance, Cgroup++ improves the I/O performance of the Cgroup instance. Cgroup++ also supports proportional I/O sharing among multiple Cgroup instances using a weight-based throttling scheme in the I/O throttling layer for the NVMe SSDs.


acm symposium on applied computing | 2016

Improving I/O performance of NVMe SSD on virtual machines

Jung-kil Kim; Sungyong Ahn; Kwanghyun La; Wooseok Chang

High-performance analysis of big data demands more computing resources, forcing similar growth in computation cost. So, the challenge to the HPC system designers is providing not only high performance but also high performance at lower cost. For high performance yet cost effective cyberinfrastructure, we propose a new system model augmenting Amdahls second law for balanced system to optimize price-performance-ratio. We express the optimal balance among CPU-speed, I/O-bandwidth and DRAM-size (i.e., Amdahls I/O-and memory-number) in terms of application characteristics and hardware cost. Considering Xeon processor and recent hardware prices, we showed that a system needs almost 0.17GBPS I/O-bandwidth and 3GB DRAM per GHz CPU-speed to minimize the price-performance-ratio for data-and compute-intensive applications. To substantiate our claim, we evaluate three different cluster architectures: 1) SupermikeII, a traditional HPC cluster, 2) SwatIII, a regular datacenter, and 3) CeresII, a MicroBrick-based novel hyperscale system. CeresII with 6-Xeon-D1541 cores (2GHz/core), 1-NVMe SSD (2GBPS I/O-bandwidth) and 64GB DRAM per node, closely resembles the optimum produced by our model. Consequently, in terms of price-performance-ratio CeresII outperformed both SupermikeII (by 65-85%) and SwatIII (by 40-50%) for data-and compute-intensive Hadoop benchmarks (TeraSort and WordCount) and our own benchmark genome assembler based on Hadoop and Giraph.


Archive | 2013

MEMORY OPERATION TIMING CONTROL METHOD AND MEMORY SYSTEM USING THE SAME

Jea-young Kwon; Shine Kim; Seongjun Ahn; Wooseok Chang; Dawoon Jung

The ever increasing demand of effective resource utilization in data centers has resulted in the dramatic development of various virtualization environments. Furthermore, the requirements on rapid processing of large data has not only caused to the replacement of spinning disks with flash-based SSD but also has led to the implementation of efficient software I/O stacks for SSDs. The software I/O stacks in most hypervisors have been developed for SATA interface-based storages. Therefore, high throughput and various functionalities provided by the NVMebased SSDs cannot be fully utilized. Also, it was found that the inefficiency of the existing storage I/O is due to the nonoptimized I/O stack of virtual machines in a hypervisor. In this paper, we have proposed a new I/O architecture that optimizes the I/O path by eliminating the overhead of user-level threads, bypassing unnecessary I/O routines and enhancing the interrupt delivery delay. The purpose of the proposed architecture is to enhance the throughput and scalability by mitigating the overhead of the existing software stack to take full advantage of NVMe SSDs. Experimental results with a real system show that the proposed approach improves the I/O performance by up to 47% compared to the existing approach.


Archive | 2011

MULTI-PROCESSOR DEVICE AND INTER-PROCESS COMMUNICATION METHOD THEREOF

Won-Seok Jung; Wooseok Chang; Hyoung-Jin Yun


Archive | 2014

FLEXIBLE SERVER SYSTEM

Bum-Jun Kim; Wooseok Chang; Hye-Won Kang; Em-Hwan Kim; Kwanghyun La; Su-Hwan Park; Jang-won Lee


Archive | 2015

STORAGE DEVICE AND GARBAGE COLLECTION METHOD OF DATA STORAGE SYSTEM HAVING THE STORAGE DEVICE

Wooseok Chang; Kangho Roh; Jong-Won Lee


Archive | 2014

DISTRIBUTED PROCESSING METHOD

Jae-Ki Hong; Wooseok Chang

Collaboration


Dive into the Wooseok Chang's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Arghya Kusum Das

Louisiana State University

View shared research outputs
Top Co-Authors

Avatar

Seung-Jong Park

Louisiana State University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge