Hyungsoo Jung | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Hyungsoo Jung is active.

Explore More

Publication

Featured researches published by Hyungsoo Jung.

international conference on parallel and distributed systems | 2008

MRBench: A Benchmark for MapReduce Framework

Kiyoung Kim; Kyungho Jeon; Hyuck Han; Shin Gyu Kim; Hyungsoo Jung; Heon Young Yeom

MapReduce is Googles programming model for easy development of scalable parallel applications which process huge quantity of data on many clusters. Due to its conveniency and efficiency, MapReduce is used in various applications (e.g., Web search services and online analytical processing). However, there are only few good benchmarks to evaluate MapReduce implementations by realistic testsets. In this paper, we present MRBench that is a benchmark for evaluating MapReduce systems. MRBench focuses on processing business oriented queries and concurrent data modifications. To this end, we build MRBench to deal with large volumes of relational data and execute highly complex queries. By MRBench, users can evaluate the performance of MapReduce systems while varying environmental parameters such as data size and the number of (map/reduce) tasks. Our extensive experimental results show that MRBench is a useful tool to benchmark the capability of answering critical business questions.

international conference on cloud computing | 2009

A RESTful Approach to the Management of Cloud Infrastructure

Hyuck Han; Shin Gyu Kim; Hyungsoo Jung; Heon Young Yeom; Changho Yoon; Jong-Won Park; Yongwoo Lee

Recently, REpresentational State Transfer (REST) has been proposed as an alternative architecture for Web services.In the era of Cloud and Web 2.0, many complex Web service-based systems such as e-Business an de-Government applications have adopted REST. Unfortunately, the REST approach has been applied to few cases in management systems, especially for a management system for cloud computing infrastructures.In this paper, we design and implement a RESTful Cloud Management System (CMS).Managed elements can be modeled as resources in REST and operations in existing systems can be evaluated using four methods of REST or a combination of them.We also show how components of existing management systems can be realized as REST-style Web services.

international conference on management of data | 2013

A scalable lock manager for multicores

Hyungsoo Jung; Hyuck Han; Alan Fekete; Gernot Heiser; Heon Young Yeom

Modern implementations of DBMS software are intended to take advantage of high core counts that are becoming common in high-end servers. However, we have observed that several database platforms, including MySQL, Shore-MT, and a commercial system, exhibit throughput collapse as load increases, even for a workload with little or no logical contention for locks. Our analysis of MySQL identifies latch contention within the lock manager as the bottleneck responsible for this collapse. We design a lock manager with reduced latching, implement it in MySQL, and show that it avoids the collapse and generally improves performance. Our efficient implementation of a lock manager is enabled by a staged allocation and de-allocation of locks. Locks are pre-allocated in bulk, so that the lock manager only has to perform simple list-manipulation operations during the acquire and release phases of a transaction. De-allocation of the lock data-structures is also performed in bulk, which enables the use of fast implementations of lock acquisition and release, as well as concurrent deadlock checking.

IEEE Transactions on Parallel and Distributed Systems | 2012

Cashing in on the Cache in the Cloud

Hyuck Han; Young Choon Lee; Woong Shin; Hyungsoo Jung; Heon Young Yeom; Albert Y. Zomaya

Over the past decades, caching has become the key technology used for bridging the performance gap across memory hierarchies via temporal or spatial localities; in particular, the effect is prominent in disk storage systems. Applications that involve heavy I/O activities, which are common in the cloud, probably benefit the most from caching. The use of local volatile memory as cache might be a natural alternative, but many well-known restrictions, such as capacity and the utilization of host machines, hinder its effective use. In addition to technical challenges, providing cache services in clouds encounters a major practical issue (quality of service or service level agreement issue) of pricing. Currently, (public) cloud users are limited to a small set of uniform and coarse-grained service offerings, such as High-Memory and High-CPU in Amazon EC2. In this paper, we present the cache as a service (CaaS) model as an optional service to typical infrastructure service offerings. Specifically, the cloud provider sets aside a large pool of memory that can be dynamically partitioned and allocated to standard infrastructure services as disk cache. We first investigate the feasibility of providing CaaS with the proof-of-concept elastic cache system (using dedicated remote memory servers) built and validated on the actual system, and practical benefits of CaaS for both users and providers (i.e., performance and profit, respectively) are thoroughly studied with a novel pricing scheme. Our CaaS model helps to leverage the cloud economy greatly in that 1) the extra user cost for I/O performance gain is minimal if ever exists, and 2) the providers profit increases due to improvements in server consolidation resulting from that performance gain. Through extensive experiments with eight resource allocation strategies, we demonstrate that our CaaS model can be a promising cost-efficient solution for both users and providers.

international conference on computer communications | 2011

Adaptive delay-based congestion control for high bandwidth-delay product networks

Hyungsoo Jung; Shin Gyu Kim; Heon Young Yeom; Sooyong Kang; Lavy Libman

The design of an end-to-end Internet congestion control protocol that could achieve high utilization, fair sharing of bottleneck bandwidth, and fast convergence while remaining TCP-friendly is an ongoing challenge that continues to attract considerable research attention. This paper presents ACP, an Adaptive end-to-end Congestion control Protocol that achieves the above goals in high bandwidth-delay product networks where TCP becomes inefficient. The main contribution of ACP is a new form of congestion window control, combining the estimation of the bottleneck queue size and a measure of fair sharing. Specifically, upon detecting congestion, ACP decreases the congestion window size by the exact amount required to empty the bottleneck queue while maintaining high utilization, while the increases of the congestion window are based on a “fairness ratio” metric of each flow, which ensures fast convergence to a fair equilibrium. We demonstrate the benefits of ACP using both ns-2 simulation and experimental measurements of a Linux prototype implementation. In particular, we show that the new protocol is TCP-friendly and allows TCP and ACP flows to coexist in various circumstances, and that ACP indeed behaves more fairly than other TCP variants under heterogeneous round-trip times (RTT).

conference on high performance computing (supercomputing) | 2005

Design and Implementation of Multiple Fault-Tolerant MPI over Myrinet (M^3)

Hyungsoo Jung; Dongin Shin; Hyuck Han; Jai Wug Kim; Heon Young Yeom; Jong-Suk Lee

Advances in network technology and computing power have inspired the emergence of high-performance cluster computing systems. While cluster management and hardware highavailability tools are readily available, practical and easily deployable fault-tolerant systems have not been successfully adopted commercially. We present a fault-tolerant system, Multiple fault-tolerant MPI over Myrinet (M3), that differs in notable respects from other proposed fault-tolerant systems in the literature. M3 is built on top of Myrinet since it is regarded as one of the best solutions for highperformance networks and is widely used in cluster computing systems because it can provide a high-speed switching network that is an inevitable ingredient in interconnecting clusters of workstations or PCs. M^3 is a user-transparent checkpointing system for multiple fault-tolerant MPI implementation that is primarily based on the coordinated checkpointing protocol. M3 supports three critical functionalities that are necessary for faulttolerance: a light-weight failure detection mechanism, dynamic process management that includes process migration, and a consistent checkpoint and recovery mechanism. The features of M are that it requires no modifications of application code and that it preserves much of the high performance characteristics of Myrinet. This paper describes the architecture of M3, its detailed design principles and comprehensive implementation issues. We also propose practical solutions for those involved in constructing highly available cluster systems for parallel programming systems. Experimental results substantiate our assertion that M3 can be a good candidate for practically deployable fault-tolerant systems in very-large and high-performance Myrinet clusters and that its protocol can be applied to a wide variety of parallel communication libraries without difficulty.

Cluster Computing | 2011

Scatter-Gather-Merge: An efficient star-join query processing algorithm for data-parallel frameworks

Hyuck Han; Hyungsoo Jung; Hyeonsang Eom; Heon Young Yeom

A data-parallel framework is very attractive for large-scale data processing since it enables such an application to easily process a huge amount of data on commodity machines. MapReduce, a popular data-parallel framework, is used in various fields such as web search, data mining and data warehouses; it is proven to be very practical for such a data-parallel application. A star-join query is a popular query in data warehouses that are a current target domain of data-parallel frameworks. This article proposes a new algorithm that efficiently processes star-join queries in data-parallel frameworks such as MapReduce and Dryad. Our star-join algorithm for general data-parallel frameworks is called Scatter-Gather-Merge, and it processes star-join queries in a constant number of computation steps, although the number of participating dimension tables increases. By adopting bloom filters, Scatter-Gather-Merge reduces a non-trivial amount of IO. We also show that Scatter-Gather-Merge can be easily applied to MapReduce. Our experimental results in both cluster and cloud environments show that Scatter-Gather-Merge outperforms existing approaches.

Journal of Information Science and Engineering | 2011

Improving MapReduce Performance by Exploiting Input Redundancy

Shin Gyu Kim; Hyuck Han; Hyungsoo Jung; Hyeonsang Eom; Heon Young Yeom

The proliferation of data parallel programming on large clusters has set a new research avenue: accommodating numerous types of data-intensive applications with a feasible plan. Behind the many research efforts, we can observe that there exists a nontrivial amount of redundant I/O in the execution of data-intensive applications. This redundancy problem arises as an emerging issue in the recent literature because even the locality-aware scheduling policy in a MapReduce framework is not effective in a cluster environment where storage nodes cannot provide a computation service. In this article, we introduce SplitCache for improving the performance of data-intensive OLAP-style applications by reducing redundant I/O in a MapReduce framework. The key strategy to achieve the goal is to eliminate such I/O redundancy especially when different applications read common input data within an overlapped time period; SplitCache caches the first input stream in the computing nodes and reuses them for future demands. We also design a cache-aware task scheduler that plays an important role in achieving high cache utilization. In execution of the TPC-H benchmark, we achieved 64.3% faster execution and 83.48% reduction in network traffic in average.

asia pacific web conference | 2006

HVEM grid: experiences in constructing an electron microscopy grid

Hyuck Han; Hyungsoo Jung; Heon Young Yeom; Hee S. Kweon; Jysoo Lee

This paper proposes HVEM-Grid, which is the cornerstone for tele-instrumentation infrastructure. The proposed architecture is mainly oriented for people whose primary work is to get access to a remotely-located instrument, a High Voltage Electron Microscopy (HVEM). Our architecture is designed to materialize all the necessary requirements in allowing the user to 1) control every single part of HVEM in a fine-grained manner, 2) check the HVEM and observe various states of specimen, and 3) manipulate their high resolution 2-D images of the specimen. In that aspect, this paper suggests an HVEM Grid designed upon the concept of the Grid and Web Service which satisfies various types of user groups.

acm symposium on applied computing | 2010

Harnessing input redundancy in a MapReduce framework

Shin Gyu Kim; Hyuck Han; Hyungsoo Jung; Hyeonsang Eom; Heon Young Yeom

The proliferation of data parallel programming on large clusters has set a new research avenue: accommodating numerous types of data-intensive applications with a feasible plan. Behind the many research efforts, we can observe that there exists a nontrivial amount of redundant I/O in the execution of data-intensive applications. Even the locality-aware scheduling policy in a MapReduce framework is not effective in a cluster environment where storage nodes cannot provide a computation service. In this paper, we introduce Split-Cache to improve the performance of data-intensive OLAP-style applications by reducing redundant I/O in a MapReduce framework. The key strategy to achieve the goal is to cut down the I/O redundancy of reading common input data among applications. SplitCache caches the first input stream in the computing nodes and reuses them for future demand. In execution of the TPC-H benchmark, we achieved 65.5% faster execution and 87% reduction in network traffic in average.

Explore More