Shin-Dug Kim | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Shin-Dug Kim is active.

Explore More

Publication

Featured researches published by Shin-Dug Kim.

international conference on computer design | 1999

Design and evaluation of a selective compressed memory system

Jang-Soo Lee; Won-Kee Hong; Shin-Dug Kim

This research explores any potential for an on-chip cache compression which can reduce not only cache miss ratio but also miss penalty, if main memory is also managed in compressed form. However, the decompression time causes a critical effect on the memory access time and variable-sized compressed blocks tend to increase the design complexity of the compressed cache architecture. This paper suggests several techniques to reduce the decompression overhead and to manage the compressed blocks efficiently which include selective compression, fixed space allocation for the compressed blocks, parallel decompression, the use of a decompression buffer, and so on. Moreover a simple compressed cache architecture based on the above techniques and its management method are proposed. The results from trace-driven simulation show that this approach can provide around 35% decrease in the on-chip cache miss ratio as well as a 53% decrease in the data traffic over the conventional memory systems. Also, a large amount of the decompression overhead can be reduced, and thus the average memory access time can also be reduced by maximum 20% against the conventional memory systems.

IEICE Electronics Express | 2009

Sub-grouped superblock management for high-performance flash storages

Jung-Wook Park; Gi-Ho Park; Charles C. Weems; Shin-Dug Kim

In this paper we describe a new superblock management scheme to overcome the problem of increased erase operations, that results from increasing the degree of interleaving of memory banks in flash memory based storage devices. To improve performance, superblock management is used to increase the degree of linear interleaving of flash memory banks. However, increased interleaving may significantly increase the number of erase operations, thus decreasing device lifetime. The proposed management scheme efficiently separates hot and cold data into two different sub-groups, dramatically increasing the efficiency of superblock merging. According to our simulation results, the number of erase operations decreases by around 27.3 percent, which is enough to significantly lengthen overall device lifetime. Read performance is only slightly degraded by our approach.

Proceedings. Workshop on Heterogeneous Processing | 1992

Augmenting the Optimal Selection Theory for Superconcurrency

Mu-Cheng Wang; Shin-Dug Kim; Mark A. Nichols; Richard F. Freund; Howard Jay Siegel; Wayne G. Nation

An approach for jinding the optimal configuration of heterogeneous computer systems to solve supercomputing problem is presented. Superconcurrency as a form of distributed heterogeneous supercomputing is an approach for matching and managing an optimally configured suite of super-speed machines to minimize the execution time on a given task. The approach performs best when the computational requirements for a given set of tasks are diverse. A supercomputing application task is decomposed into a collection of code segments, where the processing requirement is homogeneous in each code segment. The optimal selection theory has been proposed to choose the optimal configuration of machines for a supercomputing problem. This technique is based on code projiling and analytical benchmarking. Here, the previously presented optimal selection theory approach is augmented in two ways: the performance of code segments on non-optimal machine choices is incorporated and non-uniform &compositions of code segments are considered.

Journal of Systems Architecture | 2000

An on-chip cache compression technique to reduce decompression overhead and design complexity

Jang-Soo Lee; Won-Kee Hong; Shin-Dug Kim

This research explores a compressed memory hierarchy model which can increase both the effective memory space and bandwidth of each level of memory hierarchy. It is well known that decompression time causes a critical effect to the memory access time and variable-sized compressed blocks tend to increase the design complexity of the compressed memory systems. This paper proposes a selective compressed memory system (SCMS) incorporating the compressed cache architecture and its management method. To reduce or hide decompression overhead, this SCMS employs several effective techniques, including selective compression, parallel decompression and the use of a decompression buffer. In addition, fixed memory space allocation method is used to achieve efficient management of the compressed blocks. Trace-driven simulation shows that the SCMS approach can not only reduce the on-chip cache miss ratio and data traffic by about 35% and 53%, respectively, but also achieve a 20% reduction in average memory access time (AMAT) over conventional memory systems (CMS). Moreover, this approach can provide both lower memory traffic at a lower cost than CMS with some architectural enhancement. Most importantly, the SCMS is a more attractive approach for future computer systems because it offers high performance in cases of long DRAM latency and limited bus bandwidth.

IEEE Computer Architecture Letters | 2002

A Low Power TLB Structure for Embedded Systems

Jin-Hyuck Choi; Jung-Hoon Lee; Seh-Woong Jeong; Shin-Dug Kim; Charles C. Weems

We present a new two-level TLB (translationlook-aside buffer) architecture that integrates a 2-waybanked filter TLB with a 2-way banked main TLB. Theobjective is to reduce power consumption in embeddedprocessors by distributing the accesses to TLB entriesacross the banks in a balanced manner. First, an advancedfiltering technique is devised to reduce access power byadopting a sub-bank structure. Second, a bank-associativestructure is applied to each level of the TLB hierarchy.Simulation results show that the Energy*Delay productcan be reduced by about 40.9% compared to a fullyassociativeTLB, 24.9% compared to a micro-TLB with4+32 entries, and 12.18% compared to a micro-TLB with16+32 entries.

IEEE Pervasive Computing | 2009

Personalized Service Discovery in Ubiquitous Computing Environments

Kyung-Lang Park; Uram H. Yoon; Shin-Dug Kim

In ubiquitous computing environments, users want to discover the most appropriate service to support their tasks. Because the most appropriate service depends on user preferences and context, service discovery protocols should personalize results. A service discovery framework based on the virtual personal space (VPS)-that is, a virtual administrative domain of services managed for the user-aims to provide this personalization. In this framework, personal operating middleware embedded in a personal device manages a set of contextually close services in the users VPS. An inference module supports this management. Laboratory evaluations show that the VPS framework helps users find high-quality, appropriate services.

ieee international conference on high performance computing data and analytics | 1997

Task scheduling in distributed computing systems with a genetic algorithm

Sung-Ho Woo; Sung-Bong Yang; Shin-Dug Kim; Tack-Don Han

Scheduling a directed acyclic graph (DAG) which represents the precedence relations of the tasks of a parallel program in a distributed computing system (DCS) is known as an NP-complete problem except for some special cases. Many heuristic-based methods have been proposed under various models and assumptions. A DCS can be classified in two types according to the characteristics of the processors on a network: a distributed homogeneous system (DHOS) and a distributed heterogeneous system (DHES). The paper defines a general model for a DHOS and a DHES and presents a genetic algorithm (GA) to solve the task scheduling problem in the defined DCS. The performance of the proposed GA is compared with the list scheduling algorithm in a DHOS and with the one-level reach-out greedy algorithm (OLROG) in a DHES. The proposed GA has shown better performance in various environments than other scheduling methods.

Microprocessors and Microsystems | 2011

A hybrid flash translation layer design for SLC-MLC flash memory based multibank solid state disk

Jung-Wook Park; Seung-Ho Park; Charles C. Weems; Shin-Dug Kim

This paper presents the design of a NAND flash based solid state disk (SSD), which can support various storage access patterns commonly observed in a PC environment. It is based on a hybrid model of high-performance SLC (single-level cell) NAND and low cost MLC (multi-level cell) NAND flash memories. Typically, SLC NAND has a higher transfer rate and greater cell endurance than MLC NAND flash memory. MLC NAND, on the other hand, benefits from lower price and higher capacity. In order to achieve higher performance than traditional SSDs, an interleaving technique that places NAND flash chips in parallel is essential. However, using the traditional FTL (flash translation layer) on an SSD with only MLC NAND chips is inefficient because the size of a logical block becomes large as the mapping address unit grows. In this paper, we proposed a HFTL (hybrid flash translation layer) which makes use of chained-blocks, combining SLC NAND and MLC NAND flash memories in parallel. Experimental results show that for most of the traces studied, the HFTL in an SSD configuration composed of 80% MLC NAND and 20% SLC NAND memories can improve performance compared to other solid state disk configurations, composed of either SLC NAND or MLC NAND flash memory alone.

international symposium on low power electronics and design | 2003

A selective filter-bank TLB system

Jung-Hoon Lee; Gi-Ho Park; Sung-Bae Park; Shin-Dug Kim

We present a selective filter-bank translation lookaside buffer (TLB) system with low power consumption for embedded processors. The proposed TLB is constructed as multiple banks with a small two-bank buffer, called as a filter-bank buffer, located above its associated bank. Either a filter-bank buffer or a main bank TLB can be selectively accessed based on two bits in the filter-bank buffer. Energy savings are achieved by reducing the number of entries accessed at a time, by using filtering and bank mechanism. The overhead of the proposed TLB turns out to be negligible compared with other hierarchical structures. Simulation results show that the Energy*Delay product can be reduced by about 88% compared with a fully associative TLB, 75% with respect to a filter-TLB, and 51% relative to a banked-filter TLB.

Journal of Systems Architecture | 2000

A new cache architecture based on temporal and spatial locality

Jung-Hoon Lee; Jang-Soo Lee; Shin-Dug Kim

Abstract A data cache system is designed as low power/high performance cache structure for embedded processors. Direct-mapped cache is a favorite choice for short cycle time, but suffers from high miss rate. Hence the proposed dual data cache is an approach to improve the miss ratio of direct-mapped cache without affecting this access time. The proposed cache system can exploit temporal and spatial locality effectively by maximizing the effective cache memory space for any given cache size. The proposed cache system consists of two caches, i.e., a direct-mapped cache with small block size and a fully associative spatial buffer with large block size. Temporal locality is utilized by caching candidate small blocks selectively into the direct-mapped cache. Also spatial locality can be utilized aggressively by fetching multiple neighboring small blocks whenever a cache miss occurs. According to the results of comparison and analysis, similar performance can be achieved by using four times smaller cache size comparing with the conventional direct-mapped cache.And it is shown that power consumption of the proposed cache can be reduced by around 4% comparing with the victim cache configuration.

Explore More