Kyungsu Kang | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Kyungsu Kang is active.

Explore More

Publication

Featured researches published by Kyungsu Kang.

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 2011

Runtime Power Management of 3-D Multi-Core Architectures Under Peak Power and Temperature Constraints

Kyungsu Kang; Jungsoo Kim; Sungjoo Yoo; Chong-Min Kyung

3-D integration is a new technology that overcomes the limitations of 2-D integrated circuits, e.g., power and delay induced from long interconnect wires, by stacking multiple dies to increase logic integration density. However, chip-level power and peak temperature are the major performance limiters in 3-D multi-core architectures. In this paper, we propose a runtime power management method for both peak power and temperature-constrained 3-D multi-core systems in order to maximize the instruction throughput. The proposed method exploits dynamic temperature slack (defined as peak temperature constraint minus current temperature) and workload characteristics (e.g., instructions per cycle and memory-boundness) as well as thermal characteristics of 3-D stacking architectures. Compared with existing thermal-aware power management solutions for 3-D multi-core systems, our method yields up to 34.2% (average 18.5%) performance improvement in terms of instructions per second without significant additional energy consumption.

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 2010

Temperature-Aware Integrated DVFS and Power Gating for Executing Tasks With Runtime Distribution

Kyungsu Kang; Jungsoo Kim; Sungjoo Yoo; Chong-Min Kyung

At high-operating temperature, chip cooling is crucial due to the exponential temperature dependence of leakage current. However, traditional cooling methods, e.g., power/clock gating applied when a temperature threshold is reached, often cause excessive performance degradation. In this paper, we propose a method for delivering lower energy consumption by integrating the cooling and running in a temperature-aware manner without incurring performance penalty. In order to further reduce the energy consumption, we exploited the runtime distribution of each sub-segment of a task called “bin” in an analytical manner such that time budget for cooling in each bin is allocated in proportion to the probability of the occurrence of the bin. We apply the proposed method to two realistic software programs, H.264 decoder and ray tracing and a benchmark program, equake. The experimental results show that the proposed method yields additional 19.4%-27.2% reduction in energy consumption compared with existing methods.

international symposium on circuits and systems | 2011

Thermal-aware energy minimization of 3D-stacked L3 cache with error rate limitation

Woojin Yun; Kyungsu Kang; Chong-Min Kyung

Three-dimensional (3D) memory stacking, which enables stacking memory on top of a microprocessor or chip-multiprocessor (CMP), is one of the most promising applications of 3D integration technology to meet memory bandwidth challenges. However, the high power density, i.e., power dissipation per unit volume due to the high integration incurs temperature-related problems such as reliability of 3D-stacked memory. Error correcting codes (ECCs) are commonly used to deal with soft errors and thereby enhance system reliability. In this paper, we present the effects of temperature, refresh period, and ECC policy on the reliability and power consumption of 3D-stacked embedded DRAM (eDRAM). To minimize the energy consumption of the 3D-stacked eDRAM without violating error rate limitation, refresh period and ECC policy must be controlled in a temperature-aware manner. Experimental results show that the proposed adaptive ECC policy with varying temperature achieves a reduction of energy consumption by up to 26% compared with fixed ECC policy under a given error rate constraints.

great lakes symposium on vlsi | 2011

Design and management of 3D-stacked NUCA cache for chip multiprocessors

Jongpil Jung; Kyungsu Kang; Chong-Min Kyung

Power and delay induced from long on-chip interconnections are becoming major issues of chip multiprocessor design. Both network-on-chip (NoC) and three-dimensional integration are promising ways to mitigate the interconnection problem. In this paper, we explore the design of 3Dstacked non-uniform cache architecture (NUCA) with onchip network. In addition, this paper investigates the problem of partitioning shared L2 cache for concurrently executing multiple applications in order to improve the system performance in terms of instructions per cycle. The proposed design is evaluated in an integrated power, performance, and temperature simulator. Experimental results show that the proposed method enhances system performance by 23.3% and reduces energy consumption by 17.9% for 16-core processor system compared to conventional design.

asia and south pacific design automation conference | 2015

THOR: Orchestrated thermal management of cores and networks in 3D many-core architectures

Jinho Lee; Junwhan Ahn; Kiyoung Choi; Kyungsu Kang

Most previous researches on thermal management of many-core architectures focus on the control of either core resources or network resources only, even though both have significant thermal impacts. This paper proposes a holistic thermal management that applies dynamic voltage/frequency scaling to cores and routers together to maximize system performance under temperature constraint. The proposed method first determines a power budget given in aggregate weighted power for every pillar of vertically adjacent tiles. Then it performs voltage/frequency assignment under the budget while exploiting the characteristics of the applications. Experiments show that our approach outperforms existing methods.

international symposium on quality electronic design | 2014

Temperature-aware runtime power management for chip-multiprocessors with 3-D stacked cache

Kyungsu Kang; Giovanni De Micheli; Seunghan Lee; Chong-Min Kyung

The advent of 3-D fabrication technology makes it possible to stack a large amount of last-level cache memory onto a multi-core die to reduce off-chip memory accesses and, thus, increases system performance. However, the higher power density (i.e., power dissipation per unit volume) of 3-D integrated circuits (ICs) might incur temperature-related problems in reliability, leakage power, system performance, and cooling cost. In this paper, we propose a runtime solution to maximize the performance (i.e., instruction throughput) of chip-multiprocessors with 3-D stacked last-level cache memory, without thermal-constraint violation. The proposed method combines runtime cache tuning (e.g., cache-way partitioning, cache-way power-gating, cache data placement) with per-core dynamic voltage/frequency scaling (DVFS) in a temperature-aware manner. Experimental results show that the integrated method offers 23% performance improvement on average in terms of instructions per second (IPS) compared with temperature-aware runtime cache tuning only.

international symposium on quality electronic design | 2011

Maximizing throughput of temperature-constrained multi-core systems with 3D-stacked cache memory

Kyungsu Kang; Jongpil Jung; Sungjoo Yoo; Chong-Min Kyung

Three-dimensional integration has the potential to increase integration density and to reduce communication latency of chip-multiprocessors (CMPs). However, high power density (i.e., power dissipation per unit volume) due to the high integration incurs temperature-related problems in reliability, power consumption, performance, and system cooling cost. In this paper, we propose a design-time solution for temperature-constrained multi-core systems with 3D stacked cache memory in order to maximize the instruction throughput. The proposed method combines power gating of memory banks in the 3D stacked cache memory, which adapts cache partitioning [7], and dynamic voltage and frequency scaling (DVFS) of each core in a temperature-aware manner. Experimental results show that the proposed method offers up to 32% (average 15%) performance improvement in terms of instructions per second (IPS) compared with an existing method which only performs cache partitioning without temperature consideration.

international conference on multimedia and expo | 2007

Search Area Selective Reuse Algorithm in Motion Estimation

Heejun Shim; Kyungsu Kang; Chong-Min Kyung

Motion estimation has been widely studied and used to improve coding efficiency with small data access for power-saving. Conventional search area reuse algorithm requires small memory access by reuse of search area, but, suffers from coding efficiency degradation in fast motion video sequence. In this paper, we propose a search area selective reuse algorithm. The proposed algorithm well selects search center for reducing memory access in slow motion region and for tracking real motion in fast motion region according to the motion vector and search center information of the neighbor block. Compared to the conventional reuse algorithm, our experimental results show that the proposed algorithm prevents the coding efficiency degradation with small memory access overhead by tracking real motion in fast motion frames.

IEEE Transactions on Very Large Scale Integration Systems | 2015

Runtime Thermal Management for 3-D Chip-Multiprocessors With Hybrid SRAM/MRAM L2 Cache

Seunghan Lee; Kyungsu Kang; Chong-Min Kyung

Nonvolatile memory such as magnetic RAM (MRAM) offers high cell density and low leakage power while suffering from long write latency and high write energy, compared with SRAM. 3-D integration technology using through-silicon vias enables stacking disparate memory technologies (e.g., SRAM and MRAM) together onto chip-multiprocessors (CMPs). The use of hybrid memories as an on-chip cache can take advantage of the best characteristics that each technology offers. However, the inherent high power density and heat removal limitation in 3-D integrated circuits may incur temperature-related problems. In this paper, we propose a runtime thermal management method for CMPs with the 3-D stacked hybrid SRAM/MRAM L2 cache. The proposed method combines dynamic cache management such as resource allocation, way-based power gating, and data migration with dynamic voltage and frequency scaling of processing cores in a temperature- and energy-aware manner. Experimental results show that the proposed runtime method with the 3-D stacked hybrid L2 cache offers up to 107.37% (55.28% on average) performance improvement and 88.47% (47.65% on average) energy efficiency improvement compared with existing thermal management methods with 3-D stacked SRAM-based L2 cache.

international soc design conference | 2012

Temperature-aware energy minimization of 3D-stacked L2 DRAM cache through DVFS

Woojin Yun; Jongpil Jung; Kyungsu Kang; Chong-Min Kyung

Three-dimensional (3D) memory stacking is one of the most promising applications in 3D integration to solve the limited memory bandwidth problem in 2D integrated circuits (ICs). However, the high power density, i.e., power dissipation per unit volume, due to the high integration density of 3D ICs may incur high operating temperature and, thus, causes low reliability as well as high power consumption. In this paper, we describes the effects of temperature, supply voltage, and L2 cache access rate on both power consumption and reliability of 3D-stacked L2 DRAM cache. Also, we propose a dynamic voltage and frequency scaling (DVFS) scheme for 3D-stacked L2 DRAM cache which can be adapted to either each cache bank or each group of cache banks while taking account of both error-rate and temperature-induced power consumption. Experimental results show that the proposed DVFS scheme achieved a reduction of energy consumption by up to 21.5% compared to a conventional scheme under a given error-rate constraint.

Explore More