Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Kyungwook Chang is active.

Publication


Featured researches published by Kyungwook Chang.


international symposium on low power electronics and design | 2015

Power benefit study of monolithic 3D IC at the 7nm technology node

Kyungwook Chang; Kartik Acharya; Saurabh Sinha; Brian Cline; Greg Yeric; Sung Kyu Lim

Monolithic 3D IC (M3D) is one potential technology to help with the challenges of continued circuit power and performance scaling. In this paper, for the first time, the power benefits of monolithic 3D IC (M3D) using a 7nm FinFET technology are investigated. The predictive 7nm Process Design Kit (PDK) and standard cell library for both high performance (HP) and low standby power (LSTP) device technologies are built based on NanGate 45nm PDK using accurate dimensional, material, and electrical parameters from publications and a commercial-grade tool flow. In addition, we implement full-chip M3D GDS layouts using both 7nm HP and LSTP cells and industry-standard physical design tools, and evaluate the resulting full-chip power, performance, and area metrics. Our study first shows that 7nm HP M3D designs outperform 7nm HP 2D designs by 16.8% in terms of iso-performance total power reduction. Moreover, 7nm LSTP M3D designs reduce the total power consumption by 14.3% compared to their 2D counterparts. This convincingly demonstrates the power benefits of M3D technologies in both high performance as well as low power future generation devices.


international soc design conference | 2008

Mapping control intensive kernels onto coarse-grained reconfigurable array architecture

Kyungwook Chang; Kiyoung Choi

A typical embedded application can be considered as a mixture of computation intensive part and control intensive part. Existing coarse-grained reconfigurable array architecture shows high performance for the computation intensive part, but cannot handle the control intensive part efficiently, thereby degrading the overall performance. This paper presents an approach to cope with such limitation by using kernel-level speculative execution. The simulation result shows that our approach increases the average performance of the deblocking filter for a luma macroblock and a chroma macroblock over 18 and 42 times respectively compared to conventional software implementation.


applied reconfigurable computing | 2010

Memory-centric communication architecture for reconfigurable computing

Kyungwook Chang; Kiyoung Choi

This paper presents a memory-centric communication architecture for a reconfigurable array of processing elements, which reduces the communication overhead by establishing a direct communication channel through a memory between the array and other masters in the system. Not to increase the area cost too much, we do not use a multi-port memory, but divide the memory into multiple memory units, each having a single port. The masters and the memory units have one-to-one mapping through a simple crossbar switch, which switches whenever data transfer is needed. Experimental results show that the proposed architecture achieves 76% performance improvement over the conventional architecture.


design automation conference | 2016

Match-making for monolithic 3D IC: finding the right technology node

Kyungwook Chang; Saurabh Sinha; Brian Cline; Greg Yeric; Sung Kyu Lim

Monolithic 3D IC (M3D) has the potential to provide a break-through in the power and performance scaling challenges. We, for the first time, present a comprehensive study of M3D on a commercial design across multiple technology nodes. The performance and power impact of M3D is investigated using a commercial, in-order, 32-bit application processor, implemented on foundry 28nm and 14/16nm process nodes, as well as a predictive 7nm node. We study the factors across the technology nodes that affect the efficiency of M3D, and propose a roadmap for optimum technology and design interaction that will enable the full entitlement of M3D.


ieee international symposium on parallel distributed processing workshops and phd forum | 2010

Automatic mapping of control-intensive kernels onto coarse-grained reconfigurable array architecture with speculative execution

Ganghee Lee; Kyungwook Chang; Kiyoung Choi

Coarse-grained reconfigurable array architectures have drawn increasing attention due to their good performance and flexibility. In general, they show high performance for compute-intensive kernel code, but cannot handle control-intensive parts efficiently, thereby degrading the overall performance. In this paper, we present automatic mapping of control-intensive kernels onto coarse-grained reconfigurable array architecture by using kernel-level speculative execution. Experimental results show that our automatic mapping tool successfully handles control-intensive kernels for coarse-grained reconfigurable array architecture. In particular, it improves the performance of the H.264 deblocking filters for luma and chroma over 26 and 16 times respectively compared to conventional software implementation. Compared to the approach using predicated execution, the proposed approach achieves 2.27 times performance enhancement.


international symposium on quality electronic design | 2016

Monolithic 3D IC design: Power, performance, and area impact at 7nm

Kartik Acharya; Kyungwook Chang; Bon Woong Ku; Shreepad Panth; Saurabh Sinha; Brian Cline; Greg Yeric; Sung Kyu Lim

In this paper, we present a comprehensive study of full-chip power, performance, and area metric for monolithic 3D (M3D) IC designs at the 7nm technology node. We investigate the benefits of M3D designs using our predictive 7nm FinFET libraries. This paper outlines detailed iso-performance power comparisons between M3D and 2D full-chip GDSII designs using both 7nm high performance (HP) and low stand-by power (LSTP) library cells. We achieve significant wire-length and buffer reduction with 7nm HP M3D designs over 2D counterparts, thus more power saving at high iso-performance frequency. In addition, this power saving is also realized in 7nm LSTP M3D designs running at low iso-performance frequencies. We also study the impact of clock tree design on the clock power consumption in M3D designs. Lastly, we demonstrate the impact of clock tree partitioning on the total power of full-chip M3D designs. Our experiments show that 7nm HP and LSTP M3D designs outperform its 2D counterparts by 12% and 10% on average, respectively.


international conference on computer aided design | 2016

Cascade2D: A design-aware partitioning approach to monolithic 3D IC with 2D commercial tools

Kyungwook Chang; Saurabh Sinha; Brian Cline; Raney Southerland; Michael Doherty; Greg Yeric; Sung Kyu Lim

Monolithic 3D IC (M3D) can continue to improve power, performance, area and cost beyond traditional Moores law scaling limitations by leveraging the third-dimension and fine-grained monolithic inter-tier vias (MIVs). Several recent studies present methodologies to implement M3D designs, but most, if not all of these studies implement top and bottom tier separately after partitioning, which results in inaccurate buffer insertion. In this paper, we present a new methodology called ‘Cascade2D’ that utilizes design and micro-architecture insight to partition and implement an M3D design using 2D commercial tools. By modeling MIVs with sets of anchor cells and dummy wires, we implement and optimize both top and bottom tier simultaneously in a single 2D design. M3D designs of a commercial, in-order, 32-bit application processor at the foundry 28nm, 14/16nm and predictive 7nm technology nodes are implemented using this new methodology and we investigate the power, performance and area improvements over 2D designs. Our new methodology consistently outperforms the state-of-the-art M3D design flow with up to 4× better power savings. In the best case scenario, M3D designs from the Cascade2D flow show 25% better performance at iso-power and 20% lower power at isoperformance.


IEEE Transactions on Very Large Scale Integration Systems | 2017

Impact and Design Guideline of Monolithic 3-D IC at the 7-nm Technology Node

Kyungwook Chang; Kartik Acharya; Saurabh Sinha; Brian Cline; Greg Yeric; Sung Kyu Lim

Monolithic 3-D (M3D) IC is one of the potential technologies to break through the challenges of continued circuit power and performance scaling. In this paper, for the first time, we demonstrate the power benefits of M3D and present design guideline in a 7-nm FinFET technology node. The predictive 7-nm process design kit (PDK) and the standard cell library using both high-performance (HP) and low-standby-power (LSTP) device technologies are developed based on NanGate 45-nm PDK using accurate dimensional, material, and electrical parameters from publications and a commercial-grade tool flow. We implement full-chip M3D designs utilizing industry-standard physical design tools, and gauge the impact of M3D technology on performance, power, and area metrics. We also provide the design guidelines as well as a new partitioning methodology to improve M3D design quality. This paper shows that M3D designs outperform 2-D counterparts by 16% and 16.5% on average in terms of isoperformance total power reduction with 7-nm HP and LSTP cell library, respectively. This demonstrates the power benefits of M3D technology in both HP and low-power future generation devices.


international conference on hybrid information technology | 2009

Coarse-grained reconfigurable architecture for multiple application domains: a case study

Manhwee Jo; Ganghee Lee; Kyungwook Chang; Kyuseung Han; Kiyoung Choi; Hoonmo Yang; Kiwook Yoon

In this paper, we propose a coarse-grained reconfigurable architecture, which supports both integer type application domain and floating-point type application domain. Our coarse-grained reconfigurable architecture has an 8x8 array of integer processing elements to execute 64 integer operations or 32 floating-point operations in parallel. In order to show the feasibility of the proposed architecture, we use MPEG4 decoder as an integer type application and physics engine for 3D graphics as a floating-point type application. We first analyze these applications and optimize their implementation on the proposed architecture at the system level. Then we implement the proposed architecture on an FPGA board, which decodes 12 frames per second of MPEG4 CIF images and execute up to 160 million floating-point operations per second for the physics engine at 20MHz clock frequency.


international symposium on physical design | 2018

Compact-2D: A Physical Design Methodology to Build Commercial-Quality Face-to-Face-Bonded 3D ICs

Bon Woong Ku; Kyungwook Chang; Sung Kyu Lim

The recent advancement of wafer bonding technology offers fine-grained and silicon-space overhead-free 3D interconnections in face-to-face (F2F) bonded 3D ICs. In this paper, we propose a full-chip RTL-to-GDSII physical design solution to build high-density and commercial-quality two-tier F2F-bonded 3D ICs. The state-of-the-art flow named Shrunk-2D (S2D) requires shrinking of standard cells and interconnects by a factor of 50% to fit into the target 3D footprint of a two-tier design. This, unfortunately, necessitates commercial place/route engines that handle one node smaller geometries, which can be challenging and costly. Our flow named Compact-2D (C2D) does not require any geometry shrinking. Instead, C2D implements a 2D IC with scaled interconnect RC parasitics, and contracts the layout to the F2F design footprint. In addition, C2D offers post-tier-partitioning optimization that is shown to be effective in fixing timing violations caused by inter-tier 3D routing, which is completely missing in S2D. Lastly, we present a methodology to recycle the routing result of post-tier-partitioning optimization for final GDSII generation. Our experimental results show that at iso-performance, C2D offers up to 26.8% power reduction and 15.6% silicon area savings over commercial 2D ICs without any routing resource overhead.

Collaboration


Dive into the Kyungwook Chang's collaboration.

Top Co-Authors

Avatar

Sung Kyu Lim

Georgia Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Brian Cline

University of Michigan

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Bon Woong Ku

Georgia Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Saurabh Sinha

University of Johannesburg

View shared research outputs
Top Co-Authors

Avatar

Kiyoung Choi

Seoul National University

View shared research outputs
Top Co-Authors

Avatar

Kartik Acharya

Georgia Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Saurabh Sinha

University of Johannesburg

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge