Bon Woong Ku
Georgia Institute of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Bon Woong Ku.
international symposium on quality electronic design | 2016
Kartik Acharya; Kyungwook Chang; Bon Woong Ku; Shreepad Panth; Saurabh Sinha; Brian Cline; Greg Yeric; Sung Kyu Lim
In this paper, we present a comprehensive study of full-chip power, performance, and area metric for monolithic 3D (M3D) IC designs at the 7nm technology node. We investigate the benefits of M3D designs using our predictive 7nm FinFET libraries. This paper outlines detailed iso-performance power comparisons between M3D and 2D full-chip GDSII designs using both 7nm high performance (HP) and low stand-by power (LSTP) library cells. We achieve significant wire-length and buffer reduction with 7nm HP M3D designs over 2D counterparts, thus more power saving at high iso-performance frequency. In addition, this power saving is also realized in 7nm LSTP M3D designs running at low iso-performance frequencies. We also study the impact of clock tree design on the clock power consumption in M3D designs. Lastly, we demonstrate the impact of clock tree partitioning on the total power of full-chip M3D designs. Our experiments show that 7nm HP and LSTP M3D designs outperform its 2D counterparts by 12% and 10% on average, respectively.
international symposium on low power electronics and design | 2016
Bon Woong Ku; Peter Debacker; Dragomir Milojevic; Praveen Raghavan; Diederik Verkest; Aaron Thean; Sung Kyu Lim
In this paper, we develop physical design tools and methodologies to tackle the inter-tier performance variations caused by low temperature manufacturing in 2-tier gate-level monolithic 3D ICs (M3D). First, we model the top tier front-end-of-line (FEOL) device mobility degradation and its impact on cell delay/power values. Next, we quantify the impact of tungsten interconnect and cost-driven metal layer saving in the back-end-of-line (BEOL) of the bottom tier. These device and interconnect degradation models are used in our new full-chip M3D physical design flow named Derated 2D. This flow overcomes the well-known drawback of the state-of-the-art Shrunk 2D that requires shrinking of layout objects and RC parasitics. Also, Derated 2D performs low-temperature process-aware tier partitioning to effectively keep timing-critical components in the bottom tier. Moreover, Derated 2D conducts timing-driven monolithic inter-tier via (MIV) planning to cope with the resistivity increase in tungsten BEOL. Lastly, Derated 2D offers an effective timing closure solution through a post-route optimization. Experiments based on a foundry-grade 7nm FinFET process design kit (PDK) show that Derated 2D achieves up to 36% performance improvement and 10% energy saving compared with Shrunk 2D. Using a post-route optimization, Derated 2D further improves timing under various FEOL/BEOL degradation settings at a minimum energy overhead.
international conference on computer aided design | 2016
Bon Woong Ku; Peter Debacker; Dragomir Milojevic; Praveen Raghavan; Sung Kyu Lim
In this paper we study power, performance, and cost (PPC) tradeoffs for 2-tier, gate-level, full-chip GDS monolithic 3D ICs (M3D) built using a foundry-grade 7nm bulk FinFET technology. We first develop highly-accurate wafer and die cost models for 2D and M3D to study PPC tradeoffs. In our study, both 2D and M3D designs are optimized in terms of the number of BEOL metal layers used for routing to obtain the best possible PPC values. We develop a new CAD methodology for 2-tier gate-level M3D, named Projected 2D Flow, that allows us to accurately compare RC parasitics of equivalent nets in both 2D and M3D designs. Our experiments based on two different circuit types (BEOL-dominant vs. FEOL-dominant) confirm that M3D designs indeed offer a significant footprint saving. However, to our surprise, the PPC quality of M3D turns out to be worse than that of 2D by 34% due to the high wafer cost of M3D. Our study also reveals that M3D wafer yield should be as high as 90% of 2D wafer yield, and the M3D device manufacturing cost should be less than 33% of that of 2D to justify the adoption of M3D technology at the 7nm era. Lastly, and counter-intuitively, our study shows that FEOL-dominant circuit shows more PPC benefits from M3D technology than BEOL-dominant circuit.
design automation conference | 2015
Yarui Peng; Bon Woong Ku; Youn-sik Park; Kwang-Il Park; Seong-Jin Jang; Joo Sun Choi; Sung Kyu Lim
3D DRAM is the next-generation memory system targeting high bandwidth, low power, and small form factor. This paper presents a cross-domain CAD/architectural platform that addresses DC power noise issues in 3D DRAM targeting stacked DDR3, Wide I/O, and hybrid memory cube technologies. Our design and analysis include both individual DRAM dies and a host logic die that communicates with them in the same stack. Moreover, our comprehensive solutions encompass all major factors in design, packaging, and architecture domains, including power delivery network wire sizing, redistribution layer routing, distributed, and dedicated TSV placement, die bonding style, backside wire bonding, and read policy optimization. We conduct regression analysis and optimization to obtain high quality solutions under noise, cost, and performance tradeoff. Compared with industry standard baseline designs and policies, our methods achieve up to 68.2% IR-drop reduction and 30.6% performance enhancement.
international symposium on physical design | 2018
Bon Woong Ku; Kyungwook Chang; Sung Kyu Lim
The recent advancement of wafer bonding technology offers fine-grained and silicon-space overhead-free 3D interconnections in face-to-face (F2F) bonded 3D ICs. In this paper, we propose a full-chip RTL-to-GDSII physical design solution to build high-density and commercial-quality two-tier F2F-bonded 3D ICs. The state-of-the-art flow named Shrunk-2D (S2D) requires shrinking of standard cells and interconnects by a factor of 50% to fit into the target 3D footprint of a two-tier design. This, unfortunately, necessitates commercial place/route engines that handle one node smaller geometries, which can be challenging and costly. Our flow named Compact-2D (C2D) does not require any geometry shrinking. Instead, C2D implements a 2D IC with scaled interconnect RC parasitics, and contracts the layout to the F2F design footprint. In addition, C2D offers post-tier-partitioning optimization that is shown to be effective in fixing timing violations caused by inter-tier 3D routing, which is completely missing in S2D. Lastly, we present a methodology to recycle the routing result of post-tier-partitioning optimization for final GDSII generation. Our experimental results show that at iso-performance, C2D offers up to 26.8% power reduction and 15.6% silicon area savings over commercial 2D ICs without any routing resource overhead.
design automation conference | 2018
Bon Woong Ku; Yu Liu; Yingyezhe Jin; Sandeep Kumar Samal; Peng Li; Sung Kyu Lim
A liquid state machine (LSM) is a powerful recurrent spiking neural network shown to be effective in various learning tasks including speech recognition. In this work, we investigate design and architectural co-optimization to further improve the area-energy efficiency of LSM-based speech recognition processors with monolithic 3D IC (M3D) technology. We conduct fine-grained tier partitioning, where individual neurons are folded, and explore the impact of shared memory architecture and synaptic model complexity on the power-performance-area-accuracy (PPA) benefit of M3D LSM-based speech recognition. In training and classification tasks using spoken English letters, we obtain up to 70.0% PPAA savings over 2D ICs.
international symposium on low power electronics and design | 2017
Bon Woong Ku; Taigon Song; Arthur Nieuwoudt; Sung Kyu Lim
Existing transistor-level monolithic 3D (T-M3D) standard cell layouts are based on the folding scheme, in which the pull-down network is simply folded and placed on top of the pull-up network. In this paper, we propose a new layout method, the stitching scheme, targeted towards improved cell performance and power integrity. We perform extensive analysis on each layout scheme and evaluate the timing/power benefits of the stitching scheme. Since the ground and power rails overlap in the T-M3D layouts with the folding scheme, we also present a design methodology for the power delivery network of folding T-M3D ICs to evaluate the impact of the T-M3D cell layout scheme on static power integrity. Compared to 2D ICs at iso-performance, stitching T-M3D ICs show a maximum of 6% power savings, 44% area savings with only 1% more static IR-drop in the 14nm technology node while folding T-M3D ICs undergo serious degradation in static power integrity, causing a reliability issue.
Proceedings of the Second International Workshop on Post Moores Era Supercomputing | 2017
Catherine D. Schuman; Raphael C. Pooser; Tiffany M. Mintz; Musabbir Adnan; Garrett S. Rose; Bon Woong Ku; Sung Kyu Lim
Neuromorphic computing represents one technology likely to be incorporated into future supercomputers. In this work, we present initial results on a potential neuromorphic co-processor, including a preliminary device design that includes memristors, estimates on energy usage for the co-processor, and performance of an on-line learning mechanism. We also present a high-level co-processor simulator used to estimate the performance of the neuromorphic co-processor on real applications. We discuss future use-cases of a potential neuromorphic co-processor in the supercomputing environment, including as an accelerator for supervised learning and for unsupervised, on-line learning tasks. Finally, we discuss plans for future work.
Proceedings of the Neuromorphic Computing Symposium on | 2017
Austin Wyer; Musabbir Adnan; Bon Woong Ku; Sung Kyu Lim; Catherine D. Schuman; Raphael C. Pooser; Garrett S. Rose
In this work we present an implementation of spike-timing-dependent plasticity (STDP) in both a high level simulation and at a circuit level. It is verified that the high level simulation captures the behavior of the circuit implementation. We use the simulation to assess the effectiveness of STDP for online-learning, and find that STDP enables networks to improve performance online after training.
international electron devices meeting | 2017
A. Mallik; Anne Vandooren; Liesbeth Witters; A. Walke; Jacopo Franco; Y. Sherazi; P. Weckx; D. Yakimets; Marie Garcia Bardon; B. Parvais; Peter Debacker; Bon Woong Ku; Sung Kyu Lim; Anda Mocuta; D. Mocuta; Julien Ryckaert; Nadine Collaert; Praveen Raghavan