Wonje Choi | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Wonje Choi is active.

Explore More

Publication

Featured researches published by Wonje Choi.

IEEE Transactions on Computers | 2016

Wireless NoC for VFI-Enabled Multicore Chip Design: Performance Evaluation and Design Trade-Offs

Ryan Gary Kim; Wonje Choi; Guangshuo Liu; Ehsan Mohandesi; Partha Pratim Pande; Diana Marculescu; Radu Marculescu

Multiple Voltage Frequency Island (VFI)-based designs can reduce the energy dissipation in multicore chips. Indeed, by tailoring the voltages and frequencies of each VFI domain, we can achieve significant energy savings subject to specific performance constraints. The achievable performance of VFI-based multicore platforms depends on the overall communication backbone, which relies predominantly on Networks-on-Chip (NoCs). Traditionally mesh-based NoCs have been used in VFI-based systems. However, the mesh-based NoCs have large latency and energy overheads due to their inherently long multihop paths. Emerging paradigms such as the millimeter (mm)-wave small-world wireless Networks-on-Chip (mSWNoCs) have lately been observed to help reduce the impact of the communication backbone on the performance of the multicore chips. In this work, we demonstrate that not only do mSWNoC-enabled VFI designs mitigate some of the full-system performance degradation inherent in VFI-partitioned multicore designs, but they also help in eliminating it entirely for certain applications. We also demonstrate that the VFI-partitioned designs used in conjunction with a novel NoC architecture like mSWNoC can achieve significant energy savings while minimizing the impact on the performance for each application under consideration.

IEEE Transactions on Very Large Scale Integration Systems | 2016

Wireless NoC and Dynamic VFI Codesign: Energy Efficiency Without Performance Penalty

Ryan Gary Kim; Wonje Choi; Zhuo Chen; Partha Pratim Pande; Diana Marculescu; Radu Marculescu

Multiple voltage frequency island (VFI)-based designs can reduce the energy dissipation in multicore platforms by taking advantage of the varying nature of the application workloads. Indeed, the voltage/frequency (V/F) levels of the VFIs can be dynamically tailored by considering the workload-driven variations in the application. Traditionally, mesh-based networks-on-chip (NoCs) have been used in VFI-based systems; however, they have large latency and energy overheads due to the inherently long multihop paths. Consequently, in this paper, we explore the emerging paradigm of wireless NoC (WiNoC) and demonstrate that by incorporating WiNoC, VFI, and dynamic V/F tuning in a synergistic manner, we can design energy-efficient multicore platforms without introducing noticeable performance penalty. Our experimental results show that for the benchmarks considered, the proposed approach can achieve between 5.7% and 46.6% energy-delay product (EDP) savings over the state-of-the-art system and 26.8% and 60.5% EDP savings over a standard baseline non-VFI mesh-based system. This opens up a new of class of codesign approaches that can make WiNoCs the communication technology of choice for future multicore platforms.

compilers, architecture, and synthesis for embedded systems | 2016

Hybrid network-on-chip architectures for accelerating deep learning kernels on heterogeneous manycore platforms

Wonje Choi; Karthi Duraisamy; Ryan Gary Kim; Janardhan Rao Doppa; Partha Pratim Pande; Radu Marculescu; Diana Marculescu

In recent years, designing specialized manycore heterogeneous architectures for deep learning kernels has become an area of great interest. However, the typical on-chip communication infrastructures employed on conventional manycore platforms are unable to handle both CPU and GPU communication requirements efficiently. Hence, in this paper, our aim is to enhance the performance of heterogeneous manycore architectures through the design of a hybrid NoC consisting of both wireline and wireless links. To this end, we specifically target the resource-intensive backpropagation algorithm commonly used as the training method in deep learning. For backpropagation, the proposed hybrid NoC achieves 1.9× reduction in network latency and improves the network throughput by a factor of 2 with respect to a highly optimized mesh NoC. These network level improvements translate into 25% savings in full system energy-delay-product (EDP). This demonstrates the capability of the proposed hybrid and heterogeneous manycore architecture in accelerating deep learning kernels in an energy-efficient manner.

IEEE Transactions on Very Large Scale Integration Systems | 2017

Imitation Learning for Dynamic VFI Control in Large-Scale Manycore Systems

Ryan Gary Kim; Wonje Choi; Zhuo Chen; Janardhan Rao Doppa; Partha Pratim Pande; Diana Marculescu; Radu Marculescu

Manycore chips are widely employed in high-performance computing and large-scale data analysis. However, the design of high-performance manycore chips is dominated by power and thermal constraints. In this respect, voltage–frequency island (VFI) is a promising design paradigm to create scalable energy-efficient platforms. By dynamically tailoring the voltage and frequency of each island, we can further improve the energy savings within given performance constraints. Inspired by the recent success of imitation learning (IL) in many application domains and its significant advantages over reinforcement learning (RL), we propose the first architecture-independent IL-based methodology for dynamic VFI (DVFI) control in manycore systems. Due to its popularity in the EDA community, we consider an RL-based DVFI control methodology as a strong baseline. Our experimental results demonstrate that IL is able to obtain higher quality policies than RL (on average, 5% less energy with the same level of performance) with significantly less computation time and hardware area overheads (3.1X and 8.8X, respectively).

design automation conference | 2015

Energy efficient MapReduce with VFI-enabled multicore platforms

Karthi Duraisamy; Ryan Gary Kim; Wonje Choi; Guangshuo Liu; Partha Pratim Pande; Radu Marculescu; Diana Marculescu

In an era when power constraints and data movement are proving to be significant barriers for high-end computing, multicore architectures offer a low-power and highly scalable platform suitable for both data- and compute-intensive applications. MapReduce is a popular framework to facilitate the management and development of big-data workloads. In this work, we demonstrate that by using a wireless NoC-enabled Voltage Frequency Island (VFI)-based multicore platform it is possible to enhance the energy efficiency of MapReduce implementations without paying significant execution time penalties. Our experimental results show that for the benchmarks considered, the designed VFI system can achieve an average of 33.7% energy-delay product (EDP) savings over the standard baseline non-VFI mesh-based system while paying a maximum of 3.22% execution time penalty.

IEEE Transactions on Computers | 2018

On-Chip Communication Network for Efficient Training of Deep Convolutional Networks on Heterogeneous Manycore Systems

Wonje Choi; Karthi Duraisamy; Ryan Gary Kim; Janardhan Rao Doppa; Partha Pratim Pande; Diana Marculescu; Radu Marculescu

Convolutional Neural Networks (CNNs) have shown a great deal of success in diverse application domains including computer vision, speech recognition, and natural language processing. However, as the size of datasets and the depth of neural network architectures continue to grow, it is imperative to design high-performance and energy-efficient computing hardware for training CNNs. In this paper, we consider the problem of designing specialized CPU-GPU based heterogeneous manycore systems for energy-efficient training of CNNs. It has already been shown that the typical on-chip communication infrastructures employed in conventional CPU-GPU based heterogeneous manycore platforms are unable to handle both CPU and GPU communication requirements efficiently. To address this issue, we first analyze the on-chip traffic patterns that arise from the computational processes associated with training two deep CNN architectures, namely, LeNet and CDBNet, to perform image classification. By leveraging this knowledge, we design a hybrid Network-on-Chip (NoC) architecture, which consists of both wireline and wireless links, to improve the performance of CPU-GPU based heterogeneous manycore platforms running the above-mentioned CNN training workloads. The proposed NoC achieves 1.8× reduction in network latency and improves the network throughput by a factor of 2.2 for training CNNs, when compared to a highly-optimized wireline mesh NoC. For the considered CNN workloads, these network-level improvements translate into 25 percent savings in full-system energy-delay-product (EDP). This demonstrates that the proposed hybrid NoC for heterogeneous manycore architectures is capable of significantly accelerating training of CNNs while remaining energy-efficient.

networks on chips | 2017

3D NoC-Enabled Heterogeneous Manycore Architectures for Accelerating CNN Training: Performance and Thermal Trade-offs

Biresh Kumar Joardar; Wonje Choi; Ryan Gary Kim; Janardhan Rao Doppa; Partha Pratim Pande; Diana Marculescu; Radu Marculescu

As deep learning technology is increasingly employed in diverse applications domains, the demand for computational power to enable these algorithms also increases. In this respect, high-performance three-dimensional (3D) heterogeneous manycore systems present a promising direction. However, deep learning on these systems pose several design challenges. First, the network-on-chip (NoC) must handle the traffic requirements of both CPU and GPU communications. Second, 3D system designs must address thermal issues resulting from high-power density. In this work, we propose a design methodology for a heterogeneous 3D NoC architecture that not only satisfies the traffic requirements of both CPUs and GPUs, but also reduces thermal hotspots. To this end, we target the training of two widely employed convolutional neural networks (CNN), namely, LeNet and CIFAR. By using our joint performance-thermal optimization methodology to create a 3D NoC for training CNNs, we reduce the maximum temperature by 22% while incurring only 5% full-system energy-delay-product degradation over a solely performance optimized 3D NoC. This demonstrates that, our design methodology achieves considerable temperature reduction with negligible loss in performance.

ACM Transactions on Design Automation of Electronic Systems | 2017

VFI-Based Power Management to Enhance the Lifetime of High-Performance 3D NoCs

Sourav Das; Dongjin Lee; Wonje Choi; Janardhan Rao Doppa; Partha Pratim Pande; Krishnendu Chakrabarty

The emergence of 3D network-on-chip (NoC) has revolutionized the design of high-performance and energy-efficient manycore chips. However, the anticipated performance gain can be compromised due to the degradation and failure of vertical links (VLs). The Through-Silicon-Via (TSV)-enabled VLs may fail due to workload-induced stress; the failure of a VL can affect the neighboring VLs, thereby causing a cascade of failures and reducing the lifetime of the chip. To enhance the reliability of 3D NoC-enabled manycore chips, we propose to incorporate a voltage-frequency island (VFI)-based power management strategy that helps to reduce the energy consumption and hence, the workload-induced stress of the highly utilized VLs. The adopted power-management strategy relies on control decisions about the voltage/frequency (V/F) levels on VLs. We demonstrate that compared to the well-known spare TSV allocation and adaptive routing strategies, power management is more effective in enhancing the reliability of a 3D NoC. VFI-based power management improves the reliability of the 3D NoC by one order of magnitude compared to both adaptive routing and spare allocation while running popular SPLASH-2 and PARSEC benchmarks. The principal benefit of power management is that it is capable of reducing the operating temperature of the system, which in turn enhances the Mean-Time-To-Failure (MTTF) of the VLs and reliability of the overall 3D NoC.

international midwest symposium on circuits and systems | 2015

Improving EDP in wireless NoC-enabled multicore chips via DVFS pruning

Wonje Choi; Shervin Hajiamin; Ryan Gary Kim; Armin Rahimi; Nillofar Hezarjaribi; Partha Pratim Pande; Behrooz A. Shirazi

The millimeter-wave small-world wireless NoC (mSWNoC) is shown to be capable of improving the overall latency and energy dissipation characteristics compared to the conventional wireline mesh-based counterpart. The mSWNoC helps in improving the energy dissipation even further in presence of dynamic voltage and frequency scaling (DVFS). On-chip voltage regulators are required to tune the voltage depending on the workload. Though it is possible to have multiple voltage levels by designing suitable on-chip regulators, certain voltage levels are underutilized for specific applications. Hence, unnecessary voltage levels should be pruned, reducing the design complexity of the on-chip voltage regulators. In certain circumstances, the pruned DVFS method improves the energy-delay product (EDP) compared to the fine-grained DVFS while still remaining within an acceptable performance boundary.

international conference on computer aided design | 2015

The (Low) Power of Less Wiring: Enabling Energy Efficiency in Many-Core Platforms Through Wireless NoC

Partha Pratim Pande; Ryan Gary Kim; Wonje Choi; Zhuo Chen; Diana Marculescu; Radu Marculescu

During the last decade, we have witnessed a major transition from computation- to communication-centric design of integrated circuits and systems. In particular, the network-on-chip (NoC) approach has emerged as the major design paradigm for multicore systems-on-chip (SoC). The major challenges in traditional wire-based NoCs are the high latency and power consumption of the multi-hop links. By inserting single-hop long-range wireless links in place of multi-hop wired links, the overall system performance can be significantly improved. We should adopt novel architectures inspired by the on-chip wireless links to design high-performance multi-core chips. In this regard, the small-world network-inspired wireless NoC (WiNoC) has emerged as an enabling interconnection infrastructure to design high-bandwidth and energy-efficient multicore chips. In this paper we present the various challenges and possible solutions for designing energy-efficient massive multicore chips enabled by the WiNoC paradigm.

Explore More