Is this you? Create Your Porfile

Duckhwan Kim

Georgia Institute of Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Duckhwan Kim is active.

Explore More

Publication

Featured researches published by Duckhwan Kim.

international symposium on computer architecture | 2016

Neurocube: a programmable digital neuromorphic architecture with high-density 3D memory

Duckhwan Kim; Jaeha Kung; Sek M. Chai; Sudhakar Yalamanchili; Saibal Mukhopadhyay

This paper presents a programmable and scalable digital neuromorphic architecture based on 3D high-density memory integrated with logic tier for efficient neural computing. The proposed architecture consists of clusters of processing engines, connected by 2D mesh network as a processing tier, which is integrated in 3D with multiple tiers of DRAM. The PE clusters access multiple memory channels (vaults) in parallel. The operating principle, referred to as the memory centric computing, embeds specialized state-machines within the vault controllers of HMC to drive data into the PE clusters. The paper presents the basic architecture of the Neurocube and an analysis of the logic tier synthesized in 28nm and 15nm process technologies. The performance of the Neurocube is evaluated and illustrated through the mapping of a Convolutional Neural Network and estimating the subsequent power and performance for both training and inference.

international symposium on low power electronics and design | 2015

A power-aware digital feedforward neural network platform with backpropagation driven approximate synapses

Jaeha Kung; Duckhwan Kim; Saibal Mukhopadhyay

This paper proposes a power-aware digital feedforward neural network platform that utilizes the backpropagation algorithm during training to enable energy-quality trade-off. Given a quality constraint, the proposed approach identifies a set of synaptic weights for approximation in a neural network. The approach selects synapses with small impact on output error, estimated by the backpropagation algorithm, for approximation. The approximations are achieved by a coupled software (reduced bit-width) and hardware (approximate multiplication in the processing engine) based design approaches. The full-chip design in 130nm CMOS shows, compared to a baseline accurate design, the proposed approach reduces system power by ~38% with 0.4% lower recognition accuracy in a classification problem.

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 2015

On the Impact of Energy-Accuracy Tradeoff in a Digital Cellular Neural Network for Image Processing

Jaeha Kung; Duckhwan Kim; Saibal Mukhopadhyay

This paper studies the opportunities of energy-accuracy tradeoff in cellular neural network (CNN). Algorithmic characteristics of CNN is coupled with hardware-induced error distribution of a digital CNN cell to evaluate energy-accuracy tradeoff for simple image processing tasks as well as a complex application. The analysis shows that errors modulate the cell dynamics and propagate through the network degrading the output quality and increasing the convergence time. The error propagation is determined by the task being performed by the CNN, specifically, the strength of the feedback template. Controlling precision is observed to be a more effective approach for energy-accuracy tradeoff in CNN than voltage over scaling.

international symposium on low power electronics and design | 2016

Dynamic Approximation with Feedback Control for Energy-Efficient Recurrent Neural Network Hardware

Jaeha Kung; Duckhwan Kim; Saibal Mukhopadhyay

This paper presents methodology of feedback-controlled dynamic approximation to enable energy-accuracy trade-off in digital recurrent neural network (RNN). A low-power digital RNN engine is presented that employs the proposed dynamic approximation. The on-chip feedback controller is realized by utilizing hysteretic or proportional controller. The dynamic adaptation of bit-precisions during the RNN computation is selected as approximation approach. Considering various applications, the digital RNN engine designed in 28nm CMOS shows ~36% average energy saving compared to the baseline case, with only ~4% of accuracy degradation on average.

computational intelligence in robotics and automation | 2009

Animal-Robot Interaction for pet caring

Jong-Hwan Kim; Seung-Hwan Choi; Duckhwan Kim; Joonwoo Kim; Minjoo Cho

Pet has been serving as an emotional companion to people. However, nowadays it is common that people are too busy to take care of their pet due to everyday work. This research is to see the possibility that robot can replace the role of taking care of pets on behalf of their owner and the conventional Human-Robot Interaction (HRI) can be extended to the interaction of robots and animals. In this paper, the concept of Animal-Robot Interaction (ARI) and its characteristics are presented along with basic experiments. The experiments are carried out with a cat and mobile robots. It clearly shows the possibility of implementation of ARI.

design, automation, and test in europe | 2017

Adaptive weight compression for memory-efficient neural networks

Jong Hwan Ko; Duckhwan Kim; Taesik Na; Jaeha Kung; Saibal Mukhopadhyay

Neural networks generally require significant memory capacity/bandwidth to store/access a large number of synaptic weights. This paper presents an application of JPEG image encoding to compress the weights by exploiting the spatial locality and smoothness of the weight matrix. To minimize the loss of accuracy due to JPEG encoding, we propose to adaptively control the quantization factor of the JPEG algorithm depending on the error-sensitivity (gradient) of each weight. With the adaptive compression technique, the weight blocks with higher sensitivity are compressed less for higher accuracy. The adaptive compression reduces memory requirement, which in turn results in higher performance and lower energy of neural network hardware. The simulation for inference hardware for multilayer perceptron with the MNIST dataset shows up to 42X compression with less than 1% loss of recognition accuracy, resulting in 3X higher effective memory bandwidth and ∼19X lower system energy.

IEEE Transactions on Emerging Topics in Computing | 2017

A Power-Aware Digital Multilayer Perceptron Accelerator with On-Chip Training Based on Approximate Computing

Duckhwan Kim; Jaeha Kung; Saibal Mukhopadhyay

This paper proposes that approximation by reducing bit-precision and using inexact multiplier can save power consumption of digital multilayer perceptron accelerator during the classification of MNIST (inference) with negligible accuracy degradation. Based on the error sensitivity precomputed during the training, synaptic weights with less sensitivity are approximated. Under given bit-precision modes, our proposed algorithm determines bit precision for all synapse to minimize power consumption for given target accuracy. For entire network, earlier layer can be more approximated since it has lower error sensitivity. Proposed algorithm can save power 57.4 percent while accuracy is degraded about 1.7 percent. After approximation, retraining with few iterations can improve the accuracy while maintaining power consumption. The impact of different training conditions on the approximation is also studied. Training with small quantization error (less bit precision) allows more power saving in inference. It also shows that enough number of iteration during the training is important for approximation in inference. Network with more layers is more sensitive to the approximation.

ieee soi 3d subthreshold microelectronics technology unified conference | 2016

NeuroSensor: A 3D image sensor with integrated neural accelerator

Mohammad Faisal Amir; Duckhwan Kim; Jaeha Kung; D. Lie; Sudhakar Yalamanchili; Saibal Mukhopadhyay

3D integration provides opportunities to design high-bandwidth and low-power CMOS image sensors (CIS) [1–4]. The 3D stacking of pixel tier, peripheral tier, memory tier(s), and compute tier(s) enables high degree of parallel processing. Also, each tier can be designed in different technology nodes (heterogeneous integration) to further improve power-efficiency. This paper presents a case study of a smart 3D image sensor with integrated neuro-inspired computing for intelligent vision processing. Hardware acceleration of neuro-inspired computing has received much attention in recent years for recognition and classification [5]. We present the physical design of NeuroSensor, a 3D CIS with an integrated convolutional neural network (CNN) accelerator. The rationale for our approach is that 3D integration of sensor, memory, and computing will effectively harness the inherent parallelism in neural algorithms. We design the NeuroSensor considering different complexities of CNN platform, ranging from only feature extraction to complete classification, and study the trade-offs between complexity, performance, and power.

international symposium on computer architecture | 2017

A Programmable Hardware Accelerator for Simulating Dynamical Systems

Jaeha Kung; Yun Long; Duckhwan Kim; Saibal Mukhopadhyay

The fast and energy-efficient simulation of dynamical systems defined by coupled ordinary/partial differential equations has emerged as an important problem. The accelerated simulation of coupled ODE/PDE is critical for analysis of physical systems as well as computing with dynamical systems. This paper presents a fast and programmable accelerator for simulating dynamical systems. The computing model of the proposed platform is based on multilayer cellular nonlinear network (CeNN) augmented with nonlinear function evaluation engines. The platform can be programmed to accelerate wide classes of ODEs/PDEs by modulating the connectivity within the multilayer CeNN engine. An innovative hardware architecture including data reuse, memory hierarchy, and near-memory processing is designed to accelerate the augmented multilayer CeNN. A dataflow model is presented which is supported by optimized memory hierarchy for efficient function evaluation. The proposed solver is designed and synthesized in 15nm technology for the hardware analysis. The performance is evaluated and compared to GPU nodes when solving wide classes of differential equations and the power consumption is analyzed to show orders of magnitude improvement in energy efficiency.

international midwest symposium on circuits and systems | 2017

Energy-efficient neural image processing for Internet-of-Things edge devices

Jong Hwan Ko; Yun Long; Mohammad Faisal Amir; Duckhwan Kim; Jaeha Kung; Taesik Na; Amit Ranjan Trivedi; Saibal Mukhopadhyay

Enhancing energy/resource efficiency of neural networks is critical to support on-chip neural image processing at Internet-of-Things edge devices. This paper presents recent technology advancements towards energy-efficient neural image processing. 3D integration of image sensor and neural network improves power-efficiency with programmability and scalability. Computation energy of feedforward and recurrent neural networks is reduced by dynamic control of approximation, and storage demand is reduced by image-based adaptive weight compression. Emerging devices such as tunnel FET and Resistive Random Access Memory are utilized to achieve higher computation efficiency than CMOS-based designs.

Explore More