Mingu Kang | Researchain

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Mingu Kang is active.

Explore More

Publication

Featured researches published by Mingu Kang.

international conference on acoustics, speech, and signal processing | 2014

An energy-efficient VLSI architecture for pattern recognition via deep embedding of computation in SRAM

Mingu Kang; Min-Sun Keel; Naresh R. Shanbhag; Sean Eilert; Ken Curewitz

In this paper, we propose the concept of compute memory, where computation is deeply embedded into the memory (SRAM). This deep embedding enables multi-row read access and analog signal processing. Compute memory exploits the relaxed precision and linearity requirements of pattern recognition applications. System-level simulations incorporating various deterministic errors from analog signal chain demonstrates the limited accuracy of analog processing does not significantly degrade the system performance, which means the probability of pattern detection is minimally impacted. The estimated energy saving is 63 % as compared to the conventional system with standard embedded memory and parallel processing architecture, for 256×256 target image.

international conference on acoustics, speech, and signal processing | 2015

An energy-efficient memory-based high-throughput VLSI architecture for convolutional networks

Mingu Kang; Sujan K. Gonugondla; Min Sun Keel; Naresh R. Shanbhag

In this paper, an energy efficient, memory-intensive, and high throughput VLSI architecture is proposed for convolutional networks (C-Net) by employing compute memory (CM) [1], where computation is deeply embedded into the memory (SRAM). Behavioral models incorporating CMs circuit non-idealities and energy models in 45nm SOI CMOS are presented. System-level simulations using these models demonstrate that the probability of handwritten digit recognition Pr > 0.99 can be achieved using the MNIST database [2], along with a 24.5× reduced energy delay product, a 5.0× reduced energy, and a 4.9× higher throughput as compared to the conventional system.

international symposium on circuits and systems | 2015

Energy-efficient and high throughput sparse distributed memory architecture

Mingu Kang; Eric P. Kim; Min Sun Keel; Naresh R. Shanbhag

This paper presents an energy-efficient VLSI implementation of Sparse Distributed Memory (SDM). High throughput and energy-efficient Hamming distance-based address decoder (CM-DEC) is proposed by employing compute memory [1], where computation is deeply embedded into a memory (SRAM). Hierarchical binary decision (HBD) is also proposed to enhance area- and energy-efficiency of read operation by minimizing data transfer. The SDM is employed as an auto-associative memory with four read iterations and 16×16 binary noisy input image with input error rates of 15%, 25%, and 30%. The proposed SDM achieves 39× smaller energy delay product with 14.5× and 2.7× reduced delay and energy, respectively as compared to conventional digital implementation of SDM in 45 nm SOI CMOS process with output error rate degradation less than 0.4%.

IEEE Transactions on Biomedical Circuits and Systems | 2016

In-Memory Computing Architectures for Sparse Distributed Memory

Mingu Kang; Naresh R. Shanbhag

This paper presents an energy-efficient and high-throughput architecture for Sparse Distributed Memory (SDM)-a computational model of the human brain [1]. The proposed SDM architecture is based on the recently proposed in-memory computing kernel for machine learning applications called Compute Memory (CM) [2], [3]. CM achieves energy and throughput efficiencies by deeply embedding computation into the memory array. SDM-specific techniques such as hierarchical binary decision (HBD) are employed to reduce the delay and energy further. The CM-based SDM (CM-SDM) is a mixed-signal circuit, and hence circuit-aware behavioral, energy, and delay models in a 65 nm CMOS process are developed in order to predict system performance of SDM architectures in the autoand hetero-associative modes. The delay and energy models indicate that CM-SDM, in general, can achieve up to 25 × and 12 × delay and energy reduction, respectively, over conventional SDM. When classifying 16 ×16 binary images with high noise levels (input bad pixel ratios: 15%-25%) into nine classes, all SDM architectures are able to generate output bad pixel ratios (Bo) ≤ 2%. The CM-SDM exhibits negligible loss in accuracy, i.e., its Bo degradation is within 0.4% as compared to that of the conventional SDM.

international symposium on computer architecture | 2018

PROMISE: an end-to-end design of a programmable mixed-signal accelerator for machine-learning algorithms

Prakalp Srivastava; Mingu Kang; Sujan K. Gonugondla; Sungmin Lim; Jungwook Choi; Vikram S. Adve; Nam Sung Kim; Naresh R. Shanbhag

Analog/mixed-signal machine learning (ML) accelerators exploit the unique computing capability of analog/mixed-signal circuits and inherent error tolerance of ML algorithms to obtain higher energy efficiencies than digital ML accelerators. Unfortunately, these analog/mixed-signal ML accelerators lack programmability, and even instruction set interfaces, to support diverse ML algorithms or to enable essential software control over the energy-vs-accuracy tradeoffs. We propose PROMISE, the first end-to-end design of a PROgrammable MIxed-Signal accElerator from Instruction Set Architecture (ISA) to high-level language compiler for acceleration of diverse ML algorithms. We first identify prevalent operations in widely-used ML algorithms and key constraints in supporting these operations for a programmable mixed-signal accelerator. Second, based on that analysis, we propose an ISA with a PROMISE architecture built with silicon-validated components for mixed-signal operations. Third, we develop a compiler that can take a ML algorithm described in a high-level programming language (Julia) and generate PROMISE code, with an IR design that is both language-neutral and abstracts away unnecessary hardware details. Fourth, we show how the compiler can map an application-level error tolerance specification for neural network applications down to low-level hardware parameters (swing voltages for each application Task) to minimize energy consumption. Our experiments show that PROMISE can accelerate diverse ML algorithms with energy efficiency competitive even with fixed-function digital ASICs for specific ML algorithms, and the compiler optimization achieves significant additional energy savings even for only 1% extra errors.

european solid state circuits conference | 2017

A 19.4 nJ/decision 364K decisions/s in-memory random forest classifier in 6T SRAM array

Mingu Kang; Sujan K. Gonugondla; Naresh R. Shanbhag

This paper presents IC realization of a random forest (RF) machine learning classifier. Algorithm-architecture-circuit is co-optimized to minimize the energy-delay product (EDP). Deterministic subsampling (DSS) and balanced decision trees result in reduced interconnect complexity and avoid irregular memory accesses. Low-swing analog in-memory computations embedded in a standard 6T SRAM enable massively parallel processing thereby minimizing the memory fetches and reducing the EDP further. The 65nm CMOS prototype achieves a 6.8× lower EDP compared to a conventional design at the same accuracy (94%) for an 8-class traffic sign recognition problem.

international solid-state circuits conference | 2018

A 42pJ/decision 3.12TOPS/W robust in-memory machine learning classifier with on-chip training

Sujan K. Gonugondla; Mingu Kang; Naresh R. Shanbhag

arXiv: Hardware Architecture | 2016

Reducing the Energy Cost of Inference via In-sensor Information Processing.

Sai Zhang; Mingu Kang; Charbel Sakr; Naresh R. Shanbhag

arXiv: Hardware Architecture | 2016

A 481pJ/decision 3.4M decision/s Multifunctional Deep In-memory Inference Processor using Standard 6T SRAM Array.

Mingu Kang; Sujan K. Gonugondla; Ameya Patil; Naresh R. Shanbhag

international symposium on information theory | 2018

SRAM Bit-line Swings Optimization using Generalized Waterfilling

Yongjune Kim; Mingu Kang; Lav R. Varshney; Naresh R. Shanbhag

Explore More

Collaboration

Dive into the Mingu Kang's collaboration.

Top Co-Authors

Sean Eilert

Micron Technology

View shared research outputs

Top Co-Authors

Ken Curewitz

Micron Technology

View shared research outputs

Top Co-Authors

Yongjune Kim

University of Illinois at Urbana–Champaign

View shared research outputs

Top Co-Authors

Jungwook Choi

IBM

View shared research outputs

Explore More

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot

Dive into the research topics where Mingu Kang is active.

Publication

Featured researches published by Mingu Kang.

An energy-efficient VLSI architecture for pattern recognition via deep embedding of computation in SRAM

An energy-efficient memory-based high-throughput VLSI architecture for convolutional networks

Energy-efficient and high throughput sparse distributed memory architecture

In-Memory Computing Architectures for Sparse Distributed Memory

PROMISE: an end-to-end design of a programmable mixed-signal accelerator for machine-learning algorithms

A 19.4 nJ/decision 364K decisions/s in-memory random forest classifier in 6T SRAM array

A 42pJ/decision 3.12TOPS/W robust in-memory machine learning classifier with on-chip training

Reducing the Energy Cost of Inference via In-sensor Information Processing.

A 481pJ/decision 3.4M decision/s Multifunctional Deep In-memory Inference Processor using Standard 6T SRAM Array.

SRAM Bit-line Swings Optimization using Generalized Waterfilling

Collaboration

Dive into the Mingu Kang's collaboration.