2019 IEEE International Electron Devices Meeting (IEDM) | 2019

Optimal Design Methods to Transform 3D NAND Flash into a High-Density, High-Bandwidth and Low-Power Nonvolatile Computing in Memory (nvCIM) Accelerator for Deep-Learning Neural Networks (DNN)

 
 
 
 
 
 
 
 

Abstract


We propose optimal design methods of 3D NAND Flash to achieve high-density, high-bandwidth and low-power nvCIM. By suitably engineering the device, we can produce ultra-low ON current of 2nA (mean) at saturated region instead of subthreshold region, while the OFF leakage current is much below 1pA. Such low Ion and large ON/OFF ratio provide large bandwidth to parallelly sum more than 10’000 cells together to offer high efficiency for DNN computing. The three-dimensional summation in 3D NAND also allows effective multi-bit resolution of weight without resorting to complex analog memory design. For the first time we witnessed the power of central limit theory in 3D NAND nvCIM, where the large number of summation averages out the noise and provides high accuracy of MAC. The effect of non-ideal cell variations, noises and shifts are studied systematically. Through adequate calibration techniques the 3D NAND nvCIM can provide accuracy close to the software limitation, with reasonable tolerance to various device errors. The 3D NAND nvCIM is promising to be an energy-efficient (TOPS/W~40) edge computing solution for large neural networks (>100Mb weight).

Volume None
Pages 38.1.1-38.1.4
DOI 10.1109/IEDM19573.2019.8993652
Language English
Journal 2019 IEEE International Electron Devices Meeting (IEDM)

Full Text