IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 2021

DiReCtX: Dynamic Resource-Aware CNN Reconfiguration Framework for Real-Time Mobile Applications

 
 
 
 
 

Abstract


Although convolutional neural networks (CNNs) have been widely applied in various cognitive applications, they are still very computationally intensive for resource-constrained mobile systems. To reduce the resource consumption of CNN computation, many optimization works have been proposed for mobile CNN deployment. However, most works are merely targeting CNN model compression from the perspective of parameter size or model structure, ignoring different resource constraints in mobile systems with respect to memory, energy, and real-time requirement. Moreover, previous works take accuracy as their primary consideration, requiring a time-costing retraining process to compensate the inference accuracy loss after compression. To address these issues, we propose DiReCtX—a dynamic resource-aware CNN model reconfiguration framework. DiReCtX is based on a set of accurate CNN profiling models for different resource consumption and inference accuracy estimation. With manageable consumption/accuracy tradeoffs, DiReCtX can reconfigure a CNN model to meet distinct resource constraint types and levels with expected inference performance maintained. To further achieve fast model reconfiguration in real-time, improved CNN model pruning and its corresponding accuracy tuning strategies are also proposed in DiReCtX. The experiments show that the proposed CNN profiling models can achieve 94.6% and 97.1% accuracy for CNN model resource consumption and inference accuracy estimation. Meanwhile, the proposed reconfiguration scheme of DiReCtX can achieve at most 44.44% computation acceleration, 31.69% memory reduction, and 32.39% energy saving, respectively. On field-tests with state-of-the-art smartphones, DiReCtX can adapt CNN models to various resource constraints in mobile application scenarios with optimal real-time performance.

Volume 40
Pages 246-259
DOI 10.1109/TCAD.2020.2995813
Language English
Journal IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

Full Text