Expert Syst. Appl. | 2021

Data-free knowledge distillation in neural networks for regression

 
 

Abstract


Abstract Knowledge distillation has been used successfully to compress a large neural network (teacher) into a smaller neural network (student) by transferring the knowledge of the teacher network with its original training dataset. However, the original training dataset is not reusable in many real-world applications. To address this issue, data-free knowledge distillation, which is knowledge distillation in the absence of the original training datasets, has been studied. However, existing methods are limited to classification problems and cannot be directly applied to regression problems. In this study, we propose a novel data-free knowledge distillation method that is applicable to regression problems. Given a teacher network, we adopt a generator network to transfer the knowledge in the teacher network to a student network. We simultaneously train the generator and student networks in an adversarial manner. The generator network is trained to create synthetic data on which the teacher and student networks make different predictions, with the student network being trained to mimic the teacher network’s predictions. We demonstrate the effectiveness of the proposed method on benchmark datasets. Our results show that the student network emulates the prediction ability of the teacher network with little performance loss.

Volume 175
Pages 114813
DOI 10.1016/J.ESWA.2021.114813
Language English
Journal Expert Syst. Appl.

Full Text