Peer-to-Peer Networking and Applications | 2019

Naive semi-supervised deep learning using pseudo-label

 
 
 

Abstract


To facilitate the utilization of large-scale unlabeled data, we propose a simple and effective method for semi-supervised deep learning that improves upon the performance of the deep learning model. First, we train a classifier and use its outputs on unlabeled data as pseudo-labels. Then, we pre-train the deep learning model with the pseudo-labeled data and fine-tune it with the labeled data. The repetition of pseudo-labeling, pre-training, and fine-tuning is called naive semi-supervised deep learning. We apply this method to the MNIST, CIFAR-10, and IMDB data sets, which are each divided into a small labeled data set and a large unlabeled data set by us. Our method achieves significant performance improvements compared to the deep learning model without pre-training. We further analyze the factors that affect our method to provide a better understanding of how to utilize naive semi-supervised deep learning in practical application.

Volume None
Pages 1-11
DOI 10.1007/S12083-018-0702-9
Language English
Journal Peer-to-Peer Networking and Applications

Full Text