Information Systems Frontiers | 2021

Multilingual Interoperation in Cross-Country Industry 4.0 System for One Belt and One Road

 
 
 
 
 
 

Abstract


For the multilingual interoperation in cross-country industrial systems, character recognition is a research issue that can largely facilitate the automatic information integration of an enormous number of forms, but has not been well resolved. Character recognition using the deep convolutional neural network depends on large scale training data collection and labor-intensive labeling work to train an effective model. Synthetic data generation and data augmentation are the typical means to compensate for the scarcity of labeled training data. However, the domain shift between synthetic data and real data inevitably results in unsatisfying recognition accuracy, bringing a significant challenge. To alleviate such an issue, a recognition system with enhanced two-phase transfer learning is proposed to utilize unlabeled real data in existing industrial forms. In the framework, massive training data are generated automatically with a configurable font and character library. A proposed convolutional neural network suitable for character recognition is pre-trained with the generated training data as the source model. In the first transfer phase, the source model is adapted to the target model with real samples of a specific writing style in an unsupervised manner. In the second supervised transfer phase, the target model is further optimized with a few labels available. The recognition application is described based on the target model. The effectiveness of the proposed enhanced two-phase model transfer method is validated on the public dataset as the target domain data through systematic experiments. Furthermore, a comparison with related works is provided to show the transferability and efficiency of the proposed framework.

Volume None
Pages None
DOI 10.1007/S10796-021-10159-Z
Language English
Journal Information Systems Frontiers

Full Text