2021 The 5th International Conference on Compute and Data Analysis | 2021

WordErrorSim: An Adversarial Examples Generation Method in Chinese by Erroneous Knowledge

Abstract

Deep learning is widely used in many kinds of network applications, but errors in texts challenge the model s performance. In order to study the impact of erroneous knowledge on the Chinese text classification model, in this paper we propose a word-level black box adversarial example generation method called WordErrorSim. The algorithm uses SIGHAN Back-off 2013 and ZDIC to construct erroneous knowledge space. The adversarial examples generate by replacing character with pronounce similar character, shape similar character or other commonly used erroneous character. The adversarial example can achieve the attack without changing the semantic or grammar of the original sentence. Besides, we design and implement Chinese Pronunciation-Shape Similarity (CPSS) algorithm to measure the change between the adversarial example and the original text. We verify the effectiveness of the proposed method with real dataset on Bi-LSTM and TextCNN models. The experiment results show that our method can significantly reduce the classification accuracy of the model.

Volume None

2021 The 5th International Conference on Compute and Data Analysis | 2021

WordErrorSim: An Adversarial Examples Generation Method in Chinese by Erroneous Knowledge

Abstract

Volume None

Pages None

DOI 10.1145/3456529.3456556

Language English

Journal 2021 The 5th International Conference on Compute and Data Analysis

Full Text