2021 IEEE International Conference on Artificial Intelligence and Computer Applications (ICAICA) | 2021

Improving Named Entity Recognition of Chinese Legal Documents by Lexical Enhancement

 
 

Abstract


The paper studies the named entity recognition based on Chinese legal texts. Named entity recognition (NER) plays a critical role in a series of natural language processing tasks, and has been studied for many years. Different from general domain texts, Chinese legal texts have their own particularities: 1) there are many professional terms: 2) there are often abbreviations and pronouns in entity names; 3) nested combinations of words lead to excessively long entity names. Moreover, there is no publicly available data set of named entity annotation in the judicial field, which limits the development of judicial named entity recognition. This paper discusses a simpler and more effective method to introduce the lexical information into the character-based NER system, that is, fine tuning the character representation layer and introducing explicit lexical boundary information. This method not only avoids the design of complex sequence model structure, but also has good portability for any neural network model. The experimental results show that this method can obtain the F value of 95.35 on our annotated judicial data set, and has achieved good performance in annotated corpus of People’s Daily in January, 1998.

Volume None
Pages 999-1004
DOI 10.1109/ICAICA52286.2021.9498036
Language English
Journal 2021 IEEE International Conference on Artificial Intelligence and Computer Applications (ICAICA)

Full Text