IEEE Access | 2019

A Set Space Model to Capture Structural Information of a Sentence

 
 
 
 
 
 

Abstract


The context of a sentence is composed of a limited number of words. This leads to the feature sparsity problem whereby the sentence’s meaning is easily influenced by language phenomena such as polysemy, ambiguity and puns. To resolve these problems, the set space model (SSM) uses language characteristics to group features of a sentence into different sets. Afterwards, the proposed feature calculus is used to capture the structural information of the sentence. Experiments have shown that this approach to the relation recognition task is effective. However, at least three weaknesses remain. First, due to the lack of a probabilistic explanation, several aspects of SSM (e.g., filter selection) have not yet been covered. Second, the existing studies have only provided an outline of SSM, and many issues remain unclear. To understand this approach, it is necessary to discuss a suitable example in detail. Third, SSM has been applied only to the task of relation recognition. Case studies of more typical topics (e.g., named entity recognition) will help illustrate the use of SSM’s methodology to manipulate features. This paper develops SSM to cover these problems. It describes a systematic and novel approach to manipulating features of a sentence. In the experimental part, two typical information extraction tasks are performed to demonstrate SSM’s capabilities. Two case studies are considered, and favorable improvements are observed. All of the obtained results surpass those of compared approaches. The experiments also show the influence of sentence structural information on information extraction.

Volume 7
Pages 142515-142530
DOI 10.1109/ACCESS.2019.2944559
Language English
Journal IEEE Access

Full Text