ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP) | 2019

Tempo-HindiWordNet

 
 
 
 

Abstract


Temporality has significantly contributed to various Natural Language Processing and Information Retrieval applications. In this article, we first create a lexical knowledge-base in Hindi by identifying the temporal orientation of word senses based on their definition and then use this resource to detect underlying temporal orientation of the sentences. To create the resource, we propose a semi-supervised learning framework, where each synset of the Hindi WordNet is classified into one of the five categories, namely, past, present, future, neutral, and atemporal. The algorithm initiates learning with a set of seed synsets and then iterates following different expansion strategies, viz. probabilistic expansion based on classifier’s confidence and semantic distance based measures. We manifest the usefulness of the resource that we build on an external task, viz. sentence-level temporal classification. The underlying idea is that a temporal knowledge-base can help in classifying the sentences according to their inherent temporal properties. Experiments on two different domains, viz. general and Twitter, show interesting results.

Volume 18
Pages 1 - 22
DOI 10.1145/3277504
Language English
Journal ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP)

Full Text