2021 6th International Conference on Computer Science and Engineering (UBMK) | 2021

Mevzuat Verisetinde Soru Cevaplama Uygulamasi Question Answering Application on Legalisation Dataset

 
 
 
 

Abstract


Question Answering is a widely studied sub-field of Natural Language Processing (NLP). It studies information retrieval techniques that locate the answer in a corpus for a given query. Recently, deep learning techniques are widely employed in this field. This work uses a transfer learning method on Turkish Tax legislation documents. Experts in Tax-Law domain created 355 question-answer pairs in SQuAD 1.1 (Stanford Question Answering Dataset) format using law documents in UYAP (National Judiciary Informatics System). BERT (Bidirectional Encoder Representations from Transformers) contextual word embedding vectors are used to create a representation that can capture different meanings in word representations. Using both these embeddings and the model obtained from SQuAD 1.1 dataset, a system was deployed. Also, using the failing answers retrieved from the application of this model, a SQuAD 2.0 dataset were created that includes impossible-to-answer questions. New models were obtained by training with this dataset. Our observation is that the most successful model of SQuAD 2.0 dataset outperforms that of SQuAD 1.1 by 11% in exact matching measure and by 5% in F1.

Volume None
Pages 603-607
DOI 10.1109/UBMK52708.2021.9558981
Language English
Journal 2021 6th International Conference on Computer Science and Engineering (UBMK)

Full Text