2021 29th Conference of Open Innovations Association (FRUCT) | 2021

Towards a Toolbox for Mining QA-pairs and QAT-triplets from Conversational Data of Public Chats

 
 
 

Abstract


Communities of various specialists are using public groups on platforms like Telegram or Slack to discuss specific domain-oriented topics, for instance, Python programming language, Clickhouse database management system or even film making peculiarities. Conversations in such chats often have a form of questions and answers: someone is looking for information, and someone is giving answers. Both sides working to create a community-driven knowledge source. In a group chat, several parallel discussions on different topics can be held simultaneously, which leads to mixing messages up between dialogues and makes it difficult to get individual dialogues from the chat. In this paper, we consider the problem of data preprocessing for the automatic formation of QA-pairs and QAT-triplets from dialogues of group QA-chats that may be used to exploit information and knowledge stored in conversations. Therefore, to deal with this problem we formulate a set of related tasks and consider approaches to solve them. In particular, we highlight two of the most important tasks: identification of the start of new discussions and classification of mixed-up messages to dialogues they belong to. We perform a comparative study of proposed methods on three large groups from Telegram messenger representing user communities that have variations in communication style and patterns, moderation aspects and topics. The experiments allow us to highlight the influence of these variations on the performance of the proposed methods for the tasks and find the best alternatives among them.

Volume None
Pages 94-101
DOI 10.23919/FRUCT52173.2021.9435511
Language English
Journal 2021 29th Conference of Open Innovations Association (FRUCT)

Full Text