Proceedings of the 2021 Conference on Human Information Interaction and Retrieval | 2021

Classifying Speech Acts using Multi-channel Deep Attention Network for Task-oriented Conversational Search Agents

 
 

Abstract


Understanding human spoken dialogues in an information-seeking scenario is a significant challenge for IR researchers. Prior literature in intelligent systems suggests that by identifying speech acts in spoken dialogues, we can identify the search intent and the information needs of the user. Therefore, in this paper, we have used speech acts to address the problem of natural language understanding in conversational search systems. First, we collected human-system interaction data through a Wizard-of-Oz study. Next, we developed a gold-standard dataset where the human-system conversations are labeled with corresponding speech acts. Finally, we built the Multi-channel Deep Attention Network (MDAN) to identify the speech acts in information-seeking dialogues. The results highlight that the best performing model predicts speech acts with 90.2% accuracy. The MDAN architecture outperforms not only all traditional machine learning models but also the state-of-the-art single-channel BERT by 3.3 absolute points. We performed ablation analysis to show the impact of the three channels of MDAN individually and in combination. The results indicate that the best performance is achieved using all three channels for speech act prediction.

Volume None
Pages None
DOI 10.1145/3406522.3446057
Language English
Journal Proceedings of the 2021 Conference on Human Information Interaction and Retrieval

Full Text