2021 IEEE International Systems Conference (SysCon) | 2021

diaLogic: Interaction-Focused Speaker Diarization

 
 

Abstract


diaLogic is a user-friendly Python program which performs social interaction classification through speaker diarization. The main libraries used include Python’s PyQt5 and Keras APIs, Matplotlib, and the computational R language. Speaker diarization is achieved with high consistency due to a simple four-layer convolutional neural network (CNN) trained on the Librispeech ASR corpus. Speaker interactions are modeled through a custom R language script. The data generated by the program allows the characterization of speaker traits within social experiments. Group leaders, followers, and level of speaker contribution can be characterized. These traits can be used to determine overall group performance, as well as the performance of individuals. The interface is designed to be simplistic and intuitive, which allows easy operation by nonengineers. This design consideration allows program operation with minimal training for users in the social sciences disciplines. The program is designed with a modular backend, which is invisible to the user of the program. The backend allows easy expansion through modular algorithms. For future iterations of the program, speaker interaction data collection will be fully automated through machine learning and/or logical constructs. The integration of voice-based emotion recognition will be the next phase for this program. Overall, the diaLogic program is the central workspace for social interaction characterization.

Volume None
Pages 1-8
DOI 10.1109/SysCon48628.2021.9447101
Language English
Journal 2021 IEEE International Systems Conference (SysCon)

Full Text