Maxim Sidorov
University of Ulm
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Maxim Sidorov.
Proceedings of the 4th International Workshop on Audio/Visual Emotion Challenge | 2014
Maxim Sidorov; Wolfgang Minker
There is an enormous number of potential applications of the system which is capable to recognize human emotions. Such opportunity can be useful in various applications, e.g., improvement of Spoken Dialogue Systems (SDSs) or monitoring agents in call-centers. Depression is another aspect of human beings which is closely related to emotions. The system, that can automatically diagnose patients depression can be helpful to physicians in order to support their decisions and avoid critical mistakes. Therefore, the Affect and Depression Recognition Sub-Challenges (ASC and DSC correspondingly) of the second combined open Audio/Visual Emotion and Depression recognition Challenge (AVEC 2014) is focused on estimating emotions and depression. This study presents the results of multimodal affect and depression recognition based on four different segmentation methods, using support vector regression. Furthermore, a speaker identification procedure has been introduced in order to build the speaker-specific emotion/depression recognition systems.
soft computing | 2016
Christina Brester; Eugene Semenkin; Maxim Sidorov
Abstract If conventional feature selection methods do not show sufficient effectiveness, alternative algorithmic schemes might be used. In this paper we propose an evolutionary feature selection technique based on the two-criterion optimization model. To diminish the drawbacks of genetic algorithms, which are applied as optimizers, we design a parallel multicriteria heuristic procedure based on an island model. The performance of the proposed approach was investigated on the Speech-based Emotion Recognition Problem, which reflects one of the most essential points in the sphere of human-machine communications. A number of multilingual corpora (German, English and Japanese) were involved in the experiments. According to the results obtained, a high level of emotion recognition was achieved (up to a 12.97% relative improvement compared with the best F-score value on the full set of attributes).
international conference on acoustics, speech, and signal processing | 2014
Maxim Sidorov; Stefan Ultes; Alexander Schmitt
In this paper, we present novel work on speech-based adaptive emotion recognition through addition of speaker-specific information. We propose a two-stage approach of first determining the speaker and then using this information during the emotion recognition process. The proposed technique has been evaluated using five emotional speech databases of different languages using both artificial neural network-based speaker identifier and the ground truth. The addition of speaker-specific information improves the emotion recognition accuracy by up to +10.2%. Moreover, emotion recognition performance scores for all applied databases are improved.
international conference on informatics in control automation and robotics | 2014
Christina Brester; Maxim Sidorov; Eugene Semenkin
In this paper the efficiency of feature selection techniques based on the evolutionary multi-objective optimization algorithm is investigated on the set of speech-based emotion recognition problems (English, German languages). Benefits of developed algorithmic schemes are demonstrated compared with Principal Component Analysis for the involved databases. Presented approaches allow not only to reduce the amount of features used by a classifier but also to improve its performance. According to the obtained results, the usage of proposed techniques might lead to increasing the emotion recognition accuracy by up to 29.37% relative improvement and reducing the number of features from 384 to 64.8 for some of the corpora.
Proceedings of the 2014 Workshop on Mapping Personality Traits Challenge and Workshop | 2014
Maxim Sidorov; Stefan Ultes; Alexander Schmitt
A system being capable of recognize personality traits may be utilized in an enormous number of applications. Adding personality-dependency may be useful to build speaker-adaptive models, e.g., to improve Spoken Dialogue Systems (SDSs) or to monitor agents in call-centers. Therefore, the First Audio/Visual Mapping Personality Traits Challenge (MAPTRAITS 2014) focuses on estimating personality traits. In this context, this study presents the results for multimodal recognition of personality traits using support vector machines. As only small portions of the data is used for personality estimation at a time (which are later combined to a final estimate), different segmentation methods (and how to derive a final hypothesis) are analyzed regarding the task as both a regression and a classification problem.
international conference on informatics in control automation and robotics | 2016
Anastasiia Spirina; Maxim Sidorov; Roman B. Sergienko; Alexander Schmitt
This work presents the first experimental results on Interaction Quality modelling for human-human conversation, as an adaptation of the Interaction Quality metric for human-computer spoken interaction. The prediction of an Interaction Quality score can be formulated as a classification problem. In this paper we describe the results of applying several classification algorithms such as: Kernel Naive Bayes Classifier, k-Nearest Neighbours algorithm, Logistic Regression, and Support Vector Machines, to a data set. Moreover, we compare the results of modelling for two approaches for Interaction Quality labelling and consider the results depending on different emotion sets. The results of Interaction Quality modelling for human-human conversation may be used both for improving the service quality in call centres and for improving Spoken Dialogue Systems in terms of flexibility, user-friendliness and human-likeness.
intelligent environments | 2013
Maxim Sidorov; Alexander Schmitt; Sergey Zablotskiy; Wolfgang Minker
In this paper we present an overview of state-of-the-art approaches for speaker identification. Due to the increased number of dialogue system applications the interest in that field has grown significantly in recent years. Nevertheless, there are many open issues in the field of automatic speaker identification. Among them the choice of the appropriate speech signal features and machine learning algorithms could be mentioned. We make here an overview of modern methods designed for the problem of speaker identification. We also describe here our direction for possible improvements to the automated speaker identification.
international conference on speech and computer | 2016
Anastasiia Spirina; Olesia Vaskovskaia; Maxim Sidorov; Alexander Schmitt
The spoken dialogue systems (SDSs), which are designed to replace employees in different services, need some indicators, which show what happened in the ongoing dialogue and what the next step in system’s behaviour should be. Thus, some indicators for the SDSs come from the field of the call centre’s quality evaluation. In turn, some metrics like Interaction Quality (IQ), which was designed for human-computer spoken interaction, can be applied to human-human conversations. Such experience might be used for both call centres and SDSs for service quality improvement. This paper provides the results of IQ modelling for human-human task-oriented conversation with several classification algorithms.
text speech and dialogue | 2017
Anastasiia Spirina; Wolfgang Minker; Maxim Sidorov
There are different metrics which are used in call centres or Spoken Dialogue Systems (SDSs) as an indicator for problem detection during the dialogue. One of such metrics is emotional state. The measurements of emotions can be a powerful indicator in different task-oriented services. Besides emotional state, there is another widely used metric: customer satisfaction (CS), which has a modification called Interaction Quality (IQ). The both models of CS and IQ may include emotional state as a feature. However, is it an actually necessary feature? Some users/customers can be very emotional, while other can be insufficiently emotional in different satisfaction categories. That is why emotional state may be not an informative feature for IQ/CS modelling. Our research is dedicated to the definition of the emotions measurements role in IQ modelling task.
international conference on informatics in control automation and robotics | 2015
Christina Brester; Eugene Semenkin; Maxim Sidorov; Olga Semenkina
In this paper we introduce the two-criterion optimization model to design multilayer perceptrons taking into account two objectives, which are the classification accuracy and computational complexity. Using this technique, it is possible to simplify the structure of neural network classifiers and at the same time to keep high classification accuracy. The main benefits of the approach proposed are related to the automatic choice of activation functions, the possibility of generating the ensemble of classifiers, and the embedded feature selection procedure. The cooperative multi-objective genetic algorithm is used as an optimizer to determine the Pareto set approximation in the two-criterion problem. The effectiveness of this approach is investigated on the speech-based emotion recognition problem. According to the results obtained, the usage of the proposed technique might lead to the generation of classifiers comprised by fewer neurons in the input and hidden layers, in contrast to conventional models, and to an increase in the emotion recognition accuracy by up to a 4.25% relative improvement due to the application of the ensemble of classifiers.