Rok Gajšek
University of Ljubljana
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Rok Gajšek.
international conference on biometrics theory applications and systems | 2009
Vitomir Struc; Rok Gajšek; Nikola Pavesic
Gabor filters have proven themselves to be a powerful tool for facial feature extraction. An abundance of recognition techniques presented in the literature exploits these filters to achieve robust face recognition. However, while exhibiting desirable properties, such as orientational selectivity or spatial locality, Gabor filters have also some shortcomings which crucially affect the characteristics and size of the Gabor representation of a given face pattern. Amongst these shortcomings the fact that the filters are not orthogonal one to another and are, hence, correlated is probably the most important. This makes the information contained in the Gabor face representation redundant and also affects the size of the representation. To overcome this problem we propose in this paper to employ orthonormal linear combinations of the original Gabor filters rather than the filters themselves for deriving the Gabor face representation. The filters, named principal Gabor filters for the fact that they are computed by means of principal component analysis, are assessed in face recognition experiments performed on the XM2VTS and YaleB databases, where encouraging results are achieved.
international conference on pattern recognition | 2010
Rok Gajšek; Vitomir truc
The information of the psycho-physical state of the subject is becoming a valuable addition to the modern audio or video recognition systems. As well as enabling a better user experience, it can also assist in superior recognition accuracy of the base system. In the article, we present our approach to multi-modal (audio-video) emotion recognition system. For audio sub-system, a feature set comprised of prosodic, spectral and cepstrum features is selected and support vector classifier is used to produce the scores for each emotional category. For video sub-system a novel approach is presented, which does not rely on the tracking of specific facial landmarks and thus, eliminates the problems usually caused, if the tracking algorithm fails at detecting the correct area. The system is evaluated on the interface database and the recognition accuracy of our audio-video fusion is compared to the published results in the literature.
text, speech and dialogue | 2009
Rok Gajšek; Vitomir Struc; Bostjan Vesnicer; Anja Podlesek; Luka Komidar
The paper deals with the recording and the evaluation of a multi modal (audio/video) database of spontaneous emotions. Firstly, motivation for this work is given and different recording strategies used are described. Special attention is given to the process of evaluating the emotional database. Different kappa statistics normally used in measuring the agreement between annotators are discussed. Following the problems of standard kappa coefficients, when used in emotional database assessment, a new time-weighted free-marginal kappa is presented. It differs from the other kappa statistics in that it weights each utterances particular score of agreement based on the duration of the utterance. The new method is evaluated and the superiority over the standard kappa, when dealing with a database of spontaneous emotions, is demonstrated.
BioID_MultiComm'09 Proceedings of the 2009 joint COST 2101 and 2102 international conference on Biometric ID management and multimodal communication | 2009
Rok Gajšek; Vitomir Struc; Simon Dobrisek; Janez Žibert; Nikola Pavesic
The paper presents our initial attempts in building an audio video emotion recognition system. Both, audio and video sub-systems are discussed, and description of the database of spontaneous emotions is given. The task of labelling the recordings from the database according to different emotions is discussed and the measured agreement between multiple annotators is presented. Instead of focusing on the prosody in audio emotion recognition, we evaluate the possibility of using linear transformations (CMLLR) as features. The classification results from audio and video sub-systems are combined using sum rule fusion and the increase in recognition results, when using both modalities, is presented.
text speech and dialogue | 2012
Rok Gajšek; Simon Dobrisek
In the article we evaluate the importance of different HMM states in an HMM-based feature extraction method used to model paralinguistic information. Specifically, we evaluate the distribution of the paralinguistic information across different states of the HMM in two different classification tasks: emotion recognition and alcoholization detection. In the task of recognizing emotions we found that the majority of emotion-related information is incorporated in the first and third state of a 3-state HMM. Surprisingly, in the alcoholization detection task we observed a somewhat equal distribution of task-specific information across all three states, resulting in constantly producing better results if more states are utilized.
text speech and dialogue | 2008
Rok Gajšek; Janez Žibert
In the article we evaluate different techniques of acoustic modeling for speech recognition in the case of limited audio resources. The objective was to build different sets of acoustic models, the first was trained on a small set of telephone speech recordings and the other was trained on a bigger database with broadband speech recordings and later adapted to a different audio environment. Different adaptation methods (MLLR, MAP) were examined in combination with different parameterization features (MFCC, PLP, RPLP). We show that using adaptation methods, which are mainly used for speaker adaptation purposes, can increase the robustness of speech recognition in cases of mismatched training and working acoustic environment conditions.
International Journal of Advanced Robotic Systems | 2013
Simon Dobrisek; Rok Gajšek; Nikola Pavesic; Vitomir Struc
Informatica (lithuanian Academy of Sciences) | 2009
Rok Gajšek; Vitomir Struc; F. Mihelič; Anja Podlesek; Luka Komidar; G. Sočan; B. Bajec
conference of the international speech communication association | 2010
Rok Gajšek; Janez Zibert; Tadej Justin; Vitomir Struc; Bostjan Vesnicer
The 33rd International Convention MIPRO | 2010
Tadej Justin; Rok Gajšek; Vitomir Struc; Simon Dobrisek