Matus Pleva
Technical University of Košice
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Matus Pleva.
international conference on multimedia communications | 2013
Eva Kiktova; Martin Lojka; Matus Pleva; Jozef Juhár; Anton Cizmar
With the increasing use of audio sensors in surveillance or monitoring applications, the detection of acoustic event performed in a real condition has emerged as a very important research problem. This paper is focused on the comparison of different feature extraction algorithms which were used for the parametric representation of the foreground and background sounds in a noisy environment. Our aim was to automatically detect shots and sounds of breaking glass in different SNR conditions. The well known feature extraction method like Mel-frequency cepstral coefficients (MFCC) and other effective spectral features such as logarithmic Mel-filter bank coefficients (FBANK) and Mel-filter bank coefficients (MELSPEC) were extracted from an input sound. Hidden Markov model (HMM) based learning technique performs the classification of mentioned sound categories.
International Journal of Advanced Robotic Systems | 2013
Stanislav Ondáš; Jozef Juhár; Matus Pleva; Anton Cizmar; Roland Holcer
The SCORPIO is a small-size mini-teleoperator mobile service robot for booby-trap disposal. It can be manually controlled by an operator through a portable briefcase remote control device using joystick, keyboard and buttons. In this paper, the speech interface is described. As an auxiliary function, the remote interface allows a human operator to concentrate sight and/or hands on other operation activities that are more important. The developed speech interface is based on HMM-based acoustic models trained using the SpeechDatE-SK database, a small-vocabulary language model based on fixed connected words, grammar, and the speech recognition setup adapted for low-resource devices. To improve the robustness of the speech interface in an outdoor environment, which is the working area of the SCORPIO service robot, a speech enhancement based on the spectral subtraction method, as well as a unique combination of an iterative approach and a modified LIMA framework, were researched, developed and tested on simulated and real outdoor recordings.
international conference on telecommunications | 2012
Jozef Vavrek; Eva Vozarikova; Matus Pleva; Jozef Juhár
Audio classification is one of the most important task in content-based analysis and can be implemented in many audio applications, such as indexing and retrieving. This paper addresses the problem of broadcast news audio classification, by support vector machine - binary tree (SVM-BT) architecture, into the five classes: pure speech, speech with music, speech with environment sound, pure music and environment sound. One of the most substantial step in creating such classification architecture is selection of an optimal feature set for each binary SVM classifier. Therefore we implement F-score feature selection algorithm, as an effective search algorithm, within a space of characteristic features that is mostly used for speech/non-speech discrimination.
language and technology conference | 2011
Milan Rusko; Jozef Juhár; Marián Trnka; Ján Staš; Sakhia Darjaa; Daniel Hládek; Róbert Sabo; Matus Pleva; Marian Ritomský; Martin Lojka
This paper describes the design, development and evaluation of the Slovak dictation system for the judicial domain. The speech is recorded using a close-talk microphone and the dictation system is used for on-line or off-line automatic transcription. The system provides an automatic dictation tool in Slovak for the employees of the Ministry of Justice of the Slovak Republic and all the courts in Slovakia. The system is designed for on-line dictation and off-line transcription of legal texts recorded in acoustical conditions of typical office. Details of the technical solution are given and the evaluation of different versions of the system is presented.
Proceedings of the Third COST 2102 international training school conference on Toward autonomous, adaptive, and context-aware multimodal interfaces: theoretical and practical issues | 2010
Ján Staš; Daniel Hládek; Matus Pleva; Jozef Juhár
Automatic speech recognition system is one of the parts of the multimodal dialogue system. It is necessary to create correct vocabulary and to generate suitable language model for this purpose. The main aim of this article is to describe a process of building statistical models of the Slovak language with large vocabulary trained on the text data gathered mainly from Internet sources. Several smoothing techniques for different sizes of vocabulary have been used in order to obtain an optimal model of the Slovak language. We have also employed pruning technique based on relative entropy for size reduction of a language model to find the maximum threshold of pruning with minimum degradation in recognition accuracy. Tests were performed by the decoder based on the HTK Toolkit.
international conference on systems signals and image processing | 2013
Jan Krekan; Matus Pleva; Lubomir Dobos
The purpose of this article is to describe a new method which is proposed to be the best practice for creating a very effective password candidate lists for specified language, which could be then also used to test the security level of wireless networks protected by WPA/WPA2 PSK standards. The main principle of this technique is to create the statistical model of the new target language which could be used for password candidates generation in controlled order for security audit of the wireless network. It means that the list starts with more probable combinations, going to the less probable ones, so it can be said that this approach means sorting the Brute-force candidates according to specified language, or predicting the usage of letter combinations according to the specified language statistics. The tests have shown that this approach of generating more probable combinations as first, could improve the procedure since it is about 15 times faster in finding about 70% of passwords than common Brute-force attack, comparing to about 20% effectiveness of old dictionary attacks.
international conference radioelektronika | 2015
Matus Pleva; Stanislav Ondáš; Jozef Juhár
Dialogue acts classification plays an important role in advanced dialogue management systems. They represent intention of the dialogue participant in the particular part of the dialogue interaction. Dialogue act classification lies in the classification of the spoken utterances according to their discourse function. It relies mainly on the classical machine learning techniques similar as in case of natural language processing (NLP) tasks. The HMM-based approach was applied to perform DA classification of utterances in Slovak language. Episodes of the Slovak TV talk show were used for creating of dialogue corpus with DA labels. New simplified annotation schema was designed and used for labeling the corpus with 12 DA classes. Bigram models of DA classes and dialogue grammar were trained on training part of the corpus. Decoding of testing utterances was done by comparing probabilities of occurrence and perplexity over trained bigrams. Obtained results are comparable with similar techniques and data sets.
international conference on systems signals and image processing | 2013
Peter Viszlay; Jozef Juhár; Matus Pleva
Linear discriminant analysis (LDA) is a popular supervised feature transformation applied in current automatic speech recognition (ASR). Generally, the parameters of LDA are computed from the training data partitioned into classes. If the number of classes is smaller than the dimension of the supervectors (typically in phoneme-based LDA) then the between-class covariance matrix can become singular or close to singular (singularity problem in classical LDA). In this paper, we present a modification of the standard between-class covariance matrix estimation, which represents one of the possible approaches to solving the singularity problem. Our method works directly with the supervectors instead of the class mean vectors. The number of estimation cycles is much larger because more data are used during the computation. Thus, the matrix structure can be significantly refined. This implies that larger lengths of context can be used while the singularity problem is efficiently eliminated. The effectiveness of the proposed estimation is evaluated in Slovak phoneme-based and triphone-based large vocabulary continuous speech recognition (LVCSR) task. The method is compared to the state-of-the-art MFCCs and to LDA trained in the standard way. The experimental results confirm that the modified LDA considerably outperforms the MFCCs and consistently leads to improvements of the conventional LDA.
international symposium on applied machine intelligence and informatics | 2017
Matus Pleva; Jozef Juhár; Anton Cizmar; Christopher R. Hudson; Daniel W. Carruth; Cindy L. Bethel
This paper describes the development of a specialized application for voice command recognition for the Jaguar V4 robot in conjunction with the Starkville, MS, USA Special Weapons and Tactics (SWAT) team during training. This training took place at The Center for Advanced Vehicular Systems (CAVS), which provides a specialized environment for police SWAT training. This reconfigurable space, setup during this study as a two bedroom apartment, includes video monitoring of the space, sound playback and capturing, reconfigurable lighting, etc. This training environment is used for testing different kinds of human-robot interfaces in SWAT training operations. The results of the voice integration evaluation indicated that voice commands could be successfully used for controlling additional functions of the robot after a short introductory training session with a few of the police officers. These preliminary observations were encouraging and provides support for further investigation into the usefulness of this technology.
Multimedia Tools and Applications | 2017
Matus Pleva; Patrick Bours; Stanislav Ondáš; Jozef Juhár
In this paper we investigate the capacity of sound & timing information during typing of a password for the user identification and authentication task. The novelty of this paper lies in the comparison of performance between improved timing-based and audio-based keystroke dynamics analysis and the fusion for the keystroke authentication. We collected data of 50 people typing the same given password 100 times, divided into 4 sessions of 25 typings and tested how well the system could recognize the correct typist. Using fusion of timing (9.73%) and audio calibration scores (8.99%) described in the paper we achieved 4.65% EER (Equal Error Rate) for the authentication task. The results show the potential of using Audio Keystroke Dynamics information as a way to authenticate or identify users during log-on.