Róbert Sabo
Slovak Academy of Sciences
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Róbert Sabo.
text, speech and dialogue | 2011
Sakhia Darjaa; Miloš Cerňak; Štefan Beňuš; Milan Rusko; Róbert Sabo; Marián Trnka
This paper presents rule-based triphone mapping for acoustic models training in automatic speech recognition. We test if the incorporation of expanded knowledge at the level of parameter tying in acoustic modeling improves the performance of automatic speech recognition in Slovak. We propose a novel technique of knowledge-based triphone tying, which allows the synthesis of unseen triphones. The proposed technique is compared with decision tree-based state tying, and it is shown that for bigger acoustic models, at a size of 3000 states and more, a triphone mapped HMM system achieves better performance than a tree-based state tying system on a large vocabulary continuous speech transription task. Experiments, performed using 350 hours of a Slovak audio database of mixed read and spontaneous speech, are presented. Relative decrease of word error rate was 4.23% for models with 7500 states, and 4.13% at 11500 states.
language and technology conference | 2011
Milan Rusko; Jozef Juhár; Marián Trnka; Ján Staš; Sakhia Darjaa; Daniel Hládek; Róbert Sabo; Matus Pleva; Marian Ritomský; Martin Lojka
This paper describes the design, development and evaluation of the Slovak dictation system for the judicial domain. The speech is recorded using a close-talk microphone and the dictation system is used for on-line or off-line automatic transcription. The system provides an automatic dictation tool in Slovak for the employees of the Ministry of Justice of the Slovak Republic and all the courts in Slovakia. The system is designed for on-line dictation and off-line transcription of legal texts recorded in acoustical conditions of typical office. Details of the technical solution are given and the evaluation of different versions of the system is presented.
text speech and dialogue | 2007
Milan Rusko; Róbert Sabo; Martin Dzúr
Research and development in speech synthesis and recognition calls for a phonological intonation annotation scheme for the particular language. Inspired by the successful ToBI (Tones and Break Indices) for American English [1] and GToBI [2] for German, this paper introduces a new intonation annotation scheme for Slovak, Sk-ToBI. In spite of the fact that Slovak prosodic rules differ from those of English or German, we decided to follow the main principals of ToBI and to define a special Slovak version of Tones and Break Indices annotation scheme. The speech material belonging to different styles, which was used for the preliminary study of accents in Slovak is shortly described and the conventions of Sk-ToBI annotation are presented.
international conference on speech and computer | 2016
Róbert Sabo; Milan Rusko; Andrej Ridzik; Jakub Rajčáni
This paper reports on initial experiments with the creation of a suitable database for training and testing systems for stress detection in speech and first experimental results. Based on the psychological understanding of the concepts of stress and emotion, we operationalized stress as a level of arousal, which can be detected in speech. We describe here a speech database with three levels of “acted stress” and three levels of soothing. For the very first experiment performed on the database we detect different levels of stress using Gaussian mixture models. The accuracy of detecting three levels of stress was 89 % for speakers included in the training database and 73 % for speakers whose recordings were not used during the adaptation of the GMM models.
language and technology conference | 2013
Milan Rusko; Jozef Juhár; Marián Trnka; Ján Staš; Sakhia Darjaa; Daniel Hládek; Róbert Sabo; Matus Pleva; Marian Ritomský; Stanislav Ondáš
This paper describes evaluation and recent advances in application of speech dictation system for the judicial domain. The dictation system incorporates Slovak speech recognition and uses a plugin for widely used office suite. It was introduced recently after preliminary user evaluation in the Slovak courts. The system was improved significantly using new acoustic databases for evaluation and acoustic modeling when compared to the previous version. The speaker adaptation procedure and gender dependent models significantly improve the overall accuracy below 5 % WER for domain specific test set. The language resources were extended and the language modeling techniques were improved as it is described in the paper. An end-user questionnaire about the user interface was evaluated and new functionalities were introduced. According to the available feedback, it can be concluded that the dictation system is able to speed up the court proceedings significantly for each user willing to cooperate with new technologies.
Journal of Linguistics/Jazykovedný casopis | 2017
Róbert Sabo; Jakub Rajčáni
Abstract This study describes the methodology used for designing a database of speech under real stress. Based on limits of existing stress databases, we used a communication task via a computer game to collect speech data. To validate the presence of stress, known psychophysiological indicators such as heart rate and electrodermal activity, as well as subjective self-assessment were used. This paper presents the data from first 5 speakers (3 men, 2 women) who participated in initial tests of the proposed design. In 4 out of 5 speakers increases in fundamental frequency and intensity of speech were registered. Similarly, in 4 out of 5 speakers heart rate was significantly increased during the task, when compared with reference measurement from before the task. These first results show that proposed design might be appropriate for building a speech under stress database. However, there are still considerations that need to be addressed.
text speech and dialogue | 2014
Róbert Sabo; Štefan Beňuš
This paper reports on initial experiments with automatic comma recovery in legal texts. In deciding whether to insert a comma or not, we propose to use the value of the probability of a bigram of two words without a comma and a trigram of the words with the comma. The probability is determined by the language model trained on sentences with commas labeled as separate words. In the training database one sentence corresponds to one line. The thresholds of bigrams and trigrams probability were experimentally determined to achieve the best balance of precision and recall. The advantage of the proposed method is its high precision (95%) at a relatively satisfactory recall (49%). For judges as potential users of an ASR system with an automatic comma insertion function, precision is particularly important.
conference of the international speech communication association | 2011
Sakhia Darjaa; Milos Cernak; Marián Trnka; Milan Rusko; Róbert Sabo
2018 World Symposium on Digital Intelligence for Systems and Machines (DISA) | 2018
Róbert Sabo; Jakub Rajčáni; Marian Ritomsky
2018 World Symposium on Digital Intelligence for Systems and Machines (DISA) | 2018
Sakhia Darjaa; Róbert Sabo; Marián Trnka; Milan Rusko; Gabriela Mucskova