Róbert Sabo | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Róbert Sabo is active.

Explore More

Publication

Featured researches published by Róbert Sabo.

text, speech and dialogue | 2011

Rule-based triphone mapping for acoustic modeling in automatic speech recognition

Sakhia Darjaa; Miloš Cerňak; Štefan Beňuš; Milan Rusko; Róbert Sabo; Marián Trnka

This paper presents rule-based triphone mapping for acoustic models training in automatic speech recognition. We test if the incorporation of expanded knowledge at the level of parameter tying in acoustic modeling improves the performance of automatic speech recognition in Slovak. We propose a novel technique of knowledge-based triphone tying, which allows the synthesis of unseen triphones. The proposed technique is compared with decision tree-based state tying, and it is shown that for bigger acoustic models, at a size of 3000 states and more, a triphone mapped HMM system achieves better performance than a tree-based state tying system on a large vocabulary continuous speech transription task. Experiments, performed using 350 hours of a Slovak audio database of mixed read and spontaneous speech, are presented. Relative decrease of word error rate was 4.23% for models with 7500 states, and 4.13% at 11500 states.

language and technology conference | 2011

Slovak Automatic Dictation System for Judicial Domain

Milan Rusko; Jozef Juhár; Marián Trnka; Ján Staš; Sakhia Darjaa; Daniel Hládek; Róbert Sabo; Matus Pleva; Marian Ritomský; Martin Lojka

This paper describes the design, development and evaluation of the Slovak dictation system for the judicial domain. The speech is recorded using a close-talk microphone and the dictation system is used for on-line or off-line automatic transcription. The system provides an automatic dictation tool in Slovak for the employees of the Ministry of Justice of the Slovak Republic and all the courts in Slovakia. The system is designed for on-line dictation and off-line transcription of legal texts recorded in acoustical conditions of typical office. Details of the technical solution are given and the evaluation of different versions of the system is presented.

text speech and dialogue | 2007

Sk-ToBI scheme for phonological prosody annotation in Slovak

Milan Rusko; Róbert Sabo; Martin Dzúr

Research and development in speech synthesis and recognition calls for a phonological intonation annotation scheme for the particular language. Inspired by the successful ToBI (Tones and Break Indices) for American English [1] and GToBI [2] for German, this paper introduces a new intonation annotation scheme for Slovak, Sk-ToBI. In spite of the fact that Slovak prosodic rules differ from those of English or German, we decided to follow the main principals of ToBI and to define a special Slovak version of Tones and Break Indices annotation scheme. The speech material belonging to different styles, which was used for the preliminary study of accents in Slovak is shortly described and the conventions of Sk-ToBI annotation are presented.

international conference on speech and computer | 2016

Stress, Arousal, and Stress Detector Trained on Acted Speech Database

Róbert Sabo; Milan Rusko; Andrej Ridzik; Jakub Rajčáni

This paper reports on initial experiments with the creation of a suitable database for training and testing systems for stress detection in speech and first experimental results. Based on the psychological understanding of the concepts of stress and emotion, we operationalized stress as a level of arousal, which can be detected in speech. We describe here a speech database with three levels of “acted stress” and three levels of soothing. For the very first experiment performed on the database we detect different levels of stress using Gaussian mixture models. The accuracy of detecting three levels of stress was 89 % for speakers included in the training database and 73 % for speakers whose recordings were not used during the adaptation of the GMM models.

language and technology conference | 2013

Advances in the Slovak Judicial Domain Dictation System

Milan Rusko; Jozef Juhár; Marián Trnka; Ján Staš; Sakhia Darjaa; Daniel Hládek; Róbert Sabo; Matus Pleva; Marian Ritomský; Stanislav Ondáš

This paper describes evaluation and recent advances in application of speech dictation system for the judicial domain. The dictation system incorporates Slovak speech recognition and uses a plugin for widely used office suite. It was introduced recently after preliminary user evaluation in the Slovak courts. The system was improved significantly using new acoustic databases for evaluation and acoustic modeling when compared to the previous version. The speaker adaptation procedure and gender dependent models significantly improve the overall accuracy below 5 % WER for domain specific test set. The language resources were extended and the language modeling techniques were improved as it is described in the paper. An end-user questionnaire about the user interface was evaluated and new functionalities were introduced. According to the available feedback, it can be concluded that the dictation system is able to speed up the court proceedings significantly for each user willing to cooperate with new technologies.

Journal of Linguistics/Jazykovedný casopis | 2017

Designing the Database of Speech Under Stress

Róbert Sabo; Jakub Rajčáni

Abstract This study describes the methodology used for designing a database of speech under real stress. Based on limits of existing stress databases, we used a communication task via a computer game to collect speech data. To validate the presence of stress, known psychophysiological indicators such as heart rate and electrodermal activity, as well as subjective self-assessment were used. This paper presents the data from first 5 speakers (3 men, 2 women) who participated in initial tests of the proposed design. In 4 out of 5 speakers increases in fundamental frequency and intensity of speech were registered. Similarly, in 4 out of 5 speakers heart rate was significantly increased during the task, when compared with reference measurement from before the task. These first results show that proposed design might be appropriate for building a speech under stress database. However, there are still considerations that need to be addressed.

text speech and dialogue | 2014

Detecting Commas in Slovak Legal Texts

Róbert Sabo; Štefan Beňuš

This paper reports on initial experiments with automatic comma recovery in legal texts. In deciding whether to insert a comma or not, we propose to use the value of the probability of a bigram of two words without a comma and a trigram of the words with the comma. The probability is determined by the language model trained on sentences with commas labeled as separate words. In the training database one sentence corresponds to one line. The thresholds of bigrams and trigrams probability were experimentally determined to achieve the best balance of precision and recall. The advantage of the proposed method is its high precision (95%) at a relatively satisfactory recall (49%). For judges as potential users of an ASR system with an automatic comma insertion function, precision is particularly important.

conference of the international speech communication association | 2011