Stevan Ostrogonac | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Stevan Ostrogonac is active.

Explore More

Publication

Featured researches published by Stevan Ostrogonac.

international conference on speech and computer | 2015

Deep Neural Network Based Continuous Speech Recognition for Serbian Using the Kaldi Toolkit

Branislav M. Popovic; Stevan Ostrogonac; Edvin Pakoci; Niksa Jakovljevic; Vlado Delić

This paper presents a deep neural network (DNN) based large vocabulary continuous speech recognition (LVCSR) system for Serbian, developed using the open-source Kaldi speech recognition toolkit. The DNNs are initialized using stacked restricted Boltzmann machines (RBMs) and trained using cross-entropy as the objective function and the standard error backpropagation procedure in order to provide posterior probability estimates for the hidden Markov model (HMM) states. Emission densities of HMM states are represented as Gaussian mixture models (GMMs). The recipes were modified based on the particularities of the Serbian language in order to achieve the optimal results. A corpus of approximately 90 hours of speech (21000 utterances) is used for the training. The performances are compared for two different sets of utterances between the baseline GMM-HMM algorithm and various DNN settings.

The Scientific World Journal | 2015

Educational Applications for Blind and Partially Sighted Pupils Based on Speech Technologies for Serbian

Branko Lučić; Stevan Ostrogonac; Nataša Vujnović Sedlar; Milan Sečujski

The inclusion of persons with disabilities has always represented an important issue. Advancements within the field of computer science have enabled the development of different types of aids, which have significantly improved the quality of life of the disabled. However, for some disabilities, such as visual impairment, the purpose of these aids is to establish an alternative communication channel and thus overcome the users disability. Speech technologies play the crucial role in this process. This paper presents the ongoing efforts to create a set of educational applications based on speech technologies for Serbian for the early stages of education of blind and partially sighted children. Two educational applications dealing with memory exercises and comprehension of geometrical shapes are presented, along with the initial tests results obtained from research including visually impaired pupils.

international symposium on intelligent systems and informatics | 2012

A language model for highly inflective non-agglutinative languages

Stevan Ostrogonac; Dragisa Miskovic; Milan Sečujski; Darko Pekar; Vlado Delić

This paper proposes a method of creating language models for highly inflective non-agglutinative languages. Three types of language models were considered - a common n-gram model, an n-gram model of lemmas and a class n-gram model. The last two types were specially designed for the Serbian language reflecting its unique grammar structure. All the language models were trained on a carefully collected data set incorporating several literary styles and a great variety of domain-specific textual documents in Serbian. Language models of the three types were created for different sets of textual corpora and evaluated by perplexity values they have given on the test data. A log-linear combination of the common, lemma-based and class n-gram models that was also created shows promising results in overcoming the data sparsity problem. However, the evaluation of this combined model in the context of a large vocabulary continuous speech recognition system (LVCSR) is yet to be done in order to establish the improvement in terms of word error rate (WER).

telecommunications forum | 2012

Impact of training corpus size on the quality of different types of language models for Serbian

Stevan Ostrogonac; Milan Sečujski; Dragiša Mišković

This paper describes a study on correspondence between the language model quality and the size of the textual corpus used in the training process. Three types of n-gram models developed for the Serbian language were included in the study: word-based, lemma-based and class-based model. They are created in order to deal with the data sparsity problem which is very expressed because of the high degree of inflection of the Serbian language. The three model types were trained on corpora of different sizes and evaluated by perplexity on authentic text and text with random word order in order to obtain the discrimination coefficients values. These values show different degrees of robustness of the three model types to data sparsity problem and indicate a way of combining these models in order to achieve the best language representation for a given training corpus.

telecommunications forum | 2013

Speech resources for a Serbian LVCSR system

Stevan Ostrogonac; Siniša Suzić; Milana Bojanić; Edvin Pakoci

This paper describes the whole procedure of speech database collection and processing required for building a good large vocabulary speech recognition system for the Serbian language. The speech database consists of speech recordings from audio books, radio programs and talk shows, as well as read utterances from an array of male and female speakers. To date, around 200 hours of read speech is collected, as well as about 10 hours of radio recordings.

telecommunications forum | 2012

Subjective assessment of text to speech synthesis systems for the Serbian language

Edvin Pakoci; Robert Mak; Stevan Ostrogonac

This paper gives a short overview of contemporary text to speech (TTS) systems available for the Serbian language and then presents the results of subjective assessment tests of the quality of synthesized speech generated with these methods. Its main goal is to show the improvement in resulting speech quality obtained using the new hidden Markov model based (HMM-based) speech synthesis system for Serbian.

telecommunications forum | 2011

DRT and SUS intelligibility tests for synthesized speech in the Serbian language

Stevan Ostrogonac; Milan Sečujski

This paper describes synthesized speech intelligibility testing using DRT (Diagnostic Rhyme Test) and SUS (Semantically Unpredictable Sentences) for Serbian language. The description of a program for creating semantically unpredictable sentences for five basic sentence structures in Serbian language, called SUSmaker, is also given. An overview of AlfaNum speech synthesizer evaluation results is presented along with a discussion about further research course.

international conference on speech and computer | 2013

Speech and Language Resources within Speech Recognition and Synthesis Systems for Serbian and Kindred South Slavic Languages

Vlado Delić; Milan Seăujski; Niksa Jakovljevic; Darko Pekar; Dragiša Mišković; Branislav M. Popovic; Stevan Ostrogonac; Milana Bojanić; Dragan Knežević

ieee international conference on cognitive infocommunications | 2013