Ovidiu Buza
Technical University of Cluj-Napoca
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Ovidiu Buza.
international conference on communications | 2010
Ovidiu Buza; Gavril Toderean; József Domokos
We present in this article our approach for building a text-to-speech system for Romanian. Main stages of this work were: voice signal analysis, region segmentation, construction of acoustic database, text analysis, unit and prosody detection, unit matching, concatenation and speech synthesis. In our approach we consider word syllables as basic units and stress indicating intrasegmental prosody. A special characteristic of current approach is rule-based processing of both speech signal analyse and text analyse stages.
ieee international conference on automation, quality and testing, robotics | 2006
Alina Nica; Alexandru Caruntu; Gavril Toderean; Ovidiu Buza
We propose a software environment in Matlab, in order to extract the main features from the Romanian vowels and to synthesize the vowels. The used analysis techniques for the estimation of the parameters are: time domain analysis for energy and zero-crossing rate (ZCR), cepstral analysis for fundamental frequency and linear predictive coding (LPC) for formants. Also, to synthesize the vowels, we used the LPC method
ieee international conference on automation, quality and testing, robotics | 2006
Ovidiu Buza; Gavril Toderean; Alina Nica; Alexandru Caruntu
We present here a software application capable to manipulate and analyse speech signal, extract characteristic parameters needed for speech synthesis and to enhance the speech quality. We also present main signal parameters used in speech synthesis, the facilities of realized application and the experimental results obtained
language resources and evaluation | 2015
József Domokos; Ovidiu Buza; Gavril Toderean
This paper intends to present a machine readable Romanian language pronunciation dictionary called NaviRo. The dictionary contains 138,500 unique words from the DexOnline dictionary together with their phonetic transcriptions in speech assessment method phonetic alphabet. The development of the pronunciation dictionary and the performed validation tests are also described in the paper. NaviRo pronunciation dictionary is freely available on the project website (http://users.utcluj.ro/~jdomokos/naviro) in plain text, Hidden Markov Model Toolkit and Festival speech synthesis system dictionary format. There are also available for download the used grapheme and phoneme sets and the audio samples for the used phonemes. The use of these resources is completely unrestricted for any research purposes in order to speed up Romanian language speech technology research.
2009 Proceedings of the 5-th Conference on Speech Technology and Human-Computer Dialogue | 2009
Arpad Zsolt Bodo; Ovidiu Buza; Gavril Toderean
During the development of a text-to-speech synthesis system, prosody prediction and generation, is one of the most critical phases. Prosody is composed by the melody, intensity and the speed of the speech. The current paper has in focus the intonation of Romanian language, its prediction and generation within a TTS system. During the development of different prosodic modules, several experimental results have been achieved, which are presented in this work. The process is followed starting from text pre-processing, the determination of sentence types and subtypes, through automatic labeling of sentences, till the labeling and transposing of the intonation curves. The creation and storage of relative intonation curves, as the developed different XML rule systems are also topic of this work.
ieee international conference on automation, quality and testing, robotics | 2006
Alexandru Caruntu; Alina Nica; Gavril Toderean; Emanuel Puschita; Ovidiu Buza
In this paper we present a novel method for silence/unvoiced/voiced (SUV) classification of speech signals. The well-known algorithm for locating endpoints in an utterance, based on zero-crossing rate and energy, was our starting point. We added a few supplementary decision criteria to it and we tested it using features like Teager energy and entropy. The experiments that we performed showed that these features performed better than the traditional energy measure for clean speech, but none of them produced a significant improvement in a noisy environment
2015 International Conference on Speech Technology and Human-Computer Dialogue (SpeD) | 2015
Gavril Toderean; Ovidiu Buza; József Domokos
This article presents some of the voice synthesis methods designed and implemented at the research center of Technical University of Cluj-Napoca, methods that include: the phonemes-based and diphones-based LPC synthesis, the multipulse MPE synthesis, the NSM synthesis method, the RR_PSOLA variant of TD-PSOLA, a method based on syllables concatenation, and a corpus-based method. Also there are presented some voice synthesis systems that were realised: the ROMVOX system, SprintVox system, LIGHTVOX system and HTS system.
Interdisciplinary Research in Engineering: Steps towards Breakthrough Innovation for Sustainable Development | 2013
József Domokos; László Sándor; Ovidiu Buza; Gavril Toderean
The aim of this article is to present a demonstrative Web application with Romanian language continuous speech recognition based multimodal interface. The scope of the paper also includes the presentation and testing of the capabilities of a context dependent grapheme based acoustic model for the Romanian language. The article describes the system architecture, the Web application development and the speech database used for the acoustic feature vector construction and acoustic model training. Further the task grammar is presented. At the end recognition results are presented in both offline and online operating mode. The used speech corpora together with the transcriptions are freely available for academic use on the NaviRo project website: http://users.utcluj.ro/~jdomokos/naviro/.
2013 7th Conference on Speech Technology and Human - Computer Dialogue (SpeD) | 2013
Ovidiu Buza; Gavril Toderean; Andras Balogh; József Domokos
This article presents an original algorithm for detecting the periodicity of voice signal. Main characteristics of current algorithm are: precise determination of each period from a voiced segment of speech, accurate detection of pitch interval boundaries, marking the glottal peak of each period. The algorithm uses time domain analysis of the signal, from this resulting its rapidity and efficiency.
2009 Proceedings of the 5-th Conference on Speech Technology and Human-Computer Dialogue | 2009
József Domokos; Gavril Toderean; Ovidiu Buza
In this paper we present a synthesis of the theoretical fundamentals and some practical aspects of statistical (n-gram) language modeling which is a main part of a large vocabulary statistical speech recognition system. There are presented the unigram, bigram and trigram language models as well as the Good-Turing estimator based Katz back-off smoothing algorithm. There is also described the perplexity measure of a language model used for evaluation. The practical experiments were made on Romanian Constitution corpus. There are also presented the text normalization steps before the language model generation. The results are ARPA-MIT format language models for Romanian language. The models were tested and compared using perplexity measure. Finally some comparisons were made between Romanian and English language modeling and conclusions are drawn.