Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Ivan Kraljevski is active.

Publication


Featured researches published by Ivan Kraljevski.


Computers & Electrical Engineering | 2014

Joint variable frame rate and length analysis for speech recognition under adverse conditions

Zheng-Hua Tan; Ivan Kraljevski

Display Omitted A computationally efficient variable frame length and rate method is proposed.The method relies on a posteriori signal-to-noise ratio weighted energy distance.The method improves speech recognition accuracy in noisy environments.Setting a proper range of frame length that is allowed to choose from is important. This paper presents a method that combines variable frame length and rate analysis for speech recognition in noisy environments, together with an investigation of the effect of different frame lengths on speech recognition performance. The method adopts frame selection using an a posteriori signal-to-noise (SNR) ratio weighted energy distance and increases the length of the selected frames, according to the number of non-selected preceding frames. It assigns a higher frame rate and a normal frame length to a rapidly changing and high SNR region of a speech signal, and a lower frame rate and an increased frame length to a steady or low SNR region. The speech recognition results show that the proposed variable frame rate and length method outperforms fixed frame rate and length analysis, as well as standalone variable frame rate analysis in terms of noise-robustness.


international conference on speech and computer | 2014

Analysis and Synthesis of Glottalization Phenomena in German-Accented English

Ivan Kraljevski; Maria Paola Bissiri; Guntram Strecha; Rüdiger Hoffmann

The present paper investigates the analysis and synthesis of glottalization phenomena in German-accented English. Word-initial glottalization was manually annotated in a subset of a German-accented English speech corpus. For each glottalized segment, time-normalized F0 and log-energy contours were produced and principal component analysis was performed on the contour sets in order to reduce their dimensionality. Centroid contours of the PC clusters were used for contour reconstruction in the resynthesis experiments. The prototype intonation and intensity contours were superimposed over non-glottalized word-initial vowels in order to resynthesize creaky voice. This procedure allows the automatic creation of speech stimuli which could be used in perceptual experiments for basic research on glottalizations.


conference on computer as a tool | 2013

Analysis-by-synthesis approach for acoustic model adaptation

Ivan Kraljevski; Frank Duckhorn; Guntram Strecha; Yitagessu Birhanu Gebremedhin; Matthias Wolff; Rüdiger Hoffmann

This paper presents an analysis-by-synthesis approach for acoustic model adaptation. Using artificial speech data for speech recognition systems adaptation, has the potential to address the problem of data sparseness, to avoid speech recordings in real conditions and to provide the capability of performing large number of development cycles for Automatic Speech Recognition (ASR) systems in shorter time. The proposed adaptation framework uses unified ASR and synthesis system to produce artificial adaptation speech signals. In order to confirm the usability of the proposed approach, several experiments were performed where the artificial speech data was coded-decoded by different speech and waveform coders and the acoustic model used for synthesis was adapted for each coder. The recognition results show that the proposed method could be used successfully in the process of speech recognition systems performance assessment and improvement, not only for coded speech effects evaluation and adaptation, but also for other environment conditions.


conference on computer as a tool | 2013

A new approach to develop a syllable based, continuous Amharic speech recognizer

Yitagessu Birhanu Gebremedhin; Frank Duckhorn; Rüdiger Hoffmann; Ivan Kraljevski

All of the previous syllable based Automatic Speech Recognizers (ASRs) for the Amharic language are built by training a separate acoustic model for each of the 196 distinctly pronounced Consonant-Vowel (CV) syllable. In this paper, we will demonstrate that a smaller number of acoustic models are sufficient to build a syllable based, speaker independent, continuous, Amharic ASR. It is built for weather forecast and business report applications using the UASR (Unified Approach to Speech Synthesis and Recognition) Tool kit. A new speech corpus, which is of more than 35 hours duration, is used for training. It is a collection of corpora recorded in three different environments in order to make the recognizer less sensitive to recording environment and microphone changes. The grammar is finite state transducer based and the lexical model consists of thousands of words. Though acoustic models for only 93 syllables are trained, a recognition accuracy of 93.26% is achieved on a test set that has 4,000 words collected from 10 speakers.


International Conference on ICT Innovations | 2012

Cross-Language Acoustic Modeling for Macedonian Speech Technology Applications

Ivan Kraljevski; Guntram Strecha; Matthias Wolff; Oliver Jokisch; Slavcho Chungurski; Rüdiger Hoffmann

This paper presents a cross-language development method for speech recognition and synthesis applications for Macedonian language. Unified system for speech recognition and synthesis trained on German language data was used for acoustic model bootstrapping and adaptation. Both knowledge-based and data-driven approaches for source and target language phoneme mapping were used for initial transcription and labeling of small amount of recorded speech. The recognition experiments on the source language acoustic model with target language dataset showed significant recognition performance degradation. Acceptable performance was achieved after Maximum a posteriori (MAP) model adaptation with limited amount of target language data, allowing suitable use for small to medium vocabulary speech recognition applications. The same unified system was used again to train new separate acoustic model for HMM based synthesis. Qualitative analysis showed, despite the low quality of the available recordings and sub-optimal phoneme mapping, that HMM synthesis produces perceptually good and intelligible synthetic speech.


wearable and implantable body sensor networks | 2012

Adaptation of Models for Food Intake Sound Recognition Using Maximum a Posteriori Estimation Algorithm

Sebastian Päßler; Wolf-Joachim Fischer; Ivan Kraljevski


conference of the international speech communication association | 2018

Classification of Correction Turns in Multilingual Dialogue Corpus.

Ivan Kraljevski; Diane Hirschfeld


conference of the international speech communication association | 2017

Hyperarticulation of Corrections in Multilingual Dialogue Systems.

Ivan Kraljevski; Diane Hirschfeld


conference of the international speech communication association | 2015

Comparison of forced-alignment speech recognition and humans for generating reference VAD

Ivan Kraljevski; Zheng-Hua Tan; Maria Paola Bissiri


Speech Communication; 10. ITG Symposium; Proceedings of | 2012

Acoustic Model Adaptation on Speech and Audio Coding Distortion

Ivan Kraljevski; Frank Duckhorn; Rüdiger Hoffmann

Collaboration


Dive into the Ivan Kraljevski's collaboration.

Top Co-Authors

Avatar

Rüdiger Hoffmann

Dresden University of Technology

View shared research outputs
Top Co-Authors

Avatar

Guntram Strecha

Dresden University of Technology

View shared research outputs
Top Co-Authors

Avatar

Diane Hirschfeld

Dresden University of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Matthias Wolff

Dresden University of Technology

View shared research outputs
Top Co-Authors

Avatar

Oliver Jokisch

Dresden University of Technology

View shared research outputs
Top Co-Authors

Avatar

Matthias Wolff

Dresden University of Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge