Zdenek Hanzlícek
University of West Bohemia
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Zdenek Hanzlícek.
international conference on signal processing | 2008
Zdenek Hanzlícek; Jindrich Matousek; Daniel Tihelka
In this paper, the project ldquoElimination of the Language Barriers Faced by the Handicapped Watchers of the Czech Televisionrdquo aimed at making Czech TV broadcasting available to a broader group of TV watchers is introduced. More specifically, the problems of the automatic audio track generation within the project are mentioned. As the audio track will be produced from subtitles, text-to-speech (TTS) technology will be utilised. Several versions of a TTS system planned to produce the audio track are described. In this paper, the main attention is paid to the analysis of synchronicity between subtitles and the synthetic speech. Problems with fitting synthetic speech into the predefined subtitles slots were revealed - for more than 44% of all subtitles, the synthetic speech overlapped the slots. So, great care will have to be taken to produce speech af faster rates when customising our TTS system for the task of generating audio tracks from subtitles.
text, speech and dialogue | 2011
Jindřich Matoušek; Zdenek Hanzlícek; Michal Campr; Z. Krňoul; Pavel Campr; Martin Grůber
A web-based system for automatic reading of technical documents focused on vision-impaired primary-school students is presented in the paper. An overview of the system, both its backend (used by teachers to create and manage the documents) and frontend (used by students for viewing and reading the documents), is given. Text-to-speech synthesis utilised for the automatic reading and, especially, the automatic processing of mathematical and physical formulas are described as well.
international conference on signal processing | 2014
Martin Gruber; Jindrich Matousek; Daniel Tihelka; Zdenek Hanzlícek
This paper is focused on reducing the size of speech corpora that are used in the unit-selection-based TTS systems. The size of a speech corpus influences the system requirements like storage and memory demands and computational complexity. For high quality speech synthesis, the speech corpus usually consists of several thousands of sentences. Thus an appropriate reduction of the corpus size is likely to lead to a decrease in the system requirements. In this work, a comparison of impacts on synthetic speech quality is presented when removing specific instances of different linguistic segment types from the original corpus. Removal of the following segment types is used and compared with each other: whole sentences, phrases, words, and diphones. Only segments with rarely selected units are removed from the corpus so that the resulting footprint size reaches a predefined value. Results confirm that synthetic speech generated by the TTS systems using the reduced corpora is of a slightly worse quality when compared with speech produced by the system employing the original full corpus. The comparison of the reduction based on different linguistic segments is also presented here.
international conference on signal processing | 2008
Zdenek Hanzlícek; Jindrich Matousek
In this paper, a new approach to line spectral frequencies transformation is introduced and employed in the voice conversion framework. This approach stems from the fact that LSFs are some specific points on the frequency axis and their positions determined the shape of the spectral envelope. Thus, they could be transformed directly by frequency axis warping. Two warping functioned were designed specially for LSFs and compared with the traditional GMM-based conversion function. Listening tests and mathematical evaluation revealed that speech transformed by using proposed warping functions is of higher quality and does not suffer from oversmoothing which is common for GMM-based transformation. On the other hand, the speaker identity is slightly better transformed by GMM-based conversion. However, it is possible to combine these two approaches to obtain a compromise between quality and speaker identity.
international symposium on signal processing and information technology | 2012
Daniel Tihelka; Zdenek Hanzlícek; Pavel Machač; Radek Skarnitzl; Jindrich Matousek
This paper presents a study on coarticulatory labialization and the significance of its respecting/violation during selection and concatenation of speech units in the unit selection speech synthesis. The aim of this study is to improve the overall speech quality, especially to increase the perceptual inconspicuousness between concatenated units. The labialization importance was verified by two listening tests-for phonetic laymen and specialists. To suppress the influence of other factors, both tests contained utterances with specially selected phones in specific contexts with respected and violated labialization. The preference for items with correct labialization was evident, which confirms the benefit of considering coarticulatory labialization in a unit selection speech synthesis.
conference of the international speech communication association | 2005
Jindrich Matousek; Zdenek Hanzlícek; Daniel Tihelka
conference of the international speech communication association | 2017
Markéta Juzová; Daniel Tihelka; Jindrich Matousek; Zdenek Hanzlícek
publisher | None
author
SPECOM | 2018
Daniel Tihelka; Zdenek Hanzlícek; Markéta Juzová; Jindrich Matousek
conference of the international speech communication association | 2017
Martin Gruber; Jindrich Matousek; Zdenek Hanzlícek; Jakub Vít; Daniel Tihelka