Zdenek Hanzlícek | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Zdenek Hanzlícek is active.

Explore More

Publication

Featured researches published by Zdenek Hanzlícek.

international conference on signal processing | 2008

Towards automatic audio track generation for Czech TV broadcasting: Initial experiments with subtitles-to-speech synthesis

Zdenek Hanzlícek; Jindrich Matousek; Daniel Tihelka

In this paper, the project ldquoElimination of the Language Barriers Faced by the Handicapped Watchers of the Czech Televisionrdquo aimed at making Czech TV broadcasting available to a broader group of TV watchers is introduced. More specifically, the problems of the automatic audio track generation within the project are mentioned. As the audio track will be produced from subtitles, text-to-speech (TTS) technology will be utilised. Several versions of a TTS system planned to produce the audio track are described. In this paper, the main attention is paid to the analysis of synchronicity between subtitles and the synthetic speech. Problems with fitting synthetic speech into the predefined subtitles slots were revealed - for more than 44% of all subtitles, the synthetic speech overlapped the slots. So, great care will have to be taken to produce speech af faster rates when customising our TTS system for the task of generating audio tracks from subtitles.

text, speech and dialogue | 2011

Web-based system for automatic reading of technical documents for vision impaired students

Jindřich Matoušek; Zdenek Hanzlícek; Michal Campr; Z. Krňoul; Pavel Campr; Martin Grůber

A web-based system for automatic reading of technical documents focused on vision-impaired primary-school students is presented in the paper. An overview of the system, both its backend (used by teachers to create and manage the documents) and frontend (used by students for viewing and reading the documents), is given. Text-to-speech synthesis utilised for the automatic reading and, especially, the automatic processing of mathematical and physical formulas are described as well.

international conference on signal processing | 2014

Reducing footprint of unit selection TTS system by removing linguistic segments with rarely selected units

Martin Gruber; Jindrich Matousek; Daniel Tihelka; Zdenek Hanzlícek

This paper is focused on reducing the size of speech corpora that are used in the unit-selection-based TTS systems. The size of a speech corpus influences the system requirements like storage and memory demands and computational complexity. For high quality speech synthesis, the speech corpus usually consists of several thousands of sentences. Thus an appropriate reduction of the corpus size is likely to lead to a decrease in the system requirements. In this work, a comparison of impacts on synthetic speech quality is presented when removing specific instances of different linguistic segment types from the original corpus. Removal of the following segment types is used and compared with each other: whole sentences, phrases, words, and diphones. Only segments with rarely selected units are removed from the corpus so that the resulting footprint size reaches a predefined value. Results confirm that synthetic speech generated by the TTS systems using the reduced corpora is of a slightly worse quality when compared with speech produced by the system employing the original full corpus. The comparison of the reduction based on different linguistic segments is also presented here.

international conference on signal processing | 2008

On using warping function for LSFs transformation in a voice conversion system

Zdenek Hanzlícek; Jindrich Matousek

In this paper, a new approach to line spectral frequencies transformation is introduced and employed in the voice conversion framework. This approach stems from the fact that LSFs are some specific points on the frequency axis and their positions determined the shape of the spectral envelope. Thus, they could be transformed directly by frequency axis warping. Two warping functioned were designed specially for LSFs and compared with the traditional GMM-based conversion function. Listening tests and mathematical evaluation revealed that speech transformed by using proposed warping functions is of higher quality and does not suffer from oversmoothing which is common for GMM-based transformation. On the other hand, the speaker identity is slightly better transformed by GMM-based conversion. However, it is possible to combine these two approaches to obtain a compromise between quality and speaker identity.

international symposium on signal processing and information technology | 2012

On the impact of labialization contexts on unit selection speech synthesis

Daniel Tihelka; Zdenek Hanzlícek; Pavel Machač; Radek Skarnitzl; Jindrich Matousek

This paper presents a study on coarticulatory labialization and the significance of its respecting/violation during selection and concatenation of speech units in the unit selection speech synthesis. The aim of this study is to improve the overall speech quality, especially to increase the perceptual inconspicuousness between concatenated units. The labialization importance was verified by two listening tests-for phonetic laymen and specialists. To suppress the influence of other factors, both tests contained utterances with specially selected phones in specific contexts with respected and violated labialization. The preference for items with correct labialization was evident, which confirms the benefit of considering coarticulatory labialization in a unit selection speech synthesis.

conference of the international speech communication association | 2005