Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Vlasta Radová is active.

Publication


Featured researches published by Vlasta Radová.


international conference on acoustics, speech, and signal processing | 1997

An approach to speaker identification using multiple classifiers

Vlasta Radová; Josef Psutka

The presented paper is interested in a speaker identification problem. The attributes representing the voice of a particular speaker are obtained from very short segments of the speech waveform corresponding only to one pitch period of vowels. The patterns formed from the samples of a pitch period waveform are either matched in the time domain by use of a nonlinear time warping method, known as dynamic time warping (DTW), or they are converted into cepstral coefficients and compared using the cepstral distance measure. Since an uttered speech signal usually contains a lot of vowels the techniques using a combination both various classifiers and multiple classifier outputs are considered in the decision making process. Experiments performed for a hundred speakers are described.


text speech and dialogue | 1999

Methods of Sentences Selection for Read-Speech Corpus Design

Vlasta Radová; Petr Vopálka

In this paper methods are proposed which can be used to select a set of phonetically balanced sentences. The principle of the methods is presented and some experimental results are given. In the end of the paper the use of the proposed methods for the Czech read-speech corpus design is described in detail and the structure of the corpus is explained.


text speech and dialogue | 2002

Automatic Transcription of Czech Language Oral History in the MALACH Project: Resources and Initial Experiments

Josef Psutka; Pavel Ircing; Vlasta Radová; William Byrne; Jan Hajic; Samuel Gustman; Bhuvana Ramabhadran

In this paper we describe the initial stages of the ASR component of the MALACH (Multilingual Access to Large Spoken Archives) project. This project will attempt to provide improved access to the large multilingual spoken archives collected by the Survivors of the Shoah Visual History Foundation (VHF) by advancing the state of the art in automated speech recognition. In order to train the ASR system, it is neccesary to manually transcribe a large amount of speech data, identify the appropriate vocabulary, and obtain relevant text for language modeling. We give a detailed description of the speech annotation process; show the specific properties of the spontaneous speech contained in the archives; and present a baseline speech recognition results.


text speech and dialogue | 1999

Speaker Identification Based on Vector Quantization

Vlasta Radová; Zdenek Svenda

In this paper a method of text-independent speaker recognition using discrete vector quantization is presented. The identification experiments were performed in a closed set of 599 speakers and two various types of features were tested: cepstral mean subtraction coefficients and mel-frequency cepstral coefficients. The effect of the various codebook size on the speaker identification performance was investigated.


international conference on speech and computer | 2016

Investigation of Segmentation in i-Vector Based Speaker Diarization of Telephone Speech

Zbyněk Zajíc; Marie Kunešová; Vlasta Radová

The goal of this paper is to evaluate the contribution of speaker change detection (SCD) to the performance of a speaker diarization system in the telephone domain. We compare the overall performance of an i-vector based system using both SCD-based segmentation and a naive constant length segmentation with overlapping segments. The diarization system performs K-means clustering of i-vectors which represent the individual segments, followed by a resegmentation step. Experiments were done on the English part of the CallHome corpus. The final results indicate that the use of speaker change detection is beneficial, but the differences between the two segmentation approaches are diminished by the use of resegmentation.


text speech and dialogue | 2000

Recording and Annotation of the Czech Speech Corpus

Vlasta Radová; Josef Psutka

The paper reassumes our papers presented at the previous TSD workshops [2, 3] and concerns the Czech speech corpus which is being developed at the Department of Cybernetics, University ofWest Bohemia in Pilsen. It describes procedures of corpus recording and annotation.


text speech and dialogue | 2004

On the Background Model Construction for Speaker Verification Using GMM

Aleš Padrta; Vlasta Radová

A method of speaker verification based on Gaussian mixture models is presented in this paper. The method works with a background model which is composed of several submodels. Several different approaches for construction of the background model from the submodels are introduced here: the log likelihood of the background model is determined either as the average of the log likelihoods of the particular submodels, or a maximum from the log likelihoods of the particular submodels is selected. A large number of experiments was performed in order to find which of the approaches gives the best result. All experiments show that procedures which use a maximum of the log likelihoods of the background submodels have better performance than the procedure which uses the average log likelihood.


text speech and dialogue | 2017

Experiments with Segmentation in an Online Speaker Diarization System

Marie Kunešová; Zbyněk Zajíc; Vlasta Radová

In offline speaker diarization systems, particularly those aimed at telephone speech, the accuracy of the initial segmentation of a conversation is often a secondary concern. Imprecise segment boundaries are typically corrected during resegmentation, which is performed as the final step of the diarization process. However, such resegmentation is generally not possible in online systems, where past decisions are usually unchangeable. In such situations, correct segmentation becomes critical. In this paper, we evaluate several different segmentation approaches in the context of online diarization by comparing the overall performance of an i-vector-based diarization system set to operate in a sequential manner.


text speech and dialogue | 2014

Captioning of Live TV Commentaries from the Olympic Games in Sochi: Some Interesting Insights

Josef Psutka; Aleš Pražák; Vlasta Radová

In this paper, we describe our effort and some interesting insights obtained during captioning more than 70 hours of live TV broadcasts from the Olympic Games in Sochi. The closed captioning was prepared for CT Sport, the sport channel of the public service broadcaster in the Czech Republic. We will briefly discuss our solution for distributed captioning architecture on live TV programs using re-speaking approach as well as several modifications of existing live captioning application (especially LVCSR system), but also the way of re-speaking of a real TV commentary for individual sports. We will show that a re-speaker after hard training can achieve such accuracy (more than 98 %) and readability of captions which clearly outperform accuracy of captions created by automatic recognition of TV soundtrack.


text, speech and dialogue | 2018

Recurrent Neural Network Based Speaker Change Detection from Text Transcription Applied in Telephone Speaker Diarization System

Zbyněk Zajíc; Daniel Soutner; Marek Hrúz; Luděk Müller; Vlasta Radová

In this paper, we propose a speaker change detection system based on lexical information from the transcribed speech. For this purpose, we applied a recurrent neural network to decide if there is an end of an utterance at the end of a spoken word. Our motivation is to use the transcription of the conversation as an additional feature for a speaker diarization system to refine the segmentation step to achieve better accuracy of the whole diarization system. We compare the proposed speaker change detection system based on transcription (text) with our previous system based on information from spectrogram (audio) and combine these two modalities to improve the results of diarization. We cut the conversation into segments according to the detected changes and represent them by an i-vector. We conducted experiments on the English part of the CallHome corpus. The results indicate improvement in speaker change detection (by 0.5% relatively) and also in speaker diarization (by 1% relatively) when both modalities are used.

Collaboration


Dive into the Vlasta Radová's collaboration.

Top Co-Authors

Avatar

Josef Psutka

University of West Bohemia

View shared research outputs
Top Co-Authors

Avatar

Pavel Ircing

University of West Bohemia

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Luděk Müller

University of West Bohemia

View shared research outputs
Top Co-Authors

Avatar

V Psutka Josef

University of West Bohemia

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Aleš Padrta

University of West Bohemia

View shared research outputs
Top Co-Authors

Avatar

Jan Hajic

Charles University in Prague

View shared research outputs
Top Co-Authors

Avatar

Petra Zochová

University of West Bohemia

View shared research outputs
Top Co-Authors

Avatar

Marie Kunešová

University of West Bohemia

View shared research outputs
Researchain Logo
Decentralizing Knowledge