Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Luboš Šmídl is active.

Publication


Featured researches published by Luboš Šmídl.


Eurasip Journal on Audio, Speech, and Music Processing | 2011

System for fast lexical and phonetic spoken term detection in a Czech cultural heritage archive

Josef Psutka; Jan Švec; Jan Vaněk; Aleš Pražák; Luboš Šmídl; Pavel Ircing

The main objective of the work presented in this paper was to develop a complete system that would accomplish the original visions of the MALACH project. Those goals were to employ automatic speech recognition and information retrieval techniques to provide improved access to the large video archive containing recorded testimonies of the Holocaust survivors. The system has been so far developed for the Czech part of the archive only. It takes advantage of the state-of-the-art speech recognition system tailored to the challenging properties of the recordings in the archive (elderly speakers, spontaneous speech and emotionally loaded content) and its close coupling with the actual search engine. The design of the algorithm adopting the spoken term detection approach is focused on the speed of the retrieval. The resulting system is able to search through the 1,000 h of video constituting the Czech portion of the archive and find query word occurrences in the matter of seconds. The phonetic search implemented alongside the search based on the lexicon words allows to find even the words outside the ASR system lexicon such as names, geographic locations or Jewish slang.


text speech and dialogue | 2000

Design of Speech Recognition Engine

Ludek Müller; Josef Psutka; Luboš Šmídl

This paper concerns a speaker independent recognition engine of Czech continuous speech designed for Czech telephone applications and describes the recognition module as an important component of a telephone dialogue system being designed and constructed at the Department of Cybernetics, the University of West Bohemia. The recognition is based on a statistical approach. The left-to-right three-state HMMs with an output probability density function expressed as multivariate Gaussian mixture are used to model triphones as basic units in acoustic modelling and stochastic regular grammars are implemented to reduce a task perplexity. A real time recognition process is supported by a very computation cost reduction approach estimating log-likelihood scores of Gaussian mixtures and also by a beam pruning used during Viterbi decoding. The present paper concerns the main part of the engine - a speaker independent recognition engine for continuous Czech speech.


international conference on acoustics, speech, and signal processing | 2013

Hierarchical discriminative model for spoken language understanding

Jan Švec; Luboš Šmídl; Pavel Ircing

The paper presents a new discriminative model for statistical spoken language understanding designed for use in spoken dialog systems. The parsing algorithm uses lexicalized grammar derived from unaligned training data with probability estimates generated by multiclass classifiers. The generated semantic trees are partially aligned with the input sentence to provide lexical realisation of semantic concepts. The model was evaluated on two semantically annotated corpora and in both tasks it outperforms the baseline Hidden Vector State parser and Semantic Tuple Classifiers model. The experiments were performed using both transcribed data and recognized lattices. The innovative aspect of using phoneme lattices in the understanding process instead of word lattices is examined and described.


Methods of Information in Medicine | 2009

Voice-supported Electronic Health Record for Temporomandibular Joint Disorders

Radek Hippmann; Tatjana Dostalova; Jana Zvárová; Miroslav Nagy; Michaela Seydlova; Petr Hanzlícek; Pavel Kriz; Luboš Šmídl; Jan Trmal

OBJECTIVES To identify support of structured data entry for an electronic health record application in temporomandibular joint disorders. METHODS The methods of structuring information in dentistry are described and the interactive DentCross component is introduced. A system of structured voice-supported data entry in electronic health record on several real cases in the field of dentistry is performed. The connection of this component to the MUDRLite electronic health record is described. RESULTS The use of DentVoice, an application which consists of the electronic health record MUDRLite and the voice-controlled interactive component DentCross, to collect dental information required by temporomandibular joint disorders is shown. CONCLUSIONS The DentVoice application with the DentCross component showed the practical ability of the temporomandibular joint disorder treatment support.


text speech and dialogue | 2012

On the Impact of Annotation Errors on Unit-Selection Speech Synthesis

Jindřich Matoušek; Daniel Tihelka; Luboš Šmídl

Unit selection is a very popular approach to speech synthesis. It is known for its ability to produce nearly natural-sounding synthetic speech, but, at the same time, also for its need for very large speech corpora. In addition, unit selection is also known to be very sensitive to the quality of the source speech corpus the speech is synthesised from and its textual, phonetic and prosodic annotations and indexation. Given the enormous size of current speech corpora, manual annotation of the corpora is a lengthy process. Despite this fact, human annotators do make errors. In this paper, the impact of annotation errors on the quality of unit-selection-based synthetic speech is analysed. Firstly, an analysis and categorisation of annotation errors is presented. Then, a speech synthesis experiment, in which the same utterances were synthesised by unit-selection systems with and without annotation errors, is described. Results of the experiment and the options for fixing the annotation errors are discussed as well.


ieee automatic speech recognition and understanding workshop | 2013

Semantic entity detection from multiple ASR hypotheses within the WFST framework

Jan Švec; Pavel Ircing; Luboš Šmídl

The paper presents a novel approach to named entity detection from ASR lattices. Since the described method not only detects the named entities but also assigns a detailed semantic interpretation to them, we call our approach the semantic entity detection. All the algorithms are designed to use automata operations defined within the framework of weighted finite state transducers (WFST) - the ASR lattices are nowadays frequently represented as weighted acceptors. The expert knowledge about the semantics of the task at hand can be first expressed in the form of a context free grammar and then converted to the FST form. We use a WFST optimization to obtain compact representation of the ASR lattice. The WFST framework also allows to use the word confusion networks as another representation of multiple ASR hypotheses. That way we can use the full power of composition and optimization operations implemented in the OpenFST toolkit for our semantic entity detection algorithm. The devised method also employs the concept of a factor automaton; this approach allows us to overcome the need for a filler model and consequently makes the method more general. The paper includes experimental evaluation of the proposed algorithm and compares the performance obtained by using the one-best word hypothesis, optimized lattices and word confusion networks.


systems man and cybernetics | 2007

An Intelligent Telephony Interface of Multiagent Decision Support Systems

Petr Becvár; Luboš Šmídl; Josef Psutka; Michal Pechoucek

ExtraPlanT is a multiagent production planning system designed for small and medium-sized enterprises with project-oriented production. In order to make the results of the system available even to users who are located away from the enterprise, it has been equipped with the possibility of remote access-a Web and telephony interface. The multiagent design of the ExtraPlanT makes the integration of these interfaces robust and simple. The telephony interface uses VoiceXML technology so that it can be built without extensive knowledge of speech processing. The interface also uses innovative techniques to overcome the common disadvantages of speech as a medium for machine output.


text speech and dialogue | 2012

Spoken Dialogue System Design in 3 Weeks

Tomáš Valenta; Jan Švec; Luboš Šmídl

This article describes knowledge-based spoken dialogue system design from scratch. It covers all stages which were performed during the period of three weeks: definition of semantic goals and entities, data collection and recording of sample dialogues, data annotation, parser and grammars design, dialogue manager design and testing. The work was focused mainly on rapid development of such a dialogue system. The final implementation was written in dynamically generated VoiceXML. The large vocabulary continuous speech recognition system was used and the language understanding module was implemented using non-recursive probabilistic context free grammars which were converted to finite states transducers. The design and implementation has been verified on a railway information service task with a real large-scale database. The paper describes an innovative combination of data, expert knowledge and state-of-the-art methods which allow fast spoken dialogue system design.


text speech and dialogue | 2013

On the Use of Phoneme Lattices in Spoken Language Understanding

Jan Švec; Luboš Šmídl

This paper presents a novel approach to spoken language understanding in dialogue systems. Unlike prevalent methods that use only the word lattices, the presented approach works with phoneme lattices generated by a phoneme recognizer. The hierarchical discriminative model for speech understanding was used together with modifications proposed in this paper. The method was experimentally evaluated using two semantic corpora and the results are presented.


international conference on industrial informatics | 2004

Decision support framework ExtraPlanT with remote access and telephony interface

Petr Becvár; Michal Pechoucek; Luboš Šmídl

ExtraPlanT system is a multi-agent production planning system designed for small factories, which needs to react quickly on market changes. To deal with this requirement, ExtraPlanT system has been equipped with an extra-enterprise access feature that allows managers to access and use the system whenever and wherever they need. One possibility of the extra-enterprise access is the telephony interface using computer based speech recognition and synthesis. The interface has been built on a VoiceXML technology, and it uses DTMF and speech input and a synthesized speech output. VoiceXML documents are generated by JAVA servlets running on Tomcat server. To overcome the main disadvantages of telephony interfaces: sequential, transient and slow presentation of information, two techniques has been developed for the ExtraPlanT telephony interface. The first technique is the two-level communication model based on analytical module-knowledge-based system that transforms data into a short summary. On a user request, each summary can be followed by a detailed explanation. The second technique is a dynamical selection of prompts wording, which selects a wording of the prompts according to estimated user experience in order to find an optimal dialog length and descriptiveness

Collaboration


Dive into the Luboš Šmídl's collaboration.

Top Co-Authors

Avatar

Jan Švec

University of West Bohemia

View shared research outputs
Top Co-Authors

Avatar

Josef Psutka

University of West Bohemia

View shared research outputs
Top Co-Authors

Avatar

Aleš Pražák

University of West Bohemia

View shared research outputs
Top Co-Authors

Avatar

Pavel Ircing

University of West Bohemia

View shared research outputs
Top Co-Authors

Avatar

Tomáš Valenta

University of West Bohemia

View shared research outputs
Top Co-Authors

Avatar

Jan Trmal

University of West Bohemia

View shared research outputs
Top Co-Authors

Avatar

Adam Chýlek

University of West Bohemia

View shared research outputs
Top Co-Authors

Avatar

J. Zahradil

University of West Bohemia

View shared research outputs
Top Co-Authors

Avatar

Luděk Müller

University of West Bohemia

View shared research outputs
Top Co-Authors

Avatar

Jana Zvárová

Academy of Sciences of the Czech Republic

View shared research outputs
Researchain Logo
Decentralizing Knowledge