Pavel Ircing
University of West Bohemia
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Pavel Ircing.
international acm sigir conference on research and development in information retrieval | 2005
Arnab Ghoshal; Pavel Ircing; Sanjeev Khudanpur
This paper introduces a novel method for automatic annotation of images with keywords from a generic vocabulary of concepts or objects for the purpose of content-based image retrieval. An image, represented as sequence of feature-vectors characterizing low-level visual features such as color, texture or oriented-edges, is modeled as having been stochastically generated by a hidden Markov model, whose states represent concepts. The parameters of the model are estimated from a set of manually annotated (training) images. Each image in a large test collection is then automatically annotated with the a posteriori probability of concepts present in it. This annotation supports content-based search of the image-collection via keywords. Various aspects of model parameterization, parameter estimation, and image annotation are discussed. Empirical retrieval results are presented on two image-collections | COREL and key-frames from TRECVID. Comparisons are made with two other recently developed techniques on the same datasets.
acm multimedia | 2005
Giridharan Iyengar; Pinar Duygulu; Shaolei Feng; Pavel Ircing; Sanjeev Khudanpur; Dietrich Klakow; M. R. Krause; R. Manmatha; Harriet J. Nock; D. Petkova; Brock Pytlik; Paola Virga
In this paper we describe a novel approach for jointly modeling the text and the visual components of multimedia documents for the purpose of information retrieval(IR). We propose a novel framework where individual components are developed to model different relationships between documents and queries and then combined into a joint retrieval framework. In the state-of-the-art systems, a late combination between two independent systems, one analyzing just the text part of such documents, and the other analyzing the visual part without leveraging any knowledge acquired in the text processing, is the norm. Such systems rarely exceed the performance of any single modality (i.e. text or video) in information retrieval tasks. Our experiments indicate that allowing a rich interaction between the modalities results in significant improvement in performance over any single modality. We demonstrate these results using the TRECVID03 corpus, which comprises 120 hours of broadcast news videos. Our results demonstrate over 14 % improvement in IR performance over the best reported text-only baseline and ranks amongst the best results reported on this corpus.
Eurasip Journal on Audio, Speech, and Music Processing | 2011
Josef Psutka; Jan Švec; Jan Vaněk; Aleš Pražák; Luboš Šmídl; Pavel Ircing
The main objective of the work presented in this paper was to develop a complete system that would accomplish the original visions of the MALACH project. Those goals were to employ automatic speech recognition and information retrieval techniques to provide improved access to the large video archive containing recorded testimonies of the Holocaust survivors. The system has been so far developed for the Czech part of the archive only. It takes advantage of the state-of-the-art speech recognition system tailored to the challenging properties of the recordings in the archive (elderly speakers, spontaneous speech and emotionally loaded content) and its close coupling with the actual search engine. The design of the algorithm adopting the spoken term detection approach is focused on the speed of the retrieval. The resulting system is able to search through the 1,000 h of video constituting the Czech portion of the archive and find query word occurrences in the matter of seconds. The phonetic search implemented alongside the search based on the lexicon words allows to find even the words outside the ASR system lexicon such as names, geographic locations or Jewish slang.
text speech and dialogue | 2011
Lucie Skorkovská; Pavel Ircing; Aleš Pražák; Jan Lehečka
The paper presents a module for topic identification that is embedded into a complex system for acquisition and storing large volumes of text data from the Web. The module processes each of the acquired data items and assigns keywords to them from a defined topic hierarchy that was developed for this purposes and is also described in the paper. The quality of the topic identification is evaluated in two ways - using classic precision-recall measures and also indirectly, by measuring the ASR performance of the topic-specific language models that are built using the automatically filtered data.
cross language evaluation forum | 2006
Pavel Ircing; Luděk Müller
The paper describes the system built by the team from the University of West Bohemia for participation in the CLEF 2006 CL-SR track. We have decided to concentrate only on the monolingual searching in the Czech test collection and investigate the effect of proper language processing on the retrieval performance. We have employed the Czech morphological analyser and tagger for that purposes. For the actual search system, we have used the classical tf.idf approach with blind relevance feedback as implemented in the Lemur toolkit. The results indicate that a suitable linguistic preprocessing is indeed crucial for the Czech IR performance.
IEEE Transactions on Audio, Speech, and Language Processing | 2009
Pavel Ircing; Josef Psutka
Automatic speech recognition, or more precisely language modeling, of the Czech language has to face challenges that are not present in the language modeling of English. Those include mainly the rapid vocabulary growth and closely connected unreliable estimates of the language model parameters. These phenomena are caused mostly by the highly inflectional nature of the Czech language. On the other hand, the rich morphology together with the well-developed automatic systems for morphological tagging can be exploited to reinforce the language model probability estimates. This paper shows that using rich morphological tags within the concept of class-based n-gram language model with many-to-many word-to-class mapping and combination of this model with the standard word-based n-gram can improve the recognition accuracy over the word-based baseline on the task of automatic transcription of unconstrained spontaneous Czech interviews.
international conference on acoustics, speech, and signal processing | 2013
Jan Švec; Luboš Šmídl; Pavel Ircing
The paper presents a new discriminative model for statistical spoken language understanding designed for use in spoken dialog systems. The parsing algorithm uses lexicalized grammar derived from unaligned training data with probability estimates generated by multiclass classifiers. The generated semantic trees are partially aligned with the input sentence to provide lexical realisation of semantic concepts. The model was evaluated on two semantically annotated corpora and in both tasks it outperforms the baseline Hidden Vector State parser and Semantic Tuple Classifiers model. The experiments were performed using both transcribed data and recognized lattices. The innovative aspect of using phoneme lattices in the understanding process instead of word lattices is examined and described.
text speech and dialogue | 1999
William Byrne; Jan Hajic; Pavel Ircing; Frederick Jelinek; Sanjeev Khudanpur; Jerome McDonough; Nino Peterek; Josef Psutka
We describe read speech and broadcast news corpora collected as part of a multi-year international collaboration for the development of large vocabulary speech recognition systems in the Czech language. Initial investigations into language modeling for Czech automatic speech recognition are described and preliminary recognition results on the read speech corpus are presented.
text speech and dialogue | 2002
Josef Psutka; Pavel Ircing; Vlasta Radová; William Byrne; Jan Hajic; Samuel Gustman; Bhuvana Ramabhadran
In this paper we describe the initial stages of the ASR component of the MALACH (Multilingual Access to Large Spoken Archives) project. This project will attempt to provide improved access to the large multilingual spoken archives collected by the Survivors of the Shoah Visual History Foundation (VHF) by advancing the state of the art in automated speech recognition. In order to train the ASR system, it is neccesary to manually transcribe a large amount of speech data, identify the appropriate vocabulary, and obtain relevant text for language modeling. We give a detailed description of the speech annotation process; show the specific properties of the spontaneous speech contained in the archives; and present a baseline speech recognition results.
cross language evaluation forum | 2008
Pavel Ircing; Josef Psutka; Jan Vavruška
The paper presents an overview of the system build and experiments performed for the CLEF 2007 CL-SR track by the University of West Bohemia. We have concentrated on the monolingual experiments using the Czech collection only. The approach that was successfully employed by our team in the last years campaign (simple tf.idf model with blind relevance feedback, accompanied with solid linguistic preprocessing) was used again but the set of performed experiments was broadened and a more detailed analysis of the results is provided.