Janez Zibert | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Janez Zibert is active.

Explore More

Publication

Featured researches published by Janez Zibert.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2009

An Edit-Distance Model for the Approximate Matching of Timed Strings

Simon Dobrisek; Janez Zibert; Nikola Pavesic; F. Mihelic

An edit-distance model that can be used for the approximate matching of contiguous and noncontiguous timed strings is presented. The model extends the concept of the weighted string-edit distance by introducing timed edit operations and by making the edit costs time dependent. Special attention is paid to the timed null symbols that are associated with the timed insertions and deletions. The usefulness of the presented model is demonstrated on the classification of phone-recognition errors using the TIMIT speech database.

eurasip conference focused on video image processing and multimedia communications | 2003

Bilingual speech recognition of Slovenian and Croatian weather forecasts

Janez Zibert; Sanda Martinčić-Ipšić; Ivo Ipšić; F. Mihelic

In the paper we present some results of a joint project in speech data collection and speech recognition of Slovenian and Croatian weather forecasts. In the paper we describe the procedures we have performed in order to obtain domain specific speech databases from broadcast programmes. We further describe the speech recognition experiments for language identification and the speech recognition experiments of monolingual and bilingual speech.

Archive | 2007

Novel Approaches to Speech Detection in the Processing of Continuous Audio Streams

Janez Zibert; Bostjan Vesnicer

With the increasing amount of information stored in various audio-data documents there is a growing need for the efficient and effective processing, archiving and accessing of this information. One of the largest sources of such information is spoken audio documents, including broadcast-news (BN) shows, voice mails, recorded meetings, telephone conversations, etc. In these documents the information is mainly relayed through speech, which needs to be appropriately processed and analysed by applying automatic speech and language technologies. Spoken audio documents are produced by a wide range of people in a variety of situations, and are derived from various multimedia applications. They are usually collected as continuous audio streams and consist of multiple audio sources. These audio sources may be different speakers, music segments, types of noise, etc. For example, a BN show typically consists of speech from different speakers as well as music segments, commercials and various types of noises that are present in the background of the reports. In order to efficiently process or extract the required information from such documents the appropriate audio data need to be selected and properly prepared for further processing. In the case of speech-processing applications this means detecting just the speech parts in the audio data and delivering them as inputs in a suitable format for further speech processing. The detection of such speech segments in continuous audio streams and the segmentation of audio streams into either detected speech or non-speech data is known as the speech/nonspeech (SNS) segmentation problem. In this chapter we present an overview of the existing approaches to SNS segmentation in continuous audio streams and propose a new representation of audio signals that is more suitable for robust speech detection in SNSsegmentation systems. Since speech detection is usually applied as a pre-processing step in various speech-processing applications we have also explored the impact of different SNSsegmentation approaches on a speaker-diarisation task in BN data. This chapter is organized as follows: In Section 2 a new high-level representation of audio signals based on phoneme-recognition features is introduced. First of all we give a short overview of the existing audio representations used for speech detection and provide the basic ideas and motivations for introducing a new representation of audio signals for SNS segmentation. In the remainder of the section we define four features based on consonantvowel pairs and the voiced-unvoiced regions of signals, which are automatically detected by

text speech and dialogue | 2002

Speech Features Extraction Using Cone-Shaped Kernel Distribution

Janez Zibert; Nikola Pavesic

The paper reviews two basic time-frequency distributions, spectrogram and cone-shaped kernel distribution. We study, analyze and compare properties and performance of these quadratic representations on speech signals. Cone-shaped kernel distribution was successfully applied to speech features extraction due to several useful properties in time-frequency analysis of speech signals.

text speech and dialogue | 1999

Language Model Representations for the GOPOLIS Database

Janez Zibert; Jerneja Gros; Simon Dobrisek

The formation of a domain-oriented sentence corpus by sentence pattern rules is described. The same rules were transformed into word networks to serve as a language model within a HTK based speech recognition system. The performance of the word network language model was compared to the one of the bigram model.

text speech and dialogue | 2003

Bilingual speech recognition for a weather information retrieval dialogue system

Sanda Martinčić-Ipšić; Janez Zibert; Ivo Ipšić; Nikola Pavesic

In the paper we present current activities and some preliminary results of a joint project in designing a spoken dialogue system for Slovenian and Croatian weather information retrieval. We give a brief description of the system design, of the procedures we have performed in order to obtain domain specific speech databases and monolingual and bilingual speech recognition experiments. Recognition results for Croatian and Slovenian speech are presented, as well as bilingual speech recognition results when using common acoustic models. We propose two different approaches to the language identification problem and show recognition results for the two acoustically similar languages like Slovenian and Croatian.

language resources and evaluation | 2004

The COST278 Pan-European Broadcast News Database.

An Vandecatseye; Jean-Pierre Martens; João Paulo Neto; Hugo Meinedo; Carmen García-Mateo; Javier Dieguez-Tirado; Janez Zibert; Jan Nouza; Petr David; Matus Pleva; Anton Cizmar; Harris Papageorgiou; Christina Alexandris

WSEAS Transactions on Information Science and Applications archive | 2009

Histogram remapping as a preprocessing step for robust face recognition

Vitomir Struc; Janez Zibert; Nikola Pavesic

conference of the international speech communication association | 2005

The COST278 Broadcast News Segmentation and Speaker Clustering Evaluation - Overview, Methodology, Systems, Results

Janez Zibert; Jean-Pierre Martens; Hugo Meinedo; João Paulo Neto; Laura Docío Fernández; Carmen García-Mateo; Petr David; Jindrich Zdánský; Matus Pleva; Anton Cizmar; Andrej Zgank; Zdravko Kacic; Csaba Teleki; Klára Vicsi

conference of the international speech communication association | 2010