Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Ivo Ipšić is active.

Publication


Featured researches published by Ivo Ipšić.


Expert Systems With Applications | 2015

A knowledge-based multi-layered image annotation system

Marina Ivašić-Kos; Ivo Ipšić; Slobodan Ribaric

A fuzzy-knowledge based intelligent system for multilayered image annotation.Novel merged statistical and knowledge-based approach for image interpretation.Automatic acquisition of facts and rules about the concepts, and their reliability.Inconsistency checking of image segments classification.Automatic knowledge-based scene recognition and inference of more abstract classes. Major challenge in automatic image annotation is bridging the semantic gap between the computable low-level image features and the human-like interpretation of images. The interpretation includes concepts on different levels of abstraction that cannot be simply mapped to features but require additional reasoning with general and domain-specific knowledge. The problem is even more complex since knowledge in context of image interpretation is often incomplete, imprecise, uncertain and ambiguous in nature. Thus, in this paper we propose a fuzzy-knowledge based intelligent system for image annotation, which is able to deal with uncertain and ambiguous knowledge and can annotate images with concepts on different levels of abstraction that is more human-like. The main contributions are associated with an original approach of using a fuzzy knowledge-representation scheme based on the Fuzzy Petri Net (KRFPN) formalism. The acquisition of knowledge is facilitated in a way that besides the general knowledge provided by the expert, the computable facts and rules about the concepts, as well as their reliability, are produced automatically from data. The reasoning capability of the fuzzy inference engine of the KRFPN is used in a novel way for inconsistency checking of the classified image segments, automatic scene recognition, and the inference of generalized and derived classes.The results of image interpretation of Corel images belonging to the domain of outdoor scenes achieved by the proposed system outperform the published results obtained on the same image base in terms of average precision and recall. Owing to the fuzzy-knowledge representation scheme, the obtained image interpretation is enriched with new, more general and abstract concepts that are close to concepts people use to interpret these images.


international convention on information and communication technology, electronics and microelectronics | 2014

Online speaker de-identification using voice transformation

Miran Pobar; Ivo Ipšić

Speaker de-identification is the process by which speech is transformed in a way that the speaker identity is masked, while at the same time the transformed speech preserves acoustic information that contributes to the intelligibility, naturalness and clarity. Systems that perform speech de-identification could be used in voice driven applications (for example in call centres) where the speakers identity has to be hidden. The paper describes the experiments we have performed in order to de-identify speech using GMM based voice transformation techniques and speaker identification using freely available tools. We propose a method by which speakers whose speech has not been used to build voice transformations (for training) can be efficiently de-identified online. The proposed method is evaluated using a speech database of read speech and a small set of speakers. The results we present show that the proposed de-identification method performs similarly as a closed-set de-identification procedure that requires previous enrolment and can efficiently be used for online speaker de-identification.


text speech and dialogue | 2012

A Bilingual HMM-Based Speech Synthesis System for Closely Related Languages

Tadej Justin; Miran Pobar; Ivo Ipšić; Janez Žibert

In this paper we investigate a bilingual HMM-based speech synthesis developed for Slovenian and Croatian languages. The primary goals of this research are to investigate the performance of an HMM-based synthesis build from two similar languages and to perform a comparison of such synthesis system with standard monolingual speaker-dependent HMM-based synthesis. The bilingual HMM synthesis is built by joining all the speech material from both languages by defining proper mapping of Slovenian and Croatian phonemes and by adapting acoustic models of Slovenian and Croatian speakers. Adapted acoustic models are then served as basic building blocks for speech synthesis in both languages. In such a way we are able to obtain synthesized speech of both languages, but with the same speaker voice. We made the quantitative comparison of such kind of synthesis with monolingual counterparts and study the performance of the synthesis in a relation to the amount of data, which is used for building the synthesis system.


ieee international conference on automatic face gesture recognition | 2015

Speaker de-identification using diphone recognition and speech synthesis

Tadej Justin; Vitomir Struc; Simon Dobrisek; Bostjan Vesnicer; Ivo Ipšić

The paper addresses the problem of speaker (or voice) de-identification by presenting a novel approach for concealing the identity of speakers in their speech. The proposed technique first recognizes the input speech with a diphone recognition system and then transforms the obtained phonetic transcription into the speech of another speaker with a speech synthesis system. Due to the fact that a Diphone RecOgnition step and a sPeech SYnthesis step are used during the de-identification, we refer to the developed technique as DROPSY. With this approach the acoustical models of the recognition and synthesis modules are completely independent from each other, which ensures the highest level of input speaker de-identification. The proposed DROPSY-based de-identification approach is language dependent, text independent and capable of running in real-time due to the relatively simple computing methods used. When designing speaker de-identification technology two requirements are typically imposed on the de-identification techniques: i) it should not be possible to establish the identity of the speakers based on the de-identified speech, and ii) the processed speech should still sound natural and be intelligible. This paper, therefore, implements the proposed DROPSY-based approach with two different speech synthesis techniques (i.e, with the HMM-based and the diphone TD-PSOLA-based technique). The obtained de-identified speech is evaluated for intelligibility and evaluated in speaker verification experiments with a state-of-the-art (i-vector/PLDA) speaker recognition system. The comparison of both speech synthesis modules integrated in the proposed method reveals that both can efficiently de-identify the input speakers while still producing intelligible speech.


eurasip conference focused on video image processing and multimedia communications | 2003

Bilingual speech recognition of Slovenian and Croatian weather forecasts

Janez Zibert; Sanda Martinčić-Ipšić; Ivo Ipšić; F. Mihelic

In the paper we present some results of a joint project in speech data collection and speech recognition of Slovenian and Croatian weather forecasts. In the paper we describe the procedures we have performed in order to obtain domain specific speech databases from broadcast programmes. We further describe the speech recognition experiments for language identification and the speech recognition experiments of monolingual and bilingual speech.


international symposium on industrial electronics | 1999

Multilingual spoken dialog system

Ivo Ipšić; Nikola Pavesic; F. Mihelic; Elmar Nöth

We present the architecture of a multilingual spoken dialog system. The system was developed within the Copernicus project, Spoken Queries in European Languages (SQEL). Such a system is capable of handling dialogs with users in Czech, German, Slovak and Slovenian languages. The modules of the multilingual system are presented. The parts of the system which are used for Slovenian dialogs are described in more detail, and some results with respect to word accuracy, semantic accuracy and dialog success rate are shown.


text speech and dialogue | 2013

A Comparison of Two Approaches to Bilingual HMM-Based Speech Synthesis

Miran Pobar; Tadej Justin; Janez Žibert; Ivo Ipšić

We compare the performance of two approaches when using cross-lingual data from different speakers to build bilingual speech synthesis systems capable of producing speech with the same speaker identity. One approach treats data from both languages as monolingual, by labeling all data with a manually joined phoneme set. Speaker independent voice is trained using the joined data, and adapted to the target speaker using the CMLLR adaptation.


international symposium elmar | 2007

Croatian speech technologies

Ivo Ipšić; Maja Matetic; Sanda Martinčić-Ipšić; Ana Meštrović; Marija Brkić

Speech technologies deal with designing computer systems that can recognize spoken words, comprehend human language and generate intelligible speech. There is a wide range of applications speech technology systems were successfully implemented in. One of the most complex applications in speech technology is a spoken dialog system, which can be used for information inquiry services. In the paper we present the work in the development of a spoken dialog system for Croatian language. We propose an approach for development of modules in a spoken dialog system for the limited domain which uses the same acoustic model for speech recognition and speech synthesis. For the linguistic analysis an approach is proposed which is based on qualitative language modelling, while the development of the dialog module is based on the formalism of the object-oriented frame logic language. Some experimental results for Croatian speech recognition and understanding are presented and discussed.


Applied Intelligence | 2004

Qualitative Modelling and Analysis of Animal Behaviour

Maja Matetic; Slobodan Ribaric; Ivo Ipšić

The paper presents the LABAQM system developed for the analysis of laboratory animal behaviours. It is based on qualitative modelling of animal motions. We are dealing with the cognitive phase of the laboratory animal behaviour analysis as a part of the pharmacological experiments. The system is based on the quantitative data from the tracking application and incomplete domain background knowledge. The LABAQM system operates in two main phases: behaviour learning and behaviour analysis. The behaviour learning and behaviour analysis phase are based on symbol sequences, obtained by the transformation of the quantitative data. Behaviour learning phase includes supervised learning procedure, unsupervised learning procedure and their combination. We have shown that the qualitative model of behaviour can be modelled by hidden Markov models. The fusion of supervised and unsupervised learning procedures produces more robust models of characteristic behaviours, which are used in the behaviour analysis phase.


information technology interfaces | 2003

VEPRAD: a Croatian speech database of weather forecasts

Sanda Martinčić-Ipšić; Ivo Ipšić

We present some results of the project in Croatian speech data collection and speech recognition of Croatian weather forecasts. We describe the procedures we have performed in order to obtain a domain specific speech database from broadcast news of national programmes. The speech signal acquisition, transcription and segmentation process is described. We present the database structure and give some database statistics. Preliminary results of Croatian speech recognition experiments using context-independent and context-dependent acoustic models are presented.

Collaboration


Dive into the Ivo Ipšić's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge