Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Claudio Vair is active.

Publication


Featured researches published by Claudio Vair.


IEEE Transactions on Audio, Speech, and Language Processing | 2007

Compensation of Nuisance Factors for Speaker and Language Recognition

Fabio Castaldo; Daniele Colibro; Emanuele Dalmasso; Pietro Laface; Claudio Vair

The variability of the channel and environment is one of the most important factors affecting the performance of text-independent speaker verification systems. The best techniques for channel compensation are model based. Most of them have been proposed for Gaussian mixture models, while in the feature domain blind channel compensation is usually performed. The aim of this work is to explore techniques that allow more accurate intersession compensation in the feature domain. Compensating the features rather than the models has the advantage that the transformed parameters can be used with models of a different nature and complexity and for different tasks. In this paper, we evaluate the effects of the compensation of the intersession variability obtained by means of the channel factors approach. In particular, we compare channel variability modeling in the usual Gaussian mixture model domain, and our proposed feature domain compensation technique. We show that the two approaches lead to similar results on the NIST 2005 Speaker Recognition Evaluation data with a reduced computation cost. We also report the results of a system, based on the intersession compensation technique in the feature space that was among the best participants in the NIST 2006 Speaker Recognition Evaluation. Moreover, we show how we obtained significant performance improvement in language recognition by estimating and compensating, in the feature domain, the distortions due to interspeaker variability within the same language.


2006 IEEE Odyssey - The Speaker and Language Recognition Workshop | 2006

Channel Factors Compensation in Model and Feature Domain for Speaker Recognition

Claudio Vair; Daniele Colibro; Fabio Castaldo; Emanuele Dalmasso; Pietro Laface

The variability of the channel and environment is one of the most important factors affecting the performance of text-independent speaker verification systems. The best techniques for channel compensation are model based. Most of them have been proposed for Gaussian mixture models, while in the feature domain typically blind channel compensation is performed. The aim of this work is to explore techniques that allow more accurate channel compensation in the domain of the features. Compensating the features rather than the models has the advantage that the transformed parameters can be used with models of different nature and complexity, and also for different tasks. In this paper we evaluate the effects of the compensation of the channel variability obtained by means of the channel factors approach. In particular, we compare channel variability modeling in the usual Gaussian mixture model domain, and our proposed feature domain compensation technique. We show that the two approaches lead to similar results on the NIST 2005 speaker recognition evaluation data. Moreover, the quality of the transformed features is also assessed in the support vector machines framework for speaker recognition on the same data, and in preliminary experiments on language identification


international conference on acoustics, speech, and signal processing | 1995

A fast segmental Viterbi algorithm for large vocabulary recognition

Pietro Laface; Claudio Vair; Luciano Fissore

The paper presents a fast segmental Viterbi algorithm. A new search strategy particularly effective for very large vocabulary word recognition. It performs a tree based, time synchronous, left-to-right beam search that develops time-dependent acoustic and phonetic hypotheses. At any given time, it makes active a sub-word unit associated to an arc of a lexical tree only if that time is likely to be the boundary between the current and the next unit. This new technique, tested with a vocabulary of 188892 directory entries, achieves the same results obtained with the Viterbi algorithm, with a 35% speedup. Results are also presented for a 718 word, speaker independent continuous speech recognition task.


international conference on acoustics, speech, and signal processing | 2008

Stream-based speaker segmentation using speaker factors and eigenvoices

Fabio Castaldo; Daniele Colibro; Emanuele Dalmasso; Pietro Laface; Claudio Vair

This paper presents a stream-based approach for unsupervised multi-speaker conversational speech segmentation. The main idea of this work is to exploit prior knowledge about the speaker space to find a low dimensional vector of speaker factors that summarize the salient speaker characteristics. This new approach produces segmentation error rates that are better than the state of the art ones reported in our previous work on the segmentation task in the NIST 2000 Speaker Recognition Evaluation (SRE). We also show how the performance of a speaker recognition system in the core test of the 2006 NIST SRE is affected, comparing the results obtained using single speaker and automatically segmented test data.


international conference on acoustics, speech, and signal processing | 2007

Language Identification using Acoustic Models and Speaker Compensated Cepstral-Time Matrices

Fabio Castaldo; Emanuele Dalmasso; Pietro Laface; Daniele Colibro; Claudio Vair

This work presents two contributions to language identification. The first contribution is the definition of a set of properly selected time-frequency features that are a valid alternative to the commonly used shifted delta cepstral features. As a second contribution, we show that significant performance improvement in language recognition can be obtained estimating a subspace that represents the distortions due to inter-speaker variability within the same language, and compensating these distortions in the domain of the features. Experiments on the NIST 1996 and 2003 Language Recognition Evaluation data have been successfully used to validate the effectiveness of the proposed techniques.


international conference on acoustics, speech, and signal processing | 2009

Loquendo - Politecnico di Torino's 2008 NIST speaker recognition evaluation system

Emanuele Dalmasso; Fabio Castaldo; Pietro Laface; Daniele Colibro; Claudio Vair

This paper describes the improvements introduced in the Loquendo-Politecnico di Torino (LPT) speaker recognition system submitted to the NIST SRE08 evaluation campaign. This system, which was among the best participants in this evaluation, combines the results of three core acoustic systems, two based on Gaussian Mixture Models (GMMs), and one on Phonetic GMMs. We discuss the results of the experiments performed for the 10sec-10sec condition and for the core condition, including the challenging tasks involving a target speaker and an interviewer. The error rate reduction of our SRE08 system compared to the SRE06 system ranges from 25% of the telephone-interview condition to 57% of the interview-interview condition. On the test with telephone and microphone conversations, the improvements range from 9% to 32%.


international conference on acoustics, speech, and signal processing | 2010

Loquendo-Politecnico di Torino system for the 2009 NIST Language Recognition Evaluation

Fabio Castaldo; Daniele Colibro; Sandro Cumani; Emanuele Dalmasso; Pietro Laface; Claudio Vair

This paper describes the system submitted by Loquendo and Politecnico di Torino (LPT) for the 2009 NIST Language Recognition Evaluation. The system is a combination of classifiers based on two core acoustic models and on two core phone tokenizers. It exploits several state-of-the-art techniques that have been successfully applied in recent years both in speaker and in language recognition.


international conference on acoustics, speech, and signal processing | 2002

Learning new user formulations in automatic Directory Assistance

Cosmin Popovici; M. Andorno; Pietro Laface; L. Fissore; Mario Nigra; Claudio Vair

Telecom Italia has deployed since the beginning of year 2001 a nationwide automatic Directory Assistance (DA) system that routinely serves customers asking for residential and business listings.


international conference on acoustics, speech, and signal processing | 2005

Learning pronunciation and formulation variants in continuous speech applications

Daniele Colibro; Luciano Fissore; Cosmin Popovici; Claudio Vair; Pietro Laface

Most voice driven applications are based on recognition grammars. In complex applications it is difficult to exactly predict how the users will formulate their requests even if a careful study of the users behavior has been performed. Moreover, it is possible that a speakers word pronunciation does not match the phonetic transcription of the system, mainly in the case of foreign words. Loquendo has developed a tool that collects field data, detects the most significant weaknesses of the application due to pronunciation of formulation mismatches, and filters the collected field corpora. This permits the application designers to perform their analysis only on a reasonable amount of preprocessed and automatically labeled data. This paper presents the approaches that have been devised to detect pronunciation variants of vocabulary words and linguistic formulations not covered by the recognition grammar. Results showing the improvements that have been obtained including automatically detected formulations in three grammars for two languages are also detailed.


Archive | 2005

Automatic Text-Independent, Language-Independent Speaker Voice-Print Creation and Speaker Recognition

Claudio Vair; Daniele Colibro; Luciano Fissore

Collaboration


Dive into the Claudio Vair's collaboration.

Researchain Logo
Decentralizing Knowledge