Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Piergiorgio Svaizer is active.

Publication


Featured researches published by Piergiorgio Svaizer.


IEEE Transactions on Speech and Audio Processing | 1997

Use of the crosspower-spectrum phase in acoustic event location

Maurizio Omologo; Piergiorgio Svaizer

The article reports on the use of crosspower-spectrum phase (CSP) analysis as an accurate time delay estimation (TDE) technique. It is used in a microphone array system for the location of acoustic events in noisy and reverberant environments. A corresponding coherence measure (CM) and its graphical representation are introduced to show the TDE accuracy. Using a two-microphone pair array, real experiments show less than a 10 cm average location error in a 6 m/spl times/6 m area.


international conference on acoustics speech and signal processing | 1996

Acoustic source location in noisy and reverberant environment using CSP analysis

Maurizio Omologo; Piergiorgio Svaizer

A linear four microphone array can be employed for acoustic event location in a real environment using an accurate time delay estimation. This paper refers to the use of a specific technique, based on crosspower spectrum phase (CSP) analysis, that yielded accurate location performance. The behavior of this technique is investigated under different noise and reverberation conditions. Real experiments as well as simulations were conducted to analyze a wide variety of situations. Results show system robustness at quite critical environmental conditions.


international conference on acoustics, speech, and signal processing | 1997

Microphone array based speech recognition with different talker-array positions

Maurizio Omologo; Marco Matassoni; Piergiorgio Svaizer; Diego Giuliani

The use of a microphone array for hands-free continuous speech recognition in noisy and reverberant environment is investigated. An array of eight omnidirectional microphones was placed at different angles and distances from the talker. A time delay compensation module was used to provide a beamformed signal as input to a hidden Markov model (HMM) based recognizer. A phone HMM adaptation, based on a small amount of phonetically rich sentences, further improved the recognition rate obtained by applying only beamforming. These results were confirmed both by experiments conducted in a noisy and reverberant environment and by simulations. In the latter case, different conditions were recreated by using the image method to reproduce synthetic versions of the array microphone signals.


Speech Communication | 1998

Environmental conditions and acoustic transduction in hands-free speech recognition

Maurizio Omologo; Piergiorgio Svaizer; Marco Matassoni

Abstract Hands-free interaction represents a key-point for increase of flexibility of present applications and for the development of new speech recognition applications, where the user cannot be encumbered by either hand-held or head-mounted microphones. When the microphone is far from the speaker, the transduced signal is affected by degradation of different nature, that is often unpredictable. Special microphones and multi-microphone acquisition systems represent a way of reducing some environmental noise effects. Robust processing and adaptation techniques can be further used in order to compensate for different kinds of variability that may be present in the recognizer input. The purpose of this paper is to re-visit some of the assumptions about the different sources of this variability and to discuss both on special transducer systems and on compensation/adaptation techniques that can be adopted. In particular, the paper will refer to the use of multi-microphone systems to overcome some undesired effects caused by room acoustics (e.g. reverberation) and by coherent/incoherent noise (e.g. competitive talkers, computer fans). The paper concludes with the description of some experiments that were conducted both on real and simulated speech data.


international conference on acoustics speech and signal processing | 1999

Training of HMM with filtered speech material for hands-free recognition

Diego Giuliani; Marco Matassoni; Maurizio Omologo; Piergiorgio Svaizer

This paper addresses the problem of hands-free speech recognition in a noisy office environment. An array of six omnidirectional microphones and a corresponding time delay compensation module are used to provide a beamformed signal as input to a HMM-based recognizer. Training of HMMs is performed either using a clean speech database or using a filtered version of the same database. Filtering consists in a convolution with the acoustic impulse response between the speaker and microphone, to reproduce the reverberation effect. Background noise is summed to provide the desired SNR. The paper shows that the new models trained on these data perform better than the baseline ones. Furthermore, the paper investigates on maximum likelihood linear regression (MLLR) adaptation of the new models. It is shown that a further performance improvement is obtained, allowing to reach a 98.7% WRR in a connected digit recognition task, when the talker is at 1.5 m distance from the array.


international symposium on circuits and systems | 1995

Matched-filter processing of microphone array for spatial volume selectivity

Ea-Ee Jan; Piergiorgio Svaizer; James L. Flanagan

Performance of a delay-and-sum beamformer is typically degraded in a reverberant enclosure because the beam captures not only the desired source signal (direct path) but also all images along the beam axis. Matched-filter processing of microphone arrays is shown to improve the quality of sound capture in reverberant environments. This paper also reports the performance of matched-filter array processing in a real room. Significant improvements are shown over single microphones and the traditional delay-and-sum beamformer.


international conference on spoken language processing | 1996

Experiments of speech recognition in a noisy and reverberant environment using a microphone array and HMM adaptation

Diego Giuliani; Maurizio Omologo; Piergiorgio Svaizer

The use of a microphone array for hands-free continuous speech recognition in noisy and reverberant environment is investigated. An array of four omnidirectional microphones is placed at a distance of 1.5 m from the talker. Given the array signals, a time delay compensation (TDC) module provides a beam-formed signal that is shown to be effective as input to a hidden Markov model (HMM) based recognizer. Given a small number of sentences collected from a new speaker in a real environment, HMM adaptation further improves the recognition rate. These results are confirmed both by experiments conducted in a noisy office environment and by simulations. In the latter case, different SNR and reverberation conditions were recreated by using the image method to reproduce synthetic array microphone signals.


Archive | 2001

Speech Recognition with Microphone Arrays

Maurizio Omologo; Marco Matassoni; Piergiorgio Svaizer

Microphone arrays can be advantageously employed in Automatic Speech Recognition (ASR) systems to allow distant-talking interaction. Their beam-forming capabilities are used to enhance the speech message, while attenuating the undesired contribution of environmental noise and reverberation. In the first part of this chapter the state of the art of ASR systems is briefly reviewed, with a particular concern about robustness in distant-talker applications. The objective is the reduction of the mismatch between the real noisy data and the acoustic models used by the recognizer. Beamforming, speech enhancement, feature compensation, and model adaptation are the techniques adopted to this end. The second part of the chapter is dedicated to the description of a microphone-array based speech recognition system developed at ITC-IRST. It includes a linear array beamformer, an acoustic front-end for speech activity detection and feature extraction, a recognition engine based on Hidden Markov Models and the modules for training and adaptation of the acoustic models. Finally the performance of this system on a typical recognition task is reported.


international conference on acoustics, speech, and signal processing | 2007

Classification of Acoustic Maps to Determine Speaker Position and Orientation from a Distributed Microphone Network

Alessio Brutti; Maurizio Omologo; Piergiorgio Svaizer; Christian Zieger

Acoustic maps created on the basis of the signals acquired by distributed networks of microphones allow to identify position and orientation of an active talker in an enclosure. In adverse situations of high background noise, high reverberation or unavailability of direct paths to the microphones, localization may fail. This paper proposes a novel approach to talker localization and estimation of head orientation based on the classification of global coherence field (GCF) or oriented GCF maps. Preliminary experiments with data obtained by simulated propagation as well as with data acquired in a real room show that the match with precalculated map models provides a robust behavior in adverse conditions.


CLEaR | 2006

A generative approach to audio-visual person tracking

Roberto Brunelli; Alessio Brutti; Paul Chippendale; Oswald Lanz; Maurizio Omologo; Piergiorgio Svaizer; Francesco Tobia

This paper focuses on the integration of acoustic and visual information for people tracking. The system presented relies on a probabilistic framework within which information from multiple sources is integrated at an intermediate stage. An advantage of the method proposed is that of using a generative approach which supports easy and robust integration of multi source information by means of sampled projection instead of triangulation. The system described has been developed in the EU funded CHIL Project research activities. Experimental results from the CLEAR evaluation workshop are reported.

Collaboration


Dive into the Piergiorgio Svaizer's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Marco Matassoni

Center for Information Technology

View shared research outputs
Top Co-Authors

Avatar

Alessio Brutti

fondazione bruno kessler

View shared research outputs
Top Co-Authors

Avatar

Diego Giuliani

fondazione bruno kessler

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Marco Matassoni

Center for Information Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

John W. McDonough

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar

Climent Nadeu

Polytechnic University of Catalonia

View shared research outputs
Researchain Logo
Decentralizing Knowledge