Robert Boman
Panasonic
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Robert Boman.
international conference on acoustics speech and signal processing | 1999
Roland Kuhn; Patrick Nguyen; Jean-Claude Junqua; Robert Boman; Nancy Niedzielski; Steven Fincke; Kenneth L. Field; Matteo Contolini
Previously, we presented a radically new class of fast adaptation techniques for speech recognition, based on prior knowledge of speaker variation. To obtain this prior knowledge, one applies a dimensionality reduction technique to T vectors of dimension D derived from T speaker-dependent (SD) models. This offline step yields T basis vectors, the eigenvoices. We constrain the model for new speaker S to be located in the space spanned by the first K eigenvoices. Speaker adaptation involves estimating K eigenvoice coefficients for the new speaker; typically, K is very small compared to original dimension D. Here, we review how to find the eigenvoices, give a maximum-likelihood estimator for the new speakers eigenvoice coefficients, and summarize mean adaptation experiments carried out on the Isolet database. We present new results which assess the impact on performance of changes in training of the SD models. Finally, we interpret the first few eigenvoices obtained.
Journal of the Acoustical Society of America | 2004
Roland Kuhn; Matteo Contolini; Robert Boman
The call screener employs a telephone system interface connected between a telephone network and a telephone device of a user. The interface selectively routes calls (and refrain from routing calls) based on the results from the dialogue system. The dialogue system elicits speech from an incoming caller and causes the telephone system interface to route calls from the incoming caller based on a comparison of the elicited speech with a set of stored speaker models. The stored speaker models may be maintained automatically by the system, using either a passive mode, in which calls exceeding a predetermined duration are assumed to be “acceptable” callers; and a proactive mode in which the system prompts the user at the end of the call to elect whether to save the speech models developed during that call in the acceptable user database. If desired, the user can attach other attributes or special tags to the stored models, indicating special handling or call routing rules to be applied when that caller calls again.
international conference on acoustics, speech, and signal processing | 2001
Christophe Cerisara; Luca Rigazio; Robert Boman; Jean-Claude Junqua
We propose an algorithm that compensates for both additive and convolutional noise. The goal of this method is to achieve an efficient environmental adaptation to realistic environments both in terms of computation time and memory. The algorithm described in this paper is an extension of an additive noise adaptation algorithm. Experimental results are given on a realistic database recorded in a car. This database is further filtered by a low pass filter to combine additive and channel noise. The proposed adaptation algorithm reduces the error rate by 75 % on this database, when compared to our baseline system without environmental adaptation.
Journal of the Acoustical Society of America | 2000
Hector R. Javkin; Michael Galler; Nancy Niedzielski; Robert Boman
The present invention eliminates injection noise in speech produced by esophageal speakers. A speech input signal is digitized. One copy of the digitized signal is used for analysis and the other is passed through a gain switch to an amplifier as output. A Fast Fourier Transform and a mean value of the digitized speech input signal is calculated. The Fast Fourier Transform (FFT) is passed through a morphological filter to produce a filtered spectrum. An occurrence of injection noise is detected by calculating a derivative of the filtered spectrum and determining from the mean value and the derivative a location and value of a largest peak and a second largest peak in the filtered spectrum. If the largest peak is lower in frequency than the second largest peak, and if all points above 2 KHz are less than the mean, then an occurrence of injection noise has been detected. An occurrence of silence is detected by center-clipping the filtered spectrum and determining whether there is any energy within a sliding 10 millisecond window for a predetermined amount of time. If no energy is detected within a sliding 10 millisecond window for a predetermined amount time, then an occurrence of silence has been detected. The output speech signal is passed after the occurrence of injection noise has been detected; and is blocked following an occurrence of silence.
international conference on acoustics, speech, and signal processing | 2004
Luca Giulio Brayda; Luca Rigazio; Robert Boman; Jean-Claude Junqua
The paper addresses the problem of noise robustness from the standpoint of the sensitivity to noise estimation errors. Since the noise is usually estimated in the power-spectral domain, we show that the implied error in the cepstral domain has interesting properties. These properties allow us to compare two key methods used in noise robust speech recognition: spectral subtraction and parallel model combination. We show that parallel model combination has an advantage over spectral subtraction because it is less sensitive to noise estimation errors. Experimental results on the Aurora2 database confirm our theoretical findings, with parallel model combination clearly outperforming spectral subtraction and other well-known signal-based robustness methods. Our Aurora2 results, with parallel model combination, a basic MFCC front-end and a simple noise estimation, are close to the best results obtained on this database with very complex signal processing schemes.
Archive | 2002
Robert Boman; Kirill Stoimenov; Roland Kuhn; Jean-Claude Junqua
international conference on acoustics, speech, and signal processing | 1999
Roland Kuhn; Patrick Nguyen; Jean Claude Junqua; Robert Boman; Nancy Niedzielski; Steven Fincke; Ken Field; Matteo Contolini
Journal of the Acoustical Society of America | 1998
Roland Kuhn; Patrick Nguyen; Jean-Claude Junqua; Robert Boman
Archive | 2004
Robert Boman; Brian A. Hanson
Archive | 2003
Luca Rigazio; Patrick Nguyen; Jean-Claude Junqua; Robert Boman