Nancy Niedzielski | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Nancy Niedzielski is active.

Explore More

Publication

Featured researches published by Nancy Niedzielski.

Journal of Language and Social Psychology | 1999

The Effect of Social Information on the Perception of Sociolinguistic Variables

Nancy Niedzielski

Forty-one Detroit-area residents were given perceptual tests in which they were asked to choose from a set of resynthesized vowels the tokens that they felt best matched the vowels they heard in the speech of a fellow Detroiter. Half of the respondents were told that the speaker was from Detroit, whereas half were told that she was from Canada. Respondents given the Canadian label chose raised-diphthong tokens as those present in the dialect of the speaker, whereas those given the Michigan label did not. Respondents given the Michigan label chose vowels that were quite different from the Northern Cities Chain-Shifted variety present in the speaker’s dialect. Because the “speaker’s” perceived nationality was the only aspect that varied between the two groups of respondents, this label alone must have caused the difference in the selection of tokens. This indicates that listeners use social information in speech perception.

international conference on acoustics speech and signal processing | 1999

Fast speaker adaptation using a priori knowledge

Roland Kuhn; Patrick Nguyen; Jean-Claude Junqua; Robert Boman; Nancy Niedzielski; Steven Fincke; Kenneth L. Field; Matteo Contolini

Previously, we presented a radically new class of fast adaptation techniques for speech recognition, based on prior knowledge of speaker variation. To obtain this prior knowledge, one applies a dimensionality reduction technique to T vectors of dimension D derived from T speaker-dependent (SD) models. This offline step yields T basis vectors, the eigenvoices. We constrain the model for new speaker S to be located in the space spanned by the first K eigenvoices. Speaker adaptation involves estimating K eigenvoice coefficients for the new speaker; typically, K is very small compared to original dimension D. Here, we review how to find the eigenvoices, give a maximum-likelihood estimator for the new speakers eigenvoice coefficients, and summarize mean adaptation experiments carried out on the Isolet database. We present new results which assess the impact on performance of changes in training of the SD models. Finally, we interpret the first few eigenvoices obtained.

international conference on acoustics, speech, and signal processing | 1997

Enhancement of esophageal speech by injection noise rejection

Hector R. Javkin; Michael Galler; Nancy Niedzielski

Esophageal speakers, who produce a voice source by bringing about a vibration of the esophageal superior sphincter, must insufflate the esophagus with an air injection gesture before every utterance, thus creating an air reservoir to drive the vibration. The resulting noise is generally undesired by the speakers. This paper describes a method for the automatic recognition and rejection of the injection noise which occurs in esophageal speech.

Annales Des Télécommunications | 2000

Eigenvoices: A compact representation of speakers in model space

Patrick Nguyen; Roland Kuhn; Jean-Claude Junqua; Nancy Niedzielski; Christian Wellekens

In this article, we present a new approach to modeling speaker-dependent systems. The approach was inspired by the eigenfaces techniques used in face recognition. We build a linear vector space of low dimensionality, called eigenspace, in which speakers are located. The basis vectors of this space are called eigenvoices. Each eigenvoice models a direction of inter-speaker variability. The eigenspace is built during the training phase. Then, any speaker model can be expressed as a linear combination of eigenvoices. The benefits of this technique as set forth in this article reside in the reduction of the number of parameters that describe a model. Thereby we are able to reduce the number of parameters to estimate, as well as computation and/or storage costs. We apply the approach to speaker adaptation and speaker recognition. Some experimental results are supplied.RésuméCet article présente une nouvelle approche inspirée de la reconnaissance d’images, adaptée et appliquée à la parole. Un espace vectoriel de dimension réduite, appelé espace propre (eigenspace), dans lequel les locuteurs se trouvent confinés est construit. Les vecteurs de base de cet espace sont appelés voix propres (eigenvoices). Chaque voix propre modélise une composante de variabilité inter-locuteur. L’espace propre est construit lors de la phase d’apprentissage classique pour des systèmes liés à la parole. Un modèle du locuteur est par la suite associé à une combinaison linéaire des vecteurs de l’espace réduit des locuteurs. L’avantage de cette méthode, mis en avant dans l’article, est la réduction du nombre de paramètres caractéristiques d’un modèle. De ce fait, le nombre de paramètres à estimer est réduit, ainsi que le temps de calcul et/ou de stockage. Cette technique est ici appliquée à l’adaptation du locuteur pour un système de reconnaissance automatique du locuteur et à la reconnaissance automatique du locuteur. Quelques résultats expérimentaux sont présentés à cette occasion.

Journal of the Acoustical Society of America | 2000

Esophageal speech injection noise detection and rejection

Hector R. Javkin; Michael Galler; Nancy Niedzielski; Robert Boman

The present invention eliminates injection noise in speech produced by esophageal speakers. A speech input signal is digitized. One copy of the digitized signal is used for analysis and the other is passed through a gain switch to an amplifier as output. A Fast Fourier Transform and a mean value of the digitized speech input signal is calculated. The Fast Fourier Transform (FFT) is passed through a morphological filter to produce a filtered spectrum. An occurrence of injection noise is detected by calculating a derivative of the filtered spectrum and determining from the mean value and the derivative a location and value of a largest peak and a second largest peak in the filtered spectrum. If the largest peak is lower in frequency than the second largest peak, and if all points above 2 KHz are less than the mean, then an occurrence of injection noise has been detected. An occurrence of silence is detected by center-clipping the filtered spectrum and determining whether there is any energy within a sliding 10 millisecond window for a predetermined amount of time. If no energy is detected within a sliding 10 millisecond window for a predetermined amount time, then an occurrence of silence has been detected. The output speech signal is passed after the occurrence of injection noise has been detected; and is blocked following an occurrence of silence.

Journal of the Acoustical Society of America | 1997

Characteristics of roughness in esophageal speech

Hector R. Javkin; Nancy Niedzielski; James Reed

Esophageal voice, produced by the vibration of the superior sphincter of the esophagus by persons who have been laryngectomized, is almost invariably associated with voice source roughness. Using inverse filtering of speech recorded to minimize phase distortion, the source characteristics of six esophageal speakers with different levels of proficiency were examined. With our three less proficient speakers, the roughness is clearly evident in the volume velocity waveform. Highly proficient esophageal speakers, however, although they have a voice quality which is perceived as rough, nevertheless produce a volume velocity waveform surprisingly similar to that of laryngeal speakers. The characteristics that produce this perceived roughness, other than jitter and shimmer, are difficult to detect in the volume velocity waveform. They are noise components of low amplitude and relatively high frequency, which are attenuated in the (integrated) volume velocity. They are best observed in the differentiated volume v...

IEEE Transactions on Speech and Audio Processing | 2000