Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Douglas A. Reynolds is active.

Publication


Featured researches published by Douglas A. Reynolds.


Digital Signal Processing | 2000

Speaker Verification Using Adapted Gaussian Mixture Models

Douglas A. Reynolds; Thomas F. Quatieri; Robert B. Dunn

Reynolds, Douglas A., Quatieri, Thomas F., and Dunn, Robert B., Speaker Verification Using Adapted Gaussian Mixture Models, Digital Signal Processing10(2000), 19Â?41.In this paper we describe the major elements of MIT Lincoln Laboratorys Gaussian mixture model (GMM)-based speaker verification system used successfully in several NIST Speaker Recognition Evaluations (SREs). The system is built around the likelihood ratio test for verification, using simple but effective GMMs for likelihood functions, a universal background model (UBM) for alternative speaker representation, and a form of Bayesian adaptation to derive speaker models from the UBM. The development and use of a handset detector and score normalization to greatly improve verification performance is also described and discussed. Finally, representative performance benchmarks and system behavior experiments on NIST SRE corpora are presented.


IEEE Transactions on Speech and Audio Processing | 1995

Robust text-independent speaker identification using Gaussian mixture speaker models

Douglas A. Reynolds; Richard C. Rose

This paper introduces and motivates the use of Gaussian mixture models (GMM) for robust text-independent speaker identification. The individual Gaussian components of a GMM are shown to represent some general speaker-dependent spectral shapes that are effective for modeling speaker identity. The focus of this work is on applications which require high identification rates using short utterance from unconstrained conversational speech and robustness to degradations produced by transmission over a telephone channel. A complete experimental evaluation of the Gaussian mixture speaker model is conducted on a 49 speaker, conversational telephone speech database. The experiments examine algorithmic issues (initialization, variance limiting, model order selection), spectral variability robustness techniques, large population performance, and comparisons to other speaker modeling techniques (uni-modal Gaussian, VQ codebook, tied Gaussian mixture, and radial basis functions). The Gaussian mixture speaker model attains 96.8% identification accuracy using 5 second clean speech utterances and 80.8% accuracy using 15 second telephone speech utterances with a 49 speaker population and is shown to outperform the other speaker modeling techniques on an identical 16 speaker telephone speech task. >


Speech Communication | 1995

Speaker identification and verification using Gaussian mixture speaker models

Douglas A. Reynolds

Abstract This paper presents high performance speaker identification and verification systems based on Gaussian mixture speaker models: robust, statistically based representations of speaker identity. The identification system is a maximum likelihood classifier and the verification system is a likelihood ratio hypothesis tester using background speaker normalization. The systems are evaluated on four publically available speech databases: TIMIT, NTIMIT, Switchboard and YOHO. The different levels of degradations and variabilities found in these databases allow the examination of system performance for different task domains. Constraints on the speech range from vocabulary-dependent to extemporaneous and speech quality varies from near-ideal, clean speech to noisy, telephone speech. Closed set identification accuracies on the 630 speaker TIMIT and NTIMIT databases were 99.5% and 60.7%, respectively. On a 113 speaker population from the Switchboard database the identification accuracy was 82.8%. Global threshold equal error rates of 0.24%, 7.19%, 5.15% and 0.51% were obtained in verification experiments on the TIMIT, NTIMIT, Switchboard and YOHO databases, respectively.


IEEE Signal Processing Letters | 2006

Support vector machines using GMM supervectors for speaker verification

William M. Campbell; Douglas E. Sturim; Douglas A. Reynolds

Gaussian mixture models (GMMs) have proven extremely successful for text-independent speaker recognition. The standard training method for GMM models is to use MAP adaptation of the means of the mixture components based on speech from a target speaker. Recent methods in compensation for speaker and channel variability have proposed the idea of stacking the means of the GMM model to form a GMM mean supervector. We examine the idea of using the GMM supervector in a support vector machine (SVM) classifier. We propose two new SVM kernels based on distance metrics between GMM models. We show that these SVM kernels produce excellent classification accuracy in a NIST speaker recognition evaluation task.


EURASIP Journal on Advances in Signal Processing | 2004

A tutorial on text-independent speaker verification

Frédéric Bimbot; Jean-François Bonastre; Corinne Fredouille; Guillaume Gravier; Ivan Magrin-Chagnolleau; Sylvain Meignier; Teva Merlin; Javier Ortega-Garcia; Dijana Petrovska-Delacrétaz; Douglas A. Reynolds

This paper presents an overview of a state-of-the-art text-independent speaker verification system. First, an introduction proposes a modular scheme of the training and test phases of a speaker verification system. Then, the most commonly speech parameterization used in speaker verification, namely, cepstral analysis, is detailed. Gaussian mixture modeling, which is the speaker modeling technique used in most systems, is then explained. A few speaker modeling alternatives, namely, neural networks and support vector machines, are mentioned. Normalization of scores is then explained, as this is a very important step to deal with real-world data. The evaluation of a speaker verification system is then detailed, and the detection error trade-off (DET) curve is explained. Several extensions of speaker verification are then enumerated, including speaker tracking and segmentation by speakers. Then, some applications of speaker verification are proposed, including on-site applications, remote applications, applications relative to structuring audio information, and games. Issues concerning the forensic area are then recalled, as we believe it is very important to inform people about the actual performance and limitations of speaker verification systems. This paper concludes by giving a few research trends in speaker verification for the next couple of years.


international conference on acoustics, speech, and signal processing | 2006

SVM Based Speaker Verification using a GMM Supervector Kernel and NAP Variability Compensation

William M. Campbell; Douglas E. Sturim; Douglas A. Reynolds; Alex Solomonoff

Gaussian mixture models with universal backgrounds (UBMs) have become the standard method for speaker recognition. Typically, a speaker model is constructed by MAP adaptation of the means of the UBM. A GMM supervector is constructed by stacking the means of the adapted mixture components. A recent discovery is that latent factor analysis of this GMM supervector is an effective method for variability compensation. We consider this GMM supervector in the context of support vector machines. We construct a support vector machine kernel using the GMM supervector. We show similarities based on this kernel between the method of SVM nuisance attribute projection (NAP) and the recent results in latent factor analysis. Experiments on a NIST SRE 2005 corpus demonstrate the effectiveness of the new technique


Computer Speech & Language | 2006

Support vector machines for speaker and language recognition

William M. Campbell; Joseph P. Campbell; Douglas A. Reynolds; Elliot Singer; Pedro A. Torres-Carrasquillo

Abstract Support vector machines (SVMs) have proven to be a powerful technique for pattern classification. SVMs map inputs into a high-dimensional space and then separate classes with a hyperplane. A critical aspect of using SVMs successfully is the design of the inner product, the kernel, induced by the high dimensional mapping. We consider the application of SVMs to speaker and language recognition. A key part of our approach is the use of a kernel that compares sequences of feature vectors and produces a measure of similarity. Our sequence kernel is based upon generalized linear discriminants. We show that this strategy has several important properties. First, the kernel uses an explicit expansion into SVM feature space—this property makes it possible to collapse all support vectors into a single model vector and have low computational complexity. Second, the SVM builds upon a simpler mean-squared error classifier to produce a more accurate system. Finally, the system is competitive and complimentary to other approaches, such as Gaussian mixture models (GMMs). We give results for the 2003 NIST speaker and language evaluations of the system and also show fusion with the traditional GMM approach.


IEEE Transactions on Audio, Speech, and Language Processing | 2006

An overview of automatic speaker diarization systems

S. E. Tranter; Douglas A. Reynolds

Audio diarization is the process of annotating an input audio channel with information that attributes (possibly overlapping) temporal regions of signal energy to their specific sources. These sources can include particular speakers, music, background noise sources, and other signal source/channel characteristics. Diarization can be used for helping speech recognition, facilitating the searching and indexing of audio archives, and increasing the richness of automatic transcriptions, making them more readable. In this paper, we provide an overview of the approaches currently used in a key area of audio diarization, namely speaker diarization, and discuss their relative merits and limitations. Performances using the different techniques are compared within the framework of the speaker diarization task in the DARPA EARS Rich Transcription evaluations. We also look at how the techniques are being introduced into real broadcast news systems and their portability to other domains and tasks such as meetings and speaker verification


international conference on acoustics, speech, and signal processing | 2002

An overview of automatic speaker recognition technology

Douglas A. Reynolds

In this paper we provide a brief overview of the area of speaker recognition, describing applications, underlying techniques and some indications of performance. Following this overview we will discuss some of the strengths and weaknesses of current speaker recognition technologies and outline some potential future trends in research, development and applications.


international conference on acoustics, speech, and signal processing | 2003

Channel robust speaker verification via feature mapping

Douglas A. Reynolds

In speaker recognition applications, channel variability is a major cause of errors. Techniques in the feature, model and score domains have been applied to mitigate channel effects. In this paper we present a new feature mapping technique that maps feature vectors into a channel independent space. The feature mapping learns mapping parameters from a set of channel-dependent models derived from a channel-independent model via MAP adaptation. The technique is developed primarily for speaker verification, but can be applied for feature normalization in speech recognition applications. Results are presented on NIST landline and cellular telephone speech corpora where it is shown that feature mapping provides significant performance improvements over baseline systems and similar performance to Hnorm and speaker-model-synthesis (SMS).

Collaboration


Dive into the Douglas A. Reynolds's collaboration.

Top Co-Authors

Avatar

William M. Campbell

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Elliot Singer

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Joseph P. Campbell

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Douglas E. Sturim

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Thomas F. Quatieri

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Pedro A. Torres-Carrasquillo

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Robert B. Dunn

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Alan McCree

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Wade Shen

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Fred Richardson

Massachusetts Institute of Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge