Is this you? Create Your Porfile

Alvin F. Martin

National Institute of Standards and Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Alvin F. Martin is active.

Explore More

Publication

Featured researches published by Alvin F. Martin.

IEEE Computer | 2000

An introduction evaluating biometric systems

P J. Phillips; Alvin F. Martin; Charles L. Wilson; Mark A. Przybocki

On the basis of media hype alone, you might conclude that biometric passwords will soon replace their alphanumeric counterparts with versions that cannot be stolen, forgotten, lost, or given to another person. But what if the actual performance of these systems falls short of the estimates? The authors designed this article to provide sufficient information to know what questions to ask when evaluating a biometric system, and to assist in determining whether performance levels meet the requirements of an application. For example, a low-performance biometric is probably sufficient for reducing-as opposed to eliminating-fraud. Likewise, completely replacing an existing security system with a biometric-based one may require a high-performance biometric system, or the required performance may be beyond what current technology can provide. Of the biometrics that give the user some control over data acquisition, voice, face, and fingerprint systems have undergone the most study and testing-and therefore occupy the bulk of this discussion. This article also covers the tools and techniques of biometric testing.

Archive | 2005

The NIST speaker recognition evaluation program

Alvin F. Martin; Mark A. Przybocki; Joseph P. Campbell

The National Institute of Standards and Technology (NIST) has coordinated annual scientific evaluations of text-independent speaker recognition since 1996. These evaluations aim to provide important contributions to the direction of research efforts and the calibration of technical capabilities. They are intended to be of interest to all researchers working on the general problem of text-independent speaker recognition. To this end, the evaluations are designed to be simple, fully supported, accessible and focused on core technology issues. The evaluations have focused primarily on speaker detection in the context of conversational telephone speech. More recent evaluations have also included related tasks, such as speaker segmentation, and have used data in addition to conversational telephone speech. The evaluations are designed to foster research progress, with the objectives of:

Digital Signal Processing | 2000

The NIST 1999 Speaker Recognition Evaluation An Overview

Alvin F. Martin; Mark A. Przybocki

Martin, Alvin, and Przybocki, Mark, The NIST 1999 Speaker Recognition Evaluation?An Overview, Digital Signal Processing10(2000), 1?18.This article summarizes the 1999 NIST Speaker Recognition Evaluation. It discusses the overall research objectives, the three task definitions, the development and evaluation data sets, the specified performance measures and their manner of presentation, the overall quality of the results. More than a dozen sites from the United States, Europe, and Asia participated in this evaluation. There were three primary tasks for which automatic systems could be designed: one-speaker detection, two-speaker detection, and speaker tracking. All three tasks were performed in the context of mu-law encoded conversational telephone speech. The one-speaker detection task used single channel data, while the other two tasks used summed two-channel data. About 500 target speakers were specified, with 2 min of training speech data provided for each. Both multiple and single speaker test segments were selected from about 2000 conversations that were not used for training material. The duration of the multiple speaker test data was nominally 1 min, while the duration of the single speaker test segments varied from near zero up to 60 s. For each task, systems had to make independent decisions for selected combinations of a test segment and a hypothesized target speaker. The data sets for each task were designed to be large enough to provide statistically meaningful results on test subsets of interest. Results were analyzed with respect to various conditions including duration, pitch differences, and handset types.

Computer Speech & Language | 2006

NIST and NFI-TNO evaluations of automatic speaker recognition

David A. van Leeuwen; Alvin F. Martin; Mark A. Przybocki; Jos S. Bouten

In the past years, several text-independent speaker recognition evaluation campaigns have taken place. This paper reports on results of the NIST evaluation of 2004 and the NFI-TNO forensic speaker recognition evaluation held in 2003, and reflects on the history of the evaluation campaigns. The effects of speech duration, training handsets, transmission type, and gender mix show expected behaviour on the DET curves. New results on the influence of language show an interesting dependence of the DET curves on the accent of speakers. We also report on a number of statistical analysis techniques that have recently been introduced in the speaker recognition community, as well as a new application of the analysis of deviance analysis. These techniques are used to determine that the two evaluations held in 2003, by NIST and NFI-TNO, are of statistically different difficulty to the speaker recognition systems.

IEEE Transactions on Audio, Speech, and Language Processing | 2007

NIST Speaker Recognition Evaluations Utilizing the Mixer Corpora—2004, 2005, 2006

Mark A. Przybocki; Alvin F. Martin; Audrey N. Le

NIST has coordinated annual evaluations of text-independent speaker recognition from 1996 to 2006. This paper discusses the last three of these, which utilized conversational speech data from the Mixer Corpora recently collected by the Linguistic Data Consortium. We review the evaluation procedures, the matrix of test conditions included, and the performance trends observed. While most of the data is collected over telephone channels, one multichannel test condition utilizes a subset of Mixer conversations recorded simultaneously over multiple microphone channels and a telephone line. The corpus also includes some non-English conversations involving bilingual speakers, allowing an examination of the effect of language on performance results. On the various test conditions involving English language conversational telephone data, considerable performance gains are observed over the past three years.

2006 IEEE Odyssey - The Speaker and Language Recognition Workshop | 2006

NIST Speaker Recognition Evaluation Chronicles - Part 2

Mark A. Przybocki; Alvin F. Martin; Audrey N. Le

NIST has coordinated annual evaluations of text-independent speaker recognition since 1996. This update to an Odyssey 2004 paper concentrates on the past two years of the NIST evaluations. We discuss in particular the results of the 2004 and 2005 evaluations, and how they compare to earlier evaluation results. We also discuss the preparation and planning for the 2006 evaluation, which concludes with the evaluation workshop in San Juan, Puerto Rico, in June 2006

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2007

Performance Generalization in Biometric Authentication Using Joint User-Specific and Sample Bootstraps

Norman Poh; Alvin F. Martin; Samy Bengio

Biometric authentication performance is often depicted by a detection error trade-off (DET) curve. We show that this curve is dependent on the choice of samples available, the demographic composition and the number of users specific to a database. We propose a two-step bootstrap procedure to take into account the three mentioned sources of variability. This is an extension to the Bolle et al.s bootstrap subset technique. Preliminary experiments on the NIST2005 and XM2VTS benchmark databases are encouraging, e.g., the average result across all 24 systems evaluated on NIST2005 indicates that one can predict, with more than 75 percent of DET coverage, an unseen DET curve with eight times more users. Furthermore, our finding suggests that with more data available, the confidence intervals become smaller and, hence, more useful

Proceedings of SPIE | 2009

NIST Speaker Recognition Evaluations 1996-2008

Craig S. Greenberg; Alvin F. Martin

From 1996 through 2008, the NIST Speaker Recognition Evaluations have focused on the task of automatic speaker detection based on recorded segments of spontaneous conversational speech. Earlier evaluations were limited to English language telephone speech. More recent evaluations (2004-2008) have included some conversational telephone speech in multiple languages, with the 2008 evaluation including 24 different languages. These recent evaluations have also explored cross channel effects by including phone conversations recorded over multiple microphone channels, and the 2008 evaluation also examined interview type speech recorded over multiple microphone channels. The considerable progress observed over the period of these evaluations has made the technology potentially useful for detecting individuals of interest in certain applications. Performance capability is measurably affected by a number of situational factors, including the number and duration of the training speech segments available, the durations of the test speech segments available, the language(s) spoken in these segments, and the types and variability of the recording channels involved.

2006 IEEE Odyssey - The Speaker and Language Recognition Workshop | 2006

The Current State of Language Recognition: NIST 2005 Evaluation Results

Alvin F. Martin; Audrey N. Le

The National Institute of Standards and Technology (NIST) coordinated in 2005 an evaluation of language recognition capabilities of research systems developed by twelve participating sites. This evaluation followed fairly similar evaluations in 1996 and 2003. We describe here the protocols of the 2005 evaluation, including the data used, the evaluation rules, and the scoring metric. We present the overall performance results and compare these to results for the previous evaluations. We also discuss how the results varied across languages and the results of limited dialect recognition tests involving English and Mandarin speech data

Journal of Research of the National Institute of Standards and Technology | 2011

Measures, Uncertainties, and Significance Test in Operational ROC Analysis

Jin Chu Wu; Alvin F. Martin; Raghu N. Kacker

In receiver operating characteristic (ROC) analysis, the sampling variability can result in uncertainties of performance measures. Thus, while evaluating and comparing the performances of algorithms, the measurement uncertainties must be taken into account. The key issue is how to calculate the uncertainties of performance measures in ROC analysis. Our ultimate goal is to perform the significance test in evaluation and comparison using the standard errors computed. From the operational perspective, based on fingerprint-image matching algorithms on large datasets, the measures and their uncertainties are investigated in the three scenarios: 1) the true accept rate (TAR) of genuine scores at a specified false accept rate (FAR) of impostor scores, 2) the TAR and FAR at a given threshold, and 3) the equal error rate. The uncertainties of measures are calculated using the nonparametric two-sample bootstrap based on our extensive studies of bootstrap variability on large datasets. The significance test is carried out to determine whether the difference between the performance of one algorithm and a hypothesized value, or the difference between the performances of two algorithms where the correlation is taken into account is statistically significant. Examples are provided.

Explore More