Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Craig S. Greenberg is active.

Publication


Featured researches published by Craig S. Greenberg.


Proceedings of SPIE | 2009

NIST Speaker Recognition Evaluations 1996-2008

Craig S. Greenberg; Alvin F. Martin

From 1996 through 2008, the NIST Speaker Recognition Evaluations have focused on the task of automatic speaker detection based on recorded segments of spontaneous conversational speech. Earlier evaluations were limited to English language telephone speech. More recent evaluations (2004-2008) have included some conversational telephone speech in multiple languages, with the 2008 evaluation including 24 different languages. These recent evaluations have also explored cross channel effects by including phone conversations recorded over multiple microphone channels, and the 2008 evaluation also examined interview type speech recorded over multiple microphone channels. The considerable progress observed over the period of these evaluations has made the technology potentially useful for detecting individuals of interest in certain applications. Performance capability is measurably affected by a number of situational factors, including the number and duration of the training speech segments available, the durations of the test speech segments available, the language(s) spoken in these segments, and the types and variability of the recording channels involved.


international conference on acoustics, speech, and signal processing | 2011

Including human expertise in speaker recognition systems: report on a pilot evaluation

Craig S. Greenberg; Alvin F. Martin; George R. Doddington; John J. Godfrey

The 2010 NIST Speaker Recognition Evaluation (SRE10) included a test of Human Assisted Speaker Recognition (HASR) in which systems based in whole or in part on human expertise were evaluated on limited sets of trials. Participation in HASR was optional, and sites could participate in it without participating in the main evaluation of fully automatic systems. Two HASR trial sets were offered, with HASR1 including 15 trials, and HASR2 a superset of 150 trials. Results were submitted for 20 systems from 15 sites from 6 countries. The trial sets were carefully selected, by a process that combined automatic processing and human listening, to include particularly challenging trials. The performance results suggest that the chosen trials were indeed difficult, and the HASR systems did not appear to perform as well as the best fully automatic systems on these trials.


ieee international conference on data science and advanced analytics | 2015

The NIST data science initiative

Bonnie J. Dorr; Craig S. Greenberg; Peter C. Fontana; Mark A. Przybocki; Marion Le Bras; Cathryn A. Ploehn; Oleg Aulov; Martial Michel; E. Jim Golden; Wo Chang

We examine foundational issues in data science including current challenges, basic research questions, and expected advances, as the basis for a new Data Science Initiative and evaluation series, introduced by the National Institute of Standards and Technology (NIST) in the fall of 2015. The evaluations will facilitate research efforts, collaboration, leverage shared infrastructure, and effectively address cross-cutting challenges faced by diverse data science communities. The evaluations will have multiple research tracks championed by members of the data science community, and will enable rigorous comparison of approaches through common tasks, datasets, metrics, and shared research challenges. The tracks will measure several different data science technologies in a wide range of fields, starting with a pre-pilot. In addition to developing data science evaluation methods and metrics, it will address computing infrastructure, standards for an interoperability framework, and domain-specific examples.


Proceedings of SPIE | 2013

Significance test with data dependency in speaker recognition evaluation

Jin Chu Wu; Alvin F. Martin; Craig S. Greenberg; Raghu N. Kacker; Vincent M. Stanford

To evaluate the performance of speaker recognition systems, a detection cost function defined as a weighted sum of the probabilities of type I and type II errors is employed. The speaker datasets may have data dependency due to multiple uses of the same subjects. Using the standard errors of the detection cost function computed by means of the two-layer nonparametric two-sample bootstrap method, a significance test is performed to determine whether the difference between the measured performance levels of two speaker recognition algorithms is statistically significant. While conducting the significance test, the correlation coefficient between two systems’ detection cost functions is taken into account. Examples are provided.


Proceedings of SPIE | 2012

Data Dependency on Measurement Uncertainties in Speaker Recognition Evaluation

Jin Chu Wu; Alvin F. Martin; Craig S. Greenberg; Raghu N. Kacker

The National Institute of Standards and Technology conducts an ongoing series of Speaker Recognition Evaluations (SRE). Speaker detection performance is measured using a detection cost function defined as a weighted sum of the probabilities of type I and type II errors. The sampling variability can result in measurement uncertainties. In our prior study, the data independency was assumed in using the nonparametric two-sample bootstrap method to compute the standard errors (SE) of the detection cost function based on our extensive bootstrap variability studies in ROC analysis on large datasets. In this article, the data dependency caused by multiple uses of the same subjects is taken into account. The data are grouped into target sets and non-target sets, and each set contains multiple scores. One-layer and two-layer bootstrap methods are proposed based on whether the two-sample bootstrap resampling takes place only on target sets and non-target sets, or subsequently on target scores and non-target scores within the sets, respectively. The SEs of the detection cost function using these two methods along with those with the assumption of data independency are compared. It is found that the data dependency increases both estimated SEs and the variations of SEs. Some suggestions regarding the test design are provided.


IEEE Transactions on Audio, Speech, and Language Processing | 2017

The Impact of Data Dependence on Speaker Recognition Evaluation

Jin Chu Wu; Alvin F. Martin; Craig S. Greenberg; Raghu N. Kacker

The data dependence due to multiple use of the same subjects has impact on the standard error (SE) of the detection cost function (DCF) in speaker recognition evaluation. The DCF is defined as a weighted sum of the probabilities of type I and type II errors at a given threshold. A two-layer data structure is constructed: Target scores are grouped into target sets based on the dependence, and likewise for non-target scores. On account of the needed equal probabilities for scores being selected when resampling, target sets must contain the same number of target scores, and so must non-target sets. In addition to the bootstrap method with i.i.d. assumption, the nonparametric two-sample one-layer and two-layer bootstrap methods are carried out based on whether the resampling takes place only on sets, or subsequently on scores within the sets. Due to the stochastic nature of the bootstrap, the distributions of the SEs of the DCF estimated using the three different bootstrap methods are created and compared. After performing hypothesis testing, it is found that data dependence increases not only the SE but also the variation of the SE, and the two-layer bootstrap is more conservative than the one-layer bootstrap. The rationale regarding the different impacts of the three bootstrap methods on the estimated SEs is investigated.


Odyssey 2016 the Speaker and Language Recognition Workshop | 2016

Summary of the 2015 NIST Language Recognition i-Vector Machine Learning Challenge.

Audrey Tong; Craig S. Greenberg; Alvin F. Martin; Désiré Bansé; John M. Howard; Hui Zhao; George R. Doddington; Daniel Garcia-Romero; Alan McCree; Douglas A. Reynolds; Elliot Singer; Jaime Hernandez-Cordero; Lisa P. Mason

In 2015 NIST coordinated the first language recognition evaluation (LRE) that used i-vectors as input, with the goals of attracting researchers outside of the speech processing community to tackle the language recognition problem, exploring new ideas in machine learning for use in language recognition, and improving recognition accuracy. The Language Recognition i-Vector Machine Learning Challenge, taking place over a period of four months, was well-received with 56 participants from 44 unique sites and over 3700 submissions, surpassing the participation levels of all previous traditional track LREs. The results of 46 of the 56 participants were better than the provided baseline system, with the best system achieving approximately 55% relative improvement over the baseline.


Journal of data science | 2016

A new data science research program: evaluation, metrology, standards, and community outreach

Bonnie J. Dorr; Craig S. Greenberg; Peter C. Fontana; Mark A. Przybocki; Marion Le Bras; Cathryn A. Ploehn; Oleg Aulov; Martial Michel; E. Jim Golden; Wo Chang

This article examines foundational issues in data science including current challenges, basic research questions, and expected advances, as the basis for a new data science research program (DSRP) and associated data science evaluation (DSE) series, introduced by the National Institute of Standards and Technology (NIST) in the fall of 2015. The DSRP is designed to facilitate and accelerate research progress in the field of data science and consists of four components: evaluation and metrology, standards, compute infrastructure, and community outreach. A key part of the evaluation and measurement component is the DSE. The DSE series aims to address logistical and evaluation design challenges while providing rigorous measurement methods and an emphasis on generalizability rather than domain- and application-specific approaches. Toward that end, each year the DSE will consist of multiple research tracks and will encourage the application of tasks that span these tracks. The evaluations are intended to facilitate research efforts and collaboration, leverage shared infrastructure, and effectively address crosscutting challenges faced by diverse data science communities. Multiple research tracks will be championed by members of the data science community with the goal of enabling rigorous comparison of approaches through common tasks, datasets, metrics, and shared research challenges. The tracks will permit us to measure several different data science technologies in a wide range of fields and will address computing infrastructure, standards for an interoperability framework, and domain-specific examples. This article also summarizes lessons learned from the data science evaluation series pre-pilot that was held in fall of 2015.


Proceedings of SPIE | 2011

Uncertainties of Measures in Speaker Recognition Evaluation

Jin Chu Wu; Alvin F. Martin; Craig S. Greenberg; Raghu N. Kacker

The National Institute of Standards and Technology (NIST) Speaker Recognition Evaluations (SRE) are an ongoing series of projects conducted by NIST. In the NIST SRE, speaker detection performance is measured using a detection cost function, which is defined as a weighted sum of probabilities of type I error and type II error. The sampling variability can result in measurement uncertainties of the detection cost function. Hence, while evaluating and comparing the performances of speaker recognition systems, the uncertainties of measures must be taken into account. In this article, the uncertainties of detection cost functions in terms of standard errors (SE) and confidence intervals are computed using the nonparametric two-sample bootstrap methods based on our extensive bootstrap variability studies on large datasets conducted before. The data independence is assumed because the bootstrap results of SEs matched very well with the analytical results of SEs using the Mann-Whitney statistic for independent and identically distributed samples if the metric of area under a receiver operating characteristic curve is employed. Examples are provided.


international conference on big data | 2016

Evaluation-driven research in data science: Leveraging cross-field methodologies

Bonnie J. Dorr; Peter C. Fontana; Craig S. Greenberg; Marion Le Bras; Mark A. Przybocki

While prior evaluation methodologies for data-science research have focused on efficient and effective teamwork on independent data science problems within given fields [1], this paper argues that an enriched notion of evaluation-driven research (EDR) supports methodologies and effective solutions to data-science problems across multiple fields. We adopt the view that progress in data-science research is enriched through the examination of a range of problems in many different areas (traffic, healthcare, finance, sports, etc.) and through the development of methodologies and evaluation paradigms that span diverse disciplines, domains, problems, and tasks. A number of questions arise when one considers the multiplicity of data science fields and the potential for cross-disciplinary “sharing” of methodologies, for example: the feasibility of generalizing problems, tasks, and metrics across domains; ground-truth considerations for different types of problems; issues related to data uncertainty in different fields; and the feasibility of enabling cross-field cooperation to encourage diversity of solutions. We posit that addressing the problems inherent in such questions provides a foundation for EDR across diverse fields. We ground our conclusions and insights in a brief preliminary study developed within the Information Access Division of the National Institute of Standards and Technology as a part of a new Data Science Research Program (DSRP). The DSRP focuses on this cross-disciplinary notion of EDR and includes a new Data Science Evaluation series to facilitate research collaboration, to leverage shared technology and infrastructure, and to further build and strengthen the data-science community.

Collaboration


Dive into the Craig S. Greenberg's collaboration.

Top Co-Authors

Avatar

Alvin F. Martin

National Institute of Standards and Technology

View shared research outputs
Top Co-Authors

Avatar

Mark A. Przybocki

National Institute of Standards and Technology

View shared research outputs
Top Co-Authors

Avatar

Marion Le Bras

National Institute of Standards and Technology

View shared research outputs
Top Co-Authors

Avatar

Peter C. Fontana

National Institute of Standards and Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jin Chu Wu

National Institute of Standards and Technology

View shared research outputs
Top Co-Authors

Avatar

Raghu N. Kacker

National Institute of Standards and Technology

View shared research outputs
Top Co-Authors

Avatar

Cathryn A. Ploehn

National Institute of Standards and Technology

View shared research outputs
Top Co-Authors

Avatar

Oleg Aulov

University of Maryland

View shared research outputs
Top Co-Authors

Avatar

Douglas A. Reynolds

Massachusetts Institute of Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge