David Snyder | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where David Snyder is active.

Explore More

Publication

Featured researches published by David Snyder.

spoken language technology workshop | 2016

Deep neural network-based speaker embeddings for end-to-end speaker verification

David Snyder; Pegah Ghahremani; Daniel Povey; Daniel Garcia-Romero; Yishay Carmiel; Sanjeev Khudanpur

In this study, we investigate an end-to-end text-independent speaker verification system. The architecture consists of a deep neural network that takes a variable length speech segment and maps it to a speaker embedding. The objective function separates same-speaker and different-speaker pairs, and is reused during verification. Similar systems have recently shown promise for text-dependent verification, but we believe that this is unexplored for the text-independent task. We show that given a large number of training speakers, the proposed system outperforms an i-vector baseline in equal error-rate (EER) and at low miss rates. Relative to the baseline, the end-to-end system reduces EER by 13% average and 29% pooled across test conditions. The fused system achieves a reduction of 32% average and 38% pooled.

ieee automatic speech recognition and understanding workshop | 2015

Time delay deep neural network-based universal background models for speaker recognition

David Snyder; Daniel Garcia-Romero; Daniel Povey

Recently, deep neural networks (DNN) have been incorporated into i-vector-based speaker recognition systems, where they have significantly improved state-of-the-art performance. In these systems, a DNN is used to collect sufficient statistics for i-vector extraction. In this study, the DNN is a recently developed time delay deep neural network (TDNN) that has achieved promising results in LVCSR tasks. We believe that the TDNN-based system achieves the best reported results on SRE10 and it obtains a 50% relative improvement over our GMM baseline in terms of equal error rate (EER). For some applications, the computational cost of a DNN is high. Therefore, we also investigate a lightweight alternative in which a supervised GMM is derived from the TDNN posteriors. This method maintains the speed of the traditional unsupervised-GMM, but achieves a 20% relative improvement in EER.

international conference on acoustics, speech, and signal processing | 2017

Speaker diarization using deep neural network embeddings

Daniel Garcia-Romero; David Snyder; Gregory Sell; Daniel Povey; Alan McCree

Speaker diarization is an important front-end for many speech technologies in the presence of multiple speakers, but current methods that employ i-vector clustering for short segments of speech are potentially too cumbersome and costly for the front-end role. In this work, we propose an alternative approach for learning representations via deep neural networks to remove the i-vector extraction process from the pipeline entirely. The proposed architecture simultaneously learns a fixed-dimensional embedding for acoustic segments of variable length and a scoring function for measuring the likelihood that the segments originated from the same or different speakers. Through tests on the CALLHOME conversational telephone speech corpus, we demonstrate that, in addition to streamlining the diarization architecture, the proposed system matches or exceeds the performance of state-of-the-art baselines. We also show that, though this approach does not respond as well to unsupervised calibration strategies as previous systems, the incorporation of well-founded speaker priors sufficiently mitigates this shortcoming.

conference of the international speech communication association | 2017

Compressed Time Delay Neural Network for Small-Footprint Keyword Spotting.

Ming Sun; David Snyder; Yixin Gao; Varun Nagaraja; Mike Rodehorst; Sankaran Panchapagesan; Nikko Strom; Spyros Matsoukas; Shiv Vitaladevuni

arXiv: Sound | 2015

MUSAN: A Music, Speech, and Noise Corpus

David Snyder; Guoguo Chen; Daniel Povey

conference of the international speech communication association | 2017

Deep Neural Network Embeddings for Text-Independent Speaker Verification.

David Snyder; Daniel Garcia-Romero; Daniel Povey; Sanjeev Khudanpur

international conference on acoustics, speech, and signal processing | 2018

X-Vectors: Robust DNN Embeddings for Speaker Recognition.

David Snyder; Daniel Garcia-Romero; Gregory Sell; Daniel Povey; Sanjeev Khudanpur

international conference on acoustics, speech, and signal processing | 2018

Characterizing Performance of Speaker Diarization Systems on Far-Field Speech Using Standard Methods.

Matthew Maciejewski; David Snyder; Vimal Manohar; Najim Dehak; Sanjeev Khudanpur

international conference on acoustics, speech, and signal processing | 2018

AUDIO-VISUAL PERSON RECOGNITION IN MULTIMEDIA DATA FROM THE IARPA JANUS PROGRAM

Gregory Sell; Kevin Duh; David Snyder; Dave Etter; Daniel Garcia-Romero

conference of the international speech communication association | 2018

Diarization is Hard: Some Experiences and Lessons Learned for the JHU Team in the Inaugural DIHARD Challenge.

Gregory Sell; David Snyder; Alan McCree; Daniel Garcia-Romero; Jesús Villalba; Matthew Maciejewski; Vimal Manohar; Najim Dehak; Daniel Povey; Shinji Watanabe; Sanjeev Khudanpur

Explore More