Gregory Sell | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Gregory Sell is active.

Explore More

Publication

Featured researches published by Gregory Sell.

international conference on acoustics, speech, and signal processing | 2011

Speech recognitionwith segmental conditional random fields: A summary of the JHU CLSP 2010 Summer Workshop

Geoffrey Zweig; Patrick Nguyen; D. Van Compernolle; Kris Demuynck; L. Atlas; Pascal Clark; Gregory Sell; M. Wang; Fei Sha; Hynek Hermansky; Damianos Karakos; Aren Jansen; Samuel Thomas; S. Bowman; Justine T. Kao

This paper summarizes the 2010 CLSP Summer Workshop on speech recognition at Johns Hopkins University. The key theme of the workshop was to improve on state-of-the-art speech recognition systems by using Segmental Conditional Random Fields (SCRFs) to integrate multiple types of information. This approach uses a state-of-the-art baseline as a springboard from which to add a suite of novel features including ones derived from acoustic templates, deep neural net phoneme detections, duration models, modulation features, and whole word point-process models. The SCRF framework is able to appropriately weight these different information sources to produce significant gains on both the Broadcast News and Wall Street Journal tasks.

IEEE Transactions on Audio, Speech, and Language Processing | 2010

Solving Demodulation as an Optimization Problem

Gregory Sell; Malcolm Slaney

We introduce two new methods for the demodulation of acoustic signals by posing the problem in a convex optimization framework. This allows the parameters of the modulator and carrier to be explicitly defined as constraints in an optimization problem. We first show the theory used to define the demodulation relationship within the rules of convex programming. Then, for the two approaches introduced, we derive specific cost functions and constraints to solve for modulators specifically motivated by perceptual rules. The methods described here perform well with simple, harmonic, and stochastic carriers, and also in the presence of noise.

conference of the international speech communication association | 2016

Priors for Speaker Counting and Diarization with AHC.

Gregory Sell; Alan McCree; Daniel Garcia-Romero

Estimating the number of speakers in an audio segment is a necessary step in the process of speaker diarization, but current diarization algorithms do not explicitly define a prior probability on this estimation. This work proposes a process for including priors in speaker diarization with agglomerative hierarchical clustering (AHC). It is also shown that the exclusion of a prior with AHC is itself implicitly a prior, which is found to be geometric growth in the number of speakers. By using more sensible priors, we are able to demonstrate significantly improved robustness to calibration error for speaker counting and speaker diarization.

international conference on acoustics, speech, and signal processing | 2010

The information content of demodulated speech

Gregory Sell; Malcolm Slaney

In this paper we describe the effect of demodulation on speech signals. We compare two different algorithms for demodulating audio: the classic approach based on the Hilbert transform and a new approach based on solving a convex optimization problem. We show that convex demodulation better separates the speech information between the modulator and the carrier. We demonstrate this advantage by measuring the speech-information content using a speech-recognition experiment. Finally, we explore the effect of subband filtering on the demodulation process and the shift of information from the modulator to the carrier as the subbands become wider.

international conference on acoustics, speech, and signal processing | 2011

A novel approach using modulation features for multiphone-based speech recognition

Pascal Clark; Gregory Sell; Les E. Atlas

Recent advances in coherent and convex demodulation have proven useful for analyzing and modifying the low-frequency envelope structure of speech. This paper reports the application of both methods, referred to here as bandwidth-constrained demodulation, to large-scale speech recognition in the form of new feature representations. Modulation-based features yielded measurable improvement when included as complementary sources of information with a baseline recognizer. Furthermore, both sets of demodulation features showed promise for outperforming the conventional Hilbert envelope method which underlies most modern speech recognition features. These experimental results show the potential for further development in feature representations based on recently-developed bandwidth-constrained modulation signal models.

international conference on acoustics, speech, and signal processing | 2013

Optimizing coherent demodulation for improved separation of overlapping sources

Gregory Sell

The complex modulators of coherent demodulation make the algorithm a natural fit for source separation, but the overlapping bands typically found in real audio mixtures present interference problems for the algorithm. This paper proposes reframing coherent demodulation as an optimization problem that distributes the energy in overlapping bands according to an optimally low-frequency strategy. The extension is shown to improve separation for sinewave mixtures and for mixtures of speech and music.

conference of the international speech communication association | 2015