Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Douglas J. Nelson is active.

Publication


Featured researches published by Douglas J. Nelson.


IEEE Transactions on Speech and Audio Processing | 1999

Scale transform in speech analysis

Srinivasan Umesh; Leon Cohen; Nenad M. Marinovic; Douglas J. Nelson

In this paper, we study the scale transform of the spectral-envelope of speech utterances by different speakers. This study is motivated by the hypothesis that the formant frequencies between different speakers are approximately related by a scaling constant for a given vowel. The scale transform has the fundamental property that the magnitude of the scale-transform of a function X(f) and its scaled version /spl radic//spl alpha/X(/spl alpha/f) are same. The methods presented here are useful in reducing variations in acoustic features. We show that the F-ratio tests indicate better separability of vowels by using scale-transform based features than mel-transform based features. The data used in the comparison of the different features consist of 200 utterances of four vowels that are extracted from the TIMIT database.


international conference on acoustics, speech, and signal processing | 1993

Special purpose correlation functions for improved signal detection and parameter estimation

Douglas J. Nelson

A compilation of several correlation functions which were developed by the author and have been used for several years in signal analysis applications is given. The affine invariant pseudometric is a correlation function normalized to be independent of power, DC bias, and phase rotation. It was developed to track radar video sync pulses. It has recently been successfully used to track glottal pulses in voiced speech. The cross-power spectrum represents a significant improvement over standard power-spectral methods for recovering weak stationary tones in noise. The harmonic rejecting correlation function is a variant of the Wigner transform which resolves fundamentals from harmonic and subharmonic features produced by periodic waveforms. Each of these algorithms has been tested and used on a variety of data. Most importantly, for each of the methods described here, closed form solutions are derived, which enable easy implementations.<<ETX>>


IEEE Signal Processing Letters | 2002

Frequency warping and the Mel scale

Srinivasan Umesh; Leon Cohen; Douglas J. Nelson

We present experimental results that show that the scale-factor relating the formant frequencies of different speakers increases with decreasing values of formant frequency. Based on these results, we experimentally obtain a frequency warping function aimed at separating speaker dependencies from the inherent characterization of the sound. We find that the frequency warping function is similar to the Mel scale, and we believe that this is the first time that a Mel-like scale has been obtained using only speech. Our results and methods may therefore explain, from a speech point of view, the Mel scale, which was obtained historically from hearing based experiments.


Digital Signal Processing | 2002

Instantaneous Higher Order Phase Derivatives

Douglas J. Nelson

Abstract Nelson, D. J., Instantaneous Higher Order Phase Derivatives, Digital Signal Processing 12 (2002) 416–428 We present methods, based on the short time Fourier transform, which may be used to analyze the structure of multicomponent FM modulated signals instantaneously in time and frequency. The methods build on previously presented cross-spectral methods. In this paper, we introduce the concept of higher order short time Fourier transform phase derivatives, which may be used to estimate signal trajectories instantaneously in both time and frequency and to determine convergence of the remapped time–frequency surface. The methods are applied to synthesized data and speech signals.


international conference on acoustics speech and signal processing | 1996

Computationally efficient estimation of sinusoidal frequency at low SNR

Srinivasan Umesh; Douglas J. Nelson

We propose a computationally efficient method for estimation of frequency of a single complex sinusoid at low SNR. This method is motivated by the cross-power spectrum method of Nelson (1993) and the weighted phase averager (WPA) methods of Tretter (1985), Kay (1988), and Lovell et al. (1991). We demonstrate that by a simple preprocessing, we can extend the threshold SNR of the WPA significantly. Further, unlike the WPA, the proposed method can be easily extended for estimation of frequencies of multiple sinusoids that are well-separated in frequency. We also derive the variance of the proposed estimator and provide simulation results comparing the proposed method with the WPA.


IEEE Transactions on Signal Processing | 2006

A linear model for TF distribution of signals

Douglas J. Nelson; David C. Smith

We describe a new linear time-frequency model in which the instantaneous value of each signal component is mapped to the curve functionally representing its instantaneous frequency. This transform is linear, uniquely defined by the signal decomposition, and satisfies linear marginal-like distribution properties. We further demonstrate the transform generated surface may be estimated from the short time Fourier transform by a concentration process based on the phase of the short-time Fourier transform (STFT), differentiated with respect to time. Interference may be identified on the concentrated STFT surface, and the signal with the interference removed may be estimated by applying the linear-time-marginal to the concentrated STFT surface from which the interference components have been removed


international conference on spoken language processing | 1996

Frequency-warping in speech

Srinivasan Umesh; Leon Cohen; Nenad M. Marinovic; Douglas J. Nelson

We present results that indicate that the formant frequencies between different speakers scale differently at different frequencies. Based on our experiments on speech data, we then numerically compute a universal frequency-warping function, to make the scale-factor independent of frequency in the warped domain. The proposed warping function is found to be similar to the mel-scale, which has previously been derived from purely psycho-acoustic experiments. The motivation for the present experiments stems from our proposed use of scale-transform based cepstral coefficients (Umesh et al., 1996) as acoustic features, since they provide superior separability of vowels than mel-cepstral coefficients.


conference on advanced signal processing algorithms architectures and implemenations | 1999

Cross-spectral methods with an application to speech processing

Douglas J. Nelson; Wayne G. Wysocki

We present a discussion of methods based on the complex cross- spectrum and the application of these methods to the analysis of speech. The cross spectral methods developed here are an extension of methods developed in the 1980s by one of the authors for accurately estimating stationary and cyclo-stationary parameters of signals buried deep in the noise. Since speech is non-stationary and therefore supports very little integration, the methods have been re-developed to address issues such as non-stationarity, harmonic structures and rapidly changing resonance Cross-spectral methods are presented as complex valued time-frequency surface methods which provide signal parameter estimation by taking advantage of signal structure. These methods have proven to be very powerful.


international conference on acoustics, speech, and signal processing | 1997

Frequency-warping and speaker-normalization

Srinivasan Umesh; Leon Cohen; Douglas J. Nelson

We have proposed the use of scale-cepstral coefficients as features in speech recognition. We have developed a corresponding frequency-warping function, such that, in the warped domain the formant envelopes of different speakers are approximately translated versions of one and another for any given vowel. These methods were motivated by a desire to achieve speaker-normalization. In this paper, we point out very interesting parallels of the various steps in computing the scale-cepstrum, with those observed in computing features based on physiological models of the auditory system or psychoacoustic experiments. It may therefore be useful to have a better understanding of the need for the various signal-processing steps which may result in the development of more robust recognizers.


international conference on acoustics, speech, and signal processing | 1995

The NP speech activity detection algorithm

Joseph Pencak; Douglas J. Nelson

This paper describes a new algorithm, the NP algorithm, for detecting speech signals of varying quality and gain levels. NP operates in the frequency domain and renders speech/no-speech decisions based on a signal-to-noise ratio (SNR) derived from a sorted power spectrum. In addition to the SNR estimates, a spectral whitening process and an estimate of the variance in the ratio of the signal power to total energy are also used to identify and reject signals that are stationary or nearly stationary The key features of this algorithm are: the detection is based on a single FFT; decisions are independent of the signal gain; the process has 3 dB/octave processing gain from the transform; and frequency domain processing permits the exploitation of the signal structure.

Collaboration


Dive into the Douglas J. Nelson's collaboration.

Top Co-Authors

Avatar

Leon Cohen

City University of New York

View shared research outputs
Top Co-Authors

Avatar

Srinivasan Umesh

Indian Institute of Technology Kanpur

View shared research outputs
Top Co-Authors

Avatar

David C. Smith

National Security Agency

View shared research outputs
Top Co-Authors

Avatar

Nenad M. Marinovic

City University of New York

View shared research outputs
Top Co-Authors

Avatar

Owen P. Kenny

Defence Science and Technology Organization

View shared research outputs
Top Co-Authors

Avatar

John S. Bodenschatz

University of Southern California

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Fehret Cakrak

University of Pittsburgh

View shared research outputs
Top Co-Authors

Avatar

J. R. Hopkins

National Security Agency

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge