Krishna Sundaram Nathan
Brown University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Krishna Sundaram Nathan.
IEEE Transactions on Speech and Audio Processing | 1994
Krishna Sundaram Nathan; Harvey F. Silverman
A feature set that captures the dynamics of formant transitions prior to closure in a VCV environment is used to characterize and classify the unvoiced stop consonants. The feature set is derived from a time-varying, data-selective model for the speech signal. Its performance is compared with that of comparable formant data from a standard delta-LPC-based model. The different feature sets are evaluated on a database composed of eight talkers. A 40% reduction in classification error rate is obtained by means of the time-varying model. The performance of three different classifiers is discussed. A novel adaptive algorithm, termed learning vector classifier (LVC) is compared with standard K-means and LVQ2 classifiers. LVC is a supervised learning classifier that improves performance by increasing the resolution of the decision boundaries. Error rates obtained for the three-way (p, t, and k) classification task using LVC and the time-varying analysis are comparable to that of techniques that make use of additional discriminating information contained in the burst. Further improvements are expected when an expanded time-varying feature set is utilized, coupled with information from the burst. >
IEEE Transactions on Signal Processing | 1991
Krishna Sundaram Nathan; Yi-Teh Lee; Harvey F. Silverman
A linear predictive coding (LPC) model based on time-dependent poles which has yielded promising results when applied to synthetic data is applied to real speech data. The data are processed pitch-synchronously using a simple procedure to identify regions of the data that best fit the model. The maximum-likelihood technique, which has been found to be robust in the presence of noise, is used to estimate the parameters. Resulting formant estimates for several diphthongs are presented. The algorithm tracks the formants well, both in stable regions and in regions of transition. This ability to track formant variation within analysis intervals is a definite advantage over traditional LPC. Results from speech data involving final stop consonants are presented. Rapid changes, particularly in the first and second formants, in the region immediately prior to the stop are detected. Such abrupt transitions are often not detected by traditional time-invariant methods. >
international conference on acoustics, speech, and signal processing | 1990
Krishna Sundaram Nathan; Harvey F. Silverman
Two data-selective techniques that facilitate formant tracking with greater temporal resolution are presented: a time-varying autoregressive (AR) model and a short window AR model. They are used to analyze the regions of voicing immediately preceding closure in vowel-consonant transitions. Results on real speech data show that these methods detect the rapid change in the formants prior to closure better than standard linear predictive coding methods; hence, they provide important cues as to the nature of the stop. When these cues are used in conjunction with the commonly used ones from the burst, far more accurate labeling of the stops should result.<<ETX>>
international conference on acoustics, speech, and signal processing | 1991
Krishna Sundaram Nathan; Harvey F. Silverman
A feature set that captures the dynamics of formant transitions is utilized to classify the unvoiced stop consonants. The second formant and its slope are used to characterize the transition between the vowel and the closure in a VCV (vowel-consonant-vowel) environment. The performance of a feature set obtained by means of a time-varying, closed-glottis model for the signal is compared with that of a standard LPC (linear predictive coding) model. The different feature sets are evaluated on a database consisting of eight speakers. A fourfold reduction in the error rate is obtained by means of the more sophisticated model. The performance of three different classifiers is presented. A novel adaptive algorithm, the learning vector classifier, is compared with standard K-means and LVQ2 (learning vector quantization-2) classifiers. Error rates of 5% are obtained for the three-way classification.<<ETX>>
Journal of the Acoustical Society of America | 1991
Krishna Sundaram Nathan; Harvey F. Silverman
It is well known that the first formant is maximally excited at the instant of glottal closure. Therefore, it is natural to utilize the energy in a band containing the first formant as a cue to the GCI. In practice, however, the actual GCI lies a few samples prior to where this energy signal attains a local maximum. Moreover, such an estimate makes no use of any period information regarding the GCIs. Consequently, secondary excitations within a period can lead to spurious GCIs. It is therefore proposed to augment the information contained in the first formant with the linear prediction error. Although, prediction error has been widely used for pitch determination, it is not sufficient to locate the GCI reliably because of ambiguities arising from multiple peaks, especially for vowels like /u/ (as in foot). Interestingly, these experiments have shown that secondary excitations tend to result in peaks in the residual error signal at locations different from those in the formant energy signal. Furthermore,...
Archive | 1992
Krishna Sundaram Nathan
international conference on acoustics, speech, and signal processing | 1994
Jerome R. Bellegarda; David Nahamoo; Krishna Sundaram Nathan; Eveline Jeannine Bellegarda
Archive | 1994
Eveline Jeannine Bellegarda; Jerome R. Bellegarda; David Nahamoo; Krishna Sundaram Nathan
Archive | 1994
Eveline Jeannine Bellegarda; Jerome R. Bellegarda; David Nahamoo; Krishna Sundaram Nathan
Archive | 1994
Jerome R. Bellegarda; David Nahamoo; Krishna Sundaram Nathan