Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where V. Viswanathan is active.

Publication


Featured researches published by V. Viswanathan.


IEEE Transactions on Communications | 1982

Variable Frame Rate Transmission: A Review of Methodology and Application to Narrow-Band LPC Speech Coding

V. Viswanathan; John Makhoul; Richard M. Schwartz; A. W. F. Huggins

We review the variable frame rate (VFR) transmission methodology that we developed, implemented, and tested during the period 1973-1978 for efficiently transmitting LPC vocoder parameters extracted from the input speech at a fixed frame rate. In the VFR method, parameters are transmitted only when their values have changed sufficiently over the interval since their preceding transmission. We explored two distinct approaches to automatic implementation of the VFR method. The first approach bases the transmission decisions on comparisons of the parameter values of the present frame and the last transmitted frame. The second approach, which is based on a functional perceptual model of speech, compares the parameter values of all the frames that lie in the interval between the present frame and the last transmitted frame against a linear model of parameter variation over that interval. The application of VFR transmission to the design of narrow-band LPC speech coders with average bit rates of 2000-2400 bits/s is also considered. The transmission decisions are made separately for the three sets of LPC parameters, pitch, gain, and spectral parameters, using separate VFR schemes. A formal subjective spccch quality test of six selected LPC coders is described, and the results are presented and analyzed in detail. It is shown that a 2075 bit/s VFR coder produces speech quality equal to or better than that of a 5700 bit/s fixed frame rate coder.


IEEE Transactions on Communications | 1982

Design of a Robust Baseband LPC Coder for Speech Transmission Over 9.6 Kbit/s Noisy Channels

V. Viswanathan; Alan L. Higgins; William H. Russell

This paper describes the design of a baseband LPC coder that transmits speech over 9.6 kbit/s (kilobit/second) synchronous channels with random bit errors of up to 1 percent. Presented are the results of our investigation of a number of aspects of the baseband LPC coder with the goal of maximizing the quality of the transmitted speech. Important among these aspects are: bandwidth of the baseband, coding of the baseband residual, high-frequency regeneration, and error protection of important transmission parameters. The paper discusses these and other issues, presents the results of speech-quality tests conducted during the various stages of optimization, and describes the details of the optimized speech coder. This optimized speech coding algorithm has been implemented as a real-time full-duplex system on an array processor. Informal listening tests of the real-time coder have shown that the coder produces good speech quality in the absence of channel bit errors and introduces only a slight degradation in quality for channel bit error rates of up to 1 percent.


international conference on acoustics, speech, and signal processing | 1986

Evaluation of multisensor speech input for speech recognition in high ambient noise

V. Viswanathan; C. Henry; Richard M. Schwartz; S. Roucos

In this paper, we report the results of isolated-word speech recognition tests performed with the Verbex 4000 recognizer on speech data collected 1) in 95 dB and 115 dB SPL broad-band acoustic noise typical in a fighter aircraft cockpit and 2) using several single sensors and two-sensor configurations that we reported in ICASSP-85. The two-sensor systems and the gradient microphones we tested produced about the same recognition performance in 95 dB noise. In 115 dB noise, the recognizer failed to train for all but the accelerometer, which produced nearly constant performance in both noise levels. Also, we demonstrate the feasibility of a new speech recognition methodology that uses a parallel system of multiple input signals transduced simultaneously using different sensors. In selected phonetic discrimination tests involving a feature-based approach, a parallel-input multisensor system reduced the discrimination errors to between one-half and one-twelfth of the number produced by a gradient microphone alone.


international conference on acoustics, speech, and signal processing | 1986

Word recognition using multisensor speech input in high ambient noise

S. Roucos; V. Viswanathan; C. Henry; Richard M. Schwartz

We present a method for word recognition with input speech transduced simultaneously by several sensors in high levels of broadband acoustic background noise. In prior work on single-input multisensor systems, limited success in machine recognition was achieved by linearly combining multiple sensor signals to yield a robust estimate of the speech signal in the presence of noise. In this paper, we demonstrate that improved recognition results are obtained by using all available sensor signals jointly as a vector, which preserves information from all sensors, as input to the decision process. We report on multisensor configurations using close-talking pressure-gradient microphones and accelerometers placed at the throat and nose of the speaker. The recognition error rates obtained by using the joint output vector are 45% lower than the error rates obtained with the best constituent sensor in the multisensor system; single-input multisensor systems, on the other hand, produce error rates that are about equal to the error rates obtained with the best constituent sensor.


international conference on acoustics, speech, and signal processing | 1984

Multisensor speech input for enhanced immunity to acoustic background noise

V. Viswanathan; Kenneth F. Karnofsky; Kenneth N. Stevens; Michael N. Alakel

The aim of this work is to develop multisensor configurations of sensors for transducing speech in order to achieve enhanced immunity to acoustic background noise. We performed detailed measurements of the sound field in the vicinity of the mouth and neck during speech using pressure and pressure gradient microphones and an accelerometer. We investigated the properties of the measured signals from the various sensor types and positions through long-term and short-term spectral analyses and from articulation index scores computed assuming ambient noise typical of that in a fighter aircraft cockpit. From the results of this investigation, we developed a two-sensor configuration involving an accelerometer and a gradient microphone. Results from formal speech intelligibility and quality tests in simulated fighter aircraft cockpit noise show clearly that each of the two-sensor signals under test outperforms the signal from the gradient microphone alone and that the performance improvement generally increases with the noise level.


international conference on acoustics, speech, and signal processing | 1985

Noise-immune speech transduction using multiple sensors

V. Viswanathan; C. Henry; A. Derr

In this ongoing work, we have developed and tested configurations of multiple sensors for transducing speech with enhanced immunity to acoustic background noise. In this paper, we describe our work to improve upon our previously reported two-sensor configuration consisting of an accelerometer and a gradient microphone, and also present several additional multisensor configurations including one that uses a second-order gradient microphone for low frequencies and a first-order gradient microphone for high frequencies. Results from formal speech intelligibility and quality tests in simulated fighter aircraft cockpit noise show that each of the multisensor configurations tested outperforms the constituent individual sensors.


international conference on acoustics, speech, and signal processing | 1982

A harmonic deviations linear prediction vocoder for improved narrowband speech transmission

V. Viswanathan; Michael G. Berouti; A. Higgins; William E. Russell

This paper presents the development, study, and experimental results of a narrowband speech coder called Harmonic Deviations (HDV) Vocoder. The HDV coder is based on the LPC vocoder, and it transmits additional information over and above the data transmitted by the LPC vocoder, in the form of deviations between the speech spectrum and the LPC all-pole model spectrum at a selected set of frequencies. At the receiver, the spectral deviations are used to generate the excitation signal for the all-pole synthesis filter. The paper describes and compares several methods for extracting the spectral deviations from the speech signal and for encoding them. To limit the bit-rate of the HDV coder to 2.4 kb/s the paper discusses several methods including orthogonal transformation and minimum-mean-square-error scalar quantization of log area ratios, two-stage vector-scalar quantization, and variable frame rate transmission. The paper also presents the results of speech-quality optimization of the HDV coder at 2.4 kb/s. Both noisefree and noisy channel applications are considered. The resulting HDV coders yield noticeable improvement in speech quality over LPC vocoders.


international conference on acoustics, speech, and signal processing | 1985

New objective measures for the evaluation of pitch extractors

V. Viswanathan; W. Russell

In this research, we developed and tested a variety of new objective measures for automatically evaluating pitch extractors for use in narrowband speech coders. For automatic computation of an accurate, reference pitch contour required by the objective measures, we modified an existing algorithm that uses the subglottal signal recorded simultaneously with the speech signal. To test and validate the objective measures, we conducted formal subjective speech quality tests, which involved five pitch extractors; a specially selected speech database of sentences that are likely to cause pitch and voicing errors; and two 2.4 kbit/s speech coders, the LPC coder and the harmonic deviations coder, each operating in clear or no noise and in Air Borne Command Post noise. Results obtained by correlating the objective scores with the mean subjective rating scores show that several of our objective measures produce consistently excellent correlation (0.9 - 0.995) for both coders and under both acoustic background conditions.


international conference on acoustics, speech, and signal processing | 1983

Objective speech quality evaluation of mediumband and narrowband real-time speech coders

V. Viswanathan; William E. Russell; A. W. F. Huggins

We consider in this paper the problem of developing and testing a procedure for objective evaluation of the speech quality of real-time speech coders. We validate this evaluation procedure by computing the objective quality scores over a test bed of five mediumband and narrowband real-time speech coders and correlating the scores with subjective judgments generated by the Diagnostic Acceptability Measure test. We report on the performance of several published objective measures. Also, we describe and suggest solution approaches to two related problems: synchronize in time a real-time coders output with its input, and design a database of input-speech sentences to be used in objective speech quality evaluation. We present various experimental results of this ongoing work.


Archive | 1982

Medium and Low Bit Rate Speech Transmission

V. Viswanathan; John Makhoul; Richard M. Schwartz

We describe five types of speech coding systems with transmission bit rates spanning the range from 16,000 bits/sec (b/s) down to 100 b/s: adaptive predictive coders at 16 kb/s, baseband coders at 9.6 kb/s, linear predictive coders at 1.5–2.4 kb/s, clustering vocoders at 600–800 b/s, and diphone-based phonetic vocoders at 100 b/s. For each type of coders, we describe the coder configuration, discuss the important coding issues, and present the results available to date.

Collaboration


Dive into the V. Viswanathan's collaboration.

Top Co-Authors

Avatar

S. Roucos

University of Florida

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Kenneth N. Stevens

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge