Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Thippur V. Sreenivas is active.

Publication


Featured researches published by Thippur V. Sreenivas.


IEEE Transactions on Speech and Audio Processing | 1996

Codebook constrained Wiener filtering for speech enhancement

Thippur V. Sreenivas; Pradeep Kirnapure

Speech enhancement using iterative Wiener filtering has been shown to require interframe and intraframe constraints in all-pole parameter estimation. We show that a clean speech VQ codebook is more effective in providing intraframe constraints and, hence, better convergence of the iterative filtering scheme. Satisfactory speech enhancement results are obtained with a small codebook of 128, and the algorithm is effective for both white noise and pink noise up to 0 dB SNR.


international conference on acoustics, speech, and signal processing | 2008

GMM based Bayesian approach to speech enhancement in signal / transform domain

Achintya Kundu; Saikat Chatterjee; A. Sreenivasa Murthy; Thippur V. Sreenivas

Considering a general linear model of signal degradation, by modeling the probability density function (PDF) of the clean signal using a Gaussian mixture model (QMM) and additive noise by a Gaussian PDF, we derive the minimum mean square error (MMSE) estimator. The derived MMSE estimator is non-linear and the linear MMSE estimator is shown to be a special case. For speech signal corrupted by independent additive noise, by modeling the joint PDF of time-domain speech samples of a speech frame using a GMM, we propose a speech enhancement method based on the derived MMSE estimator. We also show that the same estimator can be used for transform-domain speech enhancement.


international conference on acoustics, speech, and signal processing | 2004

Music instrument recognition: from isolated notes to solo phrases

A. G. Krishna; Thippur V. Sreenivas

Speech and audio processing techniques are used along with statistical pattern recognition principles to solve the problem of music instrument recognition. Non-temporal, frame level features only are used so that the proposed system is scalable from the isolated notes to the solo instrumental phrases scenario without the need for temporal segmentation of solo music. Based on their effectiveness in speech, line spectral frequencies (LSF) are proposed as features for music instrument recognition. The proposed system has also been evaluated using MFCC and LPCC features. Gaussian mixture models and K-nearest neighbour model classifier are used for classification. The experimental dataset included the Ulowa MIS and the C Music corporation RWC databases. Our best results at the instrument family level is about 95% and at the instrument level is about 90% when classifying 14 instruments.


international conference on acoustics, speech, and signal processing | 2005

Automatic speech segmentation using average level crossing rate information

Anindya Sarkar; Thippur V. Sreenivas

We explore new methods of determining automatically derived units for classification of speech into segments. For detecting signal changes, temporal features are more reliable than the standard feature vector domain methods, since both magnitude and phase information are retained. Motivated by auditory models, we have presented a method based on average level crossing rate (ALCR) of the signal, to detect significant temporal changes in the signal. An adaptive level allocation scheme has been used in this technique that allocates levels, depending on the signal pdf and SNR. We compare the segmentation performance to manual phonemic segmentation and also that provided by maximum likelihood (ML) segmentation for 100 TIMIT sentences. The ALCR method matches the best segmentation performance without a priori knowledge of number of segments, as in ML segmentation.


international conference on acoustics, speech, and signal processing | 2002

Robust parameters for automatic segmentation of speech

A. K. V. SaiJayram; V. Ramasubramanian; Thippur V. Sreenivas

Automatic segmentation of speech is an important problem that is useful in speech recognition, synthesis and coding. We explore in this paper, the robust parameter set, weighting function and distance measure for reliable segmentation of noisy speech. It is found that the MFCC parameters, successful in speech recognition. holds the best promise for robust segmentation also. We also explored a variety of symmetric and asymmetric weighting lifters. from which it is found that a symmetric lifter of the form 1 + A sin1/2(πn/L), 0 ≤ n ≤ L − 1, for MFCC dimension L, is most effective. With regard to distance measure, the direct L2 norm is found adequate.


ieee automatic speech recognition and understanding workshop | 2007

Joint decoding of multiple speech patterns for robust speech recognition

Nishanth Ulhas Nair; Thippur V. Sreenivas

We are addressing a new problem of improving automatic speech recognition performance, given multiple utterances of patterns from the same class. We have formulated the problem of jointly decoding K multiple patterns given a single hidden Markov model. It is shown that such a solution is possible by aligning the K patterns using the proposed multi pattern dynamic time warping algorithm followed by the constrained multi pattern Viterbi algorithm. The new formulation is tested in the context of speaker independent isolated word recognition for both clean and noisy patterns. When 10 percent of speech is affected by a burst noise at -5 dB signal to noise ratio (local), it is shown that joint decoding using only two noisy patterns reduces the noisy speech recognition error rate to about 51 percent, when compared to the single pattern decoding using the Viterbi Algorithm. In contrast a simple maximization of individual pattern likelihoods, provides only about 7 percent reduction in error rate.


Signal Processing | 2007

Increased watermark-to-host correlation of uniform random phase watermarks in audio signals

S Krishna Kumar; Thippur V. Sreenivas

We analyze the watermark-to-host correlation (WHC) of random phase watermarks, in the context of additive embedding and blind correlation-detection of the watermark in audio signals. Interestingly, we find that WHC is higher when a legitimate watermark is present in the audio signal. Instead of trying to minimize the WHC, one could also attempt to harness this increased correlation in order to result in better detection performance. The analysis shows that uniformly distributed phase difference (between the host signal and the watermark) provides maximum advantage. It is also verified that, a uniformly distributed watermark phase provides the most advantage, experimented over a variety of audio signals.


EURASIP Journal on Advances in Signal Processing | 2004

Adaptive window zero-crossing-based instantaneous frequency estimation

S. Chandra Sekhar; Thippur V. Sreenivas

We address the problem of estimating instantaneous frequency (IF) of a real-valued constant amplitude time-varying sinusoid. Estimation of polynomial IF is formulated using the zero-crossings of the signal. We propose an algorithm to estimate nonpolynomial IF by local approximation using a low-order polynomial, over a short segment of the signal. This involves the choice of window length to minimize the mean square error (MSE). The optimal window length found by directly minimizing the MSE is a function of the higher-order derivatives of the IF which are not available a priori. However, an optimum solution is formulated using an adaptive window technique based on the concept of intersection of confidence intervals. The adaptive algorithm enables minimum MSE-IF (MMSE-IF) estimation without requiring a priori information about the IF. Simulation results show that the adaptive window zero-crossing-based IF estimation method is superior to fixed window methods and is also better than adaptive spectrogram and adaptive Wigner-Ville distribution (WVD)-based IF estimators for different signal-to-noise ratio (SNR).


Signal Processing | 2008

Optimum switched split vector quantization of LSF parameters

Saikat Chatterjee; Thippur V. Sreenivas

We address the issue of rate-distortion (R/D) performance optimality of the recently proposed switched split vector quantization (SSVQ) method. The distribution of the source is modeled using Gaussian mixture density and thus, the non-parametric SSVQ is analyzed in a parametric model based framework for achieving optimum R/D performance. Using high rate quantization theory, we derive the optimum bit allocation formulae for the intra-cluster split vector quantizer (SVQ) and the inter-cluster switching. For the wide-band speech line spectrum frequency (LSF) parameter quantization, it is shown that the Gaussian mixture model (GMM) based optimum parametric SSVQ method provides 1bit/vector advantage over the non-parametric SSVQ method.


Signal Processing | 2002

IF estimation using higher order TFRs

G. Viswanath; Thippur V. Sreenivas

Instantaneous frequency estimation of arbitrary polynomial phase signals is shown to be feasible using higher order time–frequency representations such as L-Wigner distribution and complex time–frequency representation. These new extensions of the iterative cross-Wigner distribution based algorithm are shown to have certain computational advantage as well as noise robustness.

Collaboration


Dive into the Thippur V. Sreenivas's collaboration.

Top Co-Authors

Avatar

Saikat Chatterjee

Royal Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

S. Chandra Sekhar

Indian Institute of Science

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Achintya Kundu

Indian Institute of Science

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge