V. Ramasubramanian
PES University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by V. Ramasubramanian.
IEEE Transactions on Signal Processing | 1992
V. Ramasubramanian; Kuldip Kumar Paliwal
Fast search algorithms are proposed and studied for vector quantization encoding using the K-dimensional (K-d) tree structure. Here, the emphasis is on the optimal design of the K-d tree for efficient nearest neighbor search in multidimensional space under a bucket-Voronoi intersection search framework. Efficient optimization criteria and procedures are proposed for designing the K-d tree, for the case when the test data distribution is available (as in vector quantization application in the form of training data) as well as for the case when the test data distribution is not available and only the Voronoi intersection information is to be used. The criteria and bucket-Voronoi intersection search procedure are studied in the context of vector quantization encoding of speech waveform. They are empirically observed to achieve constant search complexity for O(log N) tree depths and are found to be more efficient in reducing the search complexity. A geometric interpretation is given for the maximum product criterion, explaining reasons for its inefficiency with respect to the optimization criteria. >
IEEE Transactions on Communications | 1989
Kuldip Kumar Paliwal; V. Ramasubramanian
Recently, C.D. Bei and R.M. Gray (1985) used a partial distance search algorithm that reduces the computational complexity of the minimum distortion encoding for vector quantization. The effect of ordering the codevectors on the computational complexity of the algorithm is studied. It is shown that the computational complexity of this algorithm can be reduced further by ordering the codevectors according to the sizes of their corresponding clusters. >
Pattern Recognition Letters | 1992
V. Ramasubramanian; Kuldip Kumar Paliwal
In this paper, we present an efficient algorithm for fast nearest-neighbour search in multidimensional space under a so called approximation-elimination framework. The algorithm is based on a new approximation procedure which selects codevectors for distance computation in the close proximity of the test vector and eliminates codevectors using the triangle inequality based elimination. The algorithm is studied in the context of vector quantization of speech and compared with related algorithms proposed earlier. It is shown to be more efficient in terms of reducing the main search complexity, overhead costs and storage.
computer vision and pattern recognition | 2008
Srikanth Cherla; Kaustubh Kulkarni; Amit A. Kale; V. Ramasubramanian
In this paper, we propose a fast method to recognize human actions which accounts for intra-class variability in the way an action is performed. We propose the use of a low dimensional feature vector which consists of (a) the projections of the width profile of the actor on to an ldquoaction basisrdquo and (b) simple spatio-temporal features. The action basis is built using eigenanalysis of walking sequences of different people. Given the limited amount of training data, Dynamic Time Warping (DTW) is used to perform recognition. We propose the use of the average-template with multiple features, first used in speech recognition, to better capture the intra-class variations for each action. We demonstrate the efficacy of this algorithm using our low dimensional feature to robustly recognize human actions. Furthermore, we show that view-invariant recognition can be performed by using a simple data fusion of two orthogonal views. For the actions that are still confusable, a temporal discriminative weighting scheme is used to distinguish between them. The effectiveness of our method is demonstrated by conducting experiments on the multi-view IXMAS dataset of persons performing various actions.
Pattern Recognition | 2000
V. Ramasubramanian; Kuldip Kumar Paliwal
Abstract In this paper, we provide an overview of fast nearest-neighbor search algorithms based on an `approximation–elimination’ framework under a class of elimination rules, namely, partial distance elimination, hypercube elimination and absolute-error-inequality elimination derived from approximations of Euclidean distance. Previous algorithms based on these elimination rules are reviewed in the context of approximation–elimination search. The main emphasis in this paper is a comparative study of these elimination constraints with reference to their approximation–elimination efficiency set within different approximation schemes.
IEEE Transactions on Image Processing | 2000
Kuldip Kumar Paliwal; V. Ramasubramanian
Previously a modified K-means algorithm for vector quantization design has been proposed where the codevector updating step is as follows: new codevector=current codevector+scale factor (new centroid-current codevector). This algorithm uses a fixed value for the scale factor. In this paper, we propose the use of a variable scale factor which is a function of the iteration number. For the vector quantization of image data, we show that it offers faster convergence than the modified K-means algorithm with a fixed scale factor, without affecting the optimality of the codebook.
international conference on acoustics, speech, and signal processing | 2002
A. K. V. SaiJayram; V. Ramasubramanian; Thippur V. Sreenivas
Automatic segmentation of speech is an important problem that is useful in speech recognition, synthesis and coding. We explore in this paper, the robust parameter set, weighting function and distance measure for reliable segmentation of noisy speech. It is found that the MFCC parameters, successful in speech recognition. holds the best promise for robust segmentation also. We also explored a variety of symmetric and asymmetric weighting lifters. from which it is found that a symmetric lifter of the form 1 + A sin1/2(πn/L), 0 ≤ n ≤ L − 1, for MFCC dimension L, is most effective. With regard to distance measure, the direct L2 norm is found adequate.
international conference on acoustics, speech, and signal processing | 2005
S.A. Santosh Kumar; V. Ramasubramanian
Recently, we established the equivalence of an ergodic HMM (EHMM) to a parallel sub-word recognition (PSWR) framework for language identification (LID). The states of EHMM correspond to acoustic units of a language and its state-transitions represent the bigram language model of unit sequences. We consider two alternatives to represent the state-observation densities of EHMM, namely, the Gaussian mixture model (GMM) and hidden Markov model (HMM). We present a segmental K-means algorithm for the training of both these types of EHMM (EHMM of GMM and EHMM of HMM) and compare their performance on a 6 language LID task in the OGI-TS database. EHMM of GMM has a performance comparable to PSWR and superior to EHMM of HMM; we provide reasons for the performance difference between EHMM(G) and EHMM(H), and identify ways of enhancing the performance of EHMM(H) which is a novel and powerful architecture, ideal for spoken language modeling.
international conference on acoustics, speech, and signal processing | 2003
A.K.V.S. Jayram; V. Ramasubramanian; Thippur V. Sreenivas
Parallel sub-word recognition (PSWR) is a new model that has been proposed for language identification (LID) which does not need elaborate phonetic labeling of the speech data in a foreign language. The new approach performs a front-end tokenization in terms of sub-word units which are designed by automatic segmentation, segment clustering and segment HMM modeling. We develop PSWR based LID in a framework similar to the parallel phone recognition (PPR) approach in the literature. This includes a front-end tokenizer and a back-end language model, for each language to be identified. Considering various combinations of the statistical evaluation scores, it is found that PSWR can perform as well as PPR, even with broad acoustic sub-word tokenization, thus making it an efficient alternative to the PPR system.
international conference on acoustics, speech, and signal processing | 2011
V. Ramasubramanian; R. Karthik; S. Thiyagarajan; Srikanth Cherla
We address the problem of audio analytics with respect to efficient modeling of audio classes and continuous decoding of audio stream to automatically segment and label the audio stream as required in audio indexing. We propose the use of left-to-right HMMs and ergodic HMMs to respectively model definite and indefinite duration audio classes and Viterbi decoding using these HMMs with non-emitting states for continuous decoding of audio streams. We quantify the decoding performance using detection and false-alarm rates and show that the proposed HMM based modeling and Viterbi decoding can have high decoding accuracies with average (%Hit, %False-alarm) of (79.2%, 1.6%), which are significantly better than VQ, GMM and Template based decoding, indicating the viability of the proposed modeling and decoding technique for practical surveillance audio analytics.