Ruchir Travadi
University of Southern California
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Ruchir Travadi.
IEEE Transactions on Audio, Speech, and Language Processing | 2015
Maarten Van Segbroeck; Ruchir Travadi; Shrikanth Narayanan
A critical challenge to automatic language identification (LID) is achieving accurate performance with the shortest possible speech segment in a rapid fashion. The accuracy to correctly identify the spoken language is highly sensitive to the duration of speech and is bounded by the amount of information available. The proposed approach for rapid language identification transforms the utterances to a low dimensional i-vector representation upon which language classification methods are applied. In order to meet the challenges involved in rapidly making reliable decisions about the spoken language, a highly accurate and computationally efficient framework of i-vector extraction is proposed. The LID framework integrates the approach of universal background model (UBM) fused total variability modeling. UBM-fused modeling yields the estimation of a more discriminant, single i-vector space. This way, it is also a computationally more efficient alternative than system level fusion. A further reduction in equal error rate is achieved by training the i-vector model on long duration speech utterances and by the deployment of a robust feature extraction scheme that aims to capture the relevant language cues under various acoustic conditions. Evaluation results on the DARPA RATS data corpus suggest the potential of performing successful automated language identification at the level of one second of speech or even shorter duration.
conference of the international speech communication association | 2016
Ruchir Travadi; Shrikanth Narayanan
In this paper, we address the problem of parameter estimation for the Total Variability Model (TVM) [1]. Typically, the estimation of the Total Variability Matrix requires several iterations of the Expectation Maximization (EM) algorithm [2], and can be considerably demanding computationally. As a result, fast and efficient parameter estimation remains a key challenge facing the model. We show that it is possible to reduce the Maximum Likelihood parameter estimation problem for TVM into a Singular Value Decomposition (SVD) problem by making some suitably justified approximations in the likelihood function. By using randomized algorithms for efficient computation of the SVD, it becomes possible to accelerate the parameter estimation task remarkably. In addition, we show that this method is able to increase the efficiency of the ivector extraction procedure, and also lends some interpretability to the extracted ivectors.
conference of the international speech communication association | 2016
Brandon M. Booth; Rahul Gupta; Pavlos Papadopoulos; Ruchir Travadi; Shrikanth Narayanan
Sincerity is important in everyday human communication and perception of genuineness can greatly affect emotions and outcomes in social interactions. In this paper, submitted for the INTERSPEECH 2016 Sincerity Challenge, we examine a corpus of six different types of apologetic utterances from a variety of English speakers articulated in different prosodic styles, and we rate the sincerity of each remark. Since the utterances and semantic meaning in the examined database are controlled, we focus on tone of voice by exploring a plethora of acoustic and paralinguistic features not present in the baseline model and how well they contribute to human assessment of sincerity. We show that these additional features improve the performance using the baseline model, and furthermore that conditioning learning models on the prosody of utterances boosts the prediction accuracy. Our best system outperforms the challenge baseline and in principle can generalize well to other corpora.
Computer Speech & Language | 2019
Ruchir Travadi; Shrikanth Narayanan
Abstract A number of audio signal processing applications characterize different properties of the source underlying an audio signal by analyzing the distribution of a sequence of feature vectors obtained from the signal. The Total Variability Model has been widely used for this purpose as a mechanism for capturing the variability in the feature vector distribution across different signals within a low dimensional representation. In order to arrive at a compact representation, a number of assumptions are made within the model regarding the properties of this distribution. In this paper, we first present an analysis of a parameter estimation method for the model which offers a computationally efficient alternative to the widely used Expectation Maximization (EM) algorithm, but relies on the validity of the model assumptions, using experiments on speaker and language identification tasks. To explain some of the results obtained using this method, we present an extensive statistical analysis aimed at verifying the validity of some of the model assumptions. We show that many of these model assumptions are not valid for the observed data, and propose model generalizations to replace these assumptions. The proposed generalizations lead to a better performance while also opening up possibilities for discriminative training of the model.
conference of the international speech communication association | 2014
Maarten Van Segbroeck; Ruchir Travadi; Colin Vaz; Jangwon Kim; Matthew P. Black; Alexandros Potamianos; Shrikanth Narayanan
conference of the international speech communication association | 2014
Ruchir Travadi; Maarten Van Segbroeck; Shrikanth Narayanan
conference of the international speech communication association | 2014
Maarten Van Segbroeck; Ruchir Travadi; Shrikanth Narayanan
conference of the international speech communication association | 2017
Ruchir Travadi; Shrikanth Narayanan
conference of the international speech communication association | 2017
Pavlos Papadopoulos; Ruchir Travadi; Shrikanth Narayanan
international conference on acoustics, speech, and signal processing | 2018
Manoj Kumar; Pavlos Papadopoulos; Ruchir Travadi; Daniel Bone; Shrikanth Narayanan