Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Ritwik Giri is active.

Publication


Featured researches published by Ritwik Giri.


international conference on acoustics, speech, and signal processing | 2015

Improving speech recognition in reverberation using a room-aware deep neural network and multi-task learning

Ritwik Giri; Michael L. Seltzer; Jasha Droppo; Dong Yu

In this paper, we propose two approaches to improve deep neural network (DNN) acoustic models for speech recognition in reverberant environments. Both methods utilize auxiliary information in training the DNN but differ in the type of information and the manner in which it is used. The first method uses parallel training data for multi-task learning, in which the network is trained to perform both a primary senone classification task and a secondary feature enhancement task using a shared representation. The second method uses a parameterization of the reverberant environment extracted from the observed signal to train a room-aware DNN. Experiments were performed on the single microphone task of the REVERB Challenge corpus. The proposed approach obtained a word error rate of 7.8% on the SimData test set, which is lower than all reported systems using the same training data and evaluation conditions, and 27.5% on the mismatched RealData test set, which is lower than all but two systems.


IEEE Transactions on Signal Processing | 2016

Type I and Type II Bayesian Methods for Sparse Signal Recovery Using Scale Mixtures

Ritwik Giri; Bhaskar D. Rao

In this paper, we propose a generalized scale mixture family of distributions, namely the Power Exponential Scale Mixture (PESM) family, to model the sparsity inducing priors currently in use for sparse signal recovery (SSR). We show that the successful and popular methods such as LASSO, Reweighted ℓ1 and Reweighted ℓ2 methods can be formulated in an unified manner in a maximum a posteriori (MAP) or Type I Bayesian framework using an appropriate member of the PESM family as the sparsity inducing prior. In addition, exploiting the natural hierarchical framework induced by the PESM family, we utilize these priors in a Type II framework and develop the corresponding EM based estimation algorithms. Some insight into the differences between Type I and Type II methods is provided and of particular interest in the algorithmic development is the Type II variant of the popular and successful reweighted ℓ1 method. Extensive empirical results are provided, and they show that the Type II methods exhibit better support recovery than the corresponding Type I methods.


IEEE Signal Processing Letters | 2017

Reweighted Algorithms for Independent Vector Analysis

Ritwik Giri; Bhaskar D. Rao; Harinath Garudadri

In this letter, we consider the problem of joint blind source separation of multiple datasets simultaneously using an Independent vector analysis (IVA) framework. In particular we propose a new paradigm of reweighted algorithms for IVA by employing a source prior from a multivariate generalized scale mixture distribution family. In addition, our proposed reweighted algorithms can also exploit second-order statistical information across datasets by learning intrasource correlation matrix of each source component vector (SCV) along with higher order statistics. Experimental results are provided to show the efficacy of our proposed algorithms in achieving reliable source separation for both the cases, i.e., when there is correlation present within an SCV, and also when the sources are uncorrelated, i.e., no second order dependencies across datasets.


international conference on acoustics, speech, and signal processing | 2014

Block sparse excitation based all-pole modeling of speech

Ritwik Giri; Bhaskar D. Rao

In this paper, it is shown that an appropriate model for voiced speech is an all-pole filter excited by a block sparse excitation sequence. The modeling approach is generalized in a novel manner to deal with a wide spectrum of speech signal; voiced speech, unvoiced speech and mixed excitation speech. In this context, the input sequence to the all-pole model is modeled as a suitable weighted linear combination of a block sparse signal and white noise. We develop the corresponding estimation procedure to reconstruct the generalized input sequence and model parameters via sparse Bayesian learning methods employing the Expectation-Maximization based procedure. Rigorous experiments have been performed to show the efficacy of our proposed model for the speech modeling task. By imposing a block sparse structure on the input sequence, the problems associated with the commonly used Linear Prediction approach is alleviated leading to a more robust modeling scheme.


international conference on image processing | 2016

Robust Bayesian method for simultaneous block sparse signal recovery with applications to face recognition

Igor Fedorov; Ritwik Giri; Bhaskar D. Rao; Truong Q. Nguyen

In this paper, we present a novel Bayesian approach to recover simultaneously block sparse signals in the presence of outliers. The key advantage of our proposed method is the ability to handle non-stationary outliers, i.e. outliers which have time varying support. We validate our approach with empirical results showing the superiority of the proposed method over competing approaches in synthetic data experiments as well as the multiple measurement face recognition problem.


intelligent data engineering and automated learning | 2014

User Behavior Modeling in a Cellular Network Using Latent Dirichlet Allocation

Ritwik Giri; Heesook Choi; Kevin Soo Hoo; Bhaskar D. Rao

Insights into the behavior and preference of mobile device users from their web browsing/application activities are critical components of any successful dynamic content recommendation system, mobile advertisement platform, or web personalization initiative. In this paper we use an unsupervised topic model to understand the interests of the cellular users based upon their browsing profile. We posit that the length of time a user remains on a given website is positively correlated with the user’s interest in the website’s content. We propose an extended model to integrate this duration information efficiently by oversampling the URLs.


international conference on acoustics, speech, and signal processing | 2016

Dynamic relative impulse response estimation using structured sparse Bayesian learning

Ritwik Giri; Bhaskar D. Rao; Fred Mustiere; Tao Zhang

In this paper we present a novel Hierarchical Bayesian approach to estimate Relative Impulse Response (ReIR) using short, noisy and reverberant microphone recordings. The information contained in ReIRs between two microphones is useful for a wide range of multichannel speech processing applications such as speaker localization, speech enhancement, etc. It has been shown in several previous works that the Relative Transfer Function (RTF) corresponding to a given ReIR is dynamic and depends on the environment, microphone positions and target position. This acts as the main motivation of this work, as we develop a structured sparse Bayesian learning algorithm to estimate ReIR using very short recordings, which will be robust to changes in the environment. An extensive experimental study with real-world recordings has also been conducted to show the efficacy of our proposed approach over other competing approaches.


IEEE Computational Intelligence Magazine | 2016

Learning Distributional Parameters for Adaptive Bayesian Sparse Signal Recovery

Ritwik Giri; Bhaskar D. Rao

Power Exponential Scale Mixture (PESM), a generalized scale mixture family of distributions, has been recently proposed to model the sparsity inducing prior distributions currently in use for Sparse Signal Recovery (SSR). In this paper, we review this generalized scale mixture family and establish the necessary and sufficient condition for a distribution (symmetric with respect to origin) to have a PESM representation, which is a generalization of the results previously known for the Gaussian Scale Mixture (GSM) family. On the algorithmic front, we propose an adaptive Bayesian Sparse Signal Recovery (B-SSR) framework by learning the distributional parameters of a Generalized t-distribution (GT) which belongs to the PESM family. For specific choice of distributional parameters of GT our proposed framework corresponds to popular sparse recovery algorithms such as, LASSO, Reweighted ,1 norm minimization, Reweighted ,2 norm minimization etc. The tail nature of GT distribution family is extensively studied in this paper and an adaptive algorithm has been proposed where the tail nature of the prior is adapted over iterations based on the observation. Extensive experimental results based on traditional SSR setup have also been presented to show the efficacy of this adaptive approach.


asilomar conference on signals, systems and computers | 2014

Bootstrapped sparse Bayesian learning for sparse signal recovery

Ritwik Giri; Bhaskar D. Rao

In this article we study the sparse signal recovery problem in a Bayesian framework using a novel Bootstrapped Sparse Bayesian Learning method. Sparse Bayesian Learning (SBL) framework is an effective tool for pruning out the irrelevant features and ending up with a sparse representation. In SBL the choice of prior over the variances of the Gaussian Scale mixture has been an interesting area of research for some time now. This motivates us to use a more generalized maximum entropy density as the prior which results in a new variant of SBL. It has been shown to perform better than traditional SBL empirically and it also accelerates the pruning procedure. Because of this advantage, this variant of SBL can be claimed as more robust choice as it is less sensitive to the threshold for pruning. Theoretical justifications have also been provided to show that the proposed model actually promotes sparse point estimates.


Signal Processing | 2018

A unified framework for sparse non-negative least squares using multiplicative updates and the non-negative matrix factorization problem

Igor Fedorov; Alican Nalci; Ritwik Giri; Bhaskar D. Rao; Truong Q. Nguyen; Harinath Garudadri

We study the sparse non-negative least squares (S-NNLS) problem. S-NNLS occurs naturally in a wide variety of applications where an unknown, non-negative quantity must be recovered from linear measurements. We present a unified framework for S-NNLS based on a rectified power exponential scale mixture prior on the sparse codes. We show that the proposed framework encompasses a large class of S-NNLS algorithms and provide a computationally efficient inference procedure based on multiplicative update rules. Such update rules are convenient for solving large sets of S-NNLS problems simultaneously, which is required in contexts like sparse non-negative matrix factorization (S-NMF). We provide theoretical justification for the proposed approach by showing that the local minima of the objective function being optimized are sparse and the S-NNLS algorithms presented are guaranteed to converge to a set of stationary points of the objective function. We then extend our framework to S-NMF, showing that our framework leads to many well known S-NMF algorithms under specific choices of prior and providing a guarantee that a popular subclass of the proposed algorithms converges to a set of stationary points of the objective function. Finally, we study the performance of the proposed approaches on synthetic and real-world data.

Collaboration


Dive into the Ritwik Giri's collaboration.

Top Co-Authors

Avatar

Bhaskar D. Rao

University of California

View shared research outputs
Top Co-Authors

Avatar

Igor Fedorov

University of California

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Alican Nalci

University of California

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge