B C Haris
Indian Institute of Technology Guwahati
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by B C Haris.
international conference on acoustics, speech, and signal processing | 2012
B C Haris; Rohit Sinha
In this work, a speaker verification (SV) method is proposed employing the sparse representation of GMM mean shifted supervectors over learned and discriminatively learned dictionaries. This work is motivated by recently proposed speaker verification methods employing the sparse representation classification (SRC) over exemplar dictionaries created from either GMM mean shifted supervectors or i-vectors. The proposed approach with discriminatively learned dictionary results in an equal error rate of 1.53 % which is found to be better than those of similar complexity SV systems developed using the i-vector based approach and the exemplar based SRC approaches with session/channel variability compensation on NIST 2003 SRE dataset.
national conference on communications | 2011
B C Haris; Gayadhar Pradhan; A. Misra; Sumitra Shukla; Rohit Sinha; S. R. M. Prasanna
In this paper, we present our initial study with the recently collected speech database for developing robust speaker recognition systems in Indian context. The database contains the speech data collected across different sensors, languages, speaking styles, and environments, from 200 speakers. The speech data is collected across five different sensors in parallel, in English and multiple Indian languages, in reading and conversational speaking styles, and in office and uncontrolled environments such as laboratories, hostel rooms and corridors etc. The collected database is evaluated using adapted Gaussian mixture model based speaker verification system following the NIST 2003 speaker recognition evaluation protocol and gives comparable performance to those obtained using NIST data sets. Our initial study exploring the impact of mismatch in training and test conditions with collected data finds that the mismatch in sensor, speaking style, and environment result in significant degradation in performance compared to the matched case whereas for language mismatch case the degradation is found to be relatively smaller.
national conference on communications | 2014
Subhadeep Dey; Sujit Barman; Ramesh K. Bhukya; Rohan Kumar Das; B C Haris; S. R. M. Prasanna; Rohit Sinha
In this paper we present the development and implementation of a speech biometric based attendance system. The users access the system by making a call from few pre-decided mobile phones. An interactive voice response (IVR) system guides a new user in the enrollment and an enrolled user in the verification processes. The system uses text independent speaker verification with MFCC features and i-vector based speaker modeling for authenticating the user. Linear discriminant analysis and within class covariance normalization are used for normalizing the effects due to session/environment variations. A simple cosine distance scoring along with score normalization is used as the classifier and a fixed threshold is used for making the decision. The developed system has been used by a group of 110 students for about two months on a regular basis. The system performance in terms of recognition rate is found to be 94.2 % and the average response time of the system for a test data of duration 50 seconds is noted to be 26 seconds.
national conference on communications | 2012
B C Haris; Rohit Sinha
In this work, we explore the use of sparse representation of GMM mean shifted supervectors over a learned dictionary for the speaker verification (SV) task. In this method the dictionaries are learned using the KSVD algorithm unlike the recently proposed SV methods employing the sparse representation classification (SRC) over exemplar dictionaries. The proposed approach with learned dictionary results in an equal error rate of 1.56 % on NIST 2003 SRE dataset, which is found to be better than those of the state-of-the-art i-vector based approach and the exemplar based SRC approaches using either GMM mean shifted supervectors or i-vectors, with appropriate session/channel variability compensation techniques applied.
ieee india conference | 2013
Om Prakash Singh; B C Haris; Rohit Sinha
In recent times the sparse representation classification (SRC) has received a lot of attention in many signal processing domains including language identification (LID). Traditionally, in SRC the dictionary is designed to be overcomplete. In case of SRC based LID systems using the GMM mean supervectors as language representation, the resulting dictionary is undercomplete due to lack of data. On the contrast, when lower dimensional i-vectors are used the overcomplete dictionary can be achieved. In this work we have explored the apprehension about the successful sparse coding with an undercomplete dictionary. The experimental studies done on NIST LRE 2007 dataset shows that the performance with the undercomplete dictionary turns out to be better than that with the overcomplete dictionary both with and without channel compensation.
IEEE Transactions on Information Forensics and Security | 2015
B C Haris; Rohit Sinha
This paper presents a novel paradigm for speaker verification (SV) exploiting sparse representation (SR) over a learned dictionary. The proposed approach is intended to overcome the shortcomings of existing SR over an exemplar dictionary-based SV systems. In this paper, the supervectors created by concatenating the mean vectors of adapted Gaussian mixture models are used as speaker representations. Both simple and discriminative methods are explored for learning the dictionary in the supervector domain. The learned dictionary-based approach is further extended to enable the compensation of the session/channel variability by using a joint sparse coding over speaker and channel dictionaries. The proposed systems are evaluated on the NIST 2012 SRE data set and are contrasted with the state-of-the-art i-vector probabilistic linear discriminant analysis-based SV system. The proposed system is found to possess the following attributes: 1) a significantly higher performance for very low-false alarm rates, which makes the system attractive for high-security applications; 2) a higher robustness to the short duration test data condition; 3) a competitive robustness to additive noise in test data; and 4) a much lower computational complexity. Even on comparing with the fastest i-vector computation methods reported in the literature, the complexity of the proposed system is found to be comparable. With these features, the proposed approach seems to be a promising candidate for practical voice biometric applications.
Speech Communication | 2015
B C Haris; Rohit Sinha
A low complexity speaker verification system is presented.Sparse random projections and decimation are used for dimensionality reduction.A multi-offset decimation diversity based speaker verification system is proposed. This work explores the use of a few low-complexity data-independent projections for reducing the dimensionality of GMM supervectors in context of speaker verification (SV). The projections derived using sparse random matrix and decimation are explored and are used as speaker representations. The reported study is done on the NIST 2012 SRE task using a state-of-the-art PLDA based SV system. Interestingly, the systems incorporating the proposed projections result in performances competitive to that of the commonly used i-vector representation based one. Both the sparse random matrix and the decimation based approaches are attributed to have very low computational requirements in finding the speaker representations. A novel SV system that exploits the diversity among the representations obtained by using different offsets in the decimation of supervector, is also proposed. The resulted system is found to achieve a relative improvement of 7% in terms of both detection cost and equal error rate over the default i-vector based system while still having lesser overall complexity.
international conference on acoustics, speech, and signal processing | 2013
B C Haris; Gayadhar Pradhan; Rohit Sinha; S. R. M. Prasanna
In this paper, we describe the speaker verification (SV) systems developed by Indian Institute of Technology Guwahati (IITG) for the NIST 2012 speaker recognition evaluations. The primary submission consists of five gender dependent SV systems combined at score level. Among the five systems two are based on sparse representation over learned and exemplar dictionaries, and the remaining are based on the generic i-vector and its variants obtained by vowel and non-vowel conditioning. The exemplar dictionary based system in particular exploits the new evaluation rule allowing the knowledge of all targets in each detection trial. The performance of the system is presented for the NIST SRE 2012 core task.
international conference on signal processing | 2012
B C Haris; Rohit Sinha
The total variability i-vector based speaker verification system is one of the most successful systems in the recent NIST evaluations. It achieves significant improvement in performance over the conventional GMM-UBM based systems by using the projections of the GMM mean shifted supervectors to a low dimensional space for representation. This low dimensional projections are commonly referred to as the total variability i-vector features. In our recent works we have explored the use of sparse representation of the GMM mean shifted supervectors derived using a learned redundant dictionary as a feature for the speaker verification. This approach resulted in a performance comparable to that of the similar complexity i-vector based system. In this work, we explore a fusion of these two approaches in which the GMM mean supervectors are smoothed using the total variability space prior to creating dictionary for sparse representation. The proposed method is found to give a relative improvement of 19% in EER compared to that of the i-vector based system for the experiments done using the NIST 2003 SRE database.
ieee india conference | 2013
Om Prakash Singh; B C Haris; Rohit Sinha; Bhusan Chettri; Abhishek Pradhan
This work explores the use of prosodic feature based sparse representation classification (SRC) system for language identification (LID) task. The prosodic features are computed by extracting syllable like unit with the help of a vowel onset points detection algorithm and mapped to i-vector domain for SRC using an exemplar dictionary. This work is a motivation from recently reported LID approach using low-dimensional i-vectors. The experiments are performed on a locally collected dataset consisting of five Indian languages. On comparing the SRC system performance with that of a contrast system based on cosine distance scoring (CDS), it is noted that the former one performs significantly better than the latter one. The performance of the best system with session/channel compensation in terms of equal error rate and minimum detection cost function turns out to be 7.46% and 0.1338, respectively.