Nirmalya Sen
Indian Institute of Technology Kharagpur
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Nirmalya Sen.
ieee india conference | 2009
Nirmalya Sen; Hemant A. Patil; T. K. Basu
This paper proposes a new method of feature extraction for robust text-independent speaker identification. The focus of this work is on applications which yield higher identification accuracy without increasing the computational effort. The impetus for this new feature extraction technique comes from a new transformation. We have proposed this transform from speaker identification perspective. A complete experimental evaluation was conducted on a database of 61 speakers with Gaussian mixture speaker model. This new feature extraction technique has been compared with mel-frequency cepstral coefficient (MFCC) feature. Evaluation results show, that the new feature provides better identification accuracy than the MFCC feature. The discrimination capability of the feature sets have been evaluated statistically, using F-ratio and J- measure. Experimental results show that the new feature set is much more discriminative than the MFCC feature set.
ieee india conference | 2009
Nirmalya Sen; T. K. Basu
This paper introduces a new Nyquist window. The proposed window has been compared with the Gaussian window. The time-bandwidth product of the proposed window is very close to the time-bandwidth product of the Gaussian window.
international conference on industrial and information systems | 2010
Nirmalya Sen; T. K. Basu; Hemant A. Patil
This paper introduces the use of a new method of feature extraction for robust text-independent speaker identification. The focus of this work is on applications which yield higher identification accuracy without increasing the computational effort. The impetus for this new feature extraction technique comes from a new transformation which is based on the Nyquist filter bank. We have proposed this transform from speaker identification perspective. This new feature extraction technique has been compared with Mel-frequency cepstral coefficient (MFCC) feature both theoretically and practically. Experimental evaluation was conducted on POLYCOST database with 130 speakers using Gaussian mixture speaker model. On clean speech the proposed feature set has 11.5% higher average accuracy compared to the MFCC feature set. For noisy speech also the proposed feature set performs significantly better than the MFCC feature set.
ieee students technology symposium | 2011
Nirmalya Sen; T. K. Basu
This paper demonstrates the use of two new methods of feature extraction called temporal energy of subband cepstral coefficient (TESBCC) and temporal correlation of subband cepstral coefficient (TCSBCC) for text-independent speaker identification. The focus of this work is on applications which yield higher identification accuracy without increasing the computational effort. The impetus for these new feature extraction techniques comes from a new transformation which is based on the Nyquist filter bank. We have proposed this transformation from speaker identification perspective. TESBCC and TCSBCC have been compared with Mel-frequency cepstral coefficient (MFCC) feature both theoretically and practically. Experimental evaluation was conducted on POLYCOST database with 130 speakers using Gaussian mixture speaker model. TESBCC feature set has 7.88% higher average accuracy compared to the MFCC feature set. Similarly TCSBCC feature set has 5.23% higher average accuracy compared to the MFCC feature set.
BioID'11 Proceedings of the COST 2101 European conference on Biometrics and ID management | 2011
Nirmalya Sen; T. K. Basu
This paper compares the feature sets extracted using frequency-time analysis approach and time-frequency analysis approach for text-independent speaker identification. The impetus for the frequency-time analysis approach comes from the band pass filtering view of STFT. Nyquist filter bank and Gaussian filter bank both have been used for extracting features using frequency-time analysis approach. Experimental evaluation was conducted on the POLYCOST database with 130 speakers using Gaussian mixture speaker model. Results reveal that, the feature sets extracted using frequency-time analysis approach performs significantly better compared to the feature set extracted using time-frequency analysis approach.
international conference on mining intelligence and knowledge exploration | 2013
Nirmalya Sen; Hemant A. Patil; Shyamal Kr. Das Mandal; K. Sreenivasa Rao
This paper compares performances between GMM-UBM classifier and SVM classifier with GMM supervector as the linear kernel for text-independent speaker verification. The MFCC feature set has been used for this comparison. Experimental evaluation was conducted on the POLYCOST database. The importance of utterance partitioning for training speech has been discussed. Results reveal that, without utterance partitioning, the accuracy of SVM classifier with GMM supervectors for small test segment is poor. For proper utterance partitioning of the training speech, the SVM classifier with GMM supervectors performs significantly better compared to GMM-UBM baseline. The detailed derivation of GMM supervector has also been discussed.
ICFCE | 2012
Nirmalya Sen; T. K. Basu
This paper compares performances between Gaussian Mixture Model (GMM) classifier and polynomial classifier for text-independent speaker identification. The MFCC feature set has been used for this comparison. Experimental evaluation was conducted on the POLYCOST database with 130 speakers. The importance of the prior in the polynomial classifier has been discussed in detail. Results reveal that, the identification accuracy of the polynomial classifier strongly depends on the choice of prior. For proper prior selection the polynomial classifier can perform better than the GMM classifier.
national conference on communications | 2011
Nirmalya Sen; T. K. Basu; Sandipan Chakroborty
This paper compares the feature sets extracted using time-frequency analysis approach and frequency-time analysis approach for text-independent speaker identification. Mel-frequency cepstral coefficient (MFCC) feature set and Inverted Mel-frequency cepstral coefficient (IMFCC) feature set are extracted using time-frequency analysis approach. Temporal energy subband cepstral coefficient (TESBCC) feature set is extracted using frequency time analysis approach. Time-bandwidth product of MFCC filter bank and TESBCC filter bank has been compared. RV coefficient has been used to calculate the correlation between the feature sets. Experimental evaluation was conducted on POLYCOST database with 130 speakers using Gaussian mixture speaker model. The TESBCC feature set has 9.5% higher average accuracy compared to the MFCC feature set. It is found that, the feature set extracted using time-frequency analysis approach is practically uncorrelated with the feature set extracted using frequency-time analysis approach. It is also demonstrated that IMFCC feature set has important role in fusion.
international conference on industrial and information systems | 2010
Nirmalya Sen; Rahul Gupta; Sandipan Chakroborty
This paper introduces an index which can be used as a measure of correlation between features of a multidimensional feature vector. The proposed index has been intuitively explained using constant distance loci of Mahalanobis metric. More rigorous explanation has been given using orthogonal decomposition of the vector space. The application of the proposed index has been given using 2D synthetic data.
international conference on devices and communications | 2011
Nirmalya Sen; T. K. Basu; Sandipan Chakroborty
This paper demonstrates the relation between flatness index of the eigen values of the covariance matrix of the feature vectors and the correlation between the features in a multidimensional feature space. The constant distance loci of Mahalanobis metric has been used to interpret the relation. The intuitive interpretation of the flatness index and correlation has been given using 2D synthetic data. The usefulness of log function to reduce the correlation between features has been shown using synthetic data and real feature vectors extracted from speech data for text-independent speaker identification using MFCC, LFCC and IMFCC feature sets.
Collaboration
Dive into the Nirmalya Sen's collaboration.
Dhirubhai Ambani Institute of Information and Communication Technology
View shared research outputs