Mohaddeseh Nosratighods
University of New South Wales
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Mohaddeseh Nosratighods.
international conference on acoustics, speech, and signal processing | 2009
Haizhou Li; Bin Ma; Kong-Aik Lee; Hanwu Sun; Donglai Zhu; Khe Chai Sim; Changhuai You; Rong Tong; Ismo Kärkkäinen; Chien-Lin Huang; Vladimir Pervouchine; Wu Guo; Yijie Li; Li-Rong Dai; Mohaddeseh Nosratighods; Thiruvaran Tharmarajah; Julien Epps; Eliathamby Ambikairajah; Eng Siong Chng; Tanja Schultz; Qin Jin
This paper describes the performance of the I4U speaker recognition system in the NIST 2008 Speaker Recognition Evaluation. The system consists of seven subsystems, each with different cepstral features and classifiers. We describe the I4U Primary system and report on its core test results as they were submitted, which were among the best-performing submissions. The I4U effort was led by the Institute for Infocomm Research, Singapore (IIR), with contributions from the University of Science and Technology of China (USTC), the University of New South Wales, Australia (UNSW), Nanyang Technological University, Singapore (NTU) and Carnegie Mellon University, USA (CMU).
Speech Communication | 2010
Mohaddeseh Nosratighods; Eliathamby Ambikairajah; Julien Epps; Michael J. Carey
The performance of speaker verification systems degrades considerably when the test segments are utterances of very short duration. This might be either due to variations in score-matching arising from the unobserved speech sounds of short speech utterances or the fact that the shorter the utterance, the greater the effect of individual speech sounds on the average likelihood score. In other words, the effects of individual speech sounds have not been cancelled out by a large number of speech sounds in very short utterances. This paper presents a score-based segment selection technique for discarding portions of speech that result in poor discrimination ability in a speaker verification task. Theory is developed to detect the most significant and reliable speech segments based on the probability that the test segment comes from a fixed set of cohort models. This approach, suitable for any duration of test utterance, reduces the effect of acoustic regions of the speech that are not accurately modelled due to sparse training data, and makes a decision based only on the segments that provide the best-matched scores from the segment selection algorithm. The proposed segment selection technique provides reductions in relative error rate of 22% and 7% in terms of minimum Detection Cost Function (DCF) and Equal Error Rate (EER) compared with a baseline used the segment-based normalization, when evaluated on the short utterances of NIST 2002 Speaker Recognition Evaluation dataset.
international conference on pattern recognition | 2006
Mohaddeseh Nosratighods; Eliathamby Ambikairajah; Julien Epps
Dynamic cepstral features such as delta and delta-delta cepstra have been shown to play an essential role in capturing the transitional characteristics of the speech signal. In this paper, a set of new dynamic features for speaker verification system are introduced. These new features, known as delta cepstral energy (DCE) and delta-delta cepstral energy (DDCE), can compactly represent the information in the delta and delta-delta cepstra. Furthermore, it is shown theoretically that DCE carries the same information as the delta cepstrum using an entropy criterion. Experimental speaker verification results on the TIMIT database support the theoretical result, showing a significant improvement in terms of equal error rate compared with conventional feature extraction methods using delta and delta-delta cepstra
international conference on acoustics, speech, and signal processing | 2009
Mohaddeseh Nosratighods; Tharmarajah Thiruvaran; Julien Epps; Eliathamby Ambikairajah; Bin Ma; Haizhou Li
In this paper, the fusion of two speaker recognition subsystems, one based on Frequency Modulation (FM) and another on MFCC features, is reported. The motivation for their fusion was to improve the recognition accuracy across different types of channel variations, since the two features are believed to contain complementary information. It was found that the MFCC-based subsystem outperformed the FM-based subsystem on telephone conversations from NIST SRE-06 dataset, while the opposite was true for NIST SRE-08 telephone data. As a result, the FM-based subsystem performed as well as the MFCC-based subsystem and their fusion gave up to 23% relative improvement in terms of EER over the MFCC subsystem alone, when evaluated on the NIST 2008 core condition.
international conference on acoustics, speech, and signal processing | 2011
Jia Min Karen Kua; Julien Epps; Mohaddeseh Nosratighods; Eliathamby Ambikairajah; Eric H. C. Choi
Recent results seem to cast some doubt over the assumption that improvements in fused recognition accuracy for speaker recognition systems based on different acoustic features are due mainly to the different origins of the features (e.g. magnitude, phase, modulation information). In this study, we utilize clustering comparison measures to investigate acoustic and speaker modelling aspects of the speaker recognition task separately and demonstrate that front-end diversity can be achieved purely through different ‘partitioning’ of the acoustic space. Further, features that exhibit good ‘stability’ with respect to repeated clustering are shown to also give good EER performance in speaker recognition. This has implications for feature choice, fusion of systems employing different features, and for UBM data selection. A method for the latter problem is presented that gives up to an 11% relative reduction in EER using only 20–30% of the usual UBM training data set.
international conference on acoustics, speech, and signal processing | 2007
Mohaddeseh Nosratighods; Eliathamby Ambikairajah; Julien Epps; Michael J. Carey
This paper presents a segment selection technique for discarding portions of speech that result in poor discrimination ability in speaker verification tasks. Theory supporting the significance of a frame selection procedure for test segments, prior to making decisions, is also developed. This approach has the ability to reduce the effect of the acoustic regions of speech that are not accurately represented due to a lack of training data. Compared with a baseline system using both CMS and variance normalization, the proposed segment selection technique brings 24% relative reduction in error rate over the entire testing data of the 2002 NIST Dataset in terms of minimum DCF. For short test segments, i.e. less than 15 seconds, the novel frame dropping technique produces a significant relative error rate reduction of 23% in terms of minimum DCF.
international conference on signal processing | 2007
Mohaddeseh Nosratighods; Eliathamby Ambikairajah; Julien Epps; Michael J. Carey
This paper presents a method for re-weighting the frame-based scores of a speaker recognition system according to the discrimination level of the best matched Gaussian mixture for that frame. This approach focuses on particular feature space regions that either have been modeled accurately or contain the phonemes which are inherently most discriminative. The performance of individual Gaussian mixtures in terms of equal error rate (EER) and minimum detection cost function (DCF) on training, development and testing datasets consistently suggest that some Gaussian mixtures are inherently more discriminative regardless of their occurrence in training data. Therefore, it is possible to enhance the performance of speaker verification systems by re-weighting the frames that are mainly produced by those discriminative Gaussian mixtures. Compared with the baseline, results show a relative improvement of 5.82% and 5.46 % on male speakers from the NIST 2002 dataset, in terms of EER and min DCF, respectively.
Odyssey | 2010
Jia Min Karen Kua; Tharmarajah Thiruvaran; Mohaddeseh Nosratighods; Eliathamby Ambikairajah; Julien Epps
conference of the international speech communication association | 2010
Ville Hautamäki; Tomi Kinnunen; Mohaddeseh Nosratighods; Kong-Aik Lee; Bin Ma; Haizhou Li
Electronics Letters | 2009
Tharmarajah Thiruvaran; Mohaddeseh Nosratighods; Eliathamby Ambikairajah; Julien Epps