P. Sivakumaran
University of Hertfordshire
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by P. Sivakumaran.
IEEE Transactions on Audio, Speech, and Language Processing | 2007
Amit S. Malegaonkar; Aladdin M. Ariyaeeinia; P. Sivakumaran
A new approach to speaker change detection is proposed and investigated. The method, which is based on a probabilistic framework, provides an effective means for tackling the problem posed by phonetic variation in high-resolution speaker change detection. Additionally, the approach incorporates the capability for dealing with undesired effects of variations in speech characteristics. Using the experimental investigations conduced with clean and broadcast news audio, it is shown that the proposed method is significantly more effective than the currently popular techniques for speaker change detection. To enhance the computational efficiency of the proposed method, modified implementation algorithms are introduced which are based on the exploitation of the redundant operations and a fast scoring procedure. It is shown that, through the use of the proposed fast algorithm, the computational efficiency of the approach can be increased by over 77% without significant reduction in its accuracy. The paper discusses the principles and characteristics of the proposed speaker change detection method, and provides a detailed description of its efficient implementation. The experiments, investigating the performance of the proposed method and its effectiveness in relation to other approaches, are described and an analysis of the results is presented.
IEEE Signal Processing Letters | 2006
Amit S. Malegaonkar; Aladdin M. Ariyaeeinia; P. Sivakumaran; J. Fortuna
This letter presents an investigation into the use of a probabilistic pattern matching approach for detecting speaker changes in audio streams. The experiments are conducted using clean speech as well as broadcast news material. It is shown that, in the proposed approach, the use of bilateral scoring is considerably more effective than unilateral scoring. Appropriate score normalization methods are considered in the study. It is observed that in all the cases, the bilateral scoring approach outperforms the currently popular method of Bayesian information criterion (BIC) for speaker change detection. This letter discusses the principles of the proposed approach and details the experimental investigations
Speech Communication | 2003
P. Sivakumaran; Aladdin M. Ariyaeeinia; Martin J. Loomes
Original article can be found at: http://www.sciencedirect.com/science/journal/01676393 --Copyright Elsevier B.V.
international carnahan conference on security technology | 2008
Amit S. Malegaonkar; Aladdin M. Ariyaeeinia; P. Sivakumaran; J. Fortuna
This paper presents investigations into an effective bilateral scoring method in open-set speaker identification. The approach is based on the fact that two different speakers usually are not reciprocal. A difficulty in deploying bilateral scoring is that test utterances are normally much shorter than training utterances. To tackle this problem, the proposed approach provides the final identification score based on a weighted combination of independently normalised forward and reverse scores. Based on the experimental results obtained using clean and telephone quality speech, it is shown that the proposed approach is more effective than the conventional scoring methods in open-set speaker identification.
international conference on acoustics, speech, and signal processing | 2000
P. Sivakumaran; Aladdin M. Ariyaeeinia
This paper focuses on the spectral representation of the sub-band cepstrum in relation to that of the full-band cepstrum. Through theoretical analysis it is shown that the net spectral information content of the cepstral coefficients with the same index in different sub-bands is only comparable to that of a full-band cepstral parameter whose quefrency is given by the product of that specific index with the number of sub-bands. A new method is proposed to tackle this deficiency of the sub-band cepstrum when it is used in the context of text-dependent speaker verification. The experimental investigations have clearly demonstrated the effectiveness of this method in speaker verification.
IET Biometrics | 2012
Surosh G. Pillay; Aladdin M. Ariyaeeinia; P. Sivakumaran; Mark Ipswich Pawlewski
This paper presents a new approach to condition-adjusted T-norm ( CT-Norm ) for speaker verification under significant mismatched noise conditions. The study is motivated by the fact that, though the standard CT-Norm method offers enhanced accuracy under mismatched data conditions, its effectiveness reduces with the increased severity of such conditions. The proposed approach attempts to address this challenge by providing a more effective reduction of data mismatch through the incorporation of multi-signal-to-noise ratio (SNR) universal background models (UBMs). The effectiveness of the proposed approach is demonstrated through experiments based on examples of real-world noise. It is shown that the superiority of the approach over CT-Norm is particularly significant for such excessive levels of test data degradation considered in the study as 5 dB SNR and below. The paper provides a description of the characteristics of the proposed approach and details the experimental analysis of its effectiveness under different noise conditions.
human factors in computing systems | 2000
Andi Bateman; Jill Hewitt; Aladdin M. Ariyaeeinia; P. Sivakumaran; Andrew Lambourne
This paper relates to ongoing work in relation to the creation of live television subtitles by speaking them. It describes an editing interface which has been developed to rapidly correct errors produced by the speech recogniser.
international carnahan conference on security technology | 1995
Aladdin M. Ariyaeeinia; P. Sivakumaran
The effectiveness, for text-dependent speaker verification, of orthogonal instantaneous and transitional feature parameters of speech is investigated. Instantaneous spectral features are represented by cepstral coefficients obtained through a linear prediction analysis of speech. Transitional spectral information is characterised using differential cepstral coefficients. Sets of orthogonal parameters are obtained by applying an eigenvector analysis to instantaneous and transitional feature coefficients. The experimental work is based on the use of a subset of the BT Millar speech database, consisting of repetitions of isolated digit utterances 1 to 9 and zero spoken by twenty male speakers. The investigation includes an examination of the relative speaker discrimination abilities of the above two types of orthogonal feature parameters. It is shown experimentally that the equal error rate in verification can be reduced significantly by forming a spectral distance based on a combination of orthogonal instantaneous and transitional feature parameters. It is further demonstrated that, when the input utterance consists of a sequence of five digits, an equal error rate of less than 0.5% can be achieved.
conference of the international speech communication association | 2001
P. Sivakumaran; J. Fortuna; Aladdin M. Ariyaeeinia
conference of the international speech communication association | 1997
Aladdin M. Ariyaeeinia; P. Sivakumaran