P. Dhanalakshmi | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where P. Dhanalakshmi is active.

Explore More

Publication

Featured researches published by P. Dhanalakshmi.

Expert Systems With Applications | 2009

Classification of audio signals using SVM and RBFNN

P. Dhanalakshmi; S. Palanivel; Vennila Ramalingam

In the age of digital information, audio data has become an important part in many modern computer applications. Audio classification has been becoming a focus in the research of audio processing and pattern recognition. Automatic audio classification is very useful to audio indexing, content-based audio retrieval and on-line audio distribution, but it is a challenge to extract the most common and salient themes from unstructured raw audio data. In this paper, we propose effective algorithms to automatically classify audio clips into one of six classes: music, news, sports, advertisement, cartoon and movie. For these categories a number of acoustic features that include linear predictive coefficients, linear predictive cepstral coefficients and mel-frequency cepstral coefficients are extracted to characterize the audio content. Support vector machines are applied to classify audio into their respective classes by learning from training data. Then the proposed method extends the application of neural network (RBFNN) for the classification of audio. RBFNN enables nonlinear transformation followed by linear transformation to achieve a higher dimension in the hidden space. The experiments on different genres of the various categories illustrate the results of classification are significant and effective.

Applied Soft Computing | 2011

Classification of audio signals using AANN and GMM

P. Dhanalakshmi; S. Palanivel; Vennila Ramalingam

Today, digital audio applications are part of our everyday lives. Audio classification can provide powerful tools for content management. If an audio clip automatically can be classified it can be stored in an organised database, which can improve the management of audio dramatically. In this paper, we propose effective algorithms to automatically classify audio clips into one of six classes: music, news, sports, advertisement, cartoon and movie. For these categories a number of acoustic features that include linear predictive coefficients, linear predictive cepstral coefficients and mel-frequency cepstral coefficients are extracted to characterize the audio content. The autoassociative neural network model (AANN) is used to capture the distribution of the acoustic feature vectors. The AANN model captures the distribution of the acoustic features of a class, and the backpropagation learning algorithm is used to adjust the weights of the network to minimize the mean square error for each feature vector. The proposed method also compares the performance of AANN with a Gaussian mixture model (GMM) wherein the feature vectors from each class were used to train the GMM models for those classes. During testing, the likelihood of a test sample belonging to each model is computed and the sample is assigned to the class whose model produces the highest likelihood.

Journal of Computer Science | 2014

SPEECH/MUSIC CLASSIFICATION USING WAVELET BASED FEATURE EXTRACTION TECHNIQUES

Thiruvengatanadhan Ramalingam; P. Dhanalakshmi

Audio classification serves as the fundamental step towards the rapid growth in audio data volume. Due to the increasing size of the multimedia sources speech and music classification is one of the most important issues for multimedia information retrieval. In this work a speech/music discrimination system is developed which utilizes the Discrete Wavelet Transform (DWT) as the acoustic feature. Multi resolution analysis is the most significant statistical way to extract the features from the input signal and in this study, a method is deployed to model the extracted wavelet feature. Support Vector Machines (SVM) are based on the principle of structural risk minimization. SVM is applied to classify audio into their classes namely speech and music, by learning from training data. Then the proposed method extends the application of Gaussian Mixture Models (GMM) to estimate the probability density function using maximum likelihood decision methods. The system shows significant results with an accuracy of 94.5%.

Engineering Applications of Artificial Intelligence | 2011

Pattern classification models for classifying and indexing audio signals

P. Dhanalakshmi; S. Palanivel; Vennila Ramalingam

In the age of digital information, audio data has become an important part in many modern computer applications. Audio classification and indexing has been becoming a focus in the research of audio processing and pattern recognition. In this paper, we propose effective algorithms to automatically classify audio clips into one of six classes: music, news, sports, advertisement, cartoon and movie. For these categories a number of acoustic features that include linear predictive coefficients, linear predictive cepstral coefficients and mel-frequency cepstral coefficients are extracted to characterize the audio content. The autoassociative neural network model (AANN) is used to capture the distribution of the acoustic feature vectors. Then the proposed method uses a Gaussian mixture model (GMM)-based classifier where the feature vectors from each class were used to train the GMM models for those classes. During testing, the likelihood of a test sample belonging to each model is computed and the sample is assigned to the class whose model produces the highest likelihood. Audio clip extraction, feature extraction, creation of index, and retrieval of the query clip are the major issues in automatic audio indexing and retrieval. A method for indexing the classified audio using LPCC features and k-means clustering algorithm is proposed.

FICTA (1) | 2015

SVM and HMM Modeling Techniques for Speech Recognition Using LPCC and MFCC Features

S. Ananthi; P. Dhanalakshmi

Speech Recognition approach intends to recognize the text from the speech utterance which can be more helpful to the people with hearing disabled. Support Vector Machine (SVM) and Hidden Markov Model (HMM) are widely used techniques for speech recognition system. Acoustic features namely Linear Predictive Coding (LPC), Linear Prediction Cepstral Coefficient (LPCC) and Mel Frequency Cepstral Coefficients (MFCC) are extracted. Modeling techniques such as SVM and HMM were used to model each individual word thus owing to 620 models which are trained to the system. Each isolated word segment from the test sentence is matched against these models for finding the semantic representation of the test input speech. The performance of the system is evaluated for the words related to computer domain and the system shows an accuracy of 91.46% for SVM 98.92% for HMM. From the exhaustive analysis, it is evident that HMM performs better than other modeling techniques such as SVM.

International Journal of Computer Applications | 2013

Speech Recognition System and Isolated Word Recognition based on Hidden Markov Model (HMM) for Hearing Impaired

S. Ananthi; P. Dhanalakshmi

The ability of a reader to recognize written words correctly, virtually and effortlessly is defined as Word Recognition or Isolated Word Recognition. It will recognize each word from their shape. Speech Recognition is the operating system which enablesto convert spoken words to written text which is called as Speech to Text (STT) method. Usual Method used in Speech Recognition (SR) is Neural Network, Hidden Markov Model (HMM) and Dynamic Time Warping (DTW). The widely used technique for Speech Recognition is HMM. Hidden Markov Model assumes that successive acoustic features of a spoken word are state independent. The occurrence of one feature is independent of the occurrence of the others state. Here each single unit of word is considered as state. Based upon the probability of the state it generates possible word sequence for the spoken word. Instead of listening to the speech, the generated sequence of text can be easily viewed. Each word is recognized from their shape. People with hearing impaired can make use of this Speech Recognition.

Archive | 2016

Analysis of Throat Microphone Using MFCC Features for Speaker Recognition

R. Visalakshi; P. Dhanalakshmi; S. Palanivel

In this paper, a visual aid system has been developed for helping people with sight loss (visually impaired) to help them in distinguishing among several speakers. We have analyzed the performance of a speaker recognition system based on features extracted from the speech recorded using a throat microphone in clean and noisy environment. In general, clean speech performs better for speaker recognition system. Speaker recognition in noisy environment, using a transducer held at the throat results in a signal that is clean even in noisy. The characteristics are extracted by means of Mel-Frequency Cepstral Coefficients (MFCC). Radial Basis function neural network (RBFNN) and Auto associative neural network (AANN) are two modeling techniques used to capture the features and in order to identify the speakers from clean and noisy environment. RBFNN and AANN model is used to reduce the mean square error among the feature vectors. The proposed work also compares the performance of RBFNN with AANN. By comparing the results of the two models, AANN performs well and produces better results than RBFNN using MFCC features in terms of accuracy.

atlantic web intelligence conference | 2003

Rough: fuzzy reasoning for customized text information retrieval

S. P. Singh; P. Dhanalakshmi; Lipika Dey

Due to the large repository of documents available on the web, users are usually inundated by a large volume of information most of which are found to be irrelevant. Since user perspectives vary, a client-side text filtering system that learns the users perspective can reduce the problem of irrelevant retrieval. In this paper, we have provided the design of a customized text information filtering system which learns user preferences and uses a rough-fuzzy reasoning scheme to filter out irrelevant documents. The rough set based reasoning takes care of natural language nuances like synonym handling, very elegantly. The fuzzy decider provides qualitative grading to the documents for the users perusal. We have provided the detailed design of the various modules and some results related to the performance analysis of the system.

Archive | 2016

MRI Classification of Parkinson’s Disease Using SVM and Texture Features

S. Pazhanirajan; P. Dhanalakshmi

A novel method for automatic classification of magnetic resonance image (MRI) under categories of normal and Parkinson’s disease (PD) is then classified according to the severity of the medical specialty drawbacks. In recent years, with the advancement in all fields, human suffers from numerous specialty disorders like brain disorder, epilepsy, Alzheimer, Parkinson, etc. Parkinson’s involves the malfunction and death of significant nerve cells within the brain, known as neurons. As metal progresses, the quantity of Dopastat made within the brain decreases, defeat someone, and make them unable to manage movements commonly. In the planned system, T2 (spin-spin relaxation time)—weighted MR images are obtained from the potential PD subjects. For categorizing the MRI knowledge, bar graph options and gray level co-occurrence matrix (GLCM) options are extracted. The options obtained are given as input to the SVM classifier that classifies the information into traditional or PD classes. The system shows a satisfactory performance of quite 87 %.

International Journal of Speech Technology | 2016

Performance of speaker localization using microphone array

R. Visalakshi; P. Dhanalakshmi; S. Palanivel

Speaker localization is a technique to locate and track an active speaker from multiple acoustic sources using microphone array. Microphone array is used to improve the speech quality of recorded speech signal in meeting room and other places. In this work, the time delay estimation between source and each microphone is calculated using a localization method called time differences of arrival (TDOA). TDOA localization consists of two steps namely (a) a time delay estimator and (b) a localization estimator. For time delay estimation, the generalized cross-correlation using phase transform, the generalized cross correlation using maximum likelihood, linear prediction (LP) residual and the Hilbert envelope of the LP residual are chosen for estimating the location of a person. A new speaker localization algorithm known as group search optimization (GSO) algorithm is proposed. The performance of this algorithm is analyzed and compared with Gauss–Newton nonlinear least square method and genetic algorithm. Experimental results show that the proposed GSO method outperforms the other methods in terms of mean square error, root mean square error, mean absolute error, mean absolute percentage error, euclidean distance and mean absolute relative error.

Explore More