Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Amita Dev is active.

Publication


Featured researches published by Amita Dev.


International Journal of Computer Applications | 2010

Robust Features for Noisy Speech Recognition using MFCC Computation from Magnitude Spectrum of Higher Order Autocorrelation Coefficients

Amita Dev; Poonam Bansal

robustness is one of the most challenging problem in automatic speech recognition. The goal of robust feature extraction is to improve the performance of speech recognition in adverse conditions. The mel-scaled frequency cepstral coefficients (MFCCs) derived from Fourier transform and filter bank analysis are perhaps the most widely used front-ends in state-of-the-art speech recognition systems. One of the major issues with the MFCCs is that they are very sensitive to additive noise. To improve the robustness of speech front-ends we introduce, in this paper, a new set of MFCC vector which is estimated through three steps. First, the relative higher order autocorrelation coefficients are extracted. Then magnitude spectrum of the resultant speech signal is estimated through the fast Fourier transform (FFT) and it is differentiated with respect to frequency. Finally, the differentiated magnitude spectrum is transformed into MFCC-like coefficients. These are called MFCCs extracted from Differentiated Relative Higher Order Autocorrelation Sequence Specrum (DRHOASS). Speech recognition experiments for various tasks indicate that the new feature vector is more robust than traditional mel-scaled frequency cepstral coefficients (MFCCs) in additive noise conditions.


Ai & Society | 2008

Effect of retroflex sounds on the recognition of Hindi voiced and unvoiced stops

Amita Dev

As development of the speech recognition system entirely depends upon the spoken language used for its development, and the very fact that speech technology is highly language dependent and reverse engineering is not possible, there is an utmost need to develop such systems for Indian languages. In this paper we present the implementation of a time delay neural network system (TDNN) in a modular fashion by exploiting the hidden structure of previously phonetic subcategory network for recognition of Hindi consonants. For the present study we have selected all the Hindi phonemes for srecognition. A vocabulary of 207 Hindi words was designed for the task-specific environment and used as a database. For the recognition of phoneme, a three-layered network was constructed and the network was trained using the back propagation learning algorithm. Experiments were conducted to categorize the Hindi voiced, unvoiced stops, semi vowels, vowels, nasals and fricatives. A close observation of confusion matrix of Hindi stops revealed maximum confusion of retroflex stops with their non-retroflex counterparts.


Iete Journal of Research | 2008

Optimum HMM combined with vector quantization for hindi speech word recognition

Poonam Bansal; Amita Dev; Shail Bala Jain

Abstract This paper proposes an optimum speaker-independent, isolated word Hidden Markov Model (HMM) recognizer for the Hindi language. The recognition system is based on the combination of the vector quantization (VQ) technique at the acoustical level and the Markovian modeling at the recognition level. The recognizer consists of three modules – feature extraction, vector quantizer and HMM training and testing modules. The scheme proposed here firstly computes the acoustic features in terms of the Linear Predictive Cepstral LPC coefficients, Mel-Frequency Cepstral coefficients and delta MFCC along with noise and silence detection. Then, codebooks are created using VQ, and finally in the recognition phase, an optimum set of parameters are derived from different phases for getting the highest recognition score. The training and testing database consists of a set of 35 utterances of nine Indian cities/states and 35 utterances of nine digits spoken in Hindi by male and female speakers. The recognition rate was observed to be 98.61%.


Ai & Society | 2003

Categorization of Hindi phonemes by neural networks

Amita Dev; Shyam S. Agrawal; D. R. Choudhury

The prime objective of this paper is to conduct phoneme categorization experiments for Indian languages. In this direction a major effort has been made to categorize Hindi phonemes using a time delay neural network (TDNN), and compare the recognition scores with other languages. A total of six neural nets aimed at the major coarse of phonetic classes in Hindi were trained. Evaluation of each net on 350 training tokens and 40 test tokens revealed a 99% recognition rate for vowel classes, 87% for unvoiced stops, 82% for voiced stops, 94.7% for semi vowels, 98.1% for nasals and 96.4% for fricatives. A new feature vector normalisation technique has been proposed to improve the recognition scores.


Ai & Society | 2012

Automatic phonetic segmentation of Hindi speech using hidden Markov model

Archana Balyan; Shyam S. Agrawal; Amita Dev

In this paper, we study the performance of baseline hidden Markov model (HMM) for segmentation of speech signals. It is applied on single-speaker segmentation task, using Hindi speech database. The automatic phoneme segmentation framework evolved imitates the human phoneme segmentation process. A set of 44 Hindi phonemes were chosen for the segmentation experiment, wherein we used continuous density hidden Markov model (CDHMM) with a mixture of Gaussian distribution. The left-to-right topology with no skip states has been selected as it is effective in speech recognition due to its consistency with the natural way of articulating the spoken words. This system accepts speech utterances along with their orthographic “transcriptions” and generates segmentation information of the speech. This corpus was used to develop context-independent hidden Markov models (HMMs) for each of the Hindi phonemes. The system was trained using numerous sentences that are relevant to provide information to the passengers of the Metro Rail. The system was validated against a few manually segmented speech utterances. The evaluation of the experiments shows that the best performance is obtained by using a combination of two Gaussians mixtures and five HMM states. A category-wise phoneme error analysis has been performed, and the performance of the phonetic segmentation has been reported. The modeling of HMMs has been implemented using Microsoft Visual Studio 2005 (C++), and the system is designed to work on Windows operating system. The goal of this study is automatic segmentation of speech at phonetic level.


International Journal of Computer Applications | 2011

A Novel Feature Extraction Technique for Speaker Identification

Amita Dev

paper presents a novel feature extraction approach for speaker identification when the speech is corrupted by additive noise. The environmental mismatch between training and testing data degrades the performance of speaker identification system. The performance degradation is primarily due to presence of background noise when try to match a given speaker to the set of known speakers in a database. Mel frequency cepstral coefficients (MFCCs) are perhaps the most widely used front ends in the state of the art speaker identification systems. One of the major issues with MFCCs is that they are very sensitive to additive noise. To overcome this bottleneck, a temporal filtering procedure on the autocorrelation sequence is proposed to minimize the effect of additive noise. The proposed feature is called Relative Autocorrelation Mel Frequency Cepstral Coefficients (A-MFCC) which is derived based on filtering the temporal trajectories of short time one sided autocorrelation sequence. This filtering process minimizes the effect of additive noise. No prior knowledge of noise characteristics is required. The additive noise can be a colored noise. For speaker identification, Hindi database was constructed from the speech samples of each known speaker. Feature vectors (MFCCs and A-MFCCs) were extracted from the samples by short-term spectral analysis, and processed further by vector quantization for locating the clusters in the feature space. Experimental results indicated that A-MFCCs significantly improved the performance of speaker identification system in noisy environment.


international conference oriental cocosda held jointly with conference on asian spoken language research and evaluation | 2013

Emotional hindi speech database

Sweeta Bansal; Amita Dev

To carry out any research in the field of spoken language, the essential requirement is speech database. In this paper attempt are made to present the features of Hindi language, why we need speech corpora and the current status of speech corpora development for Hindi language. Speech databases have been used for speech recognition, speech synthesis, speaker recognition, language translation, emotion recognition, emotion conversion. An attempt has been made to create emotional speech corpus. A brief introduction of Emotional Hindi Speech database developed at IPEC is given. A study is also preformed to understand the effect of emotion over F0 Contour and spectrum.


international conference oriental cocosda held jointly with conference on asian spoken language research and evaluation | 2013

Semi-automatic syllable-like segmentation for Hindi

Archana Balyan; Shyam S. Agrawal; Amita Dev; Ruchika Kumari

The goal of this study is automatic segmentation of speech at syllable level and also that the reasonable number of syllables may suffice the need for travel domain applications. This paper presents study of algorithm for identifying syllables based on linguistic rules in Hindi words. After survey of the relevant literature, a set of rules are identified and implemented as a simple easy-to-implement algorithm. The algorithm is tested on 2400 distinct words and algorithm performs with 99.5% accuracy for segmentation of written text. A baseline group delay based segmentation technique is applied on spoken speech sentences to generate labeled database at syllable level. The system is validated against a few manually segmented speech utterances. It is observed that vowels are more accurately segmented as compared to fricatives. It is seen that nearly accurate segmentation is achieved if the window scale factor is modified for each sentence.


Iete Journal of Research | 2016

Labelling of Hindi Speech

Archana Balyan; Amita Dev; Ruchika Kumari; Shyam S. Agrawal

ABSTRACT The goal of this paper is to obtain segmented and labelled speech at syllable level and also that the reasonable number of syllables may suffice the need for travel domain applications. A base-line group delay-based segmentation technique is applied on spoken speech sentences to generate labelled database at syllable level. The system is validated against 50 manually segmented speech utterances. The segmentation accuracy was evaluated by performing time-error analysis. It is observed that 63.07% syllables have time-error less than 30 ms. It is observed that vowels are more accurately segmented as compared to fricatives. The confidence interval is found to be 0.1147 ms for confidence level of 95%. This paper also presents implementation of algorithm for identifying syllables based on linguistic rules for Hindi words. After survey of the relevant literature, a set of rules are identified and implemented as a simple easy-to-implement algorithm. The text segmentation algorithm is tested on 2400 distinct words and algorithm performs with 99.5% accuracy for segmentation of written text.


international conference oriental cocosda held jointly with conference on asian spoken language research and evaluation | 2013

Hindi speech corpora: A review

Nivedita; P. Ahmed; Amita Dev; Shyam S. Agrawal

A benchmark dataset provides insight into the phenomena that generate the data. Hence, it is an essential requirement to conduct research that requires concept discovery from data. In this paper, we examine the current status of 26 (twenty-six) datasets for Hindi speech (or Hindi speech corpora). This paper also aims at studying their impacts on Hindi speech based computer mediated application development. During this study, we discovered that researchers have paid little attention to issues relating to data collection from a realistic environment through mobile phone. Out of the twenty-six Hindi speech corpora reviewed only one is created for speaker recognition, in which conversation speech samples are recorded through mobile phone for noisy as well as clear condition.

Collaboration


Dive into the Amita Dev's collaboration.

Top Co-Authors

Avatar

Shyam S. Agrawal

Maharshi Dayanand University

View shared research outputs
Top Co-Authors

Avatar

Archana Balyan

Maharaja Surajmal Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Poonam Bansal

Guru Gobind Singh Indraprastha University

View shared research outputs
Top Co-Authors

Avatar

Shail Bala Jain

Indira Gandhi Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Ruchika Kumari

Maharaja Surajmal Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Sweeta Bansal

Inderprastha Engineering College

View shared research outputs
Top Co-Authors

Avatar

Archana Agarwal

Inderprastha Engineering College

View shared research outputs
Top Co-Authors

Avatar

D. R. Choudhury

Delhi Technological University

View shared research outputs
Top Co-Authors

Avatar

Manju Khari

Guru Gobind Singh Indraprastha University

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge