Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Mumtaz Begum Mustafa is active.

Publication


Featured researches published by Mumtaz Begum Mustafa.


Expert Systems With Applications | 2015

Exploring the influence of general and specific factors on the recognition accuracy of an ASR system for dysarthric speaker

Mumtaz Begum Mustafa; Fadhilah Rosdi; Siti Salwah Salim; Muhammad Umair Mughal

Reviewed existing literature on the performance of ASR system for dysarthria.Analysed influence of speech/speaker mode, vocabulary size & speaking style on WER.Identified specific factors of Dysarthric speech on WER.Analysed influence of specific factor on WER.Measured the correlation of general and specific factors on WER. Automatic speech recognition (ASR) is becoming an important assistive tool among the speech impaired individuals such as dysarthria. Currently, the existing ASR systems were unable to recognise dysarthric speech at an acceptable degree. Little research was carried out to identify factors that influence the performance of ASR system in recognising dysarthric speech. This article aims to identify factors that potentially influence the recognition accuracy of ASR system in recognising dysarthric speech. Some of the factors that influence the recognition accuracy of ASR, which have been confirmed in existing researches for ASR system are speech mode, speaker mode, vocabulary size and speaking styles. We have also focused at factors that are more specific to dysarthric speech such as speech intelligibility, severity and intra-speaker variability that potentially influence the recognition accuracy. We have evaluated the influence of these factors on recognition accuracy using data published in existing researches. It was found that general factors considered in this review have little influence on the recognition accuracy. However, factors more specific to dysarthric speech were found to have a significant influence on the recognition accuracy of the ASR system. From the findings, it can be concluded that intelligibility and severity have significant influence on the recognition accuracy. To improve the recognition accuracy of ASR system, methods and techniques that reduces the influence of these specific factors should be identified.


Journal of the Acoustical Society of America | 2013

Emotional speech acoustic model for Malay: Iterative versus isolated unit training

Mumtaz Begum Mustafa; Raja Noor Ainon

The ability of speech synthesis system to synthesize emotional speech enhances the users experience when using this kind of system and its related applications. However, the development of an emotional speech synthesis system is a daunting task in view of the complexity of human emotional speech. The more recent state-of-the-art speech synthesis systems, such as the one based on hidden Markov models, can synthesize emotional speech with acceptable naturalness with the use of a good emotional speech acoustic model. However, building an emotional speech acoustic model requires adequate resources including segment-phonetic labels of emotional speech, which is a problem for many under-resourced languages, including Malay. This research shows how it is possible to build an emotional speech acoustic model for Malay with minimal resources. To achieve this objective, two forms of initialization methods were considered: iterative training using the deterministic annealing expectation maximization algorithm and the isolated unit training. The seed model for the automatic segmentation is a neutral speech acoustic model, which was transformed to target emotion using two transformation techniques: model adaptation and context-dependent boundary refinement. Two forms of evaluation have been performed: an objective evaluation measuring the prosody error and a listening evaluation to measure the naturalness of the synthesized emotional speech.


PLOS ONE | 2014

Severity-based adaptation with limited data for ASR to aid dysarthric speakers

Mumtaz Begum Mustafa; Siti Salwah Salim; Noraini Mohamed; Bassam Ali Al-Qatab; Chng Eng Siong

Automatic speech recognition (ASR) is currently used in many assistive technologies, such as helping individuals with speech impairment in their communication ability. One challenge in ASR for speech-impaired individuals is the difficulty in obtaining a good speech database of impaired speakers for building an effective speech acoustic model. Because there are very few existing databases of impaired speech, which are also limited in size, the obvious solution to build a speech acoustic model of impaired speech is by employing adaptation techniques. However, issues that have not been addressed in existing studies in the area of adaptation for speech impairment are as follows: (1) identifying the most effective adaptation technique for impaired speech; and (2) the use of suitable source models to build an effective impaired-speech acoustic model. This research investigates the above-mentioned two issues on dysarthria, a type of speech impairment affecting millions of people. We applied both unimpaired and impaired speech as the source model with well-known adaptation techniques like the maximum likelihood linear regression (MLLR) and the constrained-MLLR(C-MLLR). The recognition accuracy of each impaired speech acoustic model is measured in terms of word error rate (WER), with further assessments, including phoneme insertion, substitution and deletion rates. Unimpaired speech when combined with limited high-quality speech-impaired data improves performance of ASR systems in recognising severely impaired dysarthric speech. The C-MLLR adaptation technique was also found to be better than MLLR in recognising mildly and moderately impaired speech based on the statistical analysis of the WER. It was found that phoneme substitution was the biggest contributing factor in WER in dysarthric speech for all levels of severity. The results show that the speech acoustic models derived from suitable adaptation techniques improve the performance of ASR systems in recognising impaired speech with limited adaptation data.


2014 Third ICT International Student Project Conference (ICT-ISPC) | 2014

Automatic speech recognition system for Malay speaking children

Feisal Dani Rahman; Noraini Mohamed; Mumtaz Begum Mustafa; Siti Salwah Salim

Automatic speech recognition or ASR system in short, is the most recent innovation in human computer interaction. An ASR system recognizes human speech and transforms them into outputs such as text or any other machine readable outputs. ASR is increasingly used in various applications such as dictation system, voice or speaker recognition and so on. Despite the advancement in the development of ASR system, not many of such system are developed for children. Children today are increasingly using computers for many daily activities including for education. The lack of ASR system for children causes them to be lagging in behind adult users. One of the reasons for the poor development of ASR system for children is the difficulties of obtaining or creating the speech corpus database of children. Unlike adults, researchers find it difficult to engage children in recording process. This research aims at developing an ASR system for Malay speaking children with the use of a small speech database. The ASR system developed in this research has the ability to recognize words at 76% accuracy.


international conference oriental cocosda held jointly with conference on asian spoken language research and evaluation | 2013

Context-dependent labels for an HMM-based speech synthesis system for Malay HMM-based speech synthesis system for Malay

Mumtaz Begum Mustafa; Zuraidah Mohd Don; Gerry Knowles

In a Hidden Markov Model-based speech synthesis system, an arbitrary text input has to be converted to context-dependent labels before it can be synthesized. This research proposes the development of a context-dependent label generating module for Malay, an under-resourced language with no available system to generate context-dependent labels. We have developed a grapheme-to-phoneme database and identified the contextual factors of Malay to generate the context-dependent labels for synthesizing Malay speech using an HMM-based speech synthesis system. We assessed the effectiveness of the generating module through intelligibility and naturalness evaluation of the synthetic utterances and these confirm the effectiveness of the proposed generating module. The high scores of intelligibility and naturalness indicate the accuracy of the synthesis labels generated by the generating module for any text input.


2011 International Conference on Speech Database and Assessments (Oriental COCOSDA) | 2011

Assessing the naturalness of malay emotional voice corpora

Mumtaz Begum Mustafa; Raja Noor Ainon; Roziati Zainuddin; Zuraidah Mohd Don; Gerry Knowles

This research reports the development and evaluation of Malay emotional voice corpora through listening evaluation, and how the numbers of emotion choices offered to evaluators affect the result of the evaluation. The voice corpora comprises of three emotions, namely anger, sadness and happiness being expressed by two male and two female actors. The voice corpora were evaluated in two separate listening tests involving a number of Malay native evaluators balanced for gender, age and profession. In the first listening test, evaluators were given twenty five choices of emotions to choose from. For the second test, the number of emotion choices is only five. Each test was conducted separately with different group of evaluators. The results of the two tests are grossly different with the emotion identification rate of the first test lower than the second test.


International Journal of Speech Technology | 2018

Speech emotion recognition research: an analysis of research focus

Mumtaz Begum Mustafa; Mansoor A. M. Yusoof; Zuraidah Mohd Don; Mehdi Malekzadeh

This article analyses research in speech emotion recognition (“SER”) from 2006 to 2017 in order to identify the current focus of research, and areas in which research is lacking. The objective is to examine what is being done in this field of research. Searching on selected keywords, we extracted and analysed 260 articles from well-known online databases. The analysis indicates that SER research is an active field of research, dozens of articles being published each year in journals and conference proceedings. The majority of articles concentrate on three critical aspects of SER, namely (1) databases, (2) suitable speech features, and (3) classification techniques to maximize the recognition accuracy of SER systems. Having carried out association analysis of the critical aspects and how they influence the performance of the SER system in term of recognition accuracy, we found that certain combination of databases, speech features and classifiers influence the recognition accuracy of the SER system. We have also suggested aspects of SER that could be taken into consideration in future works based on our review.


Applied Soft Computing | 2016

The experimental applications of search-based techniques for model-based testing

Aneesa Saeed; Siti Hafizah Ab Hamid; Mumtaz Begum Mustafa

Graphical abstractDisplay Omitted HighlightsA systematic review of applications of search-based techniques for model-based testing is provided.Four taxonomies are proposed to classify the applications based on the purpose, problems, solutions and evaluations.The applications are analyzed based on the proposed taxonomies.The development of search-based techniques for model-based testing is discussed.Limitations and potential research directions are summarized. ContextModel-based testing (MBT) aims to generate executable test cases from behavioral models of software systems. MBT gains interest in industry and academia due to its provision of systematic, automated, and comprehensive testing. Researchers have successfully applied search-based techniques (SBTs) by automating the search for an optimal set of test cases at reasonable cost compared to other more expensive techniques. Thus, there is a recent surge toward the applications of SBTs for MBT because the generated test cases are optimal and have low computational cost. However, successful, future SBTs for MBT applications demand deep insight into its existing experimental applications that underlines stringent issues and challenges, which is lacking in the literature. ObjectiveThe objective of this study is to comprehensively analyze the current state-of-the-art of the experimental applications of SBTs for MBT and present the limitations of the current literature to direct future research. MethodWe conducted a systematic literature review (SLR) using 72 experimental papers from six data sources. We proposed a taxonomy based on the literature to categorize the characteristics of the current applications. ResultsThe results indicate that the majority of the existing applications of SBTs for MBT focus on functional and structural coverage purposes, as opposed to stress testing, regression testing and graphical user interface (GUI) testing. We found research gaps in the existing applications in five areas: applying multi-objective SBTs, proposing hybrid techniques, handling complex constraints, addressing data and requirement-based adequacy criteria, and adapting landscape visualization. Only twelve studies proposed and empirically evaluated the SBTs for complex systems in MBT. ConclusionThis extensive systematic analysis of the existing literature based on the proposed taxonomy enables to assist researchers in exploring the existing research efforts and reveal the limitations that need additional investigation.


conference on industrial electronics and applications | 2015

Test oracles based on artificial neural networks and info fuzzy networks: A comparative study

Muhammad Elrashid Yousif; Seyed Reza Shahamiri; Mumtaz Begum Mustafa

One of the key software development activities that ensures software quality is software testing that needs automation due to scarcity of resources in software production; however, automating the testing process faces several issues especially those issues associated with automated test oracles. Test oracles offer simple and reliable sources of expected software behavior that guide testers to undertake the testing process and detect faults. This paper performs a comparative study on two existing test oracles using a black-box approach. We compare experimental studies, processes, and evaluation procedures reported so far. The two test oracles are Multi-network oracles based on ANNs, and IFN-based regression tester. ANN-based oracles have the capability of processing complex relationships while IFN is a test oracle that is limited to one functionality, although the test oracle performs best when applied for regression testing. The results obtained from existing experiments and evaluations, and they indicate that Multi-Network oracles have a better accuracy rate of 98.26% and a minimal misclassification error rate of 1.74% compared to the IFN regression tester. Consequently, Multi-network oracles based on ANNs are more suitable, offering better quality and reliability for a software testing process.


asia modelling symposium | 2014

Severity Based Adaptation for ASR to Aid Dysarthric Speakers

Bassam Ali Al-Qatab; Mumtaz Begum Mustafa; Siti Salwah Salim

Automatic speech recognition (ASR) for dysarthric speakers is one of the most challenging research areas. The lack of corpus for dysarthric speakers makes it even more difficult. This paper introduces the Intra-Severity adaptation, using small amount of speech data, in which data from all participants in a given severity type will use for adaptation of that type. The adaptation is performed for two types of acoustic models, which are the Controlled Acoustic Model (CAM) developed using rich phonetic corpus, and Dysarthric Acoustic Model (DAM) that includes speech collected from dysarthric speakers suffering from variety level of severity. This paper compares two adaptation techniques for building ASR systems for dysarthric speakers, which are Maximum Likelihood Linear Regression (MLLR) and Constrained Maximum Likelihood Linear Regression (CMLLR).The result shows that the Word Recognition Accuracy (WRA) for the CAM outperformed DAM for both the Speaker Independent (SI) and Speaker Adaptation (SA). On the other hand, it was found that MLLR is outperformed the CMLLR for both Controlled Speaker Adaptation (CSA) and Dysarthric Speaker Adaptation (DSA).

Collaboration


Dive into the Mumtaz Begum Mustafa's collaboration.

Top Co-Authors

Avatar

Siti Salwah Salim

Information Technology University

View shared research outputs
Top Co-Authors

Avatar

Raja Noor Ainon

Information Technology University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Noraini Mohamed

Information Technology University

View shared research outputs
Top Co-Authors

Avatar

Fadhilah Rosdi

National University of Malaysia

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Aneesa Saeed

Information Technology University

View shared research outputs
Researchain Logo
Decentralizing Knowledge