Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Fengpei Ge is active.

Publication


Featured researches published by Fengpei Ge.


international symposium on neural networks | 2009

An SVM-Based Mandarin Pronunciation Quality Assessment System

Fengpei Ge; Fuping Pan; Changliang Liu; Bin Dong; Shui-duen Chan; Xinhua Zhu; Yonghong Yan

This paper presents our Mandarin pronunciation quality assessment system for the examination of Putonghua Shuiping Kaoshi (PSK) and investigates a novel Support Vector Machine (SVM) based method to improve its assessment accuracy. Firstly, an selective speaker adaptation module is introduced, in which we select well pronounced speech from results of the first-pass automatic pronunciation scoring as the adaptation data, and adopt Maximum Likelihood Linear Regression to update the acoustic model (AM). Then, compared with the traditional triphone based AM, the monophone based AM is studied. Finally, we propose a new method of incorporating all kinds of posterior probabilities using SVM classifier. Experimental results show that the average correlation coefficient between machine and human scores is improved from 83.72% to 85.48%. It suggests that the two methods of selective speaker adaptation and multi-model combination using SVM are very effective to improve the accuracy of pronunciation quality assessment.


international symposium on computer science and society | 2011

Experimental Investigation of Mandarin Pronunciation Quality Assessment System

Fengpei Ge; Li Lu; Yonghong Yan

As the most effective confidence measure in computer assisted language learning system, the posterior probability is used widely, in which some tricks are applied to reduce the computation complexity. In this paper, we analysis the defect of the traditional algorithm and propose some improvements. Firstly, the traditional algorithm adopts the method of maximum instead of sum in the calculation of the denominator, which seriously reduces the accuracy of posterior probability. Therefore, taking into account both computation complexity and system performance, we propose a novel algorithm based on phoneme confusion extended network. Secondly, in the traditional algorithm, the posterior probability is normalized by its segment time. Infact, the acoustic likelihood is more related with time and grows with the frame number. So we propose the acoustic likelihood based normalization algorithm. Experiment results show that compared to traditional algorithm, the proposed algorithm can improve system performance significantly, about 35% average score error rate relatively, and the computation complexity is hardly increased.


international conference on research challenges in computer science | 2009

An Effective CALL System for Strongly Accented Mandarin Speech

Tonghai Jiang; Ming Tang; Fengpei Ge; Changliang Liu; Bin Dong

In this paper, we investigate some specific acoustic problems of the computer assisted language learning (CALL) system by modifying the acoustic model and feature under the speech recognition framework. At first, in order to alleviate the distortion of channel and speaker, speaker-dependent Cepstrum Mean Normalization (Speaker CMN) is adopted, by which the average correlation coefficient (ACC) between human and machine scores is improved from 78.00% to 84.14%. Then, Heteroscedastic Linear Discriminate Analysis (HLDA) is applied to enhance the discrimination ability of acoustic model, which successfully increases ACC from 84.14% to 84.62%. Additionally, HLDA can lessen the great human-machine scoring difference of speeches that have very good or too bad quality, and so leads to an increase of the correctly-rank rate from 85.59% to 90.99%. Finally, we use the technology of Maximum a Posteriori (MAP) to tune the acoustic model to match the strongly accented testing speech. As the result, ACC is improved from 84.62% to 86.57%.


international conference on natural computation | 2008

Application of LVCSR to the Detection of Chinese Mandarin Reading Miscues

Changliang Liu; Fuping Pan; Fengpei Ge; Bin Dong; Qingwei Zhao; Yonghong Yan

For a reading tutor, the most important task is to detect the reading miscues such as insertions, omissions, etc. This paper constructed a Chinese reading miscues detection system based on technologies of general large vocabulary continuous speech recognition and proposed two methods to improve the performance of the detection. The first is to align the reference to the confusion network resulted from the recognition instead of the 1-best result to find out the reading miscues. And the second is using the knowledge of reference to regulate the decoding by weighting the n-gram language model probability, if it exists in the reference. The experiments on a Chinese Mandarin reading corpus proved the effectiveness of these two modifications. The detection MDerr and FArate are depressed 50.1% and 70.8% totally by these two methods.


IEICE Transactions on Information and Systems | 2008

Effective Acoustic Modeling for Pronunciation Quality Scoring of Strongly Accented Mandarin Speech

Fengpei Ge; Changliang Liu; Jian Shao; Fuping Pan; Bin Dong; Yonghong Yan

In this paper we present our investigation into improving the performance of our computer-assisted language learning (CALL) system through exploiting the acoustic model and features within the speech recognition framework. First, to alleviate channel distortion, speaker-dependent cepstrum mean normalization (CMN) is adopted and the average correlation coefficient (average CC) between machine and expert scores is improved from 78.00% to 84.14%. Second, heteroscedastic linear discriminant analysis (HLDA) is adopted to enhance the discriminability of the acoustic model, which successfully increases the average CC from 84.14% to 84.62%. Additionally, HLDA causes the scoring accuracy to be more stable at various pronunciation proficiency levels, and thus leads to an increase in the speaker correct-rank rate from 85.59% to 90.99%. Finally, we use maximum a posteriori (MAP) estimation to tune the acoustic model to fit strongly accented test speech. As a result, the average CC is improved from 84.62% to 86.57%. These three novel techniques improve the accuracy of evaluating pronunciation quality.


international conference on audio, language and image processing | 2008

Some acoustic improvements for pronunciation quality assessment for strongly accented mandarin speech

Fengpei Ge; Fuping Pan; Changliang Liu; Bin Dong; Qingwei Zhao; Yonghong Yan

This paper presents our recent study in resolving some specific acoustic problems of the computer assisted language learning (CALL) system by modifying the acoustic model (AM) and feature under ASR framework. Firstly, speaker dependent cepstrum mean normalization (Speaker CMN) is adopted to alleviate the distortion of channel, with which the average human-machine scoring correlation coefficient (ACC) is improved from 78.00% to 84.14%. Heteroscedastic linear discriminate analysis (HLDA) is then applied to enhance the discrimination ability of AM, which successfully increases ACC from 84.14% to 84.62%. Additionally, HLDA can lessen the great human-machine scoring difference of those speeches that have very good or too bad pronunciation quality, and so lead to an increase of the correctly-rank rate (CRR) from 85.59% to 90.99%. Finally, we use maximum a posteriori (MAP) to tune AM to match the strong accented test speech. As the result, ACC is improved from 84.62% to 86.57%.


international conference on research challenges in computer science | 2009

An Mandarin Pronunciation Quality Assessment System Using Two Kinds of Acoustic Models

Fengpei Ge; Li Lu; Changliang Liu; Fuping Pan; Bin Dong; Yonghong Yan

This paper presents our Mandarin pronunciation quality assessment system for the examination of Putonghua Shuiping Kaoshi (PSK) and investigates some measures to improve the assessment accuracy. In this paper, a selective speaker adaptation method is studied. In the adaptation module, we select well pronounced speech as the adaptation data, and adopt Maximum Likelihood Linear Regression (MLLR) to update the speaker-independent (SI) acoustic model. Besides the triphone based acoustic model, the monophone based acoustic model is also applied to our system. Further improvements are obtained by combining posterior probabilities computed with triphone and monophone based acoustic models using Support Vector Machine (SVM) to assess the goodness of pronunciations. The experiment results show that the average correlation coefficient (ACC) between machine and the human scores achieves 0.8549, almost equivalent to ACC between different experts. The improved system achieves usable performance in actual applications.


international conference on intelligent computation technology and automation | 2012

Improvement of Acoustic Model in Text-independent Pronunciation Quality Assessment

Yaohui Qi; Changhai Shi; Fengpei Ge; Yonghong Yan

In order to give an accurate assessment, the test speech should be recognized firstly in the text-independent pronunciation quality assessment system. Field test data has some flaws which degrade the recognition performance, such as noise, accent and spontaneous speaking style. In this paper, we investigate these factors by improving the acoustic model (AM) for the speech recognition system. Background noise is added to the training data to enhance the ability of anti-noise. Speaker-based Cepstral Mean and Variance Normalization (SCMVN) is adopted to alleviate the distortion of channel and the impact of inter-speaker pronunciation variability. Maximum a Posteriori (MAP) adaptation is applied twice, in order to tune acoustic model to match the pronunciation characteristic of the accent and the spontaneous style in spoken language. According to the experimental results, above measures increase the word correct rate relatively by 44.1% and the correlation coefficient between machine score and expert score relatively by 6.3%.


Journal of the Acoustical Society of America | 2012

Text-independent pronunciation quality automatic assessment system for English retelling test

Yaohui Qi; Bin Dong; Fengpei Ge; Yonghong Yan

An automatic grading system for spoken English retelling test is presented in this paper. Speech recognition technology is used in the system to evaluate the quality of retelling according to the pre-defined scoring rubric which includes speech fluency, pronunciation accuracy and content integrity. Scoring features for these quality aspects are firstly extracted by applying LVCSR, keyword spotting, forced alignment and confidence measurements. And then, these features are mapped to a score by using SVM model which is pre-trained on human rated test items. According to the experimental results the correlation coefficient between machine scores and expert scores is 0.729, which means that the system can be used in real examination to replace human scores. This work is partially supported by the National Natural Science Foundation of China (No. 10925419, 90920302, 10874203, 60875014, 61072124, 11074275, 11161140319).


international symposium on chinese spoken language processing | 2010

Forward optimal measures for automatic mispronunciation detection

Changliang Liu; Fuping Pan; Fengpei Ge; Bin Dong; Yonghong Yan

Pronunciation measure computation is a vital part of Computer Assisted Pronunciation Training (CAPT) system. This paper conducts some research on pronunciation measures based on the two popular measures - Log posterior probability (LPP) and Goodness of Pronunciation (GOP). A modified GOP - AGOP is proposed which directly uses the segmentation information of forced alignment instead of free phone recognizer (FPR) when computing the denominator of GOP to avoid the effect of inaccuracy of FPR. The context dependent acoustic models is investigated in mispronunciation detection. It is found that Tri-phone AM has better performance in mispronunciation detection of continuous speech. This paper also proposes a fast algorithm of pronunciation measure - FAGOP which uses the maximization instead of summation to calculate the denominator of AGOP approximately and applies Viterbi algorithm with some effective pruning strategy to reduce the computation perplexity. It achieves much better efficiency while barely impairing the detection presicion.

Collaboration


Dive into the Fengpei Ge's collaboration.

Top Co-Authors

Avatar

Yonghong Yan

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Bin Dong

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Changliang Liu

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Fuping Pan

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Qingwei Zhao

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Li Lu

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Ran Xu

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Tonghai Jiang

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Yaohui Qi

Beijing Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Shui-duen Chan

Hong Kong Polytechnic University

View shared research outputs
Researchain Logo
Decentralizing Knowledge