Yanqing Sun
Chinese Academy of Sciences
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Yanqing Sun.
international conference on information engineering and computer science | 2009
Yu Zhou; Yanqing Sun; Jianping Zhang; Yonghong Yan
In this paper, we propose a speech emotion recognition system using both spectral and prosodic features. Most traditional systems have focused on spectral features or prosodic features. Since both the spectral and the prosodic features contain emotion information, it is believed that the combining of spectral features and prosodic features will improve the performance of the emotion recognition system. Therefore, we propose to use both spectral and prosodic features. For spectral features, a GMM super vector based SVM is applied with them. For prosodic features, a set of prosodic features that are clearly correlated with speech emotional states and SVM is also used for emotion recognition. The combination of both spectral features and prosodic features is posed as a data fusion problem to obtain the final decision. Experimental results show that the combining of both spectral features and prosodic features yields the emotion error reduction rate of 18.0% and 52.8%, over using only spectral and prosodic features.
international conference on research challenges in computer science | 2009
Yu Zhou; Yanqing Sun; Lin Yang; Yonghong Yan
In this paper, we present an approach that using articulatory features (AFs) derived from spectral features for speech emotion recognition. Also, we investigated the combination of AFs and spectral features. Systems based on AFs only and combined spectral-articulatory features are tested on the CASIA Mandarin emotional corpus. Experiments results show that AFs alone are not suitable for speech emotion recognition and that the combination of spectral features and AFs don’t improve the performance of the system that using only spectral features.
international conference on information engineering and computer science | 2009
Yanqing Sun; Yu Zhou; Qingwei Zhao; Yonghong Yan
This paper tries to deal with the problem of performance degradation in emotion affected speech recognition. The F-Ratio analysis method in statistics is utilized to analyze the significance of different frequency bands for speech unit classification. The result is then used to optimize filter bank design for Mel-frequency cepstral coefficients (MFCC) and Perceptual Linear Prediction (PLP) features respectively in emotion affected speech recognition. Under comparable conditions, the modified features get a relative 40.14% decrease for MFCC and 34.93% for PLP in sentence error rate.
international conference on e-education, e-business, e-management and e-learning | 2010
Yanqing Sun; Qingwei Zhao; Changliang Liu; Yonghong Yan
This paper presents our confidence measure system for speech recognition to integrate with e-Service to make Human-Computer Interaction more convenient. In order to make the system more robust for practical usage, the confidence measure is optimized to improve its performance as well as speed, compared with traditional state based confidence measure. First, the decoding likelihood of the best path is normalized with all the survival paths to form the onepass-based posteriori. After decoding, when recognition result is available as well as the phoneme level division point, the phoneme loop posteriori based confidence is calculated. Different models are compared for speed and performance. Then they are combined to form the final confidence for the judgement. Experiments are designed, and the proposed confidence measure get a relative improvement of 20%, 19.33% for equal error rate and 37.19%, 35.17% of false acceptance rate for out-of-vocabulary set on the development sets and the test sets, with no loss of false rejection rate for in-vocabulary set.
web information systems modeling | 2009
Jie Gao; Yanqing Sun; Hongbin Suo; Qingwei Zhao; Yonghong Yan
We address the problem of effectively monitoring audio programs on the web. The paper tries to present how to construct such an audio program surveillance system using several state-of-the-art speech technologies. A real-world system WAPS (Web Audio Program Surveillance) is used as an example. WAPS is described in details in terms of the challenges it faces, it system architecture and its component modules. Objective evaluation of the whole WAPS is also given. Experiments show that WAPS shows satisfying performance on both artificially created data and real web data.
international workshop on advanced computational intelligence | 2010
Yanqing Sun; Qingwei Zhao; Qingqing Zhang; Yu Zhou; Yonghong Yan
This paper reports our recent work on optimizing the AF (articulatory features) based confidence measures, and combining them with the traditional HMM-based confidence measures. Different articulatory properties are analyzed using a separate AF-based confidence calculation method proposed in this paper, and are observed to be both complementary and redundant. A more compact subset is chosen and assembled based on the above analyses and contrast experiments, which gets a relative improvement of 12.7% on EER compared with using the whole AF set. The optimized AF-based confidence is finally combined with the HMM-based confidence, which increases the rejection rate for the out-of-vocabulary tests with no accuracy loss of the in-vocabulary tests compared with the baseline HMM system, and the relative improvement for the false acceptance rate is 34% on the development sets and 35.3% on the testing sets.
international conference on future biomedical information engineering | 2009
Yanqing Sun; Yu Zhou; Qingwei Zhao; Yonghong Yan; Xiao Wu
In this paper, one fuzzy retrieval algorithm is designed to work with LVCSR in the speech navigation system. Inverted indexing as well as other searching skills are utilized to speed up the searching while keeping the performance. Several cell levels are tried instead of just using word. Easily reaching 90% sentence accuracy within normal database, this framework can also handle very large database, as many as millions, while preserving at least 70% accuracy in both Chinese and English.
international workshop on advanced computational intelligence | 2010
Yanqing Sun; Jie Gao; Yu Zhou; Fuping Pan; Qingwei Zhao; Yonghong Yan
This paper describes the TBNR system, which features at many state-of-the-art technologies of speech recognition, covering the decoder, acoustic modeling, speech recognition features, and etc. By integrating these technologies, several optimizations have been performed to utilize multi-processors resources. Along with models in several typical languages, these systems could be used at once for several applications. API functions are designed in a way that the users not familiar with speech recognition could finish development writing only several lines of code to finish a complete task of speech recognition.
IEICE Transactions on Information and Systems | 2010
Yanqing Sun; Yu Zhou; Qingwei Zhao; Yonghong Yan
conference of the international speech communication association | 2009
Yu Zhou; Yanqing Sun; Junfeng Li; Jianping Zhang; Yonghong Yan