Ching-feng Yeh
National Taiwan University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Ching-feng Yeh.
IEEE Transactions on Audio, Speech, and Language Processing | 2014
Hung-yi Lee; Sz-Rung Shiang; Ching-feng Yeh; Yun-Nung Chen; Yu Huang; Sheng-yi Kong; Lin-Shan Lee
It takes very long time to go through a complete online course. Without proper background, it is also difficult to understand retrieved spoken paragraphs. This paper therefore presents a new approach of spoken knowledge organization for course lectures for efficient personalized learning. Automatically extracted key terms are taken as the fundamental elements of the semantics of the course. Key term graph constructed by connecting related key terms forms the backbone of the global semantic structure. Audio/video signals are divided into multi-layer temporal structure including paragraphs, sections and chapters, each of which includes a summary as the local semantic structure. The interconnection between semantic structure and temporal structure together with spoken term detection jointly offer to the learners efficient ways to navigate across the course knowledge with personalized learning paths considering their personal interests, available time and background knowledge. A preliminary prototype system has also been successfully developed.
international conference on acoustics, speech, and signal processing | 2012
Ching-feng Yeh; Aaron Heidel; Hong-Yi Lee; Lin-Shan Lee
In this work, we proposed a new framework for recognition of highly imbalanced code-mixed bilingual speech using an additional frame-level language detector in the conventional recognition system. Blurred posteriorgram features (BPFs) are also proposed to be used in the language detector. The approach was evaluated with real spontaneous lectures offered at National Taiwan University. The highly imbalanced language distribution in code-mixed speech makes the task difficult. Preliminary experimental results showed not only very good performance improvement, but the improvement is complementary to that brought by better acoustic models, whether due to better adaptation approach or increased training data. The code-mixed bilingual speech is frequently used in the daily lives of many people in the globalized world today.
international conference on acoustics, speech, and signal processing | 2011
Ching-feng Yeh; Liang-Che Sun; Chao-Yu Huang; Lin-Shan Lee
This paper presents a bilingual acoustic modeling approach for transcribing Mandarin-English code-mixed lectures with highly unbalanced language distribution. Special terminologies for the content were produced in the guest language of English (about 15%) and embedded in the utterances produced in the host language of Mandarin (about 85%). The code-mixing nature of the target corpus and the very small percentage of the English data made the task difficult. State mapping and merging approaches plus three stages of model adaptation handles the above problem. Significant improvements in recognition accuracy were obtained in the experiment with a real bilingual code-mixed lecture corpus recorded at National Taiwan University. The code-mixing situation considered is actually very natural in the spoken language of the daily lives of many people in the globalized world today.
spoken language technology workshop | 2010
Hung-yi Lee; Chia-ping Chen; Ching-feng Yeh; Lin-Shan Lee
This paper presents a new framework integrating different relevance feedback scenarios (pseudo relevance feedback and user relevance feedback in short- and long-term context) and different approaches (model- and example-based) in a spoken term detection system, and shows the retrieval performance can be improved step by step. It is found that short-term context user relevance feedback can further improve the retrieval performance after pseudo relevance feedback, regardless of whether the acoustic models have been adapted by matched data or long-term context user relevance feedback or not. Moreover, model-based and example-based methods are shown to be additive when integrated in short-term context user relevance feedback scenario.
international conference on acoustics speech and signal processing | 1988
Ching-feng Yeh; Ju-Hong Lee; Yu-Hao Chen
Estimating the two-dimensional (2-D) angle of arrival for radiating sources in a coherent environment is studied. The concept of spatial smoothing is first extended to a rectangular planar array, and a 2-D search function is formed to estimate the source directions. To avoid performing a 2-D search, an approach based on two one-dimensional (1-D) searches is also discussed. This approach uses rows and columns of the rectangular array to perform 1-D searches. To match the data obtained, a 2-D verification is then performed. Computer simulation results for both approaches based on the MUSIC method are presented.<<ETX>>
IEEE Transactions on Audio, Speech, and Language Processing | 2015
Ching-feng Yeh; Lin-Shan Lee
This paper considers the recognition of a widely observed type of bilingual code-switched speech: the speaker speaks primarily the host language (usually his native language), but with a few words or phrases in the guest language (usually his second language) inserted in many utterances of the host language. In this case, not only the languages are switched back and forth within an utterance so the language identification is difficult, but much less data are available for the guest language, which results in poor recognition accuracy for the guest language part. Unit merging approaches on three levels of acoustic modeling (triphone models, HMM states and Gaussians) have been proposed for cross-lingual data sharing for such highly imbalanced bilingual code-switched speech. In this paper, we present an improved overall framework on top of the previously proposed unit merging approaches for recognizing such code-switched speech. This includes unit recovery for reconstructing the identity for units of the two languages after being merged, unit occupancy ranking to offer much more flexible data sharing between units both across languages and within the language based on the accumulated occupancy of the HMM states, and estimation of frame-level language posteriors using blurred posteriorgram features (BPFs) to be used in decoding. We also present a complete set of experimental results comparing all approaches involved for a real-world application scenario under unified conditions, and show very good improvement achieved with the proposed approaches.
international conference on acoustics, speech, and signal processing | 2014
Ching-feng Yeh; Lin-Shan Lee
This paper considers the transcription of the widely observed yet less investigated bilingual code-switched speech: the words or phrases of the guest language are inserted within the utterances of the host language, so the languages are switched back and forth within an utterance, and much less data are available for the guest language. Two approaches utilizing the deep neural network (DNN) were tested and analyzed, including using DNN bottleneck features in HMM/GMM (BF-HMM/GMM) and modeling context-dependent HMM senones by DNN (CD-DNN-HMM). In both cases the unit merging (and recovery) techniques in acoustic modeling were used to handle the data imbalance problem. Improved recognition accuracies were observed with unit merging (and recovery) for the two approaches under different conditions.
conference of the international speech communication association | 2010
Chia-ping Chen; Hung-yi Lee; Ching-feng Yeh; Lin-Shan Lee
conference of the international speech communication association | 2011
Yun-Nung Chen; Yu Huang; Ching-feng Yeh; Lin-Shan Lee
conference of the international speech communication association | 2011
Ching-feng Yeh; Chao-Yu Huang; Lin-Shan Lee