Yi-Chung Lin
National Tsing Hua University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Yi-Chung Lin.
international conference on computational linguistics | 1992
Tung-Hui Chiang; Yi-Chung Lin; Keh-Yih Su
In this paper, a discrimination and robustness oriented adaptive learning procedure is proposed to deal with the task of syntactic ambiguity resolution. Owing to the problem of insufficient training data and approximation error introduced by the language model, traditional statistical approaches, which resolve ambiguities by indirectly and implicitly using maximum likelihood method, fail to achieve high performance in real applications. The proposed method remedies these problems by adjusting the parameters to maximize the accuracy rate directly. To make the proposed algorithm robust, the possible variations between the training corpus and the real tasks are also taken into consideration by enlarging the separation margin between the correct candidate and its competing members. Significant improvement has been observed in the test. The accuracy rate of syntactic disambiguation is raised from 46.0% to 60.62% by using this novel approach.
international conference on acoustics, speech, and signal processing | 1992
Keh-Yih Su; Tung-Hui Chiang; Yi-Chung Lin
To enhance the performance of spoken language processing, a unified framework is proposed to integrate speech and language information together. This framework uses probabilistic formulation to characterize different language analyses generated from a language processing module. As probabilistic formulations are used in both speech and language processing modules, information from both modules can be easily integrated. To further improve the performance, a discrimination and robustness oriented learning procedure is proposed to adjust the parameters of probabilistic formulations. Significant improvement has been observed in the task of reading aloud Chinese computer manuals, which operates in a speaker dependent, isolated word mode.<<ETX>>
international conference on computational linguistics | 1994
Yi-Chung Lin; Tung-Hui Chiang; Keh-Yih Su
Statistical NLP models usually only consider coarse information and very restricted context to make the estimation of parameters feasible. To reduce the modeling error introduced by a simplified probabilistic model, the Classification and Regression Tree (CART) method was adopted in this paper to select more discriminative features for automatic model refinement. Because the features are adopted dependently during splitting the classification tree in CART, the number of training data in each terminal node is small, which makes the labeling process of terminal nodes not robust. This over-tuning phenomenon cannot be completely removed by cross - validation process (i.e., pruning process). A probabilistic classification model based on the selected discriminative features is thus proposed to use the training data more efficiently. In tagging the Brown Corpus, our probabilistic classification model reduces the error rate of the top 10 error dominant words from 5.71% to 4.35%, which shows 23.82% improvement over the unrefined model.
Computer Speech & Language | 1995
Yi-Chung Lin; Tung-Hui Chiang; Keh-Yih Su
Abstract To reduce the estimation error introduced by insufficient training data, the parameters of probabilistic models are usually smoothed by different techniques, such as Good–Turing smoothing and back-off smoothing. However, the discriminative power of the model cannot be significantly enhanced simply with the smoothing techniques. Therefore, in this paper an adaptive learning method is adopted to enhance the discrimination power of a probabilistic model. Also, a novel tying scheme is proposed to tie the unreliable parameters which never or rarely occurred in the training data, so that those unreliable parameters can have more chance to be adjusted by the learning procedure. In the task of tagging Brown Corpus, this approach greatly reduces the number of parameters from 578 759 to 27 947 and reduces the error rate of the ambiguous words (i.e. the words with more than one possible part of speech) from 5?48 to 4?93%, corresponding to 10?4% error reduction rate. Furthermore, a probabilistic model is usually simplified to enable reliable estimates of its parameters using the limited amount of training data. As a consequence, the modelling error is increased because some discriminative features are sacrificed while simplifying that model. Therefore, a probabilistic classification model is proposed to reduce the modelling error by better using the discriminative features selected by the Classification and Regression Tree method. This proposed model achieves 19?16% error reduction rate for the top 30 error-contributing words, which contribute 31?64% of the overall tagging errors.
international conference on acoustics, speech, and signal processing | 1994
Tung-Hui Chiang; Yi-Chung Lin; Keh-Yih Su
A joint learning algorithm which enabled the parameters in an integrated speech and language model to be trained jointly, was proposed in this paper. The integrated model enhanced the spoken language system with high level knowledge, and operated in a character-synchronous mode. This integration model was tested on the task of recognizing isolated Chinese characters in the speaker independent mode with very large vocabulary of 90,495 words, and the performance of 88.26% character accuracy rate was obtained. In contrast, only 75.71% accuracy rate was achieved with the baseline system, which directly coupled the speech recognizer with a character bi-gram language module. Afterwards, the parameters of both speech and language modules were jointly adjusted according to their contribution in discrimination. The dynamic range variations among the parameters in different modules were also well tuned during the learning processes. After applying this procedure to the character-synchronous integration model, a very promising result of 94.16% character accuracy (75.96% error reduction rate) was obtained.<<ETX>>
international joint conference on artificial intelligence | 2017
Chao-Chun Liang; Yu-Shiang Wong; Yi-Chung Lin; Keh-Yih Su
A goal-oriented meaning-based statistical framework is presented in this paper to solve the math word problem that requires multiple arithmetic operations with understanding, reasoning and explanation. It first analyzes and transforms sentences into their meaning-based logical forms, which represent the associated context of each quantity with role-tags (e.g., nsubj, verb, etc.). Logic forms with role-tags provide a flexible and simple way to specify the physical meaning of a quantity. Afterwards, the main-goal of the problem is decomposed recursively into its associated sub-goals. For each given sub-goal, the associated operator and operands are selected with statistical models. Lastly, it performs inference on logic expressions to get the answer and explains how the answer is obtained in a human comprehensible way. This process thus resembles the human cognitive understanding of the problem and produces a more meaningful problem solving interpretation.
ROCLING | 1992
Yi-Chung Lin; Tung-Hui Chiang; Keh-Yih Su
meeting of the association for computational linguistics | 1995
Jing-Shin Chang; Yi-Chung Lin; Keh-Yih Su
Computational Linguistics | 1995
Tung-Hui Chiang; Keh-Yih Su; Yi-Chung Lin
international conference on computational linguistics | 2015
Yi-Chung Lin; Chao-Chun Liang; Kuang-Yi Hsu; Chien-Tsung Huang; Shen-Yun Miao; Wei-Yun Ma; Lun-Wei Ku; Churn-Jung Liau; Keh-Yih Su