Lung-Hao Lee
National Taiwan Normal University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Lung-Hao Lee.
north american chapter of the association for computational linguistics | 2016
Liang-Chih Yu; Lung-Hao Lee; Shuai Hao; Jin Wang; Yunchao He; Jun Hu; K. Robert Lai; Xuejie Zhang
An increasing amount of research has recently focused on representing affective states as continuous numerical values on multiple dimensions, such as the valence-arousal (VA) space. Compared to the categorical approach that represents affective states as several classes (e.g., positive and negative), the dimensional approach can provide more finegrained sentiment analysis. However, affective resources with valence-arousal ratings are still very rare, especially for the Chinese language. Therefore, this study builds 1) an affective lexicon called Chinese valence-arousal words (CVAW) containing 1,653 words, and 2) an affective corpus called Chinese valencearousal text (CVAT) containing 2,009 sentences extracted from web texts. To improve the annotation quality, a corpus cleanup procedure is used to remove outlier ratings and improper texts. Experiments using CVAW words to predict the VA ratings of the CVAT corpus show results comparable to those obtained using English affective resources.
conference of the european chapter of the association for computational linguistics | 2014
Yuen Hsien Tseng; Lung-Hao Lee; Shu-Yen Lin; Bo-Shun Liao; Mei-Jun Liu; Hsin-Hsi Chen; Oren Etzioni; Anthony Fader
This study presents the Chinese Open Relation Extraction (CORE) system that is able to extract entity-relation triples from Chinese free texts based on a series of NLP techniques, i.e., word segmentation, POS tagging, syntactic parsing, and extraction rules. We employ the proposed CORE techniques to extract more than 13 million entity-relations for an open domain question answering application. To our best knowledge, CORE is the first Chinese Open IE system for knowledge acquisition.
Proceedings of The Third CIPS-SIGHAN Joint Conference on Chinese Language Processing | 2014
Liang-Chih Yu; Lung-Hao Lee; Yuen Hsien Tseng; Hsin-Hsi Chen
This paper introduces a Chinese Spelling Check campaign organized for the SIGHAN 2014 bake-off, including task description, data preparation, performance metrics, and evaluation results based on essays written by Chinese as a foreign language learners. The hope is that such evaluations can produce more advanced Chinese spelling check techniques.
meeting of the association for computational linguistics | 2015
Lung-Hao Lee; Liang-Chih Yu; Li-Ping Chang
This paper introduces the NLP-TEA 2015 shared task for Chinese grammatical error diagnosis. We describe the task, data preparation, performance metrics, and evaluation results. The hope is that such an evaluation campaign may produce more advanced Chinese grammatical error diagnosis techniques. All data sets with gold standards and evaluation tools are publicly available for research purposes.
Proceedings of the Eighth SIGHAN Workshop on Chinese Language Processing | 2015
Yuen Hsien Tseng; Lung-Hao Lee; Li-Ping Chang; Hsin-Hsi Chen
This paper introduces the SIGHAN 2015 Bake-off for Chinese Spelling Check, including task description, data preparation, performance metrics, and evaluation results. The competition reveals current state-of-the-art NLP techniques in dealing with Chinese spelling checking. All data sets with gold standards and evaluation tool used in this bake-off are publicly available for future research.
international conference on asian language processing | 2016
Liang-Chih Yu; Lung-Hao Lee; Kam-Fai Wong
This paper presents the IALP 2016 shared task on Dimensional Sentiment Analysis for Chinese Words (DSAW) which seeks to identify a real-value sentiment score of Chinese words in the both valence and arousal dimensions. Valence represents the degree of pleasant and unpleasant (or positive and negative) feelings, and arousal represents the degree of excitement and calm. Of the 22 teams registered for this shared task for two-dimensional sentiment analysis, 16 submitted results. We expected that this evaluation campaign could produce more advanced dimensional sentiment analysis techniques, especially for Chinese affective computing. All data sets with gold standards and scoring script are made publicly available to researchers.
Knowledge Based Systems | 2016
Liang-Chih Yu; Lung-Hao Lee; Jui-Feng Yeh; Hsiu-Min Shih; Yu-Ling Lai
Near-synonyms are fundamental and useful knowledge resources for computer-assisted language learning (CALL) applications. For example, in online language learning systems, learners may have a need to express a similar meaning using different words. However, it is usually difficult to choose suitable near-synonyms to fit a given context because the differences of near-synonyms are not easily grasped in practical use, especially for second language (L2) learners. Accordingly, it is worth developing algorithms to verify whether near-synonyms match given contexts. Such algorithms could be used in applications to assist L2 learners in discovering the collocational differences between near-synonyms. We propose a discriminative vector space model for the near-synonym substitution task, and consider this task as a classification task. There are two components: a vector space model and discriminative training. The vector space model is used as a baseline classifier to classify test examples into one of the near-synonyms in a given near-synonym set. A discriminative training technique is then employed to improve the vector space model by distinguishing positive and negative features for each near-synonym. Experimental results show that the DT-VSM achieves higher accuracy than both pointwise mutual information and n-gram-based methods that have been used in previous studies.
conference on information and knowledge management | 2011
Lung-Hao Lee; Hsin-Hsi Chen
This paper presents an intent conformity model to collaboratively generate blacklists for cyberporn filtering. A novel porn detection framework via searches-and-clicks is proposed to explore collective intelligence embedded in query logs. Firstly, the clicked pages are represented in terms of the weighted queries to reflect the degrees related to pornography. Consequently, these weighted queries are regarded as discriminative features to calculate the pornography indicator by an inverse chi-square method for candidate determination. Finally, a candidate whose URL contains at least one pornographic keyword is included in our collaborative blacklists. The experiments on a MSN porn data set indicate that the generated blacklist achieves a high precision, while maintaining a favorably low false positive rate. In addition, real-life filtering simulations reveal that our blacklist is more effective than some publicly released blacklists.
workshop on innovative use of nlp for building educational applications | 2016
Lung-Hao Lee; Bo-Lin Lin; Liang-Chih Yu; Yuen Hsien Tseng
This study describes the design of the NTNU-YZU system for the automated evaluation of scientific writing shared task. We employ a convolutional neural network with the Word2Vec/GloVe embedding representation to predict whether a sentence needs language editing. For the Boolean prediction track, our best F-score of 0.6108 ranked second among the ten submissions. Our system also achieved an F-score of 0.7419 for the probabilistic estimation track, ranking fourth among the nine submissions.
international conference on computational linguistics | 2014
Lung-Hao Lee; Liang-Chih Yu; Kuei-Ching Lee; Yuen Hsien Tseng; Li-Ping Chang; Hsin-Hsi Chen