Lun-Wei Ku
Academia Sinica
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Lun-Wei Ku.
international conference on computational linguistics | 2014
Soujanya Poria; Erik Cambria; Lun-Wei Ku; Chen Gui; Alexander F. Gelbukh
Sentiment analysis is a rapidly growing research field that has attracted both academia and industry because of the challenging research problems it poses and the potential benefits it can provide in many real life applications. Aspect-based opinion mining, in particular, is one of the fundamental challenges within this research field. In this work, we aim to solve the problem of aspect extraction from product reviews by proposing a novel rule-based approach that exploits common-sense knowledge and sentence dependency trees to detect both explicit and implicit aspects. Two popular review datasets were used for evaluating the system against state-of-the-art aspect extraction techniques, obtaining higher detection accuracy for both datasets.
International Journal of Computational Linguistics & Chinese Language Processing, Volume 13, Number 3, September 2008: Special Issue on Selected Papers from ROCLING XIX | 2008
Lun-Wei Ku; Yu-Ting Liang; Hsin-Hsi Chen
Question answering systems provide an elegant way for people to access an underlying knowledge base. However, people are interested in not only factual questions, but also opinions. This paper deals with question analysis and answer passage retrieval in opinion QA systems. For question analysis, six opinion question types are defined. A two-layered framework utilizing two question type classifiers is proposed. Algorithms for these two classifiers are described. The performance achieves 87.8% in general question classification and 92.5% in opinion question classification. The question focus is detected to form a query for the information retrieval system and the question polarity is detected to retain relevant sentences which have the same polarity as the question. For answer passage retrieval, three components are introduced. Relevant sentences retrieved are further identified as to whether the focus (Focus Detection) is in a scope of opinion (Opinion Scope Identification) or not, and, if yes, whether the polarity of the scope and the polarity of the question (Polarity Detection) match with each other. The best model achieves an F-measure of 40.59% by adopting partial match for relevance detection at the level of meaningful unit. With relevance issues removed, the F-measure of the best model boosts up to 84.96%.
Topic detection and tracking | 2002
Hsin-Hsi Chen; Lun-Wei Ku
This paper presents algorithms for Chinese and English-Chinese topic detection. Named entities, other nouns and verbs are cue patterns to relate news stories describing the same event. Lexical translation and name transliteration resolve lexical differences between English and Chinese. A two-threshold scheme determines relevance (irrelevance) between a news story and a topic cluster. Lookahead information deals with ambiguous cases in clustering. The least-recently-used removal strategy models the time factor in such a way that older and unimportant terms will have no effect on clustering. Experimental results show that nouns and verbs as well as the least-recently-used removal strategy outperform other models. The performance of the named-entity-only approach decreases slightly, but it has no overhead of nouns-and-verbs approach with the least-recently-used removal strategy.
Sensors | 2014
C.-Y. Lin; Edward T.-H. Chu; Lun-Wei Ku; Jane W.-S. Liu
In recent years, major natural disasters have made it more challenging for us to protect human lives. Examples include the 2011 Japan Tohoku earthquake and Typhoon Morakot in 2009 in Taiwan. However, modern disaster warning systems cannot automatically respond efficiently and effectively to disasters. As a result, it is necessary to develop an automatic response system to prevent, diminish or eliminate the damages caused by disasters. In this paper, we develop an active emergency disaster system to automatically process standard warning messages, such as CAP (Common Alerting Protocol) messages. After receiving official warning messages of earthquakes, our system automatically shuts down natural gas lines in order to prevent buildings from fire and opens the building doors for easy evacuation. Our system also stops elevators when they reach the closest floor. In addition, our system can be applied to hospitals to tell surgeons to pause ongoing operations, or to supermarkets to inform consumers of the way to exit the building. According to our experiment results, the proposed system can avoid possible dangers and save human lives during major disasters.
meeting of the association for computational linguistics | 2007
Lun-Wei Ku; Yong-Sheng Lo; Hsin-Hsi Chen
Opinion analysis is an important research topic in recent years. However, there are no common methods to create evaluation corpora. This paper introduces a method for developing opinion corpora involving multiple annotators. The characteristics of the created corpus are discussed, and the methodologies to select more consistent testing collections and their corresponding gold standards are proposed. Under the gold standards, an opinion extraction system is evaluated. The experiment results show some interesting phenomena.
International Journal of Computational Linguistics & Chinese Language Processing, Volume 14, Number 4, December 2009 | 2009
Lun-Wei Ku; Chia-Ying Lee; Hsin-Hsi Chen
Opinion holder identification aims to extract entities that express opinions in sentences. In this paper, opinion holder identification is divided into two subtasks: authors opinion recognition and opinion holder labeling. Support vector machine (SVM) is adopted to recognize authors opinions, and conditional random field algorithm (CRF) is utilized to label opinion holders. New features are proposed for both methods. Our method achieves an f-score of 0.734 in the NTCIR7 MOAT task on the Traditional Chinese side, which is the best performance among results of machine learning methods proposed by participants, and also it is close to the best performance of this task. In addition, inconsistent annotations of opinion holders are analyzed, along with the best way to utilize the training instances with inconsistent annotations being proposed.
information reuse and integration | 2012
Lun-Wei Ku; Cheng-Wei Sun
This paper utilized sentences containing emoticon from the articles in Yahoo! blogs to automatically detect users emotions from messenger logs. Four approaches, topical approach, emotional approach, retrieval approach, and lexicon approach, were proposed. Forty emoticon classes found in Yahoo! blog articles were used for experiments. Two experiments were performed. The first experiment classified sentences into 40 emoticon classes by calculating emotional scores of words. The second experiment took the Yahoo! and MSN messenger logs collected from users as the experimental materials, classified them into 40 emoticon classes by proposed approaches, and mapped 40 emoticon classes to 6 emotion classes to tell the users emotion. The best performance of the proposed approaches for user emotion detection was achieved by the topical approach and its micro-average precision 0.48 was satisfactory.
meeting of the association for computational linguistics | 2015
Lun-Wei Ku; Shafqat Mumtaz Virk; Yann-Huei Lee
We describe a well-performed semantic role labeling system that further extracts concepts (smaller semantic expressions) from unstructured natural language sentences language independently. A dual-layer semantic role labeling (SRL) system is built using Chinese Treebank and Propbank data. Contextual information is incorporated while labeling the predicate arguments to achieve better performance. Experimental results show that the proposed approach is superior to CoNLL 2009 best systems and comparable to the state of the art with the advantage that it requires no feature engineering process. Concepts are further extracted according to templates formulated by the labeled semantic roles to serve as features in other NLP tasks to provide semantically related cues and potentially help in related research problems. We also show that it is easy to generate a different language version of this system by actually building an English system which performs satisfactory.
advances in social networks analysis and mining | 2016
Wei-Chung Wang; Lun-Wei Ku
Lexical inference problem is a significant component of some recent core AI and NLP research problems like machine reading and textual entailment. In this paper, we propose method utilizing the Probabilistic Soft Logic (PSL) model for Chinese lexical inference. The proposed PSL model not only can integrate two complementary traditional methods, i.e., the lexical-knowledge-based method and the distributional probabilistic method, but also can optimize the lexical inference network in a global view by the transitivity property of entailment relations. We build a large domain specific verb inference corpus containing 18,029 verb pairs with gold inference labels from math world problems. A five-folded experiment is performed. Results show that the proposed PSL model greatly outperforms our baseline.
international conference on hci in business | 2015
Wei-Fan Chen; Yann-Hui Lee; Lun-Wei Ku
Recent techniques of opinion mining have succeeded in analyzing sentiment on the social media, but processing the skewed data or data with few labels about political or social issues remains tough. In this paper, we introduce a two-step approach that starts from only five seed words for detecting the stance of Facebook posts toward the anti-reconstruction of the nuclear power plant. First, InterestFinder, which detects interest words, is adopted to filter out irrelevant documents. Second, we employ machine learning methods including SVM and co-training, and also a compositional sentiment scoring tool CopeOpi to determine the stance of each relevant post. Experimental results show that when applying the proposed transition process, CopeOpi outperforms the other machine learning methods. The best precision scores of predicting three stance categories (i.e., supportive, neutral and unsupportive) are 94.62 %, 88.86 % and 10.47 %, respectively, which concludes that the proposed approach can capture the sentiment of documents from lack-of-label, skewed data.