Liang-Chih Yu | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Liang-Chih Yu is active.

Explore More

Publication

Featured researches published by Liang-Chih Yu.

Knowledge Based Systems | 2013

Using a contextual entropy model to expand emotion words and their intensity for the sentiment classification of stock market news

Liang-Chih Yu; Jheng-Long Wu; Pei-Chann Chang; Hsuan-Shou Chu

Sentiment classification of stock market news involves identifying positive and negative news articles, and is an emerging technique for making stock trend predictions which can facilitate investor decision making. In this paper, we propose the presence and intensity of emotion words as features to classify the sentiment of stock market news articles. To identify such words and their intensity, a contextual entropy model is developed to expand a set of seed words generated from a small corpus of stock market news articles with sentiment annotation. The contextual entropy model measures the similarity between two words by comparing their contextual distributions using an entropy measure, allowing for the discovery of words similar to the seed words. Experimental results show that the proposed method can discover more useful emotion words and their corresponding intensity, thus improving classification performance. Performance was further improved by the incorporation of intensity into the classification, and the proposed method outperformed the previously-proposed pointwise mutual information (PMI)-based expansion methods.

IEEE Transactions on Audio, Speech, and Language Processing | 2010

Sentence Correction Incorporating Relative Position and Parse Template Language Models

Chung-Hsien Wu; Chao-Hong Liu; Matthew Harris; Liang-Chih Yu

Sentence correction has been an important emerging issue in computer-assisted language learning. However, existing techniques based on grammar rules or statistical machine translation are still not robust enough to tackle the common errors in sentences produced by second language learners. In this paper, a relative position language model and a parse template language model are proposed to complement traditional language modeling techniques in addressing this problem. A corpus of erroneous English-Chinese language transfer sentences along with their corrected counterparts is created and manually judged by human annotators. Experimental results show that compared to a state-of-the-art phrase-based statistical machine translation system, the error correction performance of the proposed approach achieves a significant improvement using human evaluation.

north american chapter of the association for computational linguistics | 2016

Building Chinese Affective Resources in Valence-Arousal Dimensions

Liang-Chih Yu; Lung-Hao Lee; Shuai Hao; Jin Wang; Yunchao He; Jun Hu; K. Robert Lai; Xuejie Zhang

An increasing amount of research has recently focused on representing affective states as continuous numerical values on multiple dimensions, such as the valence-arousal (VA) space. Compared to the categorical approach that represents affective states as several classes (e.g., positive and negative), the dimensional approach can provide more finegrained sentiment analysis. However, affective resources with valence-arousal ratings are still very rare, especially for the Chinese language. Therefore, this study builds 1) an affective lexicon called Chinese valence-arousal words (CVAW) containing 1,653 words, and 2) an affective corpus called Chinese valencearousal text (CVAT) containing 2,009 sentences extracted from web texts. To improve the annotation quality, a corpus cleanup procedure is used to remove outlier ratings and improper texts. Experiments using CVAW words to predict the VA ratings of the CVAT corpus show results comparable to those obtained using English affective resources.

IEEE Intelligent Systems | 2005

Using semantic dependencies to mine depressive symptoms from consultation records

Chung-Hsien Wu; Liang-Chih Yu; Fong Lin Jang

With the rapid growth of depressive disorders, many psychiatric Web sites have developed various psychiatric screening services for mental health care and crisis prevention. We propose a framework for mining depressive symptoms and their relations from consultation records. The records contain many kinds of depressive symptoms, such as depressed mood, suicide ideas, anxiety, sleep disturbances, and so on. The depressive symptoms are embedded in a single sentence or a discourse segment - that is, successive sentences describing the same depressive symptom. Our framework infers the semantic label according to a sentences semantic dependencies and the HowNet knowledge base, a Chinese-language concept hierarchy that defines higher-level abstractions, or hypernyms, for Chinese words, concepts, and interconcept relations. Moreover, the framework computes the lexical cohesion between sentences to enhance its semantic labeling power and adopts a domain ontology to mine the semantic relations. Preliminary experiments show the semantic dependencies within and between sentences and the domain ontology used in this approach are significant features in the mining task.

meeting of the association for computational linguistics | 2016

Dimensional Sentiment Analysis Using a Regional CNN-LSTM Model

Jin Wang; Liang-Chih Yu; K. Robert Lai; Xuejie Zhang

Dimensional sentiment analysis aims to recognize continuous numerical values in multiple dimensions such as the valencearousal (VA) space. Compared to the categorical approach that focuses on sentiment classification such as binary classification (i.e., positive and negative), the dimensional approach can provide more fine-grained sentiment analysis. This study proposes a regional CNN-LSTM model consisting of two parts: regional CNN and LSTM to predict the VA ratings of texts. Unlike a conventional CNN which considers a whole text as input, the proposed regional CNN uses an individual sentence as a region, dividing an input text into several regions such that the useful affective information in each region can be extracted and weighted according to their contribution to the VA prediction. Such regional information is sequentially integrated across regions using LSTM for VA prediction. By combining the regional CNN and LSTM, both local (regional) information within sentences and long-distance dependency across sentences can be considered in the prediction process. Experimental results show that the proposed method outperforms lexicon-based, regression-based, and NN-based methods proposed in previous studies.

Information Processing and Management | 2010

Annotation and verification of sense pools in OntoNotes

Liang-Chih Yu; Chung-Hsien Wu; Ru-Yng Chang; Chao-Hong Liu; Eduard H. Hovy

The paper describes the OntoNotes, a multilingual (English, Chinese and Arabic) corpus with large-scale semantic annotations, including predicate-argument structure, word senses, ontology linking, and coreference. The underlying semantic model of OntoNotes involves word senses that are grouped into so-called sense pools, i.e., sets of near-synonymous senses of words. Such information is useful for many applications, including query expansion for information retrieval (IR) systems, (near-)duplicate detection for text summarization systems, and alternative word selection for writing support systems. Although a sense pool provides a set of near-synonymous senses of words, there is still no knowledge about whether two words in a pool are interchangeable in practical use. Therefore, this paper devises an unsupervised algorithm that incorporates Google n-grams and a statistical test to determine whether a word in a pool can be substituted by other words in the same pool. The n-gram features are used to measure the degree of context mismatch for a substitution. The statistical test is then applied to determine whether the substitution is adequate based on the degree of mismatch. The proposed method is compared with a supervised method, namely Linear Discriminant Analysis (LDA). Experimental results show that the proposed unsupervised method can achieve comparable performance with the supervised method.

international joint conference on natural language processing | 2015

Predicting Valence-Arousal Ratings of Words Using a Weighted Graph Method

Liang-Chih Yu; Jin Wang; K. Robert Lai; Xuejie Zhang

Compared to the categorical approach that represents affective states as several discrete classes (e.g., positive and negative), the dimensional approach represents affective states as continuous numerical values on multiple dimensions, such as the valence-arousal (VA) space, thus allowing for more fine-grained sentiment analysis. In building dimensional sentiment applications, affective lexicons with valence-arousal ratings are useful resources but are still very rare. Therefore, this study proposes a weighted graph model that considers both the relations of multiple nodes and their similarities as weights to automatically determine the VA ratings of affective words. Experiments on both English and Chinese affective lexicons show that the proposed method yielded a smaller error rate on VA prediction than the linear regression, kernel method, and pagerank algorithm used in previous studies.

Applied Soft Computing | 2014

An intelligent stock trading system using comprehensive features

Jheng-Long Wu; Liang-Chih Yu; Pei-Chann Chang

The aim of this study is to predict automatic trading decisions in stock markets. Comprehensive features (CF) for predicting future trend are very difficult to generate in a complex environment, especially in stock markets. According to related work, the relevant stock information can help investors formulate objects that may result in better profits. With this in mind, we present a framework of an intelligent stock trading system using comprehensive features (ISTSCF) to predict future stock trading decisions. The ISTSCF consists of stock information extraction, prediction model learning and stock trading decision. We apply three different methods to generate comprehensive features, including sentiment analysis (SA) that provides sensitive market events from stock news articles for sentiment indices (SI), technical analysis (TA) that yields effective trading rules based on trading information on the stock exchange for technical indices (TI), as well as the trend-based segmentation method (TBSM) that raises trading decisions from stock price for trading signals (TS). Experiments on the Taiwan stock market show that the results of employing comprehensive features are significantly better than traditional methods using numeric features alone (without textual sentiment features).

IEEE Transactions on Evolutionary Computation | 2008

HAL-Based Evolutionary Inference for Pattern Induction From Psychiatry Web Resources

Liang-Chih Yu; Chung-Hsien Wu; Jui-Feng Yeh; Fong-Lin Jang

Negative and stressful life events play a significant role in triggering depressive episodes. Psychiatric services that can identify such events efficiently are vital for mental health care and prevention. Meaningful patterns, e.g., <lost, parents>, must be extracted from psychiatric texts before these services can be provided. This study presents an evolutionary text-mining framework capable of inducing variable-length patterns from unannotated psychiatry Web resources. The proposed framework can be divided into two parts: 1) a cognitive motivated model such as hyperspace analog to language (HAL) and 2) an evolutionary inference algorithm (EIA). The HAL model constructs a high-dimensional context space to represent words as well as combinations of words. Based on the HAL model, the EIA bootstraps with a small set of seed patterns, and then iteratively induces additional relevant patterns. To avoid moving in the wrong direction, the EIA further incorporates relevance feedback to guide the induction process. Experimental results indicate that combining the HAL model and relevance feedback enables the EIA to not only induce patterns from the unannotated Web corpora, but also achieve useful results in a reasonable amount of time. The proposed framework thus significantly reduces reliance on annotated corpora.

Proceedings of The Third CIPS-SIGHAN Joint Conference on Chinese Language Processing | 2014

Overview of SIGHAN 2014 Bake-off for Chinese Spelling Check

Liang-Chih Yu; Lung-Hao Lee; Yuen Hsien Tseng; Hsin-Hsi Chen

This paper introduces a Chinese Spelling Check campaign organized for the SIGHAN 2014 bake-off, including task description, data preparation, performance metrics, and evaluation results based on essays written by Chinese as a foreign language learners. The hope is that such evaluations can produce more advanced Chinese spelling check techniques.

Explore More