Yunfang Wu
Peking University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Yunfang Wu.
meeting of the association for computational linguistics | 2007
Peng Jin; Yunfang Wu; Shiwen Yu
The Multilingual Chinese-English lexical sample task at SemEval-2007 provides a framework to evaluate Chinese word sense disambiguation and to promote research. This paper reports on the task preparation and the results of six participants.
international conference on computational linguistics | 2011
Likun Qiu; Yunfang Wu; Yanqiu Shao
Supersense tagging classifies unknown words into semantic categories defined by lexicographers and inserts them into a thesaurus. Previous studies on supersense tagging show that context-based methods perform well for English unknown words while structure-based methods perform well for Chinese unknown words. The challenge before us is how to successfully combine contextual and structural information together for supersense tagging of Chinese unknown words. We propose a simple yet effective approach to address the challenge. In this approach, contextual information is used for measuring contextual similarity between words while structural information is used to filter candidate synonyms and adjusting contextual similarity score. Experiment results show that the proposed approach outperforms the state-of-art context-based method and structure-based method.
international conference on computational linguistics | 2009
Peng Jin; Xu Sun; Yunfang Wu; Shiwen Yu
The main disadvantage of collocation-based word sense disambiguation is that the recall is low, with relatively high precision. How to improve the recall without decrease the precision? In this paper, we investigate a word-class approach to extend the collocation list which is constructed from the manually sense-tagged corpus. But the word classes are obtained from a larger scale corpus which is not sense tagged. The experiment results have shown that the F-measure is improved to 71% compared to 54% of the baseline system where the word-class is not considered, although the precision decreases slightly. Further study discovers the relationship between the F-measure and the number of word-class trained from the various sizes of corpus.
international conference natural language processing | 2008
Peng Jin; Fuxin Li; Danqing Zhu; Yunfang Wu; Shiwen Yu
This paper proposes a novel approach to improve the kernel-based word sense disambiguation (WSD). We first explain why linear kernels are more suitable to WSD and many other natural language processing problems than translation-invariant kernels. Based on the linear kernel, two external knowledge sources are integrated. One comprises a set of linguistic rules to find the crucial features. For the other, a distributional similarity thesaurus is used to alleviate data sparseness by generalizing crucial features when they do not match the word-form exactly. The experiments show that we have outperformed the state-of-the-art system on the benchmark data from English lexical sample task of SemEval-2007 and the improvement is statistically significant.
international conference natural language processing | 2008
Yunfang Wu; Miao Wang; Peng Jin
This paper makes a systematic study on disambiguating sentiment ambiguous adjectives within context in real text, which is an interaction between word sense disambiguation and sentiment analysis. We firstly address the issue of inter-annotator agreement on assigning semantic orientations to word occurrences in real text. Secondly we demonstrate that co-occurring sentiment monosemous adjectives can not effectively disambiguate sentiment ambiguous adjectives. Then collocation-based disambiguation and support vector machine (SVM) algorithm are exploited on the task of disambiguation. We present a new approach of combining collocation information and SVM to disambiguate sentiment ambiguous words. The experimental results show that the combining approach of Coll+SVM outperforms both collocation-based method and SVM algorithm.
international conference on computational linguistics | 2012
Fei Wang; Yunfang Wu
Today blog has become an important medium for people to post their ideas and share new information. And the market trend of pricing Up/Down always draws peoples attention. In this paper, we make a thorough study on mining market trend from blog titles in the field of housing market and stock market, based on lexical semantic similarity. We focus on the automatic extraction and construction of Chinese Up/Down verb lexicon, by using both Chinese and Chinese-English bilingual semantic similarity. The experimental results show that verb lexicon extraction based on semantic similarity is of great use in the task of mining public opinions on market trend, and that the performance of applying English similar words to Chinese verb lexicon extraction is well compared with using Chinese similar words.
linguistic annotation workshop | 2007
Yunfang Wu; Peng Jin; Tao Guo; Shiwen Yu
This paper presents the building procedure of a Chinese sense annotated corpus. A set of software tools is designed to help human annotator to accelerate the annotation speed and keep the consistency. The software tools include 1) a tagger for word segmentation and POS tagging, 2) an annotating interface responsible for the sense describing in the lexicon and sense annotating in the corpus, 3) a checker for consistency keeping, 4) a transformer responsible for the transforming from text file to XML format, and 5) a counter for sense frequency distribution calculating.
international conference on computational linguistics | 2010
Yunfang Wu; Miaomiao Wen
international conference on computational linguistics | 2012
Fei Wang; Yunfang Wu; Likun Qiu
meeting of the association for computational linguistics | 2010
Peng Jin; Yunfang Wu