Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Jinying Chen is active.

Publication


Featured researches published by Jinying Chen.


language and technology conference | 2006

An Empirical Study of the Behavior of Active Learning for Word Sense Disambiguation

Jinying Chen; Andrew I. Schein; Lyle H. Ungar; Martha Palmer

This paper shows that two uncertainty-based active learning methods, combined with a maximum entropy model, work well on learning English verb senses. Data analysis on the learning process, based on both instance and feature levels, suggests that a careful treatment of feature extraction is important for the active learning to be useful for WSD. The overfitting phenomena that occurred during the active learning process are identified as classic overfitting in machine learning based on the data analysis.


international joint conference on natural language processing | 2005

Towards robust high performance word sense disambiguation of english verbs using rich linguistic features

Jinying Chen; Martha Palmer

This paper shows that our WSD system using rich linguistic features achieved high accuracy in the classification of English SENSEVAL2 verbs for both fine-grained (64.6%) and coarse-grained (73.7%) senses. We describe three specific enhancements to our treatment of rich linguistic features and present their separate and combined contributions to our systems performance. Further experiments showed that our system had robust performance on test data without high quality rich features.


language resources and evaluation | 2009

Improving English verb sense disambiguation performance with linguistically motivated features and clear sense distinction boundaries

Jinying Chen; Martha Palmer

This paper presents a high-performance broad-coverage supervised word sense disambiguation (WSD) system for English verbs that uses linguistically motivated features and a smoothed maximum entropy machine learning model. We describe three specific enhancements to our system’s treatment of linguistically motivated features which resulted in the best published results on SENSEVAL-2 verbs. We then present the results of training our system on OntoNotes data, both the SemEval-2007 task and additional data. OntoNotes data is designed to provide clear sense distinctions, based on using explicit syntactic and semantic criteria to group WordNet senses, with sufficient examples to constitute high quality, broad coverage training data. Using similar syntactic and semantic features for WSD, we achieve performance comparable to that of human taggers, and competitive with the top results for the SemEval-2007 task. Empirical analysis of our results suggests that clarifying sense boundaries and/or increasing the number of training instances for certain verbs could further improve system performance.


meeting of the association for computational linguistics | 2005

A Parallel Proposition Bank II for Chinese and English

Martha Palmer; Nianwen Xue; Olga Babko-Malaya; Jinying Chen; Benjamin Snyder

The Proposition Bank (PropBank) project is aimed at creating a corpus of text annotated with information about semantic propositions. The second phase of the project, PropBank II adds additional levels of semantic annotation which include eventuality variables, co-reference, coarse-grained sense tags, and discourse connectives. This paper presents the results of the parallel PropBank II project, which adds these richer layers of semantic annotation to the first 100K of the Chinese Treebank and its English translation. Our preliminary analysis supports the hypothesis that this additional annotation reconciles many of the surface differences between the two languages.


meeting of the association for computational linguistics | 2006

Aligning Features with Sense Distinction Dimensions

Nianwen Xue; Jinying Chen; Martha Palmer

In this paper we present word sense disambiguation (WSD) experiments on ten highly polysemous verbs in Chinese, where significant performance improvements are achieved using rich linguistic features. Our system performs significantly better, and in some cases substantially better, than the baseline on all ten verbs. Our results also demonstrate that features extracted from the output of an automatic Chinese semantic role labeling system in general benefited the WSD system, even though the amount of improvement was not consistent across the verbs. For a few verbs, semantic role information actually hurt WSD performance. The inconsistency of feature performance is a general characteristic of the WSD task, as has been observed by others. We argue that this result can be explained by the fact that word senses are partitioned along different dimensions for different verbs and the features therefore need to be tailored to particular verbs in order to achieve adequate accuracy on verb sense disambiguation.


international joint conference on natural language processing | 2004

Using a smoothing maximum entropy model for chinese nominal entity tagging

Jinying Chen; Nianwen Xue; Martha Palmer

This paper treats nominal entity tagging as a six-way (five categories plus non-entity) classification problem and applies a smoothing maximum entropy (ME) model with a Gaussian prior to a Chinese nominal entity tagging task. The experimental results show that the model performs consistently better than an ME model using a simple count cut-off. The results also suggest that simple semantic features extracted from an electronic dictionary improve the model’s performance, especially when the training data is insufficient.


meeting of the association for computational linguistics | 2004

Chinese Verb Sense Discrimination Using an EM Clustering Model with Rich Linguistic Features

Jinying Chen; Martha Palmer

This paper discusses the application of the Expectation-Maximization (EM) clustering algorithm to the task of Chinese verb sense discrimination. The model utilized rich linguistic features that capture predicate-argument structure information of the target verbs. A semantic taxonomy for Chinese nouns, which was built semi-automatically based on two electronic Chinese semantic dictionaries, was used to provide semantic features for the model. Purity and normalized mutual information were used to evaluate the clustering performance on 12 Chinese verbs. The experimental results show that the EM clustering model can learn sense or sense group distinctions for most of the verbs successfully. We further enhanced the model with certain fine-grained semantic categories called lexical sets. Our results indicate that these lexical sets improve the models performance for the three most challenging verbs chosen from the first set of experiments.


international conference on semantic computing | 2007

Towards Large-scale High-Performance English Verb Sense Disambiguation by Using Linguistically Motivated Features

Jinying Chen; Dmitriy Dligach; Martha Palmer

In this paper we describe the results of training high performance word sense disambiguation (WSD) systems on a new data set based on groupings of WordNet senses. This data set is designed to provide clear sense distinctions with sufficient examples in order to provide high quality training data. The sense distinctions are based on explicit syntactic and semantic criteria. Our WSD features utilize similar syntactic and semantic linguistic information. We demonstrate that this approach, using both maximum entropy and SVM models, produces systems whose performance is comparable to that of humans.


Archive | 2017

VerbNet/OntoNotes-Based Sense Annotation

Meredith Green; Orin Hargraves; Claire Bonial; Jinying Chen; Lindsay Clark; Martha Palmer

In this chapter, we present our challenges and successes in producing the OntoNotes word sense groupings [41], which represent a slightly more coarse-grained set of English verb senses drawn from WordNet [13], and which have provided the foundation for our VerbNet sense annotation. These sense groupings were based on the successive merging of WordNet senses into more coarse-grained senses according to the results of inter-annotator agreement [10]. We find that the sense granularity, or level of semantic specificity found in this inventory, reflects sense distinctions that can be made consistently and accurately by human annotators, who achieve a high inter-annotator agreement rate of 89\(\%\). This, in turn, leads to a correspondingly high system performance for automatic WSD: sense distinctions with this level of granularity can be detected automatically at 87–89\(\%\) accuracy, making them effective for NLP applications [9].


international conference natural language processing | 2005

Clustering-based feature selection for verb sense disambiguation

Jinying Chen; Martha Palmer

This paper presents a novel feature selection algorithm for supervised verb sense disambiguation. The algorithm disambiguates and aggregates WordNet synsets of a verbs noun phrase (NP) arguments in the training data. It was then used to filter out irrelevant WordNet semantic features introduced by the ambiguity of verb NP arguments. Experimental results showed that our new feature selection method boosted our systems performance on verbs whose meanings depended heavily on their NP arguments. Furthermore, our method outperformed two standard feature selection methods, indicating its effectiveness and advantages, especially for small-sample machine learning tasks like supervised WSD.

Collaboration


Dive into the Jinying Chen's collaboration.

Top Co-Authors

Avatar

Martha Palmer

University of Colorado Boulder

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Andrew I. Schein

University of Pennsylvania

View shared research outputs
Top Co-Authors

Avatar

Benjamin Snyder

University of Pennsylvania

View shared research outputs
Top Co-Authors

Avatar

Claire Bonial

University of Colorado Boulder

View shared research outputs
Top Co-Authors

Avatar

Dmitriy Dligach

University of Colorado Boulder

View shared research outputs
Top Co-Authors

Avatar

Lyle H. Ungar

University of Pennsylvania

View shared research outputs
Top Co-Authors

Avatar

Meredith Green

University of Colorado Boulder

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Orin Hargraves

University of Colorado Boulder

View shared research outputs
Researchain Logo
Decentralizing Knowledge