Is this you? Create Your Porfile

Heng Ji

Rensselaer Polytechnic Institute

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Heng Ji is active.

Explore More

Publication

Featured researches published by Heng Ji.

meeting of the association for computational linguistics | 2014

Incremental Joint Extraction of Entity Mentions and Relations

Qi Li; Heng Ji

We present an incremental joint framework to simultaneously extract entity mentions and relations using structured perceptron with efficient beam-search. A segment-based decoder based on the idea of semi-Markov chain is adopted to the new framework as opposed to traditional token-based tagging. In addition, by virtue of the inexact search, we developed a number of new and effective global features as soft constraints to capture the interdependency among entity mentions and relations. Experiments on Automatic Content Extraction (ACE) 1 corpora demonstrate that our joint model significantly outperforms a strong pipelined baseline, which attains better performance than the best-reported end-to-end system.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2012

Exploring Context and Content Links in Social Media: A Latent Space Method

Guo-Jun Qi; Charu C. Aggarwal; Qi Tian; Heng Ji; Thomas S. Huang

Social media networks contain both content and context-specific information. Most existing methods work with either of the two for the purpose of multimedia mining and retrieval. In reality, both content and context information are rich sources of information for mining, and the full power of mining and processing algorithms can be realized only with the use of a combination of the two. This paper proposes a new algorithm which mines both context and content links in social media networks to discover the underlying latent semantic space. This mapping of the multimedia objects into latent feature vectors enables the use of any off-the-shelf multimedia retrieval algorithms. Compared to the state-of-the-art latent methods in multimedia analysis, this algorithm effectively solves the problem of sparse context links by mining the geometric structure underlying the content links between multimedia objects. Specifically for multimedia annotation, we show that an effective algorithm can be developed to directly construct annotation models by simultaneously leveraging both context and content information based on latent structure between correlated semantic concepts. We conduct experiments on the Flickr data set, which contains user tags linked with images. We illustrate the advantages of our approach over the state-of-the-art multimedia retrieval techniques.

graph based methods for natural language processing | 2009

Graph-based Event Coreference Resolution

Zheng Chen; Heng Ji

In this paper, we address the problem of event coreference resolution as specified in the Automatic Content Extraction (ACE) program. In contrast to entity coreference resolution, event coreference resolution has not received great attention from researchers. In this paper, we first demonstrate the diverse scenarios of event coreference by an example. We then model event coreference resolution as a spectral graph clustering problem and evaluate the clustering algorithm on ground truth event mentions using ECM F-Measure. We obtain the ECM-F scores of 0.8363 and 0.8312 respectively by using two methods for computing coreference matrices.

north american chapter of the association for computational linguistics | 2009

Language Specific Issue and Feature Exploration in Chinese Event Extraction

Zheng Chen; Heng Ji

In this paper, we present a Chinese event extraction system. We point out a language specific issue in Chinese trigger labeling, and then commit to discussing the contributions of lexical, syntactic and semantic features applied in trigger labeling and argument labeling. As a result, we achieved competitive performance, specifically, F-measure of 59.9 in trigger labeling and F-measure of 43.8 in argument labeling.

international joint conference on natural language processing | 2015

A Dependency-Based Neural Network for Relation Classification

Yang Liu; Furu Wei; Sujian Li; Heng Ji; Ming Zhou; Houfeng Wang

Previous research on relation classification has verified the effectiveness of using dependency shortest paths or subtrees. In this paper, we further explore how to make full use of the combination of these dependency information. We first propose a new structure, termed augmented dependency path (ADP), which is composed of the shortest dependency path between two entities and the subtrees attached to the shortest path. To exploit the semantic representation behind the ADP structure, we develop dependency-based neural networks (DepNN): a recursive neural network designed to model the subtrees, and a convolutional neural network to capture the most important features on the shortest path. Experiments on the SemEval-2010 dataset show that our proposed method achieves state-of-art results.

Proceedings of the Workshop on Computationally Hard Problems and Joint Inference in Speech and Language Processing | 2006

Re-Ranking Algorithms for Name Tagging

Heng Ji; Cynthia Rudin; Ralph Grishman

Integrating information from different stages of an NLP processing pipeline can yield significant error reduction. We demonstrate how re-ranking can improve name tagging in a Chinese information extraction system by incorporating information from relation extraction, event extraction, and coreference. We evaluate three state-of-the-art re-ranking algorithms (MaxEnt-Rank, SVMRank, and p-Norm Push Ranking), and show the benefit of multi-stage re-ranking for cross-sentence and cross-document inference.

IEEE Signal Processing Magazine | 2008

Speech segmentation and spoken document processing

Mari Ostendorf; Benoit Favre; Ralph Grishman; D. Hakkani-Tur; Mary P. Harper; D. Hillard; J. Hirschberg; Heng Ji; Jeremy G. Kahn; Yang Liu; Sameer Maskey; Hermann Ney; Andrew Rosenberg; Elizabeth Shriberg; Wen Wang; C. Woofers

Progress in both speech and language processing has spurred efforts to support applications that rely on spoken rather than written language input. A key challenge in moving from text-based documents to such spoken documents is that spoken language lacks explicit punctuation and formatting, which can be crucial for good performance. This article describes different levels of speech segmentation, approaches to automatically recovering segment boundary locations, and experimental results demonstrating impact on several language processing tasks. The results also show a need for optimizing segmentation for the end task rather than independently.

empirical methods in natural language processing | 2005

Using Semantic Relations to Refine Coreference Decisions

Heng Ji; David Westbrook; Ralph Grishman

We present a novel mechanism for improving reference resolution by using the output of a relation tagger to rescore coreference hypotheses. Experiments show that this new framework can improve performance on two quite different languages - English and Chinese.

empirical methods in natural language processing | 2014

Constructing Information Networks Using One Single Model

Qi Li; Heng Ji; Yu Hong; Sujian Li

In this paper, we propose a new framework that unifies the output of three information extraction (IE) tasks - entity mentions, relations and events as an information network representation, and extracts all of them using one single joint model based on structured prediction. This novel formulation allows different parts of the information network fully interact with each other. For example, many relations can now be considered as the resultant states of events. Our approach achieves substantial improvements over traditional pipelined approaches, and significantly advances state-of-the-art end-toend event argument extraction.

Proceedings of the Workshop on Information Extraction Beyond The Document | 2006

Data Selection in Semi-supervised Learning for Name Tagging

Heng Ji; Ralph Grishman

We present two semi-supervised learning techniques to improve a state-of-the-art multi-lingual name tagger. For English and Chinese, the overall system obtains 1.7% - 2.1% improvement in F-measure, representing a 13.5% -- 17.4% relative reduction in the spurious, missing, and incorrect tags. We also conclude that simply relying upon large corpora is not in itself sufficient: we must pay attention to unlabeled data selection too. We describe effective measures to automatically select documents and sentences.

Explore More