Zhu Zhang | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Zhu Zhang is active.

Explore More

Publication

Featured researches published by Zhu Zhang.

language resources and evaluation | 2004

MEAD - A Platform for Multidocument Multilingual Text Summarization

Dragomir R. Radev; Timothy Allison; Sasha Blair-Goldensohn; John Blitzer; Arda Çelebi; Stanko Dimitrov; Elliott Franco Drábek; Ali Hakim; Wai Lam; Danyu Liu; Jahna Otterbacher; Hong Qi; Horacio Saggion; Simone Teufel; Michael Topper; Adam Winkel; Zhu Zhang

Abstract This paper describes the functionality of MEAD, a comprehensive, public domain, open source, multidocument multilingual summarization environment that has been thus far downloaded by more than 500 organizations. MEAD has been used in a variety of summarization applications ranging from summarization for mobile devices to Web page summarization within a search engine and to novelty detection.

international conference on human language technology research | 2001

NewsInEssence: a system for domain-independent, real-time news clustering and multi-document summarization

Dragomir R. Radev; Sasha Blair-Goldensohn; Zhu Zhang; Revathi Sundara Raghavan

NEWSINESSENCE is a system for finding, visualizing and summarizing a topic-based cluster of news stories. In the generic scenario for NEWSINESSENCE, a user selects a single news story from a news Web site. Our system then searches other live sources of news for other stories related to the same event and produces summaries of a subset of the stories that it finds, according to parameters specified by the user.

conference on information and knowledge management | 2004

Weakly-supervised relation classification for information extraction

Zhu Zhang

This paper approaches the relation classification problem in information extraction framework with bootstrapping on top of Support Vector Machines. A new bootstrapping algorithm is proposed and empirically evaluated on the ACE corpus. We show that the supervised SVM classifier using various lexical and syntactic features can achieve promising classification accuracy. More importantly, the proposed <i>BootProject</i> algorithm based on random feature projection can significantly reduce the need for labeled training data with only limited sacrifice of performance.

conference on information and knowledge management | 2003

Learning cross-document structural relationships using boosting

Zhu Zhang; Jahna Otterbacher; Dragomir R. Radev

Multi-document discoure analysis has emerged with the potential of improving various information retrieval applications. Based on the newly proposed Cross-document Structure Theory (CST), this paper describes an empirical study that uses boosting to classify CST relationships between sentence pairs extracted from topically related documents. We show that the binary classifier for determining existence of structural relationships significantly outperforms the baseline. We also achieve promising results on the multi-class case in which the full taxonomy of relationships are considered.

european conference on research and advanced technology for digital libraries | 2001

Interactive, Domain-Independent Identification and Summarization of Topically Related News Articles

Dragomir R. Radev; Sasha Blair-Goldensohn; Zhu Zhang; Revathi Sundara Raghavan

In this paper we present NewsInEssence, a fully deployed digital news system. A user selects a current news story of interest which is useda s a seed article by NewsInEssence to find in real time other related stories from a large number of news sources. The output is a single document summary presenting the most salient information gleaned from the different sources. We discuss the algorithm used by NewsInEssence, module interoperability, and conclude the paper with a number of empirical analyses.

international joint conference on natural language processing | 2004

Combining labeled and unlabeled data for learning cross-document structural relationships

Zhu Zhang; Dragomir R. Radev

Multi-document discourse analysis has emerged with the potential of improving various NLP applications. Based on the newly proposed Cross-document Structure Theory (CST), this paper describes an empirical study that classifies CST relationships between sentence pairs extracted from topically related documents, exploiting both labeled and unlabeled data. We investigate a binary classifier for determining existence of structural relationships and a full classifier using the full taxonomy of relationships. We show that in both cases the exploitation of unlabeled data helps improve the performance of learned classifiers.

international joint conference on natural language processing | 2005

Tense tagging for verbs in cross-lingual context: a case study

Yang Ye; Zhu Zhang

The current work applies Conditional Random Fields to the problem of temporal reference mapping from Chinese text to English text. The learning algorithm utilizes a moderate number of linguistic features that are easy and inexpensive to obtain. We train a tense classifier upon a small amount of manually labeled data. The evaluation results are promising according to standard measures as well as in comparison with a pilot tense annotation experiment involving human judges. Our study exhibits potential value for full-scale machine translation systems and other natural language processing tasks in a cross-lingual scenario.

international joint conference on natural language processing | 2005

Mining inter-entity semantic relations using improved transductive learning

Zhu Zhang

This paper studies the problem of mining relational data hidden in natural language text. In particular, it approaches the relation classification problem with the strategy of transductive learning. Different algorithms are presented and empirically evaluated on the ACE corpus. We show that transductive learners exploiting various lexical and syntactic features can achieve promising classification performance. More importantly, transductive learning performance can be significantly improved by using an induced similarity function.

Archive | 2001