Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Regina Barzilay is active.

Publication


Featured researches published by Regina Barzilay.


Advances in Automatic Text Summarization | 1997

Using lexical chains for text summarization

Regina Barzilay; Michael Elhadad

We investigate one technique to produce a summary of an original text without requiring its full semantic interpretation, but instead relying on a model of the topic progression in the text derived from lexical chains. We present a new algorithm to compute lexical chains in a text, merging several robust knowledge sources: the WordNet thesaurus, a part-of-speech tagger, shallow parser for the identification of nominal groups, and a segmentation algorithm. Summarization proceeds in four steps: the original text is segmented, lexical chains are constructed, strong chains are identified and significant sentences are extracted. We present in this paper empirical results on the identification of strong chains and of significant sentences. Preliminary results indicate that quality indicative summaries are produced. Pending problems are identified. Plans to address these short-comings are briefly presented.


north american chapter of the association for computational linguistics | 2003

Learning to paraphrase: an unsupervised approach using multiple-sequence alignment

Regina Barzilay; Lillian Lee

We address the text-to-text generation problem of sentence-level paraphrasing --- a phenomenon distinct from and more difficult than word- or phrase-level paraphrasing. Our approach applies multiple-sequence alignment to sentences gathered from unannotated comparable corpora: it learns a set of paraphrasing patterns represented by word lattice pairs and automatically determines how to apply these patterns to rewrite new sentences. The results of our evaluation experiments show that the system derives accurate paraphrases, outperforming baseline systems.


meeting of the association for computational linguistics | 2001

Extracting Paraphrases from a Parallel Corpus

Regina Barzilay; Kathleen R. McKeown

While paraphrasing is critical both for interpretation and generation of natural language, current systems use manual or semi-automatic methods to collect paraphrases. We present an unsupervised learning algorithm for identification of paraphrases from a corpus of multiple English translations of the same source text. Our approach yields phrasal and single word lexical paraphrases as well as syntactic paraphrases.


Computational Linguistics | 2008

Modeling local coherence: An entity-based approach

Regina Barzilay; Mirella Lapata

This article proposes a novel framework for representing and measuring local coherence. Central to this approach is the entity-grid representation of discourse, which captures patterns of entity distribution in a text. The algorithm introduced in the article automatically abstracts a text into a set of entity transition sequences and records distributional, syntactic, and referential information about discourse entities. We re-conceptualize coherence assessment as a learning task and show that our entity-based representation is well-suited for ranking-based generation and text classification tasks. Using the proposed representation, we achieve good performance on text ordering, summary coherence evaluation, and readability assessment.


meeting of the association for computational linguistics | 1999

Information Fusion in the Context of Multi-Document Summarization

Regina Barzilay; Kathleen R. McKeown; Michael Elhadad

We present a method to automatically generate a concise summary by identifying and synthesizing similar elements across related text from a set of multiple documents. Our approach is unique in its usage of language generation to reformulate the wording of the summary.


Computational Linguistics | 2005

Sentence Fusion for Multidocument News Summarization

Regina Barzilay; Kathleen R. McKeown

A system that can produce informative summaries, highlighting common information found in many online documents, will help Web users to pinpoint information that they need without extensive reading. In this article, we introduce sentence fusion, a novel text-to-text generation technique for synthesizing common information across documents. Sentence fusion involves bottom-up local multisequence alignment to identify phrases conveying similar information and statistical generation to combine common phrases into a sentence. Sentence fusion moves the summarization field from the use of purely extractive methods to the generation of abstracts that contain sentences not found in any of the input documents and can synthesize information across sources.


Journal of Artificial Intelligence Research | 2002

Inferring strategies for sentence ordering in multidocument news summarization

Regina Barzilay; Noémie Elhadad; Kathleen R. McKeown

The problem of organizing information for multidocument summarization so that the generated summary is coherent has received relatively little attention. While sentence ordering for single document summarization can be determined from the ordering of sentences in the input article, this is not the case for multidocument summarization where summary sentences may be drawn from different input articles. In this paper, we propose a methodology for studying the properties of ordering information in the news genre and describe experiments done on a corpus of multiple acceptable orderings we developed for the task. Based on these experiments, we implemented a strategy for ordering information that combines constraints from chronological order of events and topical relatedness. Evaluation of our augmented algorithm shows a significant improvement of the ordering over two baseline strategies.


meeting of the association for computational linguistics | 2006

Minimum Cut Model for Spoken Lecture Segmentation

Igor Malioutov; Regina Barzilay

We consider the task of unsupervised lecture segmentation. We formalize segmentation as a graph-partitioning task that optimizes the normalized cut criterion. Our approach moves beyond localized comparisons and takes into account long-range cohesion dependencies. Our results demonstrate that global analysis improves the segmentation accuracy and is robust in the presence of speech recognition errors.


language and technology conference | 2006

Paraphrasing for Automatic Evaluation

David Kauchak; Regina Barzilay

This paper studies the impact of paraphrases on the accuracy of automatic evaluation. Given a reference sentence and a machine-generated sentence, we seek to find a paraphrase of the reference sentence that is closer in wording to the machine output than the original reference. We apply our paraphrasing method in the context of machine translation evaluation. Our experiments show that the use of a paraphrased synthetic reference refines the accuracy of automatic evaluation. We also found a strong connection between the quality of automatic paraphrases as judged by humans and their contribution to automatic evaluation.


national conference on artificial intelligence | 1999

Towards multidocument summarization by reformulation: progress and prospects

Kathleen R. McKeown; Judith L. Klavans; Vasileios Hatzivassiloglou; Regina Barzilay; Eleazar Eskin

By synthesizing information common to retrieved documents, multi-document summarization can help users of information retrieval systems to find relevant documents with a minimal amount of reading. We are developing a multidocument summarization system to automatically generate a concise summary by identifying and synthesizing similarities across a set of related documents. Our approach is unique in its integration of machine learning and statistical techniques to identify similar paragraphs, intersection of similar phrases within paragraphs, and language generation to reformulate the wording of the summary. Our evaluation of system components shows that learning over multiple extracted linguistic features is more effective than information retrieval approaches at identifying similar text units for summarization and that it is possible to generate a fluent summary that conveys similarities among documents even when full semantic interpretations of the input text are not available.

Collaboration


Dive into the Regina Barzilay's collaboration.

Top Co-Authors

Avatar

Tommi S. Jaakkola

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Tao Lei

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Benjamin Snyder

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Jacob Eisenstein

Georgia Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Karthik Narasimhan

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Satchuthanan R. Branavan

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Yuan Zhang

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Harr Chen

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Tahira Naseem

Massachusetts Institute of Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge