John DeNero | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where John DeNero is active.

Explore More

Publication

Featured researches published by John DeNero.

international joint conference on natural language processing | 2009

Better Word Alignments with Supervised ITG Models

Aria Haghighi; John Blitzer; John DeNero; Daniel Klein

This work investigates supervised word alignment methods that exploit inversion transduction grammar (ITG) constraints. We consider maximum margin and conditional likelihood objectives, including the presentation of a new normal form grammar for canonicalizing derivations. Even for non-ITG sentence pairs, we show that it is possible learn ITG alignment models by simple relaxations of structured discriminative learning objectives. For efficiency, we describe a set of pruning techniques that together allow us to align sentences two orders of magnitude faster than naive bitext CKY parsing. Finally, we introduce many-to-one block alignment features, which significantly improve our ITG models. Altogether, our method results in the best reported AER numbers for Chinese-English and a performance improvement of 1.1 BLEU over GIZA++ alignments.

empirical methods in natural language processing | 2008

Sampling Alignment Structure under a Bayesian Translation Model

John DeNero; Alexandre Bouchard-Côté; Daniel Klein

We describe the first tractable Gibbs sampling procedure for estimating phrase pair frequencies under a probabilistic model of phrase alignment. We propose and evaluate two nonparametric priors that successfully avoid the degenerate behavior noted in previous work, where overly large phrases memorize the training data. Phrase table weights learned under our model yield an increase in BLEU score over the word-alignment based heuristic estimates used regularly in phrase-based translation systems.

workshop on statistical machine translation | 2006

Why Generative Phrase Models Underperform Surface Heuristics

John DeNero; Daniel Gillick; James Zhang; Daniel Klein

We investigate why weights from generative models underperform heuristic estimates in phrase-based machine translation. We first propose a simple generative, phrase-based model and verify that its estimates are inferior to those given by surface statistics. The performance gap stems primarily from the addition of a hidden segmentation variable, which increases the capacity for overfitting during maximum likelihood training with EM. In particular, while word level models benefit greatly from re-estimation, phrase-level models do not: the crucial difference is that distinct word alignments cannot all be correct, while distinct segmentations can. Alternate segmentations rather than alternate alignments compete, resulting in increased deter-minization of the phrase table, decreased generalization, and decreased final BLEU score. We also show that interpolation of the two methods can result in a modest increase in BLEU score.

meeting of the association for computational linguistics | 2008

The Complexity of Phrase Alignment Problems

John DeNero; Daniel Klein

Many phrase alignment models operate over the combinatorial space of bijective phrase alignments. We prove that finding an optimal alignment in this space is NP-hard, while computing alignment expectations is #P-hard. On the other hand, we show that the problem of finding an optimal alignment can be cast as an integer linear program, which provides a simple, declarative approach to Viterbi inference for phrase alignment models that is empirically quite efficient.

international joint conference on natural language processing | 2009

Fast Consensus Decoding over Translation Forests

John DeNero; David Chiang; Kevin Knight

The minimum Bayes risk (MBR) decoding objective improves BLEU scores for machine translation output relative to the standard Viterbi objective of maximizing model score. However, MBR targeting BLEU is prohibitively slow to optimize over k-best lists for large k. In this paper, we introduce and analyze an alternative to MBR that is equally effective at improving performance, yet is asymptotically faster --- running 80 times faster than MBR in experiments with 1000-best lists. Furthermore, our fast decoding procedure can select output sentences based on distributions over entire forests of translations, in addition to k-best lists. We evaluate our procedure on translation forests from two large-scale, state-of-the-art hierarchical machine translation systems. Our forest-based decoding objective consistently outperforms k-best list MBR, giving improvements of up to 1.0 BLEU.

empirical methods in natural language processing | 2009

Consensus Training for Consensus Decoding in Machine Translation

Adam Pauls; John DeNero; Daniel Klein

We propose a novel objective function for discriminatively tuning log-linear machine translation models. Our objective explicitly optimizes the BLEU score of expected n-gram counts, the same quantities that arise in forest-based consensus and minimum Bayes risk decoding methods. Our continuous objective can be optimized using simple gradient ascent. However, computing critical quantities in the gradient necessitates a novel dynamic program, which we also present here. Assuming BLEU as an evaluation measure, our objective function has two principle advantages over standard max BLEU tuning. First, it specifically optimizes model weights for downstream consensus decoding procedures. An unexpected second benefit is that it reduces overfitting, which can improve test set BLEU scores when using standard Viterbi decoding.

north american chapter of the association for computational linguistics | 2009

Efficient Parsing for Transducer Grammars

John DeNero; Mohit Bansal; Adam Pauls; Daniel Klein

The tree-transducer grammars that arise in current syntactic machine translation systems are large, flat, and highly lexicalized. We address the problem of parsing efficiently with such grammars in three ways. First, we present a pair of grammar transformations that admit an efficient cubic-time CKY-style parsing algorithm despite leaving most of the grammar in n-ary form. Second, we show how the number of intermediate symbols generated by this transformation can be substantially reduced through binarization choices. Finally, we describe a two-pass coarse-to-fine parsing approach that prunes the search space using predictions from a subset of the original grammar. In all, parsing time reduces by 81%. We also describe a coarse-to-fine pruning scheme for forest-based language model reranking that allows a 100-fold increase in beam size while reducing decoding time. The resulting translations improve by 1.3 BLEU.

meeting of the association for computational linguistics | 2009

Asynchronous Binarization for Synchronous Grammars

John DeNero; Adam Pauls; Daniel Klein

Binarization of n-ary rules is critical for the efficiency of syntactic machine translation decoding. Because the target side of a rule will generally reorder the source side, it is complex (and sometimes impossible) to find synchronous rule binarizations. However, we show that synchronous binarizations are not necessary in a two-stage decoder. Instead, the grammar can be binarized one way for the parsing stage, then rebinarized in a different way for the reranking stage. Each individual binarization considers only one monolingual projection of the grammar, entirely avoiding the constraints of synchronous binarization and allowing binarizations that are separately optimized for each stage. Compared to n-ary forest reranking, even simple target-side binarization schemes improve overall decoding accuracy.

learning at scale | 2015

Problems Before Solutions: Automated Problem Clarification at Scale

Soumya Basu; Albert Wu; Brian Hou; John DeNero

Automatic assessment reduces the need for individual feedback in massive courses, but often focuses only on scoring solutions, rather than assessing whether students correctly understand problems. We present an enriched approach to automatic assessment that explicitly assists students in understanding the detailed specification of technical problems that they are asked to solve, in addition to evaluating their solutions. Students are given a suite of solution test cases, but they must first unlock each test case by validating its behavior before they are allowed to apply it to their proposed solution. When provided with this automated feedback early in the problem-solving process, students ask fewer clarificatory questions and express less confusion about assessments. As a result, instructors spend less time explaining problems to students. In a 1300-person university course, we observed that the vast majority of students chose to validate their understanding of test cases before attempting to solve problems. These students reported that the validation process improved their understanding.

meeting of the association for computational linguistics | 2014

A Constrained Viterbi Relaxation for Bidirectional Word Alignment

Yin-Wen Chang; Alexander M. Rush; John DeNero; Michael Collins

Bidirectional models of word alignment are an appealing alternative to post-hoc combinations of directional word aligners. Unfortunately, most bidirectional formulations are NP-Hard to solve, and a previous attempt to use a relaxationbased decoder yielded few exact solutions (6%). We present a novel relaxation for decoding the bidirectional model of DeNero and Macherey (2011). The relaxation can be solved with a modified version of the Viterbi algorithm. To find optimal solutions on difficult instances, we alternate between incrementally adding constraints and applying optimality-preserving coarse-to-fine pruning. The algorithm finds provably exact solutions on 86% of sentence pairs and shows improvements over directional models.

Explore More