Spence Green
Stanford University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Spence Green.
human factors in computing systems | 2013
Spence Green; Jeffrey Heer; Christopher D. Manning
Language translation is slow and expensive, so various forms of machine assistance have been devised. Automatic machine translation systems process text quickly and cheaply, but with quality far below that of skilled human translators. To bridge this quality gap, the translation industry has investigated post-editing, or the manual correction of machine output. We present the first rigorous, controlled analysis of post-editing and find that post-editing leads to reduced time and, surprisingly, improved quality for three diverse language pairs (English to Arabic, French, and German). Our statistical models and visualizations of experimental data indicate that some simple predictors (like source text part of speech counts) predict translation time, and that post-editing results in very different interaction patterns. From these results we distill implications for the design of new language translation interfaces.
Computational Linguistics | 2013
Spence Green; Marie-Catherine de Marneffe; Christopher D. Manning
Multiword expressions lie at the syntax/semantics interface and have motivated alternative theories of syntax like Construction Grammar. Until now, however, syntactic analysis and multiword expression identification have been modeled separately in natural language processing. We develop two structured prediction models for joint parsing and multiword expression identification. The first is based on context-free grammars and the second uses tree substitution grammars, a formalism that can store larger syntactic fragments. Our experiments show that both models can identify multiword expressions with much higher accuracy than a state-of-the-art system based on word co-occurrence statistics.We experiment with Arabic and French, which both have pervasive multiword expressions. Relative to English, they also have richer morphology, which induces lexical sparsity in finite corpora. To combat this sparsity, we develop a simple factored lexical representation for the context-free parsing model. Morphological analyses are automatically transformed into rich feature tags that are scored jointly with lexical items. This technique, which we call a factored lexicon, improves both standard parsing and multiword expression identification accuracy.
meeting of the association for computational linguistics | 2014
Will Monroe; Spence Green; Christopher D. Manning
Segmentation of clitics has been shown to improve accuracy on a variety of Arabic NLP tasks. However, state-of-the-art Arabic word segmenters are either limited to formal Modern Standard Arabic, performing poorly on Arabic text featuring dialectal vocabulary and grammar, or rely on linguistic knowledge that is hand-tuned for each dialect. We extend an existing MSA segmenter with a simple domain adaptation technique and new features in order to segment informal and dialectal Arabic text. Experiments show that our system outperforms existing systems on newswire, broadcast news and Egyptian dialect, improvingsegmentationF1 scoreonarecently released Egyptian Arabic corpus to 95.1%, compared to 90.8% for another segmenter designed specifically for Egyptian Arabic.
workshop on statistical machine translation | 2014
Spence Green; Daniel M. Cer; Christopher D. Manning
We present a new version of Phrasal, an open-source toolkit for statistical phrasebased machine translation. This revision includes features that support emerging research trends such as (a) tuning with large feature sets, (b) tuning on large datasets like thebitext, and(c)web-basedinteractivemachine translation. A direct comparison with Moses shows favorable results in terms of decoding speed and tuning time.
empirical methods in natural language processing | 2014
Spence Green; Sida I. Wang; Jason Chuang; Jeffrey Heer; Sebastian Schuster; Christopher D. Manning
Analyses of computer aided translation typically focus on either frontend interfaces and human effort, or backend translation and machine learnability of corrections. However, this distinction is artificial in practice since the frontend and backend must work in concert. We present the first holistic, quantitative evaluation of these issues by contrasting two assistive modes: postediting and interactive machine translation (MT). We describe a new translator interface, extensive modifications to a phrasebased MT system, and a novel objective function for re-tuning to human corrections. Evaluation with professional bilingual translators shows that post-edit is faster than interactive at the cost of translation quality for French-English and EnglishGerman. However, re-tuning the MT system to interactive output leads to larger, statistically significant reductions in HTER versus re-tuning to post-edit. Analysis shows that tuning directly to HTER results in fine-grained corrections to subsequent machine output.
workshop on statistical machine translation | 2014
Spence Green; Daniel M. Cer; Christopher D. Manning
Scalable discriminative training methods are now broadly available for estimating phrase-based, feature-rich translation models. However, the sparse feature sets typically appearing in research evaluations are less attractive than standard dense features such as language and translation model probabilities: they often overfit, do not generalize, or require complex and slow feature extractors. This paper introduces extended features, which are more specific than dense features yet more general than lexicalized sparse features. Large-scale experiments show that extended features yield robust BLEU gains for both Arabic-English (+1.05) and Chinese-English (+0.67) relative to a strong feature-rich baseline. We also specialize the feature set to specific datadomains, identifyanobjectivefunction that is less prone to overfitting, and release fast, scalable, and language-independent tools for implementing the features.
user interface software and technology | 2014
Spence Green; Jason Chuang; Jeffrey Heer; Christopher D. Manning
The standard approach to computer-aided language translation is post-editing: a machine generates a single translation that a human translator corrects. Recent studies have shown this simple technique to be surprisingly effective, yet it underutilizes the complementary strengths of precision-oriented humans and recall-oriented machines. We present Predictive Translation Memory, an interactive, mixed-initiative system for human language translation. Translators build translations incrementally by considering machine suggestions that update according to the users current partial translation. In a large-scale study, we find that professional translators are slightly slower in the interactive mode yet produce slightly higher quality translations despite significant prior experience with the baseline post-editing condition. Our analysis identifies significant predictors of time and quality, and also characterizes interactive aid usage. Subjects entered over 99% of characters via interactive aids, a significantly higher fraction than that shown in previous work.
workshop on statistical machine translation | 2014
Julia Neidert; Sebastian Schuster; Spence Green; Kenneth Heafield; Christopher D. Manning
We describe Stanford’s participation in the French-English and English-German tracks of the 2014 Workshop on Statistical Machine Translation (WMT). Our systems used large feature sets, word classes, and an optional unconstrained language model. Among constrained systems, ours performed the best according to uncased BLEU: 36.0% for French-English and 20.9% for English-German.
empirical methods in natural language processing | 2015
Joern Wuebker; Spence Green; John DeNero
We present an incremental adaptation approach for statistical machine translation that maintains a flexible hierarchical domain structure within a single consistent model. Both weights and rules are updated incrementallyonastreamofpost-edits. Our multi-level domain hierarchy allows the system to adapt simultaneously towards local context at dierent levels of granularity, including genres and individual documents. Our experiments show consistent improvements in translation quality from all components of our approach.
meeting of the association for computational linguistics | 2016
Joern Wuebker; Spence Green; John DeNero; Sasa Hasan; Minh-Thang Luong
We apply phrase-based and neural models to a core task in interactive machine translation: suggesting how to complete a partial translation. For the phrase-based system, we demonstrate improvements in suggestion quality using novel objective functions, learning techniques, and inference algorithms tailored to this task. Our contributions include new tunable metrics, an improved beam search strategy, an n-best extraction method that increases suggestion diversity, and a tuning procedure for a hierarchical joint model of alignment and translation. The combination of these techniques improves next-word suggestion accuracy dramatically from 28.5% to 41.2% in a large-scale English-German experiment. Our recurrent neural translation system increases accuracy yet further to 53.0%, but inference is two orders of magnitude slower. Manual error analysis shows the strengths and weaknesses of both approaches.