Lane Schwartz | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Lane Schwartz is active.

Explore More

Publication

Featured researches published by Lane Schwartz.

workshop on statistical machine translation | 2009

Joshua: An Open Source Toolkit for Parsing-Based Machine Translation

Zhifei Li; Chris Callison-Burch; Chris Dyer; Sanjeev Khudanpur; Lane Schwartz; Wren N. G. Thornton; Jonathan Weese; Omar F. Zaidan

We describe Joshua, an open source toolkit for statistical machine translation. Joshua implements all of the algorithms required for synchronous context free grammars (SCFGs): chart-parsing, n-gram language model integration, beam-and cube-pruning, and k-best extraction. The toolkit also implements suffix-array grammar extraction and minimum error rate training. It uses parallel and distributed computing techniques for scalability. We demonstrate that the toolkit achieves state of the art translation performance on the WMT09 French-English translation task.

Computational Linguistics | 2010

Broad-coverage parsing using human-like memory constraints

William Schuler; Samir E. AbdelRahman; Timothy A. Miller; Lane Schwartz

Human syntactic processing shows many signs of taking place within a general-purpose short-term memory. But this kind of memory is known to have a severely constrained storage capacitypossibly constrained to as few as three or four distinct elements. This article describes a model of syntactic processing that operates successfully within these severe constraints, by recognizing constituents in a right-corner transformed representation (a variant of left-corner parsing) and mapping this representation to random variables in a Hierarchic Hidden Markov Model, a factored time-series model which probabilistically models the contents of a bounded memory store over time. Evaluations of the coverage of this model on a large syntactically annotated corpus of English sentences, and the accuracy of a a bounded-memory parsing strategy based on this model, suggest this model may be cognitively plausible.

Computational Linguistics | 2009

A framework for fast incremental interpretation during speech decoding

William Schuler; Stephen T. Wu; Lane Schwartz

This article describes a framework for incorporating referential semantic information from a world model or ontology directly into a probabilistic language model of the sort commonly used in speech recognition, where it can be probabilistically weighted together with phonological and syntactic factors as an integral part of the decoding process. Introducing world model referents into the decoding search greatly increases the search space, but by using a single integrated phonological, syntactic, and referential semantic language model, the decoder is able to incrementally prune this search based on probabilities associated with these combined contexts. The result is a single unified referential semantic probability model which brings several kinds of context to bear in speech decoding, and performs accurate recognition in real time on large domains in the absence of example in-domain training sentences.

meeting of the association for computational linguistics | 2009

Demonstration of Joshua: An Open Source Toolkit for Parsing-based Machine Translation

Zhifei Li; Chris Callison-Burch; Chris Dyery; Juri Ganitkevitch; Sanjeev Khudanpur; Lane Schwartz; Wren N. G. Thornton; Jonathan Weese; Omar F. Zaidan

We describe Joshua (Li et al., 2009a), an open source toolkit for statistical machine translation. Joshua implements all of the algorithms required for translation via synchronous context free grammars (SCFGs): chart-parsing, n-gram language model integration, beam- and cube-pruning, and k-best extraction. The toolkit also implements suffix-array grammar extraction and minimum error rate training. It uses parallel and distributed computing techniques for scalability. We also provide a demonstration outline for illustrating the toolkits features to potential users, whether they be newcomers to the field or power users interested in extending the toolkit.

international conference on computational linguistics | 2008

Toward a Psycholinguistically-Motivated Model of Language Processing

William Schuler; Samir E. AbdelRahman; Timothy A. Miller; Lane Schwartz

Psycholinguistic studies suggest a model of human language processing that 1) performs incremental interpretation of spoken utterances or written text, 2) preserves ambiguity by maintaining competing analyses in parallel, and 3) operates within a severely constrained short-term memory store --- possibly constrained to as few as four distinct elements. This paper describes a relatively simple model of language as a factored statistical time-series process that meets all three of the above desiderata; and presents corpus evidence that this model is sufficient to parse naturally occurring sentences using human-like bounds on memory.

The Prague Bulletin of Mathematical Linguistics | 2010

Hierarchical Phrase-based Grammar Extraction in Joshua: Suffix Arrays and Prefix Trees

Lane Schwartz; Chris Callison-Burch

Hierarchical Phrase-Based Grammar Extraction in Joshua: While example-based machine translation has long used corpus information at run-time, statistical phrase-based approaches typically include a preprocessing stage where an aligned parallel corpus is split into phrases, and parameter values are calculated for each phrase using simple relative frequency estimates. This paper describes an open source implementation of the crucial algorithms presented in (Lopez, 2008) which allow direct run-time calculation of SCFG translation rules in Joshua.

workshop on statistical machine translation | 2014

Machine Translation and Monolingual Postediting: The AFRL WMT-14 System

Lane Schwartz; Timothy Anderson; Jeremy Gwinnup; Katherine Young

This paper describes the AFRL statistical MT system and the improvements that were developed during the WMT14 evaluation campaign. As part of these efforts we experimented with a number of extensions to the standard phrase-based model that improve performance on Russian to English and Hindi to English translation tasks. In addition, we describe our efforts to make use of monolingual English speakers to correct the output of machine translation, and present the results of monolingual postediting of the entire 3003 sentences of the WMT14 Russian-English test set.

intelligent user interfaces | 2008

Exploiting referential context in spoken language interfaces for data-poor domains

Stephen T. Wu; Lane Schwartz; William Schuler

This paper describes an implementation of a shell-like programming interface that utilizes referential context (that is, information about the current state of an interfaced application) in order to achieve accurate recognition -- even in user-defined domains with no available domain-specific training corpora. The interface incorporates a knowledge of context into its model of syntax, yielding a referential semantic language model. Interestingly, the referential semantic language model exploits context dynamically, unlike other recent systems, by using incremental processing and the limited stack memory of an HMM-like time series model.

international conference on acoustics, speech, and signal processing | 2008

Referential semantic language modeling for data-poor domains

Stephen T. Wu; Lane Schwartz; William Schuler

This paper describes a referential semantic language model that achieves accurate recognition in user-defined domains with no available domain-specific training corpora. This model is interesting in that, unlike similar recent systems, it exploits context dynamically, using incremental processing and limited stack memory of an HMM-like time series model to constrain search.

workshop on statistical machine translation | 2015

The University of Illinois submission to the WMT 2015 Shared Translation Task

Lane Schwartz; Bill Bryce; Chase Geigle; Sean Massung; Yisi Liu; Haoruo Peng; Vignesh Raja; Subhro Roy; Shyam Upadhyay

In this year’s WMT translation task, Finnish-English was introduced as a language pair of competition for the first time. We present experiments examining several variations on a morphologically-aware statistical phrase-based machine translation system for translating Finnish into English. Our system variations attempt to mitigate the issue of rich agglutinative morphology when translating from Finnish into English. Our WMT submission for Finnish-English preprocesses Finnish data with omorfi (Pirinen, 2015), a Finnish morphological analyzer. We also present results for two other language pairs with morphologically interesting source languages, namely German-English and Czech-English.

Explore More