Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where David Farwell is active.

Publication


Featured researches published by David Farwell.


Archive | 1998

Machine Translation and the Information Soup

David Farwell; Laurie Gerber; Eduard H. Hovy

We present two problems for statistically extracting bilingual lexicon: (1) How can noisy parallel corpora be used? (2) How can non-parallel yet comparable corpora be used? We describe our own work and contribution in relaxing the constraint of using only clean parallel corpora. DKvec is a method for extracting bilingual lexicons, from noisy parallel corpora based on arrival distances of words in noisy parallel corpora. Using DKvec on noisy parallel corpora in English/Japanese and English/Chinese, our evaluations show a 55.35% precision from a small corpus and 89.93% precision from a larger corpus. Our major contribution is in the extraction of bilingual lexicon from non-parallel corpora. We present a rst such result in this area, from a new method{Convec. Convec is based on context information of a word to be translated. We show a 30% to 76% precision when top-one to top-20 translation candidates are considered. Most of the top-20 candidates are either collocations or words related to the correct translation. Since non-parallel corpora contain a lot more polysemous words, many-to-many translations, and di erent lexical items in the two languages, we conclude that the output from Convec is reasonable and useful.


Machine Translation | 1993

Automatically Creating Lexical Entries for ULTRA, a Multilingual MT System*

David Farwell; Louise Guthrie; Yorick Wilks

In this paper, we describe both a multi-lingual, interlingual MT system (ULTRA) and a method of extracting lexical entries for it automatically from an existing machine-readable dictionary (LDOCE). We believe the latter is original and the former, although not the first interlingual MT System by any means, may be first that is symmetrically multi-lingual. It translates between English, German, Chinese, Japanese and Spanish and has vocabularies in each language based on about 10,000 word senses.


north american chapter of the association for computational linguistics | 2004

Interlingual Annotation of Multilingual Text Corpora

David Farwell; Stephen Helmreich; Florence Reeder; Keith J. Miller; Lori S. Levin; Eduard H. Hovy; Bonnie J. Dorr; Nizar Habash; Teruko Mitamura; Owen Rambow; Advaith Siddharthan

This paper describes a multi-site project to annotate six sizable bilingual parallel corpora for interlingual content. After presenting the background and objectives of the effort, we will go on to describe the data set that is being annotated, the interlingua representation language used, an interface environment that supports the annotation task and the annotation process itself. We will then present a preliminary version of our evaluation methodology and conclude with a summary of the current status of the project along with a number of issues which have arisen.


international conference on computational linguistics | 1992

The automatic creation of lexical entries for a multilingual MT system

David Farwell; Louise Guthrie; Yorick Wilks

In this paper, we describe a method of extracting information from an on-line resource for the construction of lexical entries for a multi-lingual, interlingual MT system (ULTRA). We have been able to automatically generate lexical entries for interlingual concepts corresponding to nouns, verbs, adjectives and adverbs. Although several features of these entries continue to be supplied manually we have greatly decreased the time required to generate each entry and see this as a promising method for the creation of large-scale lexicons.


Natural Language Engineering | 2010

Interlingual annotation of parallel text corpora: A new framework for annotation and evaluation

Bonnie J. Dorr; Rebecca J. Passonneau; David Farwell; Rebecca Green; Nizar Habash; Stephen Helmreich; Eduard H. Hovy; Lori S. Levin; Keith J. Miller; Teruko Mitamura; Owen Rambow; Advaith Siddharthan

This paper focuses on an important step in the creation of a system of meaning representation and the development of semantically annotated parallel corpora, for use in applications such as machine translation, question answering, text summarization, and information retrieval. The work described below constitutes the first effort of any kind to annotate multiple translations of foreign-language texts with interlingual content. Three levels of representation are introduced: deep syntactic dependencies (IL0), intermediate semantic representations (IL1), and a normalized representation that unifies conversives, nonliteral language, and paraphrase (IL2). The resulting annotated, multilingually induced, parallel corpora will be useful as an empirical basis for a wide range of research, including the development and evaluation of interlingual NLP systems and paraphrase-extraction systems as well as a host of other research and development efforts in theoretical and applied linguistics, foreign language pedagogy, translation studies, and other related disciplines.


conference of the association for machine translation in the americas | 2004

Interlingual Annotation for MT Development

Florence Reeder; Bonnie J. Dorr; David Farwell; Nizar Habash; Stephen Helmreich; Eduard H. Hovy; Lori S. Levin; Teruko Mitamura; Keith J. Miller; Owen Rambow; Advaith Siddharthan

MT systems that use only superficial representations, including the current generation of statistical MT systems, have been successful and useful. However, they will experience a plateau in quality, much like other “silver bullet” approaches to MT. We pursue work on the development of interlingual representations for use in symbolic or hybrid MT systems. In this paper, we describe the creation of an interlingua and the development of a corpus of semantically annotated text, to be validated in six languages and evaluated in several ways. We have established a distributed, well-functioning research methodology, designed a preliminary interlingua notation, created annotation manuals and tools, developed a test collection in six languages with associated English translations, annotated some 150 translations, and designed and applied various annotation metrics. We describe the data sets being annotated and the interlingual (IL) representation language which uses two ontologies and a systematic theta-role list. We present the annotation tools built and outline the annotation process. Following this, we describe our evaluation methodology and conclude with a summary of issues that have arisen.


Archive | 1992

Building an Intelligent Second Language Tutoring System from Whatever Bits you Happen to Have Lying Around

Yorick Wilks; David Farwell

The Computing Research Laboratory (CRL) at New Mexico State University is currently engaged in the design of language teaching software, based on previously developed mature artificial intelligence and machine translation technologies within CRL. Our approach is unique because it uses the robustness of a natural language processing (NLP) system which incorporates both general world knowledge and task domain knowledge (Metallel), beliefs ascription (ViewGen), and semantic parsing techniques (PREMO) in the service of better student-system interaction.


Machine Translation | 1998

Translation Differences and Pragmatics-Based MT

Stephen Helmreich; David Farwell

This paper examines differences between two professional translations into English of the same Spanish newspaper article. Among other explanations for these differences, such as outright errors and free variation, we find a significant number of differences are due to differing beliefs on the part of the translators about the subject matter and about what the author wished to say. Furthermore, these differences are consistent with divergent global views of the translators about the likelihood of future events (earthquakes and tidal waves) and about (rational or irrational) reactions of people to such likelihood. We discuss the requirements for a pragmatics-based model of translation that would account for these differences.


north american chapter of the association for computational linguistics | 2000

An interlingual-based approach to reference resolution

David Farwell; Stephen Helmreich

In this paper we outline an interlingual-based procedure for resolving reference and suggest a practical approach to implementing it. We assume a two-stage language analysis system. First, a syntactic analysis of an input text results in a functional structure in which certain cases of pronominal reference are resolved. Second, the f-structure is mapped onto an interlingual representation. As part of this mapping, the reference of the various f-structure elements is resolved resulting in the addition of information to certain existing IL objects (coreference) or in the creation of new IL objects which are added to the domain of discourse (initial reference).For this effort, we adopt Text Meaning Representation for our IL and rely on the ONTOS ontology (Mahesh & Nirenburg, 1995) as a general knowledge base. Since the central barrier to developing such a system today is the incompleteness of the knowledge base, we outline a strategy starting with the implementation of a series of form-based resolution algorithms that are applied directly to the referring expressions of the input text. These are initially supplemented by a knowledge-based resolution procedure which, as the knowledge base grows and the adequacy of the f-structure and IL-representation increases, takes on more and more of the processing load.We examine the operation of the form-based algorithms on a sample Spanish text and show their limitations. We then demonstrate how an IL-based approach can be used to resolve the problematic cases of reference. This research effort is part of the CREST project at the CRL funded by DARPA.


conference of the association for machine translation in the americas | 2000

Text Meaning Representation as a Basis for Representation of Text Interpretation

Stephen Helmreich; David Farwell

In this paper we propose a representation for what we have called an interpretation of a text. We base this representation on TMR (Text Meaning Representation), an interlingual representation developed for Machine Translation purposes. A TMR consists of a complex feature-value structure, with the feature names and filler values drawn from an ontology, in this case, ONTOS, developed concurrently with TMR. We suggest on the basis of previous work, that a representation of an interpretation of a text must build on a TMR structure for the text in several ways: (1) by the inclusion of additional required features and feature values (which may themselves be complex feature structures); (2) by pragmatically filling in empty slots in the TMR structure itself; and (3) by supporting the connections between feature values by including, as part of the TMR itself, the chains of inferencing that link various parts of the structure.

Collaboration


Dive into the David Farwell's collaboration.

Top Co-Authors

Avatar

Stephen Helmreich

New Mexico State University

View shared research outputs
Top Co-Authors

Avatar

Eduard H. Hovy

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar

Yorick Wilks

University of Sheffield

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Lori S. Levin

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Teruko Mitamura

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar

Nizar Habash

New York University Abu Dhabi

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge