Paul McFetridge
Simon Fraser University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Paul McFetridge.
Machine Translation | 2000
Fred Popowich; Paul McFetridge; Davide Turcato; Janine Toole
Traditional Machine Translation (MT) systems are designed to translate documents. In this paper we describe an MT system that translates the closed captions that accompany most North American television broadcasts. This domain has two identifying characteristics. First, the captions themselves have properties quite different from the type of textual input that many MT systems have been designed for. This is due to the fact that captions generally represent speech and hence contain many of the phenomena that characterize spoken language. Second, the operational characteristics of the closed-caption domain are also quite distinctive. Unlike most other translation domains, the translated captions are only one of several sources of information that are available to the user. In addition, the user has limited time to comprehend the translation since captions only appear on the screen for a few seconds. In this paper, we look at some of the theoretical and implementational challenges that these characteristics pose for MT. We present a fully automatic large-scale multilingual MT system, ALTo. Our approach is based on Whitelocks Shake and Bake MT paradigm, which relies heavily on lexical resources. The system currently provides wide-coverage translation from English to Spanish. In addition to discussing the design of the system, we also address the evaluation issues that are associated with this domain and report on our current performance.
north american chapter of the association for computational linguistics | 2000
Davide Turcato; Fred Popowich; Paul McFetridge; Devlan Nicholson; Janine Toole
We describe an approach to Machine Translation of transcribed speech, as found in closed captions. We discuss how the colloquial nature and input format peculiarities of closed captions are dealt with in a pre-processing pipeline that prepares the input for effective processing by a core MT system. In particular, we describe components for proper name recognition and input segmentation. We evaluate the contribution of such modules to the system performance. The described methods have been implemented on an MT system for translating English closed captions to Spanish and Portuguese.
ASSESSEVALNLP '99 Proceedings of a Symposium on Computer Mediated Language Assessment and Evaluation in Natural Language Processing | 1999
Trude Heift; Paul McFetridge
One of the typical problems of Natural Language Processing (NLP) is the explosive property of the parser and this is aggravated in an Intelligent Language Tutoring System (ILTS) because the grammar is unconstrained and admits even more analyses. NLP applications frequently incorporate techniques for selecting a preferred parse. Computational criteria, however, are insufficient for a pedagogic system because the parse chosen will possibly result in misleading feedback for the learner. Preferably, the analysis emphasizes language teaching pedagogy by selecting the sentence interpretation a student most likely intended. In the system described in this paper, several modules are responsible for selecting the appropriate analysis and these are informed by the Student Model. Aspects in the Student Model play an important pedagogic role in determining the desired sentence interpretation, handling multiple errors, and deciding on the level of interaction with the student.
international conference on computational linguistics | 1992
Fred Popowich; Paul McFetridge; Dan Fass; Gary Hall
Analysis of a corpus of queries to a statistical database has shown considerable variation in the location and order of modifiers in complex noun phrases. Nevertheless, restrictions can be defined on nominal modification because of certain correspondences between nominal modifiers and the role they fulfill in a statistical database, notably that the names of database tables and columns, and values of columns, are all determined by the modifiers. These restrictions are described. Incorporating these restrictions into Head-Driven Phrase Structure Grammar (HPSG) has caused us to examine the treatment of nominal modification in HPSG. A new treatment is proposed and an implementation within an HPSG based natural language front-end to a statistical database is described.
conference of the association for machine translation in the americas | 1998
Janine Toole; Davide Turcato; Fred Popowich; Dan Fass; Paul McFetridge
This paper defines the class of time-constrained applications: applications in which the user has limited time to process the system output. This class is differentiated from real-time systems, where it is production time rather than comprehension time that is constrained. Examples of time-constrained MT applications include the translation of multi-party dialogue and the translation of closed-captions. The constraints on comprehension time in such systems have significant implications for the systems objectives, its design, and its evaluation. In this paper we outline these challenges and discuss how they have been met in an English-Spanish MT system designed to translate the closed-captions used on television.
KBCS '89 Proceedings of the international conference on Knowledge based computer systems | 1989
Paul McFetridge; Chris Groeneboer
An approach for dealing with novel terms in input to a natural language interface to databases is presented. Traditionally terms not found in the lexicon are assumed to be database values. It is thus taken for granted that customization is complete, i.e., the lexicon contains all synonyms for all attributes. The problem then becomes one of determining the attribute of which the novel term is a value. The present approach entertains the possibility that the novel term is either a database value or a structural term. We argue that there are linguistic phenomena which in conjunction with the usual methods of defining database values can be used to distinguish between values and structural terms. Novel terms are treated as ambiguous, and the interface attempts to constrain the set of candidate interpretations using certain heuristics. If these methods fail to disambiguate the term, a focussed, informative response is generated for the set of interpretations. When appropriate, the user is solicited for information which the interface uses to create a lexical entry for the previously novel term.
data and knowledge engineering | 1996
Paul McFetridge; Fred Popowich; Dan Fass
Compounds can be analyzed in HPSG as head/complement structures, corresponding to verbal compounds, and head/adjunct structures, corresponding to non-verbal compounds. The rules that create these structures are also responsible for paraphrases using prepositional modification. Although compounds are often thought to have a high structural ambiguity, we show that the distinction between head/complement compounding and head/adjunct compounding together with general semantic considerations eliminates ambiguity. Finally, for database interfaces, the notion of a semantic field is useful for solving problems of noncompositionality of compounds.
intelligent information systems | 1994
Nick Cercone; Jiawei Han; Paul McFetridge; Fred Popowich; Yandong Cai; Dan Fass; Chris Groeneboer; Gary Hall; Yue Huang
Two systems developed in the Centre for Systems Science at Simon Fraser University over the past several years are described briefly. These systems permit easy ad hoc access to information in databases, often stored implicitly, for decision-makers. We argue for their relevance and utility and explain their architectural characteristics, providing pointers to appropriate references for specific theoretical and operational details. Examples are given to illustrate the power and usefulness of these systems and these examples are drawn from actual databases in use with Rogers Cablesystems Ltd. and the Natural Sciences and Engineering Research Council of Canada NSERC.
brazilian symposium on artificial intelligence | 1995
Paul McFetridge; Aline Villavicencio
This paper presents an analysis of the Portuguese using DATR, a language designed for lexical representation by nonmonotonic inheritance hierarchies. The analysis shows that the verb has a consistent structure and that variations can be organized into subclasses that are themselves regular. The default inheritance rule of DATR permits and economical description of the structure of the verb while also retaining sufficient information that exceptions can be described in the appropriate points of the grammar.
canadian conference on artificial intelligence | 2000
Scott McDonald; Davide Turcato; Paul McFetridge; Fred Popowich; Janine Toole
The accurate translation of collocations, or multi-word units, is essential for high quality machine translation. However, many collocations do not translate compositionally, thus requiring individual entries in the bilingual lexicon. We present a technique for collocation extraction from large corpora that takes into account the dispersion of the collocations throughout the corpus. Collocations are ranked to more accurately reflect how likely they are to occur in a wide variety of texts; collocations which are specific to a particular text are less useful for lexicon development. Once the collocations are extracted, appropriate bilingual lexical entries can be developed by lexicographers.