Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Janine Toole is active.

Publication


Featured researches published by Janine Toole.


recent advances in natural language processing | 2000

Adapting a synonym database to specific domains

Davide Turcato; Fred Popowich; Janine Toole; Dan Fass; Devlan Nicholson; Gordon W. Tisher

This paper describes a method for adapting a general purpose synonym database, like WordNet, to a specific domain, where only a subset of the synonymy relations defined in the general database hold. The method adopts an eliminative approach, based on incrementally pruning the original database. The method is based on a preliminary manual pruning phase and an algorithm for automatically pruning the database. This method has been implemented and used for an Information Retrieval system in the aviation domain.


Machine Translation | 2000

Machine Translation of Closed Captions

Fred Popowich; Paul McFetridge; Davide Turcato; Janine Toole

Traditional Machine Translation (MT) systems are designed to translate documents. In this paper we describe an MT system that translates the closed captions that accompany most North American television broadcasts. This domain has two identifying characteristics. First, the captions themselves have properties quite different from the type of textual input that many MT systems have been designed for. This is due to the fact that captions generally represent speech and hence contain many of the phenomena that characterize spoken language. Second, the operational characteristics of the closed-caption domain are also quite distinctive. Unlike most other translation domains, the translated captions are only one of several sources of information that are available to the user. In addition, the user has limited time to comprehend the translation since captions only appear on the screen for a few seconds. In this paper, we look at some of the theoretical and implementational challenges that these characteristics pose for MT. We present a fully automatic large-scale multilingual MT system, ALTo. Our approach is based on Whitelocks Shake and Bake MT paradigm, which relies heavily on lexical resources. The system currently provides wide-coverage translation from English to Spanish. In addition to discussing the design of the system, we also address the evaluation issues that are associated with this domain and report on our current performance.


north american chapter of the association for computational linguistics | 2000

Pre-processing closed captions for machine translation

Davide Turcato; Fred Popowich; Paul McFetridge; Devlan Nicholson; Janine Toole

We describe an approach to Machine Translation of transcribed speech, as found in closed captions. We discuss how the colloquial nature and input format peculiarities of closed captions are dealt with in a pre-processing pipeline that prepares the input for effective processing by a core MT system. In particular, we describe components for proper name recognition and input segmentation. We evaluate the contribution of such modules to the system performance. The described methods have been implemented on an MT system for translating English closed captions to Spanish and Portuguese.


Computer Assisted Language Learning | 2002

The Tutor Assistant: An Authoring Tool for an Intelligent Language Tutoring System

Janine Toole; Trude Heift

This paper describes the Tutor Assistant, an authoring tool for an Intelligent Language Tutoring systems (ILTS) for English as a Second Language (ESL). The common goal of authoring tools for ITSs is to reduce the costs in expertise and time that is required to produce a usable intelligent learning environment. The Tutor Assistant is designed to be usable by language instructions with little of no experience of ILTSs and ILTS authoring tools. This paper reports on a recent study which evaluates the degree to which typical users of our system can author good quality content for an ILTS and establishes benchmarks for development times.


conference on applied natural language processing | 2000

Categorizing Unknown Words: Using Decision Trees to Identify Names and Misspellings

Janine Toole

This paper introduces a system for categorizing unknown words. The system is based on a multicomponent architecture where each component is responsible for identifying one class of unknown words. The focus of this paper is the components that identify names and spelling errors. Each component uses a decision tree architecture to combine multiple types of evidence about the unknown word. The system is evaluated using data from live closed captions - a genre replete with a wide variety of unknown words.


conference of the association for machine translation in the americas | 1998

Time-Constrained Machine Translation

Janine Toole; Davide Turcato; Fred Popowich; Dan Fass; Paul McFetridge

This paper defines the class of time-constrained applications: applications in which the user has limited time to process the system output. This class is differentiated from real-time systems, where it is production time rather than comprehension time that is constrained. Examples of time-constrained MT applications include the translation of multi-party dialogue and the translation of closed-captions. The constraints on comprehension time in such systems have significant implications for the systems objectives, its design, and its evaluation. In this paper we outline these challenges and discuss how they have been met in an English-Spanish MT system designed to translate the closed-captions used on television.


australian joint conference on artificial intelligence | 1999

Categorizing Unknown Words: A Decision Tree-Based Misspelling Identifier

Janine Toole

This paper introduces a robust, portable system for categorizing unknown words. It is based on a multi- component architecture where each component is responsible for identifying one class of unknown words. The focus of this paper is the component that identifies spelling errors. The misspelling identifier uses a decision tree architecture to combine multiple types of evidence about the unknown word. The misspelling identifier is evaluated using data from live closed captions - a gem-e replete with a wide variety of unknown words.


canadian conference on artificial intelligence | 2000

Collocation Discovery for Optimal Bilingual Lexicon Development

Scott McDonald; Davide Turcato; Paul McFetridge; Fred Popowich; Janine Toole

The accurate translation of collocations, or multi-word units, is essential for high quality machine translation. However, many collocations do not translate compositionally, thus requiring individual entries in the bilingual lexicon. We present a technique for collocation extraction from large corpora that takes into account the dispersion of the collocations throughout the corpus. Collocations are ranked to more accurately reflect how likely they are to occur in a wide variety of texts; collocations which are specific to a particular text are less useful for lexicon development. Once the collocations are extracted, appropriate bilingual lexical entries can be developed by lexicographers.


Archive | 2001

Method and system for describing and identifying concepts in natural language text for information retrieval and processing

Daniel C. Fass; Davide Turcato; Gordon W. Tisher; James Devlan Nicholson; Milan Mosny; Frederick P. Popowich; Janine Toole; Paul McFetridge; Frederick W Kroon


Archive | 2001

Method and system for adapting synonym resources to specific domains

Davide Turcato; Frederick P. Popowich; Janine Toole; Daniel C. Fass; James Devlan Nicholson; Gordon W. Tisher

Collaboration


Dive into the Janine Toole's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Trude Heift

Simon Fraser University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Dan Fass

Simon Fraser University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge