James R. Cowie
New Mexico State University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by James R. Cowie.
international conference on computational linguistics | 1992
James R. Cowie; Joe A Guthrie; Louise Guthrie
The resolution of lexical ambiguity is important for most natural language processing tasks, and a range of computational techniques have been proposed for its solution. None of these has yet proven effective on a large scale. In this paper, we describe a method for lexical disambiguation of text using the definitions in a machine-readable dictionary together with the technique of simulated annealing. The method operates on complete sentences and attempts to select the optimal combinations of word senses for all the words in the sentence simultaneously. The words in the sentences may be any of the 28,000 headwords in Longmans Dictionary of Contemporary English (LDOCE) and are disambiguated relative to the senses given in LDOCE. Our initial results on a sample set of 50 sentences are comparable to those of other researchers, and the fully automatic method requires no hand-coding of lexical entries, or hand-tagging of text.
International Journal of Medical Informatics | 2010
Gondy Leroy; Stephen Helmreich; James R. Cowie
PURPOSE Willingness and ability to learn from health information in text are crucial for people to be informed and make better medical decisions. These two user characteristics are influenced by the perceived and actual difficulty of text. Our goal is to find text features that are indicative of perceived and actual difficulty so that barriers to reading can be lowered and understanding of information increased. METHODS We systematically manipulated three text characteristics, - overall sentence structure (active, passive, extraposed-subject, or sentential-subject), noun phrases complexity (simple or complex), and function word density (high or low), - which are more fine-grained metrics to evaluate text than the commonly used readability formulas. We measured perceived difficulty with individual sentences by asking consumers to choose the easiest and most difficult version of a sentence. We measured actual difficulty with entire paragraphs by posing multiple-choice questions to measure understanding and retention of information in easy and difficult versions of the paragraphs. RESULTS Based on a study with 86 participants, we found that low noun phrase complexity and high function words density lead to sentences being perceived as simpler. In the sentences with passive, sentential-subject, or extraposed-subject sentences, both main and interaction effects were significant (all p<.05). In active sentences, only noun phrase complexity mattered (p<.001). For the same group of participants, simplification of entire paragraphs based on these three linguistic features had only a small effect on understanding (p=.99) and no effect on retention of information. CONCLUSIONS Using grammatical text features, we could measure and improve the perceived difficulty of text. In contrast to expectations based on readability formulas, these grammatical manipulations had limited effects on actual difficulty and so were insufficient to simplify the text and improve understanding. Future work will include semantic measures and overall text composition and their effects on perceived and actual difficulty. LIMITATIONS These results are limited to grammatical features of text. The studies also used only one task, a question-answering task, to measure understanding of information.
Machine Translation | 2002
Marjorie McShane; Sergei Nirenburg; James R. Cowie; Ron Zacharski
This paper describes Expedition, an environment designed to facilitate the quick ramp-up of MT systems from practically any alphabetic language (L) into English. The central component of Expedition is a knowledge-elicitation system that guides a linguistically naive bilingual speaker through the process of describing L in terms of its ecological, morphological, grammatical, lexical, and transfer information. Expedition also includes a module for converting the elicited information into the format expected by the underlying MT system and an MT engine that relies on both the elicited knowledge and resident knowledge about English. The Expedition environment is integrated using a configuration and control system. Expedition represents an innovative approach to answering the need for rapid-configuration MT by preparing an MT system in which the only missing link is information about L, which is elicited in a structured fashion such that it can be directly exploited by the system. In this paper we report on the current state of Expedition with an emphasis on the knowledge elicitation system.
Proceedings of the TIPSTER Text Program: Phase I | 1993
James R. Cowie; Louise Guthrie; Wang Jin; William C. Ogden; James Pustejovsky; Rong Wang; Takahiro Wakao; Scott Waterman; Yorick Wilks
Diderot is an information extraction system built at CRL and Brandeis University over the past two years. It was produced as part of our efforts in the Tipster project. The same overall system architecture has been used for English and Japanese and for the micro-electronics and joint venture domains.
MUC4 '92 Proceedings of the 4th conference on Message understanding | 1992
James R. Cowie; Louise Guthrie; Yorick Wilks; James Pustejovsky; Scott Waterman
Through their involvement in the Tipster project the Computing Research Laboratory at New Mexico State University and the Computer Science Department at Brandeis University are developing a method for identifying articles of interest and extracting and storing specific kinds of information from large volumes of Japanese and English texts. We intend that the method be general and extensible. The techniques involved are not explicitly tied to these two languages nor to a particular subject area. Development for Tipster has been going on since September, 1992.
Proceedings of the TIPSTER Text Program: Phase III | 1998
James R. Cowie; Eugene Ludovik; Hugo Molina-Salgado
We discuss those techniques which, in the opinion of the authors, are needed to support robust automatic summarization. Many of these methods are already incorporated in a multi-lingual summarization system, MINDS, developed at CRL. The approach is sentence selection, but includes techniques to improve coherence and also to perform sentence reduction. Our methods are in distinct contrast to those approaches to summarization by deep analysis of a document followed by text generation.
MUC5 '93 Proceedings of the 5th conference on Message understanding | 1993
James R. Cowie; Louise Guthrie; Wang Jin; Rong Wang; Takahiro Wakao; James Pustejovsky; Scott Waterman
This report describes the major developments over the last six months in completing the Diderot information extraction system for the MUC-5 evaluation.Diderot is an information extraction system built at CRL and Brandeis University over the past two years. It was produced as part of our efforts in the Tipster project. The same overall system architecture has been used for English and Japanese and for the micro-electronics and joint venture domains.The past history of the system is discussed and the operation of its major components described. A summary of scores at the 24 month workshop is given and the performance of the system on the texts selected for the system walkthrough is discussed.
human language technology | 1994
Yorick Wilks; James Pustejovsky; James R. Cowie
The Computing Research Laboratory at New Mexico State University, in collaboration with Brandeis University, was one of four sites selected to develop systems to extract relevant information automatically from English and Japanese texts. When we started, neither site had been involved in message understanding or information extraction. CRL had extensive experience in multilingual natural language processing and in the use of machine readable dictionaries for system building, Brandeis had developed a theory of lexical semantics and preliminary methods for deriving this lexical information from corpora. Thus, our approach focused on applying new techniques to the information extraction task. In the last two years we have developed information extraction software for 5 five different subject area/language pairs.
MUC4 '92 Proceedings of the 4th conference on Message understanding | 1992
James R. Cowie; Louise Guthrie; Yorick Wilks; James Pustejovsky
The Computing Research Laboratory (New Mexico State University) and the Computer Science Department (Brandeis University) are collaborating on the development of a system (DIDEROT) to perform data extraction for the Tipster project. This system is still far from fully developed, but as many of the techniques being used are domain ---and in many cases language--- independent, we have assembled them in a preliminary manner to produce a prototype system (MucBruce), which handles the MUC-4 texts.
american medical informatics association annual symposium | 2008
Gondy Leroy; Stephen Helmreich; James R. Cowie; Trudi Miller; Wei Zheng