Graeme Hirst | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Graeme Hirst is active.

Explore More

Publication

Featured researches published by Graeme Hirst.

Computational Linguistics | 2006

Evaluating WordNet-based Measures of Lexical Semantic Relatedness

Alexander Budanitsky; Graeme Hirst

The quantification of lexical semantic relatedness has many applications in NLP, and many different measures have been proposed. We evaluate five of these measures, all of which use WordNet as their central resource, by comparing their performance in detecting and correcting real-word spelling errors. An information-content-based measure proposed by Jiang and Conrath is found superior to those proposed by Hirst and St-Onge, Leacock and Chodorow, Lin, and Resnik. In addition, we explain why distributional similarity is not an adequate proxy for lexical semantic relatedness.

Computational Linguistics | 2002

Near-synonymy and lexical choice

Philip Edmonds; Graeme Hirst

We develop a new computational model for representing the fine-grained meanings of near-synonyms and the differences between them. We also develop a lexical-choice process that can decide which of several near-synonyms is most appropriate in a particular situation. This research has direct applications in machine translation and text generation. We first identify the problems of representing near-synonyms in a computational lexicon and show that no previous model adequately accounts for near-synonymy. We then propose a preliminary theory to account for near-synonymy, relying crucially on the notion of granularity of representation, in which the meaning of a word arises out of a context-dependent combination of a context-independent core meaning and a set of explicit differences to its near-synonyms. That is, near-synonyms cluster together. We then develop a clustered model of lexical knowledge, derived from the conventional ontological model. The model cuts off the ontology at a coarse grain, thus avoiding an awkward proliferation of language-dependent concepts in the ontology, yet maintaining the advantages of efficient computation and reasoning. The model groups near-synonyms into subconceptual clusters that are linked to the ontology. A cluster differentiates near-synonyms in terms of fine-grained aspects of denotation, implication, expressed attitude, and style. The model is general enough to account for other types of variation, for instance, in collocational behavior. An efficient, robust, and flexible fine-grained lexical-choice process is a consequence of a clustered model of lexical knowledge. To make it work, we formalize criteria for lexical choice as preferences to express certain concepts with varying indirectness, to express attitudes, and to establish certain styles. The lexical-choice process itself works on two tiers: between clusters and between near-synonyns of clusters. We describe our prototype implementation of the system, called I-Saurus.

Natural Language Engineering | 2005

Correcting real-word spelling errors by restoring lexical cohesion

Graeme Hirst; Alexander Budanitsky

Spelling errors that happen to result in a real word in the lexicon cannot be detected by a conventional spelling checker. We present a method for detecting and correcting many such errors by identifying tokens that are semantically unrelated to their context and are spelling variations of words that would be related to the context. Relatedness to context is determined by a measure of semantic distance initially proposed by Jiang and Conrath (1997). We tested the method on an artificial corpus of errors; it achieved recall of 23–50% and precision of 18–25%.

Journal of the Association for Information Science and Technology | 1978

Discipline impact factors: A method for determining core journal lists

Graeme Hirst

A method of determining core journals for a discipline, using data from the Journal Citation Reports to generate discipline impact factors, is described.

Literary and Linguistic Computing | 2007

Bigrams of Syntactic Labels for Authorship Discrimination of Short Texts

Graeme Hirst; Ol’ga Feiguina

We present a method for authorship discrimination that is based on the frequency of bigrams of syntactic labels that arise from partial parsing of the text. We show that this method, alone or combined with other classification features, achieves a high accuracy on discrimination of the work of Anne and Charlotte Bronte ¨, which is very difficult to do by traditional methods. Moreover, high accuracies are achieved even on fragments of text little more than 200 words long.

Speech Communication | 1994

Repairing conversational misunderstandings and non-understandings

Graeme Hirst; Susan Weber McRoy; Peter A. Heeman; Philip Edmonds; Diane Horton

Abstract Participants in a discourse sometimes fail to understand one another, but, when aware of the problem, collaborate upon or negotiate the meaning of a problematic utterance. To address non-understanding, we have developed two plan-based models of collaboration in identifying the correct referent of a description: one covers situations where both conversants know of the referent, and the other covers situations, such as direction-giving, where the recipient does not. In the models, conversants use the mechanisms of refashioning, suggestion and elaboration, to collaboratively refine a referring expression until it is successful. To address misunderstanding, we have developed a model that combines intentional and social accounts of discourse to support the negotiation of meaning. The approach extends intentional accounts by using expectations deriving from social conventions in order to guide interpretation. Reflecting the inherent symmetry of the negotiation of meaning, all our models can act as both speaker and hearer, and can play both the role of the conversant who is not understood or misunderstood and the role of the conversant who fails to understand.

empirical methods in natural language processing | 2008

Computing Word-Pair Antonymy

Saif Mohammad; Bonnie J. Dorr; Graeme Hirst

Knowing the degree of antonymy between words has widespread applications in natural language processing. Manually-created lexicons have limited coverage and do not include most semantically contrasting word pairs. We present a new automatic and empirical measure of antonymy that combines corpus statistics with the structure of a published thesaurus. The approach is evaluated on a set of closest-opposite questions, obtaining a precision of over 80%. Along the way, we discuss what humans consider antonymous and how antonymy manifests itself in utterances.

empirical methods in natural language processing | 2006

Distributional measures of concept-distance: A task-oriented evaluation

Saif Mohammad; Graeme Hirst

We propose a framework to derive the distance between concepts from distributional measures of word co-occurrences. We use the categories in a published thesaurus as coarse-grained concepts, allowing all possible distance values to be stored in a concept--concept matrix roughly .01% the size of that created by existing measures. We show that the newly proposed concept-distance measures outperform traditional distributional word-distance measures in the tasks of (1) ranking word pairs in order of semantic distance, and (2) correcting real-word spelling errors. In the latter task, of all the WordNet-based measures, only that proposed by Jiang and Conrath outperforms the best distributional concept-distance measures.

international conference on computational linguistics | 2008

Real-word spelling correction with trigrams: a reconsideration of the Mays, Damerau, and Mercer model

L. Amber Wilcox-O'Hearn; Graeme Hirst; Alexander Budanitsky

The trigram-based noisy-channel model of real-word spelling-error correction that was presented by Mays, Damerau, and Mercer in 1991 has never been adequately evaluated or compared with other methods. We analyze the advantages and limitations of the method, and present a new evaluation that enables a meaningful comparison with the WordNet-based method of Hirst and Budanitsky. The trigram method is found to be superior, even on content words. We then show that optimizing over sentences gives better results than variants of the algorithm that optimize over fixed-length windows.

Archive | 1997

Authoring and Generating Health-Education Documents That Are Tailored to the Needs of the Individual Patient

Graeme Hirst; Chrysanne DiMarco; Eduard H. Hovy; Kimberley Parsons

Health-education documents can be much more effective in achieving patient compliance if they are customized for individual readers. For this purpose, a medical record can be thought of as an extremely detailed user model of a reader of such a document. The HealthDoc project is developing methods for producing health-information and patient-education documents that are tailored to the individual personal and medical characteristics of the patients who receive them. Information from an on-line medical record or from a clinician will be used as the primary basis for deciding how best to fit the document to the patient. In this paper, we describe our research on three aspects of the project: the kinds of tailoring that are appropriate for health-education documents; the nature of a tailorable master document, and how it can be created; and the linguistic problems that arise when a tailored instance of the document is to be generated.

Explore More