Saif Mohammad | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Saif Mohammad is active.

Explore More

Publication

Featured researches published by Saif Mohammad.

north american chapter of the association for computational linguistics | 2009

Using Citations to Generate surveys of Scientific Paradigms

Saif Mohammad; Bonnie J. Dorr; Melissa Egan; Ahmed Hassan; Pradeep Muthukrishan; Vahed Qazvinian; Dragomir R. Radev; David M. Zajic

The number of research publications in various disciplines is growing exponentially. Researchers and scientists are increasingly finding themselves in the position of having to quickly understand large amounts of technical material. In this paper we present the first steps in producing an automatically generated, readily consumable, technical survey. Specifically we explore the combination of citation information and summarization techniques. Even though prior work (Teufel et al., 2006) argues that citation text is unsuitable for summarization, we show that in the framework of multi-document survey creation, citation texts can play a crucial role.

empirical methods in natural language processing | 2008

Computing Word-Pair Antonymy

Saif Mohammad; Bonnie J. Dorr; Graeme Hirst

Knowing the degree of antonymy between words has widespread applications in natural language processing. Manually-created lexicons have limited coverage and do not include most semantically contrasting word pairs. We present a new automatic and empirical measure of antonymy that combines corpus statistics with the structure of a published thesaurus. The approach is evaluated on a set of closest-opposite questions, obtaining a precision of over 80%. Along the way, we discuss what humans consider antonymous and how antonymy manifests itself in utterances.

empirical methods in natural language processing | 2006

Distributional measures of concept-distance: A task-oriented evaluation

Saif Mohammad; Graeme Hirst

We propose a framework to derive the distance between concepts from distributional measures of word co-occurrences. We use the categories in a published thesaurus as coarse-grained concepts, allowing all possible distance values to be stored in a concept--concept matrix roughly .01% the size of that created by existing measures. We show that the newly proposed concept-distance measures outperform traditional distributional word-distance measures in the tasks of (1) ranking word pairs in order of semantic distance, and (2) correcting real-word spelling errors. In the latter task, of all the WordNet-based measures, only that proposed by Jiang and Conrath outperforms the best distributional concept-distance measures.

empirical methods in natural language processing | 2009

Estimating Semantic Distance Using Soft Semantic Constraints in Knowledge-Source -- Corpus Hybrid Models

Yuval Marton; Saif Mohammad; Philip Resnik

Strictly corpus-based measures of semantic distance conflate co-occurrence information pertaining to the many possible senses of target words. We propose a corpus-thesaurus hybrid method that uses soft constraints to generate word-senseaware distributional profiles (DPs) from coarser concept DPs (derived from a Roget-like thesaurus) and sense-unaware traditional word DPs (derived from raw text). Although it uses a knowledge source, the method is not vocabulary-limited: if the target word is not in the thesaurus, the method falls back gracefully on the words co-occurrence information. This allows the method to access valuable information encoded in a lexical resource, such as a thesaurus, while still being able to effectively handle domain-specific terms and named entities. Experiments on word-pair ranking by semantic distance show the new hybrid method to be superior to others.

international conference on computational linguistics | 2003

Guaranteed pre-tagging for the Brill tagger

Saif Mohammad; Ted Pedersen

This paper describes and evaluates a simple modification to the Brill Part-of-Speech Tagger. In its standard distribution the Brill Tagger allows manual assignment of a part-of-speech tag to a word prior to tagging. However, it may change it to another tag during processing. We suggest a change that guarantees that the pre-tag remains unchanged and ensures that it is used throughout the tagging process. Our method of guaranteed pre-tagging is appropriate when the tag of a word is known for certain, and is intended to help improve the accuracy of tagging by providing a reliable anchor or seed around which to tag.

meeting of the association for computational linguistics | 2007

Tor, TorMd: Distributional Profiles of Concepts for Unsupervised Word Sense Disambiguation

Saif Mohammad; Graeme Hirst; Philip Resnik

Words in the context of a target word have long been used as features by supervised word-sense classifiers. Mohammad and Hirst (2006a) proposed a way to determine the strength of association between a sense or concept and co-occurring words---the distributional profile of a concept (DPC)---without the use of manually annotated data. We implemented an unsupervised naive Bayes word sense classifier using these DPCs that was best or within one percentage point of the best unsupervised systems in the Multilingual Chinese-English Lexical Sample Task (task #5) and the English Lexical Sample Task (task #17). We also created a simple PMI-based classifier to attempt the English Lexical Substitution Task (task #10); however, its performance was poor.

conference of the european chapter of the association for computational linguistics | 2006