Martin Chodorow | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Martin Chodorow is active.

Explore More

Publication

Featured researches published by Martin Chodorow.

human language technology | 1994

Using a semantic concordance for sense identification

George A. Miller; Martin Chodorow; Shari Landes; Claudia Leacock; Robert G. Thomas

This paper proposes benchmarks for systems of automatic sense identification. A textual corpus in which open-class words had been tagged both syntactically and semantically was used to explore three statistical strategies for sense identification: a guessing heuristic, a most-frequent heuristic, and a co-occurrence heuristic. When no information about sense-frequencies was available, the guessing heuristic using the numbers of alternative senses in WordNet was correct 45% of the time. When statistics for sense-frequencies were derived from a semantic concordance, the assumption that each word is used in its most frequently occurring sense was correct 69% of the time; when that figure was calculated for polysemous words alone, it dropped to 58%. And when a co-occurrence heuristic took advantage of prior occurrences of words together in the same sentences, little improvement was observed. The semantic concordance is still too small to estimate the potential limits of a co-occurrence heuristic.

Computers in Human Behavior | 2002

Stumping e-rater:challenging the validity of automated essay scoring

Donald E. Powers; Jill Burstein; Martin Chodorow; Mary E. Fowles; Karen Kukich

Abstract For this study, various parties were invited to “challenge” e-rater—an automated essay scorer that relies on natural language processing techniques—by composing essays in response to Graduate Record Examinations (GRE®) Writing Assessment prompts with the intention of undermining its scoring capability. Specifically, using detailed information about e-raters approach to essay scoring, writers tried to “trick” the computer-based system into assigning scores that were higher or lower than deserved. E-raters automated scores on these “problem essays” were compared with scores given by two trained, human readers, and the difference between the scores constituted the standard for judging the extent to which e-rater was fooled. Challengers were differentially successful in writing problematic essays. As a whole, they were more successful in tricking e-rater into assigning scores that were too high than in duping e-rater into awarding scores that were too low. The study provides information on ways in which e-rater, and perhaps other automated essay scoring systems, may fail to provide accurate evaluations, if used as the sole method of scoring in high-stakes assessments. The results suggest possible avenues for improving automated scoring methods.

Journal of Educational Computing Research | 2002

Comparing the Validity of Automated and Human Scoring of Essays.

Donald E. Powers; Jill Burstein; Martin Chodorow; Mary E. Fowles; Karen Kukich

Automated, or computer-based, scoring represents one promising possibility for improving the cost effectiveness (and other features) of complex performance assessments (such as direct tests of writing skill) that require examinees to construct responses rather than select them from a set of multiple choices. Indeed, significant advances have been made in applying natural language processing techniques to the automatic scoring of essays. Thus far, most of the validation of automated scoring has focused appropriately (but too narrowly, we contend) on the correspondence between computer-generated scores and those assigned by human readers. Far less effort has been devoted to assessing the relation of automated scores to independent indicators of examinees writing skills. This study examined the relationship of scores from a graduate level writing assessment to several independent, non-test indicators of examinees writing skills—both for automated scores and for scores assigned by trained human readers. The extent to which automated and human scores exhibited similar relations with the non-test indicators was taken as evidence of the degree to which the two methods of scoring reflect similar aspects of writing proficiency. Analyses revealed significant, but modest, correlations between the non-test indicators and each of the two methods of scoring. These relations were somewhat weaker for automated scores than for scores awarded by human readers. Overall, however, the results provide some evidence of the validity of one specific procedure for automated scoring.

Archive | 1998