Karen Kukich | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Karen Kukich is active.

Explore More

Publication

Featured researches published by Karen Kukich.

meeting of the association for computational linguistics | 1998

Automated Scoring Using A Hybrid Feature Identification Technique

Jill Burstein; Karen Kukich; Susanne Wolff; Chi Lu; Martin Chodorow; Lisa C. Braden-Harder; Mary Dee Harris

This study exploits statistical redundancy inherent in natural language to automatically predict scores for essays. We use a hybrid feature identification method, including syntactic structure analysis, rhetorical structure analysis, and topical analysis, to score essay responses from test-takers of the Graduate Management Admissions Test (GMAT) and the Test of Written English (TWE). For each essay question, a stepwise linear regression analysis is run on a training set (sample of human scored essay responses) to extract a weighted set of predictive features for each test question. Score prediction for cross-validation sets is calculated from the set of predictive features. Exact or adjacent agreement between the Electronic Essay Rater (e-rater) score predictions and human rater scores ranged from 87% to 94% across the 15 test questions.

Computers in Human Behavior | 2002

Stumping e-rater:challenging the validity of automated essay scoring

Donald E. Powers; Jill Burstein; Martin Chodorow; Mary E. Fowles; Karen Kukich

Abstract For this study, various parties were invited to “challenge” e-rater—an automated essay scorer that relies on natural language processing techniques—by composing essays in response to Graduate Record Examinations (GRE®) Writing Assessment prompts with the intention of undermining its scoring capability. Specifically, using detailed information about e-raters approach to essay scoring, writers tried to “trick” the computer-based system into assigning scores that were higher or lower than deserved. E-raters automated scores on these “problem essays” were compared with scores given by two trained, human readers, and the difference between the scores constituted the standard for judging the extent to which e-rater was fooled. Challengers were differentially successful in writing problematic essays. As a whole, they were more successful in tricking e-rater into assigning scores that were too high than in duping e-rater into awarding scores that were too low. The study provides information on ways in which e-rater, and perhaps other automated essay scoring systems, may fail to provide accurate evaluations, if used as the sole method of scoring in high-stakes assessments. The results suggest possible avenues for improving automated scoring methods.

meeting of the association for computational linguistics | 2000

The role of centering theory's rough-shift in the teaching and evaluation of writing skills

Eleni Miltsakaki; Karen Kukich

Existing software systems for automated essay scoring can provide NLP researchers with opportunities to test certain theoretical hypotheses, including some derived from Centering Theory. In this study we employ ETSs e-rater essay scoring system to examine whether local discourse coherence, as defined by a measure of Rough-Shift transitions, might be a significant contributor to the evaluation of essays. Our positive results indicate that Rough-Shifts do indeed capture a source of incoherence, one that has not been closely examined in the Centering literature. These results not only justify Rough-Shifts as a valid transition type, but they also support the original formulation of Centering as a measure of discourse continuity even in pronominal-free text.

Archive | 1998