Language Testing | 2019

The effect of training and rater differences on oral proficiency assessment

 
 
 

Abstract


As a result of the fact that judgments of non-native speech are closely tied to social biases, oral proficiency ratings are susceptible to error because of rater background and social attitudes. In the present study we seek first to estimate the variance attributable to rater background and attitudinal variables on novice raters’ assessments of L2 spoken English. Second, we examine the effects of minimal training in reducing the potency of those trait-irrelevant rater factors. Accordingly, we examined the relative impact of rater differences on TOEFL iBT® speaking scores. Eighty-two untrained raters judged 112 speech samples produced by TOEFL® examinees. Findings revealed that approximately 20% of untrained raters’ score variance was, in part, a result of their background and attitudinal factors. The strongest predictor was the raters’ own native speaker status. However, minimal online training dramatically reduced the impact of rater background and attitudinal variables for a subsample of high- and low-severity raters. Implications suggest that brief and user-friendly rater-training sessions offer the promise of mitigating rater bias, at least in the short run. This procedure can be adopted in assessment and other related fields of applied linguistics.

Volume 36
Pages 481 - 504
DOI 10.1177/0265532219849522
Language English
Journal Language Testing

Full Text