Sara Cushing Weigle | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Sara Cushing Weigle is active.

Explore More

Publication

Featured researches published by Sara Cushing Weigle.

Language Testing | 1998

Using FACETS to model rater training effects

Sara Cushing Weigle

This article describes a study conducted to explore differences in rater severity and consistency among inexperienced and experienced raters both before and after rater training. Sixteen raters (eight experienced and eight inexperienced) rated overlapping subsets of essays from a total sample of 60 essays before and after rater training in the context of an operational administration of UCLA’s English as a Second Language Placement Examination (ESLPE). A three-part scale was used, comprising content, rhetorical control, and language. Ratings were analysed using FACETS, a multi-faceted Rasch analysis program that provides estimates of rater severity on a linear scale as well as fit statistics, which are indicators of rater consistency. The analysis showed that the inexperienced raters tended to be both more severe and less consistent in their ratings than the experienced raters before training. After training, the differences between the two groups of raters were less pronounced; however, significant differences in severity were still found among raters, although consistency had improved for most raters. These results provide support for the notion that rater training is more successful in helping raters give more predictable scores (i.e., intra-rater reliability) than in getting them to give identical scores (i.e., inter-rater reliability).

Language Testing | 1994

Effects of training on raters of ESL compositions

Sara Cushing Weigle

Several effects of training on composition raters have been hypothesized but not investigated empirically. This article presents an analysis of the verbal protocols of four inexperienced raters of ESL placement compositions scoring the same essays both before and after rater training. The verbal protocols show that training clarified the intended scoring criteria for raters, modified their expectations of student writing and provided a reference group of other raters with which raters could compare themselves, although agreement with peers was not an over-riding concern. These results are generally in accordance with hypothesized effects of rater training.

Language Testing | 2010

Validation of automated scores of TOEFL iBT tasks against non-test indicators of writing ability

Sara Cushing Weigle

Automated scoring has the potential to dramatically reduce the time and costs associated with the assessment of complex skills such as writing, but its use must be validated against a variety of criteria for it to be accepted by test users and stakeholders. This study approaches validity by comparing human and automated scores on responses to TOEFL® iBT Independent writing tasks with several non-test indicators of writing ability: student self-assessment, instructor assessment, and independent ratings of non-test writing samples. Automated scores were produced using e-rater ®, developed by Educational Testing Service (ETS). Correlations between both human and e-rater scores and non-test indicators were moderate but consistent, providing criterion-related validity evidence for the use of e-rater along with human scores. The implications of the findings for the validity of automated scores are discussed.

TESOL Quarterly | 2003

Effects of Task and Rater Background on the Evaluation of ESL Student Writing: A Pilot Study.

Sara Cushing Weigle; Heather Boldt; Maria Ines Valsecchi

TESOL Quarterly invites readers to submit short reports and updates on their work. These summaries may address any areas of interest to Quarterly readers.

Archive | 2005

Second Language Writing Expertise

Sara Cushing Weigle

It is not uncommon to observe that, while virtually everyone is an expert at speaking their first language, expertise in writing is attained only rarely and only with great effort. Writing as a technology is quite recent in human history, and widespread literacy has only been accomplished in the past few centuries. Many languages do not have a writing system, and in other cases, the variety of the language that is used for writing differs widely from the variety that is used for oral communication. Even the majority of those who speak a standard language that is used for writing do not develop what might be called expertise. The situation of second language writers is vastly more complicated due to the variety of situations in which a second language is learned, the reasons for learning that language, the relative usefulness of writing in the Li versus the L2, and whether an L2 learner is literate in Li. The second language is frequently not acquired to the same extent as the first language, first language literacy influences the acquisition of L2 literacy in complex ways, and the use of writing in different L2 contexts differs widely. What does it mean, then, to be an expert writer, and to promote the development of expertise in second language contexts?

ETS Research Report Series | 2011

Validation of Automated Scores of TOEFL iBT® Tasks Against Nontest Indicators of Writing Ability

Sara Cushing Weigle

Automated scoring has the potential to dramatically reduce the time and costs associated with the assessment of complex skills such as writing, but its use must be validated against a variety of criteria for it to be accepted by test users and stakeholders. This study addresses two validity-related issues regarding the use of e-rater® with the independent writing task on the TOEFL iBT® (Internet-based test). First, relationships between automated scores of iBT tasks and nontest indicators of writing ability were examined. This was followed by exploration of prompt-related differences in automated scores of essays written by the same examinees. Correlations between both human and e-rater scores and nontest indicators were moderate but consistent, with few differences between e-rater and human rater scores. E-rater was more consistent across prompts than individual human raters, although there were differences in scores across prompts for the individual features used to generate total e-rater scores.

Language Assessment Quarterly | 2013

Exploring Reading Processes in an Academic Reading Test Using Short-Answer Questions

Sara Cushing Weigle; WeiWei Yang; Megan Montee

Integrated reading/writing tasks are becoming more common in large-scale language tests. Much of the research on these tasks has focused on writing through reading; assessing reading through writing is a less explored area. In this article we describe a reading-into-writing task that is intended to measure both reading comprehension and language use on an academic English test. The task involves responding to short-answer questions (SAQs) that require examinees to use their own words to state the main idea of a text, draw inferences, or synthesize information across multiple texts. The article presents results of a two-part study addressing the validity of this method of assessing reading by investigating the cognitive processes involved in responding to SAQs. First, we present the results of a qualitative study of five nonnative English-speaking students, who provided verbal protocols as they read the texts and responded to the SAQs. Next, we present data from a larger sample of students focusing specifically on the cognitive processes used when reading the texts for the purpose of responding to SAQs. Implications of the study for the validity of this method of testing are discussed.

Archive | 2012

Raters’ Perceptions of Textual Borrowing in Integrated Writing Tasks

Sara Cushing Weigle; Megan Montee

The role of the rater is central in writing assessment. Integrated reading-writing assessment tasks are becoming more frequent in large-scale tests of writing as they are intended to more closely reflect language use in real-world academic settings than tasks that measure reading and writing as discrete skills. Such tasks raise questions about how test takers incorporate ideas and language from source texts and how source text borrowing is perceived by raters. While previous research has looked at patterns of textual borrowing in integrated writing tests, little research has examined how test raters perceive textual borrowing and how various kinds of textual borrowing affect their scores. This chapter presents the results of an exploratory study of rater perceptions of textual borrowing on a timed integrated writing test administered at a large public university in the United States. The methodology for the study included focus groups, stimulated recalls, and a rater survey. Keywords:focus groups; integrated reading-writing assessment tasks; rater judgment task; stimulated recall interviews; textual borrowing; writing assessment

Assessing Writing | 1999