The impact of rater experience and essay quality on raters’ decision-making behaviors
The impact of rater experience and essay quality on raters’ decision-making behaviors
Because assessing EFL students’ writing is a subjective process, detailed scoring rubrics are often used to help minimize rater inconsistency and produce reliable scores. However, focusing exclusively on the outcomes of the assessment task might obscure the processes through which raters go to arrive at their decisions. In this sense, think-aloud protocols can be considered an important tool to understand better the thought processes of raters. This study examines the decision-making behaviors of raters with varying levels of experience while assessing EFL essays of distinct qualities. The data were collected from 28 raters who were Turkish nationals and full-time employees at the English language departments of different state universities in Turkey. They were divided into three experience groups (low, n = 11; medium, n = 8; high, n = 9) based on their reported experience assessing EFL writing. Using a 10-point analytic rubric, each rater voice-recorded their thoughts while scoring 16 essays of distinct text qualities. The data were transcribed and coded using a coding scheme adapted from Cumming, Kantor, and Powers (2002). The results revealed that raters used more interpretation strategies than judgment strategies in their assessments and employed more self-monitoring strategies than language-focused or rhetorical and ideational-focused strategies. Moreover, raters prioritized aspects of style, grammar and mechanics when rating low-quality essays but emphasized rhetoric and their general impressions of the text for high-quality essays. Furthermore, the medium- and high-experienced groups displayed more similarities in their decision-making behaviors than the low-experienced group. The findings suggest that raters’ scoring behaviors might evolve as they become more experienced and that more experienced raters might rely more on their own judgements and less on the scoring criteria. As such, this research provides implications for developing strategy-based rater training programs, which might help to increase consistency across raters of different experience levels.
Sahan, Özgür
6dd60c34-883f-4d29-9886-cb8aa07f718a
9 March 2019
Sahan, Özgür
6dd60c34-883f-4d29-9886-cb8aa07f718a
Sahan, Özgür
(2019)
The impact of rater experience and essay quality on raters’ decision-making behaviors.
American Association for Applied Linguistics (AAAL) Conference, Sheraton Hotel , Atlanta, United States.
09 - 12 Mar 2019.
Record type:
Conference or Workshop Item
(Paper)
Abstract
Because assessing EFL students’ writing is a subjective process, detailed scoring rubrics are often used to help minimize rater inconsistency and produce reliable scores. However, focusing exclusively on the outcomes of the assessment task might obscure the processes through which raters go to arrive at their decisions. In this sense, think-aloud protocols can be considered an important tool to understand better the thought processes of raters. This study examines the decision-making behaviors of raters with varying levels of experience while assessing EFL essays of distinct qualities. The data were collected from 28 raters who were Turkish nationals and full-time employees at the English language departments of different state universities in Turkey. They were divided into three experience groups (low, n = 11; medium, n = 8; high, n = 9) based on their reported experience assessing EFL writing. Using a 10-point analytic rubric, each rater voice-recorded their thoughts while scoring 16 essays of distinct text qualities. The data were transcribed and coded using a coding scheme adapted from Cumming, Kantor, and Powers (2002). The results revealed that raters used more interpretation strategies than judgment strategies in their assessments and employed more self-monitoring strategies than language-focused or rhetorical and ideational-focused strategies. Moreover, raters prioritized aspects of style, grammar and mechanics when rating low-quality essays but emphasized rhetoric and their general impressions of the text for high-quality essays. Furthermore, the medium- and high-experienced groups displayed more similarities in their decision-making behaviors than the low-experienced group. The findings suggest that raters’ scoring behaviors might evolve as they become more experienced and that more experienced raters might rely more on their own judgements and less on the scoring criteria. As such, this research provides implications for developing strategy-based rater training programs, which might help to increase consistency across raters of different experience levels.
This record has no associated files available for download.
More information
Published date: 9 March 2019
Venue - Dates:
American Association for Applied Linguistics (AAAL) Conference, Sheraton Hotel , Atlanta, United States, 2019-03-09 - 2019-03-12
Identifiers
Local EPrints ID: 457798
URI: http://eprints.soton.ac.uk/id/eprint/457798
PURE UUID: 216c716f-b0a4-4a84-aaa4-09e09994b31b
Catalogue record
Date deposited: 16 Jun 2022 17:03
Last modified: 19 Dec 2023 03:10
Export record
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics