A Hierarchical Rater Model Approach for Integrating Automated Essay Scoring Models

Aron Fink*, Sebastian Gombert, Tuo Liu, Hendrik Drachsler, Andreas Frey

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

Essay writing tests, integral inmany educational settings, demand significant resources for manual scoring. Automated essay scoring (AES) can alleviate this by automating the process, thereby reducing human effort. However, the multitude of AES models, each varying in its features and scoring approaches, complicates selecting one optimal model, especially when evaluating diverse content-related aspects across multiple rating items. Therefore, we propose a hierarchical rater model-based approach to integrate predictions from multiple AES models, accounting for their distinct scoring behaviors. We investigated its performance on data from a university essay writing test. The proposed method achieved accuracy that was comparable to the best individual AES model. This is a promising result because it additionally reduced the amount of differential item functioning between human and automated scoring and thus established a higher degree of measurement invariance compared to the individual AES models.

Original languageEnglish
Pages (from-to)209-218
Number of pages10
JournalZeitschrift fur Psychologie / Journal of Psychology
Volume232
Issue number3
Early online date12 Jul 2024
DOIs
Publication statusPublished - Jul 2024

Keywords

  • automated essay scoring
  • formative assessment
  • hierarchical rater model
  • natural language processing
  • transformer models

Fingerprint

Dive into the research topics of 'A Hierarchical Rater Model Approach for Integrating Automated Essay Scoring Models'. Together they form a unique fingerprint.

Cite this