ZIB Education

Semi-Automatic Coding of Text Responses in Large-Scale Assessments

Nico Andersen➚, ZIB-associated Researcher DIPF

When tests measure performance in a large sample, the evaluation of open-ended text responses is very time-consuming. The aim of this qualification work is to develop a method to minimize the associated effort. The text is mathematically represented by an n-dimensional vector. Similar texts will be clustered so that prototypical answers can be found. Consequently, human coders can classify these prototypical responses. Assuming that semantically similar answers also receive similar codes, the system supports the coder in his or her work by automatically coding semantically similar texts in the background (taking into account statistical uncertainty). Thus, not all responses have to be evaluated manually. The method’s performance is simulated based on text responses to items from the reading literacy assessment from the Programme for International Student Assessment (PISA) 2012 and 2015.


Further work is planned to follow-up on the question of how human coders can be supported in evaluating text responses and how the process of automatic coding can be optimized.

BetreuerProf. Dr. Frank Goldhammer➚


089 289 28274 zib.edu@sot.tum.de