ZIB Education

Investigation of mode and setting effects incorporating the test-taking process using log and process data

Dr. Ulf Kröhne➚, ZIB-associated researcher DIPF

The habilitation project deals with the design of computer-based testing in large-scale assessments (LSAs), facing the challenge of the maintaining the validity of the score interpretation with respect to trend estimates (using the example of PISA) and with respect to longitudinal comparisons (using the case of the National Education Panel Study, NEPS). With the help of a technique developed within this project for the collection of log data from paper-based administered LSAs using digital pens, process indicators describing the test-taking process are also included in the comparison between forms of test administration.

Theoretical background: In contrast to other invariance questions such as Differential Item Functioning (DIF), mode effects (Kroehne & Martens, 2011), and settings effects (Kroehne, Gnambs & Goldhammer, 2019) can be identified through the experimental assignment of persons to administration conditions. Therefore, it is possible to maintain the interpretation of scores after a mode change, even if only partial invariance is present (Buerger, Kroehne, & Goldhammer, 2016)

Method development: For the analysis of log and process data, a methodological framework based on the reconstruction of the test-taking processes using finite state machines was developed (Kroehne & Goldhammer, 2018) and implemented in an R-package (LogFSM). This methodological framework supports the comparison of low-level features and derived process indicators even in the presence of different log events, as is the case for comparing paper-based testing with digital pens and computer-based testing.

Selected results: It could be shown using the PISA reading domain as an example that the prerequisite of construct-equivalence, as required for maintaining the interpretation of scores, faces the attempt of an empirical falsification (Kroehne, Hahnel & Goldhammer, 2019). These analyses of the national add-on-study to PISA showed an increase in item difficulties for computer-based items, which, however, can be explained by differences in the self-selected test-taking speed using the log and process data (Kroehne, Hahnel & Goldhammer, 2019). Accordingly, the impact of the mode change on the trend estimate, as found in national analyses for PISA 2015 (Robitzsch, Lüdtke, Goldhammer, Kroehne & Köller, 2020), should also be interpreted in the light of possible mediators, e.g., the test-taking speed. Especially for short text answers, the mode change also led to qualitative differences of responses (Zehner, Kroehne, Hahnel & Goldhammer (2020). Finally, data from an experimental comparison within the NEPS showed that mode and setting effects are also reflected in differences in rapid guessing rates, which can be interpreted as a measure of low test-taking engagement (Kroehne, Deribo & Goldhammer, 2020). Based on these findings, computerized testing under standardized conditions can lead to test-taking with less rapid guessing for at least one of the two investigated domains.



Buerger, S., Kroehne, U., & Goldhammer, F. (2016). The transition to computer-based testing in large-scale assessments: Investigating (partial) measurement invariance between modes. Psychological Test and Assessment Modeling, 58(4), 597–616. [online available➚]

Kroehne, U., Buerger, S., Hahnel, C., & Goldhammer, F. (2019). Construct Equivalence of PISA Reading Comprehension Measured with Paper‐Based and Computer‐Based Assessments. Educational Measurement: Issues and Practice, 38 (3), 97–111. https://doi.org/10.1111/emip.12280➚

Kroehne, U., Deribo, T., & Goldhammer, F. (2020). Rapid Guessing Rates Across Administration Mode and Test Setting. Psychological Test and Assessment Modeling, 62 (2), 147–177. [online available➚]

Kroehne, U., & Goldhammer, F. (2018). How to conceptualize, represent, and analyze log data from technology-based assessments? A generic framework and an application to questionnaire items. Behaviormetrika, 45, 527–563. https://doi.org/10.1007/s41237-018-0063-y➚

Kroehne, U., & Martens, T. (2011). Computer-based competence tests in the national educational panel study: The challenge of mode effects. Zeitschrift Für Erziehungswissenschaft14(S2), 169–186. https://doi.org/10.1007/s11618-011-0185-4➚

Kroehne, U., Gnambs, T., & Goldhammer, F. (2019). Disentangling Setting and Mode Effects for Online Competence Assessment. In H.-P. Blossfeld & H.-G. Roßbach, Education as a Lifelong Process (pp. 171–193). https://doi.org/10.1007/978-3-658-23162-0_10➚

Kroehne, U., Hahnel, C., & Goldhammer, F. (2019). Invariance of the Response Processes Between Gender and Modes in an Assessment of Reading. Frontiers in Applied Mathematics and Statistics, 5, 2. https://doi.org/10.3389/fams.2019.00002➚

Robitzsch, A., Lüdtke, O., Goldhammer, F., Kroehne, U., & Köller, O. (2020). Reanalysis of the German PISA data: A comparison of different approaches for trend estimation with a particular emphasis on mode effects. Frontiers in Psychology11(884). http://dx.doi.org/10.3389/fpsyg.2020.00884➚

Zehner, F., Kroehne, U., Hahnel, C., & Goldhammer, F. (2020). PISA Reading: Mode Effects Unveiled in Short Text Responses. Psychological Test and Assessment Modeling62 (1), 85-105. [online available➚]


Mentor: Prof. Dr. Frank Goldhammer➚


089 289 28274 zib.edu@sot.tum.de