Manuel Reif | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Manuel Reif is active.

Explore More

Publication

Featured researches published by Manuel Reif.

Educational Research and Evaluation | 2011

Analysing item position effects due to test booklet design within large-scale assessment

Christine Hohensinn; Klaus D. Kubinger; Manuel Reif; Eva Schleicher; Lale Khorramdel

For large-scale assessments, usually booklet designs administering the same item at different positions within a booklet are used. Therefore, the occurrence of position effects influencing the difficulty of the item is a crucial issue. Not taking learning or fatigue effects into account would result in a bias of estimated item difficulty. The occurrence of position effects is examined for a 4th-grade mathematical competence test of the Austrian Educational Standards by means of the linear logistic test model (LLTM). A small simulation study assesses the test power for this model. Overall, the LLTM without a modelled position effect yielded a good model fit. Therefore, no relevant global item position effect could be found for the analysed mathematical competence test.

International Journal of Selection and Assessment | 2010

On Minimizing Guessing Effects on Multiple-Choice Items: Superiority of a Two Solutions and Three Distractors Item Format to a One Solution and Five Distractors Item Format

Klaus D. Kubinger; Stefana Holocher-Ertl; Manuel Reif; Christine Hohensinn; Martina Frebort

Multiple-choice response formats are troublesome, as an item is often scored as solved simply because the examinee may be lucky at guessing the correct option. Instead of pertinent Item Response Theory models, which take guessing effects into account, this paper considers a psycho-technological approach to re-conceptualizing multiple-choice response formats. The free-response format is compared with two different multiple-choice formats: a traditional format with a single correct response option and five distractors (‘1 of 6’), and another with five response options, three of them being distractors and two of them being correct (‘2 of 5’). For the latter format, an item is scored as mastered only if both correct response options and none of the distractors are marked. After the exclusion of a few items, the Rasch model analyses revealed appropriate fit for 188 items altogether. The resulting item-difficulty parameters were used for comparison. The multiple-choice format ‘1 of 6’ differs significantly from the multiple-choice format ‘2 of 5’, while the latter does not differ significantly from the free-response format. The lower difficulty of items ‘1 of 6’ suggests guessing effects.

Algorithms from and for Nature and Life | 2013

Detecting Person Heterogeneity in a Large-Scale Orthographic Test Using Item Response Models

Christine Hohensinn; Klaus D. Kubinger; Manuel Reif

Achievement tests for students are constructed with the aim of measuring a specific competency uniformly for all examinees. This requires students to work on the items in a homogenous way. The dichotomous logistic Rasch model is the model of choice for assessing these assumptions during test construction. However, it is also possible that various subgroups of the population either apply different strategies for solving the items or make specific types of mistakes, or that different items measure different latent traits. These assumptions can be evaluated with extensions of the Rasch model or other Item Response models. In this paper, the test construction of a new large-scale German orthographic test for eighth grade students is presented. In the process of test construction and calibration, a pilot version was administered to 3,227 students in Austria. In the first step of analysis, items yielded a poor model fit to the dichotomous logistic Rasch model. Further analyses found homogenous subgroups in the sample which are characterized by different orthographic error patterns.

Educational Research and Evaluation | 2011

Branched adaptive testing with a Rasch-model-calibrated test: analysing item presentation's sequence effects using the Rasch-model-based LLTM

Klaus D. Kubinger; Manuel Reif; Takuya Yanagida

Item position effects provoke serious problems within adaptive testing. This is because different testees are necessarily presented with the same item at different presentation positions, as a consequence of which comparing their ability parameter estimations in the case of such effects would not at all be fair. In this article, a specific adaptive Rasch-model-calibrated test, AID 2 (Adaptive Intelligence Diagnosticum – Version 2.2; Kubinger, 2009a), is analyzed according to a suggestion of Kubinger (2008, 2009b): applying the Rasch-model-based Linear logistic test model (LLTM) to test item presentations sequence effects. AID 2s subtests are to be administered along a branched testing design; 5 of these subtests are now under consideration. Results showed no stringent trend of an item position effect for any specific subtest or generally for any combinations of item subsets; to the extent that such an effect was established, it was not really large enough to be of practical relevance.

Educational Research and Evaluation | 2011

Designing the test booklets for Rasch model calibration in a large-scale assessment with reference to numerous moderator variables and several ability dimensions

Klaus D. Kubinger; Christine Hohensinn; Sandra Hofer; Lale Khorramdel; Martina Frebort; Stefana Holocher-Ertl; Manuel Reif; Philipp Sonnleitner

In large-scale assessments, it usually does not occur that every item of the applicable item pool is administered to every examinee. Within item response theory (IRT), in particular the Rasch model (1960), this is not really a problem because item calibration works nevertheless. The different test booklets only need to be conceptualized according to a connected incomplete block design. Yet, connectedness of such a design should best be fulfilled severalfold, since deletion of some items in the course of the item pools IRT calibration may become necessary. The real challenge, however, is to meet constraints determined by numerous moderator variables such as different response formats and several topics of content – all the more so, if several ability dimensions are under consideration, the testing duration is strongly limited or individual scoring and feedback is an issue. In this article, we offer a report of how to deal with the resulting problems. Experience is based on the governmental project of the Austrian Educational Standards (Kubinger et al., 2007).

Psychology Science | 2008