Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Marie Wiberg is active.

Publication


Featured researches published by Marie Wiberg.


Applied Psychological Measurement | 2010

Local observed-score equating with anchor-test designs

Willem J. van der Linden; Marie Wiberg

For traditional methods of observed-score equating with anchor-test designs, such as chain and poststratification equating, it is difficult to satisfy the criteria of equity and population invariance. Their equatings are therefore likely to be biased. The bias in these methods was evaluated against a simple local equating method in which the anchor-test score was used as a proxy of the proficiency measured by the test and the equating was conditional on this score. The results showed substantial bias for the two traditional methods under a variety of conditions but much smaller bias for the local method. In addition, unlike the traditional methods, the local method appeared to be quite robust with respect to changes in the difficulty and accuracy of the two tests that were equated. But like these methods, it appeared to be sensitive to a decrease in the accuracy of the anchor test as a proxy of the ability measured by the tests.


Archive | 2017

Applying test equating methods using R

Jorge González; Marie Wiberg

This book describes how to use test equating methods in practice. The non-commercial software R is used throughout the book to illustrate how to perform different equating methods when scores data ...


Applied Psychological Measurement | 2016

A note on the Poisson's binomial distribution in Item Response Theory

Jorge González; Marie Wiberg; Alina A. von Davier

The Poisson’s binomial (PB) is the probability distribution of the number of successes in independent but not necessarily identically distributed binary trials. The independent non-identically distributed case emerges naturally in the field of item response theory, where answers to a set of binary items are conditionally independent given the level of ability, but with different probabilities of success. In many applications, the number of successes represents the score obtained by individuals, and the compound binomial (CB) distribution has been used to obtain score probabilities. It is shown here that the PB and the CB distributions lead to equivalent probabilities. Furthermore, one of the proposed algorithms to calculate the PB probabilities coincides exactly with the well-known Lord and Wingersky (LW) algorithm for CBs. Surprisingly, we could not find any reference in the psychometric literature pointing to this equivalence. In a simulation study, different methods to calculate the PB distribution are compared with the LW algorithm. Providing an exact alternative to the traditional LW approximation for obtaining score distributions is a contribution to the field.


Journal of Integrative Psychology and Therapeutics | 2014

Insights into features of anxiety through multiple aspects of psychological time

Elisabeth Åström; Britt Wiberg; Anna Sircova; Marie Wiberg; Maria Grazia Carelli

Background: It is well-recognized that emotions and emotional disorders may alter the experience of time. Yet relatively little is known about different aspects of psychological time in relation to ...


International Journal of Testing | 2009

Differential Item Functioning in Mastery tests : A comparison of three methods using real data

Marie Wiberg

The aim of this study was to examine log linear modelling (LLM) compared with logistic regression (LR) and Mantel-Haenszel (MH) test for detecting Differential Item Functioning (DIF) in a mastery test. The three methods were chosen because they have similar components. The results showed fairly high matching percentages together with high correlations among the methods regarding size of DIF. The MH approach yielded more conservative results than both LR and LLM. LLM and LR were fairly consistent with each other. The LLM has the advantage of dividing the test scores into certain intervals, which is of special interest in mastery tests. This partition of test scores was also tried with LR and MH with different results.


Journal of Educational and Behavioral Statistics | 2003

An Optimal Design Approach to Criterion-Referenced Computerized Testing

Marie Wiberg

A criterion-referenced computerized test is expressed as a statistical hypothesis problem. This admits that it can be studied by using the theory of optimal design. The power function of the statistical test is used as a criterion function when designing the test. A formal proof is provided showing that all items should have the same item characteristics, i.e. items that have high discrimination, low guessing and difficulty near the cut-off score give the most powerful statistical test. An efficiency study shows how many times more items are needed if nonoptimal items are used instead of optimal items in order to get the same power in the test.


Applied Psychological Measurement | 2015

Kernel Equating Under the Non-Equivalent Groups With Covariates Design

Marie Wiberg; Kenny Bränberg

When equating two tests, the traditional approach is to use common test takers and/or common items. Here, the idea is to use variables correlated with the test scores (e.g., school grades and other test scores) as a substitute for common items in a non-equivalent groups with covariates (NEC) design. This is performed in the framework of kernel equating and with an extension of the method developed for post-stratification equating in the non-equivalent groups with anchor test design. Real data from a college admissions test were used to illustrate the use of the design. The equated scores from the NEC design were compared with equated scores from the equivalent group (EG) design, that is, equating with no covariates as well as with equated scores when a constructed anchor test was used. The results indicate that the NEC design can produce lower standard errors compared with an EG design. When covariates were used together with an anchor test, the smallest standard errors were obtained over a large range of test scores. The results obtained, that an EG design equating can be improved by adjusting for differences in test score distributions caused by differences in the distribution of covariates, are useful in practice because not all standardized tests have anchor tests.


Psychometrika | 2017

Item Response Theory Observed-Score Kernel Equating.

Björn Andersson; Marie Wiberg

Item response theory (IRT) observed-score kernel equating is introduced for the non-equivalent groups with anchor test equating design using either chain equating or post-stratification equating. The equating function is treated in a multivariate setting and the asymptotic covariance matrices of IRT observed-score kernel equating functions are derived. Equating is conducted using the two-parameter and three-parameter logistic models with simulated data and data from a standardized achievement test. The results show that IRT observed-score kernel equating offers small standard errors and low equating bias under most settings considered.


Journal of Educational and Behavioral Statistics | 2017

A Strategy for Replacing Sum Scoring

James O. Ramsay; Marie Wiberg

This article promotes the use of modern test theory in testing situations where sum scores for binary responses are now used. It directly compares the efficiencies and biases of classical and modern test analyses and finds an improvement in the root mean squared error of ability estimates of about 5% for two designed multiple-choice tests and about 12% for a classroom test. A new parametric density function for ability estimates, the tilted scaled β, is used to resolve the nonidentifiability of the univariate test theory model. Item characteristic curves (ICCs) are represented as basis function expansions of their log-odds transforms. A parameter cascading method along with roughness penalties is used to estimate the corresponding log odds of the ICCs and is demonstrated to be sufficiently computationally efficient that it can support the analysis of large data sets.


Educational Research and Evaluation | 2012

Can a multidimensional test be evaluated with unidimensional item response theory

Marie Wiberg

The aim of this study was to evaluate possible consequences of using unidimensional item response theory (UIRT) on a multidimensional college admission test. The test consists of 5 subscales and can be divided into two sections, that is, it can be considered both as a unidimensional and a multidimensional test. The test was examined with both UIRT and multidimensional IRT (MIRT). Simulations were used to examine item and ability parameter recovery when UIRT and MIRT models were used. The results obtained from the college admission test showed that although we get a better model fit when using MIRT instead of UIRT, the difference is small if we compare it with using a consecutive UIRT approach. The results from the simulations indicate that if the test only has between-item multidimensionality, it is probably not harmful to use UIRT instead of MIRT models.

Collaboration


Dive into the Marie Wiberg's collaboration.

Top Co-Authors

Avatar

Jorge González

Pontifical Catholic University of Chile

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge