Marilyn S. Wingersky | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Marilyn S. Wingersky is active.

Explore More

Publication

Featured researches published by Marilyn S. Wingersky.

Applied Psychological Measurement | 1984

An Investigation of Methods for Reducing Sampling Error in Certain IRT Procedures

Marilyn S. Wingersky; Frederic M. Lord

The sampling errors of maximum likelihood esti mates of item response theory parameters are studied in the case when both people and item parameters are estimated simultaneously. A check on the validity of the standard error formulas is carried out. The effect of varying sample size, test length, and the shape of the ability distribution is investigated. Finally, the ef fect of anchor-test length on the standard error of item parameters is studied numerically for the situation, common in equating studies, when two groups of ex aminees each take a different test form together with the same anchor test. The results encourage the use of rectangular or bimodal ability distributions, and also the use of very short anchor tests.

Applied Psychological Measurement | 1994

A Simulation Study of Methods for Assessing Differential Item Functioning in Computerized Adaptive Tests.

Rebecca Zwick; Dorothy T. Thayer; Marilyn S. Wingersky

Simulated data were used to investigate the performance of modified versions of the Mantel-Haenszel method of differential item functioning (DIF) analysis in computerized adaptive tests (CATs). Each simulated examinee received 25 items from a 75-item pool. A three-parameter logistic item response theory (IRT) model was assumed, and examinees were matched on expected true scores based on their CAT responses and estimated item parameters. The CAT-based DIF statistics were found to be highly correlated with DIF statistics based on nonadaptive administration of all 75 pool items and with the true magnitudes of DIF in the simulation. Average DIF statistics and average standard errors also were examined for items with various characteristics. Finally, a study was conducted of the accuracy with which the modified Mantel-Haenszel procedure could identify CAT items with substantial DIF using a classification system now implemented by some testing programs. These additional analyses provided further evidence that the CAT-based DIF procedures performed well. More generally, the results supported the use of IRT-based matching variables in DIF analysis. Index terms: adaptive testing, computerized adaptive testing, differential item functioning, item bias, item response theory.

Applied Psychological Measurement | 1982

A Study of Pre-Equating Based on Item Response Theory

Isaac I. Bejar; Marilyn S. Wingersky

The study reports a feasibility study using item response theory (IRT) as a means of equating the Test of Standard Written English (TSWE). The study focused on the possibility of pre-equating, that is, deriving the equating transformation prior to the final administration of the test. The three-parameter logistic model was postulated as the response model and its fit was assessed at the item, subscore, and total score level. Minor problems were found at each of these levels; but, on the whole, the three-parameter model was found to portray the data well. The adequacy of the equating provided by IRT procedures was investigated in two TSWE forms. It was concluded that pre-equating does not appear to present problems beyond those inherent to IRT-equating.

ETS Research Report Series | 1993

A SIMULATION STUDY OF METHODS FOR ASSESSING DIFFERENTIAL ITEM FUNCTIONING IN COMPUTER‐ADAPTIVE TESTS

Rebecca Zwick; Dorothy T. Thayer; Marilyn S. Wingersky

Simulated data were used to investigate the performance of modified versions of the Mantel-Haenszel and standardization methods of differential item functioning (DIF) analysis in computer-adaptive tests (CATs). Each “examinee” received 25 items out of a 75-item pool. A three-parameter logistic item response model was assumed, and examinees were matched on expected true scores based on their CAT responses and on estimated item parameters. Both DIF methods performed well. The CAT-based DIF statistics were highly correlated with DIF statistics based on nonadaptive administration of all 75 pool items and with the true magnitudes of DIF in the simulation. DIF methods were also investigated for “pretest items,” for which item parameter estimates were assumed to be unavailable. The pretest DIF statistics were generally well-behaved and also had high correlations with the true DIF. The pretest DIF measures, however, tended to be slightly smaller in magnitude than their CAT-based counterparts. Also, in the case of the Mantel-Haenszel approach, the pretest DIF statistics tended to have somewhat larger standard errors than the CAT DIF statistics.

Educational and Psychological Measurement | 1969

A COMPUTER PROGRAM FOR ESTIMATING TRUE-SCORE DISTRIBUTIONS AND GRADUATING OBSERVED-SCORE DISTRIBUTIONS.

Marilyn S. Wingersky; Diana M. Lees; Virginia Lennon; Frederic M. Lord

Abstract : The program takes a frequency distribution of number-right test scores and produces (1) an estimated distribution of true scores for the group tested, computed on the assumption that the errors of measurement have a certain compound binomial distribution, (2) the corresponding smoothed distribution of actual scores, and (3) a chi-square for comparing the smoothed and the actual distributions. All instructions and background information necessary for practical use of the program are given. (Author)

Applied Psychological Measurement | 1984