Seock-Ho Kim
University of Wisconsin-Madison
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Seock-Ho Kim.
Applied Psychological Measurement | 1993
Allan S. Cohen; Seock-Ho Kim; Frank B. Baker
Methods for detecting differential item func tioning (DIF) have been proposed primarily for the item response theory dichotomous response model. Three measures of DIF for the dichotomous response model are extended to include Samejimas graded response model: two measures based on area differences between item true score functions, and a χ2 statistic for comparing differences in item parameters. An illustrative example is presented.
Applied Psychological Measurement | 1991
Seock-Ho Kim; Allan S. Cohen
The area between two item response functions is often used as a measure of differential item functioning under item response theory. This area can be measured over either an open interval (i.e., exact) or closed interval. Formulas are presented for com puting the closed-interval signed and unsigned areas. Exact and closed-interval measures were estimated on data from a test with embedded items intentionally constructed to favor one group over another. No real differences in detection of these items were found between exact and closed-interval methods.
Applied Psychological Measurement | 1993
Allan S. Cohen; Seock-Ho Kim
The area between item response functions esti mated in different samples is often used as a measure of differential item functioning (DIF). Under item response theory, this area should be 0, except for errors of measurement. This study examined the effectiveness of two statistical tests of this area—a Z test for exact signed area and a Z test for exact unsigned area—for different test length, sample size, proportion of DIF items on the test, and item parameter estimation conditions using the two- parameter model. Errors in detection made using these two statistics were compared with errors made using Lords χ2. Differences between all three statistics were relatively small; however, the χ2 statistic was more effective than either of the two Z tests at detecting simulated DIF. The Z test for the exact signed area was the least effective and was the most likely to result in false negative errors.
Psychometrika | 1994
Seock-Ho Kim; Allan S. Cohen; Frank B. Baker; Michael J. Subkoviak; Tom Leonard
Hierarchical Bayes procedures for the two-parameter logistic item response model were compared for estimating item and ability parameters. Simulated data sets were analyzed via two joint and two marginal Bayesian estimation procedures. The marginal Bayesian estimation procedures yielded consistently smaller root mean square differences than the joint Bayesian estimation procedures for item and ability estimates. As the sample size and test length increased, the four Bayes procedures yielded essentially the same result.
Applied Psychological Measurement | 1992
Seock-Ho Kim; Allan S. Cohen
IRT models. To compute the DIF measures and the statistics to test the significance of the DIF measures, IRTDIF uses two files. One file contains sets of item parameter estimates; the other contains the sampling variance-covariance matrices. Significance levels (p values) are provided for Lord’s x2 and the exact area measures. When the sampling variance-covariance matrices are not available, the exact and closedinterval area measures are provided without statistical significance tests. The program was written in IBM Professional FORTRAN for IBM and compatible personal computers and uses subroutines taken from Numerical Recipes (Press, Flannery, Teukolsky, & Vetterling, 1986) to compute the percentage points of the incomplete gamma functions. Execution of the program requires a numerical coprocessor.
Applied Psychological Measurement | 1994
Seock-Ho Kim; Allan S. Cohen; Hae-Ok Kim
Type I error rates of Lords χ 2 test for differential item functioning were investigated using monte carlo simulations. Two- and three-parameter item response theory (IRT) models were used to generate 50-item tests for samples of 250 and 1,000 simulated examin ees. Item parameters were estimated using two algo rithms (marginal maximum likelihood estimation and marginal Bayesian estimation) for three IRT models (the three-parameter model, the three-parameter model with a fixed guessing parameter, and the two-param eter model). Proportions of significant χ 2s at selected nominal α levels were compared to those from joint maximum likelihood estimation as reported by McLaughlin & Drasgow (1987). Type I error rates for the three-parameter model consistently exceeded theo retically expected values. Results for the three-param eter model with a fixed guessing parameter and for the two-parameter model were consistently lower than ex pected values at the a levels in this study. Index terms: differential item functioning, item response theory, Lords χ2.
Applied Psychological Measurement | 1995
Seock-Ho Kim; Allan S. Cohen
The minimum x2 method for computing equating coefficients for tests with dichotomously scored items was extended to the case of Samejimas graded response items. The minimum X2 method was compared with the test response function method (also referred to as the test characteristic curve method) in which the equating coefficients were obtained by matching the test response functions of the two tests. The minimum X2 method was much less demanding computationally and yielded equating coefficients that differed little from those obtained using the test response function approach. Index terms: equating, graded response model, item response theory, minimum xz method, test response function method.
Journal of Educational Measurement | 1992
Seock-Ho Kim; Allan S. Cohen
Journal of Educational Measurement | 1995
Seock-Ho Kim; Allan S. Cohen; Tae-Hak Park
Journal of Educational Measurement | 1991
Allan S. Cohen; Seock-Ho Kim; Michael J. Subkoviak