Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Brian W. Junker is active.

Publication


Featured researches published by Brian W. Junker.


Applied Psychological Measurement | 2001

Cognitive assessment models with few assumptions, and connections with nonparametric item response theory

Brian W. Junker; Klaas Sijtsma

Some usability and interpretability issues for single-strategy cognitive assessment models are considered. These models posit a stochastic conjunctive relationship between a set of cognitive attributes to be assessed and performance on particular items/tasks in the assessment. The models considered make few assumptions about the relationship between latent attributes and task performance beyond a simple conjunctive structure. An example shows that these models can be sensitive to cognitive attributes, even in data designed to well fit the Rasch model. Several stochastic ordering and monotonicity properties are considered that enhance the interpretability of the models. Simple data summaries are identified that inform about the presence or absence of cognitive attributes when the full computational power needed to estimate the models is not available.


Journal of Educational and Behavioral Statistics | 1999

A Straightforward Approach to Markov Chain Monte Carlo Methods for Item Response Models

Richard J. Patz; Brian W. Junker

This paper demonstrates Markov chain Monte Carlo (MCMC) techniques that are particularly well-suited to complex models with item response theory (IRT) assumptions. MCMC may be thought of as a successor to the standard practice of first calibrating the items using E-M methods and then taking the item parameters to be known and fixed at their calibrated values when proceeding with inference regarding the latent trait, in contrast to this two-stage E-M approach, MCMC methods treat item and subject parameters at the same time; this allows us to incorporate standard errors of item estimates into trait inferences, and vice versa. We develop a MCMC methodology, based on Metropolis-Hastings sampling, that can be routinely implemented to fit novel IRT models, and we compare the algorithmic features of the Metropolis- Hastings approach to other approaches based on Gibbs sampling. For concreteness we illustrate the methodology using the familiar two-parameter logistic (2PL) IRT model; more complex models are treated in a subsequent paper (Patz & Junker, in press).


Journal of Educational and Behavioral Statistics | 1999

Applications and Extensions of MCMC in IRT: Multiple Item Types, Missing Data, and Rated Responses.

Richard J. Patz; Brian W. Junker

Patz and Junker (1999) describe a general Markov chain Monte Carlo (MCMC) strategy, based on Metropolis-Hastings sampling, for Bayesian inference in complex item response theory (IRT) settings. They demonstrate the basic methodology using the two-parameter logistic (2PL) model. In this paper we extend their basic MCMC methodology to address issues such as non-response, designed missingness, multiple raters, guessing behavior and partial credit (polytomous) test items. We apply the basic MCMC methodology to two examples from the National Assessment of Educational Progress 1992 Trial State Assessment in Reading: (a) a multiple item format (2PL, 3PL, and generalized partial credit) subtest with missing response data; and (b) a sequence of rated, dichotomous short-response items, using a new IRT model called the generalized linear logistic test model (GLLTM).


intelligent tutoring systems | 2006

Learning factors analysis – a general method for cognitive model evaluation and improvement

Hao Cen; Kenneth R. Koedinger; Brian W. Junker

A cognitive model is a set of production rules or skills encoded in intelligent tutors to model how students solve problems. It is usually generated by brainstorming and iterative refinement between subject experts, cognitive scientists and programmers. In this paper we propose a semi-automated method for improving a cognitive model called Learning Factors Analysis that combines a statistical model, human expertise and a combinatorial search. We use this method to evaluate an existing cognitive model and to generate and evaluate alternative models. We present improved cognitive models and make suggestions for improving the intelligent tutor based on those models.


Psychometrika | 1997

Stochastic Ordering Using the Latent Trait and the Sum Score in Polytomous IRT Models.

Bas T. Hemker; Klaas Sijtsma; Ivo W. Molenaar; Brian W. Junker

In a restricted class of item response theory (IRT) models for polytomous items the unweighted total score has monotone likelihood ratio (MLR) in the latent traitϑ. MLR implies two stochastic ordering (SO) properties, denoted SOM and SOL, which are both weaker than MLR, but very useful for measurement with IRT models. Therefore, these SO properties are investigated for a broader class of IRT models for which the MLR property does not hold.In this study, first a taxonomy is given for nonparametric and parametric models for polytomous items based on the hierarchical relationship between the models. Next, it is investigated which models have the MLR property and which have the SO properties. It is shown that all models in the taxonomy possess the SOM property. However, counterexamples illustrate that many models do not, in general, possess the even more useful SOL property.


Journal of The Royal Statistical Society Series A-statistics in Society | 1999

Classical multilevel and Bayesian approaches to population size estimation using multiple lists

Stephen E. Fienberg; Matthew S. Johnson; Brian W. Junker

One of the major objections to the standard multiple‐recapture approach to population estimation is the assumption of homogeneity of individual ‘capture’ probabilities. Modelling individual capture heterogeneity is complicated by the fact that it shows up as a restricted form of interaction among lists in the contingency table cross‐classifying list memberships for all individuals. Traditional log‐linear modelling approaches to capture–recapture problems are well suited to modelling interactions among lists but ignore the special dependence structure that individual heterogeneity induces. A random‐effects approach, based on the Rasch model from educational testing and introduced in this context by Darroch and co‐workers and Agresti, provides one way to introduce the dependence resulting from heterogeneity into the log‐linear model; however, previous efforts to combine the Rasch‐like heterogeneity terms additively with the usual log‐linear interaction terms suggest that a more flexible approach is required. In this paper we consider both classical multilevel approaches and fully Bayesian hierarchical approaches to modelling individual heterogeneity and list interactions. Our framework encompasses both the traditional log‐linear approach and various elements from the full Rasch model. We compare these approaches on two examples, the first arising from an epidemiological study of a population of diabetics in Italy, and the second a study intended to assess the ‘size’ of the World Wide Web. We also explore extensions allowing for interactions between the Rasch and log‐linear portions of the models in both the classical and the Bayesian contexts.


Journal of Educational and Behavioral Statistics | 2002

The Hierarchical Rater Model for Rated Test Items and its Application to Large-Scale Educational Assessment Data

Richard J. Patz; Brian W. Junker; Matthew S. Johnson; Louis T. Mariano

Open-ended or “constructed” student responses to test items have become a stock component of standardized educational assessments. Digital imaging of examinee work now enables a distributed rating process to be flexibly managed, and allocation designs that involve as many as six or more ratings for a subset of responses are now feasible. In this article we develop Patz’s (1996) hierarchical rater model (HRM) for polytomous item response data scored by multiple raters, and show how it can be used to scale examinees and items, to model aspects of consensus among raters, and to model individual rater severity and consistency effects. The HRM treats examinee responses to open-ended items as unobsered discrete varibles, and it explicitly models the “proficiency” of raters in assigning accurate scores as well as the proficiency of examinees in providing correct responses. We show how the HRM “fits in” to the generalizability theory framework that has been the traditional tool of analysis for rated item response data, and give some relationships between the HRM, the design effects correction of Bock, Brennan and Muraki (1999), and the rater bundle model of Wilson and Hoskens (2002). Using simulated and real data, we compare the HRM to the conventional IRT Facets model for rating data (e.g., Linacre, 1989; Engelhard, 1994, 1996), and we explore ways that information from HRM analyses may improved the quality of the rating process.


Applied Psychological Measurement | 2000

Latent and manifest monotonicity in item response models

Brian W. Junker; Klaas Sijtsma

The monotonicity of item response functions(IRF) is a central feature of most parametric andnonparametric item response models. Monotonicityallows items to be interpreted as measuring a trait,and it allows for a general theory of nonparametricinference for traits. This theory is based on monotonelikelihood ratio and stochastic ordering properties.Thus, confirming the monotonicity assumption isessential to applications of nonparametric itemresponse models. The results of two methods ofevaluating monotonicity are presented: regressingindividual item scores on the total test score andon the ‘rest’ score, which is obtained by omittingthe selected item from the total test score. It wasfound that the item-total regressions of some familiardichotomous item response models with monotoneIRFs exhibited nonmonotonicities that persist as thetest length increased. However, item-rest regressionsnever exhibited nonmonotonicities under the nonparametricmonotone unidimensional item responsemodel. The implications of these results for exploratoryanalysis of dichotomous item response dataand the application of these results to polytomousitem response data are discussed.


Psychometrika | 1996

Polytomous IRT models and monotone likelihood ratio of the total score

Bas T. Hemker; Klaas Sijtsma; Ivo W. Molenaar; Brian W. Junker

In a broad class of item response theory (IRT) models for dichotomous items the unweighted total score has monotone likelihood ratio (MLR) in the latent traitθ. In this study, it is shown that for polytomous items MLR holds for the partial credit model and a trivial generalization of this model. MLR does not necessarily hold if the slopes of the item step response functions vary over items, item steps, or both. MLR holds neither for Samejimas graded response model, nor for nonparametric versions of these three polytomous models. These results are surprising in the context of Graysons and Huynhs results on MLR for nonparametric dichotomous IRT models, and suggest that establishing stochastic ordering properties for nonparametric polytomous IRT models will be much harder.


Applied Psychological Measurement | 2001

Nonparametric Item Response Theory in Action: An Overview of the Special Issue

Brian W. Junker; Klaas Sijtsma

Although most item response theory (IRT) applications and related methodologies involve model fitting within a single parametric IRT (PIRT) family [e.g., the Rasch (1960) model or the threeparameter logistic model ( 3PLM; Lord, 1980)], nonparametric IRT (NIRT) research has been growing in recent years. Three broad motivations for the development and continued interest in NIRT can be identified: 1. To identify a commonality among PIRT and IRT-like models, model features [e.g., local independence (LI), monotonicity of item response functions (IRFs), unidimensionality of the latent variable] should be characterized, and it should be discovered what happens when models satisfy only weakened versions of these features. Characterizing successful and unsuccessful inferences under these broad model features can be attempted in order to understand how IRT models aggregate information from data. All this can be done with NIRT. 2. Any model applied to data is likely to be incorrect. When a family of PIRT models has been shown (or is suspected) to fit poorly, a more flexible family of NIRT models often is desired. These NIRT models have been used to: (1) assess violations of LI due to nuisance traits (e.g., latent variable multidimensionality) or the testing context influencing test performance (e.g., speededness and question wording), (2) clarify questions about the sources and effects of differential item functioning, (3) provide a flexible context in which to develop methodology for establishing the most appropriate number of latent dimensions underlying a test, and (4) serve as alternatives for PIRT models in tests of fit. 3. In psychological and sociological research, when it is necessary to develop a new questionnaire or measurement instrument, there often are fewer examinees and items than are desired for fitting PIRT models in large-scale educational testing. NIRT provides tools that are easy to use in small samples. It can identify items that scale together well (follow a particular set of NIRT assumptions). NIRT also identifies several subscales with simple structure among the scales, if the items do not form a single unidimensional scale.

Collaboration


Dive into the Brian W. Junker's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Brian Tobin

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar

Fabrizio Lecci

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar

Jack Mostow

Carnegie Mellon University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge