Russell G. Almond
Florida State University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Russell G. Almond.
Measurement: Interdisciplinary Research & Perspective | 2003
Robert J. Mislevy; Linda S. Steinberg; Russell G. Almond
In educational assessment, we observe what students say, do, or make in a few particular circumstances and attempt to infer what they know, can do, or have accomplished more generally. A web of inference connects the two. Some connections depend on theories and experience concerning the targeted knowledge in the domain, how it is acquired, and the circumstances under which people bring their knowledge to bear. Other connections may depend on statistical models and probability-based reasoning. Still others concern the elements and processes involved in test construction, administration, scoring, and reporting. This article describes a framework for assessment that makes explicit the interrelations among substantive arguments, assessment designs, and operational processes. The work was motivated by the need to develop assessments that incorporate purposes, technologies, and psychological perspectives that are not well served by familiar forms of assessments. However, the framework is equally applicable to analyzing existing assessments or designing new assessments within familiar forms.
Applied Psychological Measurement | 1999
Russell G. Almond; Robert J. Mislevy
Computerized adaptive testing (CAT) based on item response theory (IRT) is viewed from the perspective of graphical modeling (GM). GM provides methods for making inferences about multifaceted skills and knowledge, and for extracting data from complex performances. However, simply incorporating variables for all sources of variation is rarely successful. Thus, researchers must closely analyze the substance and structure of the problem to create more effective models. Researchers regularly employ sophisticated strategies to handle many sources of variability outside the IRT model. Relevant variables can play many roles without appearing in the operational IRT model per se, e.g., in validity studies, assembling tests, and constructing and modeling tasks. Some of these techniques are described from a GM perspective, as well as how to extend them to more complex assessment situations. Issues are illustrated in the context of language testing.
Language Testing | 2002
Robert J. Mislevy; Linda S. Steinberg; Russell G. Almond
In task-based language assessment (TBLA) language use is observed in settings that are more realistic and complex than in discrete skills assessments, and which typically require the integration of topical, social and/or pragmatic knowledge along with knowledge of the formal elements of language. But designing an assessment is not accomplished simply by determining the settings in which performance will be observed. TBLA raises questions of just how to design complex tasks, evaluate students’ performances and draw valid conclusions therefrom. This article examines these challenges from the perspective of ‘evidence-centred assessment design’. The main building blocks are student, evidence and task models, with tasks to be administered in accordance with an assembly model. We describe these models, show how they are linked and assembled to frame an assessment argument and illustrate points with examples from task-based language assessment.
Applied Measurement in Education | 2002
Robert J. Mislevy; Linda S. Steinberg; F. Jay Breyer; Russell G. Almond; Lynn Johnson
Advances in cognitive psychology both deepen our understanding of how students gain and use knowledge and broaden the range of performances and situations we want to see to acquire evidence about their developing knowledge. At the same time, advances in technology make it possible to capture more complex performances in assessment settings by including, as examples, simulation, interactivity, and extended responses. The challenge is making sense of the complex data that result. This article concerns an evidence-centered approach to the design and analysis of complex assessments. We present a design framework that incorporates integrated structures for a modeling knowledge and skills, designing tasks, and extracting and synthesizing evidence. The ideas are illustrated in the context of a project with the Dental Interactive Simulation Corporation (DISC), assessing problem solving in dental hygiene with computer-based simulations. After reviewing the substantive grounding of this effort, we describe the design rationale, statistical and scoring models, and operational structures for the DISC assessment prototype.
Computers in Human Behavior | 1999
Robert J. Mislevy; Linda S. Steinberg; F.J. Breyer; Russell G. Almond; L. Johnson
To function effectively as a learning environment, a simulation system must present learners with situations in which they use relevant knowledge, skills, and abilities. To function effectively as an assessment, such a system must additionally be able to evoke and interpret observable evidence about targeted knowledge in a manner that is principled, defensible, and fitting to the purpose at hand (e.g. licensure, achievement testing, coached practice). This article concerns an evidence-centered approach to designing a computer-based performance assessment of problem solving. The application is a prototype licensure test, with supplementary feedback, for prospective use in the field of dental hygiene. We describe a cognitive task analysis designed to: (1) tap the knowledge hygienists use when they assess patients, plan treatments, and monitor progress; and (2) elicit behaviors that manifest this knowledge. After summarizing the results of the analysis, we discuss implications for designing student models, evidentiary structures, task frameworks, and simulation capabilities required for the proposed assessment.
Journal of Computational and Graphical Statistics | 1997
David Madigan; Krzysztof Mosurski; Russell G. Almond
Abstract Belief networks provide an important bridge between statistical modeling and expert systems. This article presents methods for visualizing probabilistic “evidence flows” in belief networks, thereby enabling belief networks to explain their behavior. Building on earlier research on explanation in expert systems, we present a hierarchy of explanations, ranging from simple colorings to detailed displays. Our approach complements parallel work on textual explanations in belief networks. Graphical-Belief, Mathsoft Inc.s belief network software, implements the methods.
uncertainty in artificial intelligence | 1994
David Madigan; Adrian E. Raftery; Jeremy York; Jeffrey M. Bradshaw; Russell G. Almond
We consider the problem of model selection for Bayesian graphical models, and embed it in the larger context of accounting for model uncertainty. Data analysts typically select a single model from some class of models, and then condition all subsequent inference on this model. However, this approach ignores model uncertainty, leading to poorly calibrated predictions: it will often be seen in retrospect that one’s uncertainty bands were not wide enough. The Bayesian analyst solves this problem by averaging over all plausible models when making inferences about quantities of interest. In many applications, however, because of the size of the model space and awkward integrals, this averaging will not be a practical proposition, and approximations are required. Here we examine the predictive performance of two recently proposed model averaging schemes. In the examples considered, both schemes outperform any single model that might reasonably have been selected.
Educational and Psychological Measurement | 2007
Sandip Sinharay; Russell G. Almond
A cognitive diagnostic model uses information from educational experts to describe the relationships between item performances and posited proficiencies. When the cognitive relationships can be described using a fully Bayesian model, Bayesian model checking procedures become available. Checking models tied to cognitive theory of the domains provides feedback to educators about the underlying cognitive theory. This article suggests a number of graphics and statistics for diagnosing problems with cognitive diagnostic models expressed as Bayesian networks. The suggested diagnostics allow the authors to identify the inadequacy of an earlier cognitive diagnostic model and to hypothesize an improved model that provides better fit to the data.
Journal of Educational and Behavioral Statistics | 2009
Russell G. Almond; Joris Mulder; Lisa Hemat; Duanli Yan
Bayesian network models offer a large degree of flexibility for modeling dependence among observables (item outcome variables) from the same task, which may be dependent. This article explores four design patterns for modeling locally dependent observations: (a) no context—ignores dependence among observables; (b) compensatory context—introduces a latent variable, context, to model task-specific knowledge and use a compensatory model to combine this with the relevant proficiencies; (c) inhibitor context—introduces a latent variable, context, to model task-specific knowledge and use an inhibitor (threshold) model to combine this with the relevant proficiencies; (d) compensatory cascading—models each observable as dependent on the previous one in sequence. This article explores the four design patterns through experiments with simulated and real data. When the proficiency variable is categorical, a simple Mantel-Haenszel procedure can test for local dependence. Although local dependence can cause problems in the calibration, if the models based on these design patterns are successfully calibrated to data, all the design patterns appear to provide very similar inferences about the students. Based on these experiments, the simpler no context design pattern appears more stable than the compensatory context model, while not significantly affecting the classification accuracy of the assessment. The cascading design pattern seems to pick up on dependencies missed by other models and should be explored with further research.
International Journal of Approximate Reasoning | 2007
Russell G. Almond
Abstract For a number of situations, a Bayesian network can be split into a core network consisting of a set of latent variables describing the status of a system, and a set of fragments relating the status variables to observable evidence that could be collected about the system state. This situation arises frequently in educational testing, where the status variables represent the student proficiency and the evidence models (graph fragments linking competency variables to observable outcomes) relate to assessment tasks that can be used to assess that proficiency. The traditional approach to knowledge engineering in this situation would be to maintain a library of fragments, where the graphical structure is specified using a graphical editor and then the probabilities are entered using a separate spreadsheet for each node. If many evidence model fragments employ the same design pattern, a lot of repetitive data entry is required. As the parameter values that determine the strength of the evidence can be buried on interior screens of an interface, it can be difficult for a design team to get an impression of the total evidence provided by a collection of evidence models for the system variables, and to identify holes in the data collection scheme. A Q -matrix – an incidence matrix whose rows represent observable outcomes from assessment tasks and whose columns represent competency variables – provides the graphical structure of the evidence models. The Q -matrix can be augmented to provide details of relationship strengths and provide a high level overview of the kind of evidence available. The relationships among the status variables can be represented with an inverse covariance matrix; this is particularly useful in models from the social sciences as often the domain experts’ knowledge about the system states comes from factor analyses and similar procedures that naturally produce covariance matrixes. The representation of the model using matrixes means that the bulk of the specification work can be done using a desktop spreadsheet program and does not require specialized software, facilitating collaboration with external experts. The design idea is illustrated with some examples from prior assessment design projects.