Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Ronald K. Hambleton is active.

Publication


Featured researches published by Ronald K. Hambleton.


Biometrics | 1997

Handbook of Modern Item Response Theory

Wim J. van der Linden; Ronald K. Hambleton

Item response theory has become an essential component in the toolkit of every researcher in the behavioral sciences. It provides a powerful means to study individual responses to a variety of stimuli, and the methodology has been extended and developed to cover many different models of interaction. This volume presents a wide-ranging handbook to item response theory - and its applications to educational and psychological testing. It will serve as both an introduction to the subject and also as a comprehensive reference volume for practitioners and researchers. It is organized into six major sections: the nominal categories model, models for response time or multiple attempts on items, models for multiple abilities or cognitive components, nonparametric models, models for nonmonotone items, and models with special assumptions. Each chapter in the book has been written by an expert of that particular topic, and the chapters have been carefully edited to ensure that a uniform style of notation and presentation is used throughout. As a result, all researchers whose work uses item response theory will find this an indispensable companion to their work and it will be the subjects reference volume for many years to come.


Medical Care | 2007

Psychometric evaluation and calibration of health-related quality of life item banks: Plans for the Patient-Reported Outcomes Measurement Information System (PROMIS)

Bryce B. Reeve; Ron D. Hays; Jakob B. Bjorner; Karon F. Cook; Paul K. Crane; Jeanne A. Teresi; David Thissen; Dennis A. Revicki; David J. Weiss; Ronald K. Hambleton; Honghu Liu; Richard Gershon; Steven P. Reise; Jin Shei Lai; David Cella

Background:The construction and evaluation of item banks to measure unidimensional constructs of health-related quality of life (HRQOL) is a fundamental objective of the Patient-Reported Outcomes Measurement Information System (PROMIS) project. Objectives:Item banks will be used as the foundation for developing short-form instruments and enabling computerized adaptive testing. The PROMIS Steering Committee selected 5 HRQOL domains for initial focus: physical functioning, fatigue, pain, emotional distress, and social role participation. This report provides an overview of the methods used in the PROMIS item analyses and proposed calibration of item banks. Analyses:Analyses include evaluation of data quality (eg, logic and range checking, spread of response distribution within an item), descriptive statistics (eg, frequencies, means), item response theory model assumptions (unidimensionality, local independence, monotonicity), model fit, differential item functioning, and item calibration for banking. Recommendations:Summarized are key analytic issues; recommendations are provided for future evaluations of item banks in HRQOL assessment.


European Journal of Psychological Assessment | 2001

The Next Generation of the ITC Test Translation and Adaptation Guidelines

Ronald K. Hambleton

Summary: The ITC Test Translation and Adaptation Guidelines have been available for nearly 7 years, and numerous researchers and practitioners have provided comments on their strengths and weaknesses. This paper addresses three goals: First, comments on the 22 guidelines that have resulted from the numerous field-tests and reviews are presented. Second, where possible, specific suggestions for revising the ITC Guidelines are described. Finally, three suggestions for essential research to improve the methodology associated with translating and adapting tests are presented.


Psicothema | 2013

Directrices para la traducción y adaptación de los tests : segunda edición

José Muñiz; Paula Elosua; Ronald K. Hambleton

BACKGROUND Adapting tests across cultures is a common practice that has increased in all evaluation areas in recent years. We live in an increasingly multicultural and multilingual world in which the tests are used to support decision-making in the educational, clinical, organizational and other areas, so the adaptation of tests becomes a necessity. The main goal of this paper is to present the second edition of the guidelines of the International Test Commission (ITC) for adapting tests across cultures. METHOD A task force of six international experts reviewed the original guidelines proposed by the International Test Commission, taking into account the advances and developments of the field. RESULTS As a result of the revision this new edition consists of twenty guidelines grouped into six sections: Precondition, test development, confirmation, administration, score scales and interpretation, and document. The different sections are reviewed, and the possible sources of error influencing the tests translation and adaptation analyzed. CONCLUSIONS Twenty guidelines are proposed for translating and adapting tests across cultures. Finally we discuss the future perspectives of the guidelines in relation to the new developments in the field of psychological and educational assessment.


Review of Educational Research | 1978

Criterion-Referenced Testing and Measurement: A Review of Technical Issues and Developments

Ronald K. Hambleton; Hariharan Swaminathan; James Algina; Douglas Bill Coulson

Glaser (1963) and Popham and Husek (1969) were the first to introduce and to popularize the field of criterion-referenced testing. Their motive was to provide the kind of test score information needed to make a variety of individual and programmatic decisions arising in objectivesbased instructional programs. Norm-referenced tests were seen as less than ideal for providing the desired kind of test score information. At present students at all levels of education are taking criterion-


Social Indicators Research | 1998

Adapting Tests for Use in Multiple Languages and Cultures

Ronald K. Hambleton; Liane Patsula

There is a growing interest in using tests constructed and validated for use in one language and culture in other languages and cultures. Sometimes these tests when adapted for use in a second language and culture can further research and meet informational needs, and other times, cross-cultural comparative studies can be carried out. But, whatever the purpose for the test adaptations, questions arise concerning the validity of inferences from these adapted tests.The purposes of this paper are (1) to consider several advantages and disadvantages of adapting tests from one language and culture to another, (2) to review several sources of error or invalidity associated with adapting tests and to suggest ways to reduce those errors, and (3) to consider test adaptation advances in one rapidly emerging area of social research – quality of life measures.


Medical Care | 2006

Good practices for identifying differential item functioning. Commentary

Ronald K. Hambleton

The articles addressing differential item functioning (DIF) and factorial invariance in this special issue of Medical Care1–9 are uniformly excellent and readers will find that each article makes an important contribution to the measurement literature. The suggestion to have researchers apply variou


Applied Psychological Measurement | 1986

ASSESSING THE DIMENSIONALITY OF A SET OF TEST ITEMS

Ronald K. Hambleton; Richard Rovinelli

This study compared four methods of determining the dimensionality of a set of test items: linear factor analysis, nonlinear factor analysis, residual analysis, and a method developed by Bejar (1980). Five artifi cial test datasets (for 40 items and 1,500 examinees) were generated to be consistent with the three-parame ter logistic model and the assumption of either a one- or a two-dimensional latent space. Two variables were manipulated: (1) the correlation between the traits (r = .10 or r = .60) and (2) the percent of test items measuring each trait (50% measuring each trait, or 75% measuring the first trait and 25% measuring the second trait). While linear factor analysis in all instances over estimated the number of underlying dimensions in the data, nonlinear factor analysis with linear and quad ratic terms led to correct determination of the item di mensionality in the three datasets where it was used. Both the residual analysis method and Bejars method proved disappointing. These results suggest the need for extreme caution in using linear factor analysis, re sidual analysis, and Bejars method until more investi gations of these methods can confirm their adequacy. Nonlinear factor analysis appears to be the most prom ising of the four methods, but more experience in ap plying the method seems necessary before wide-scale use can be recommended.


Applied Measurement in Education | 2004

Student Test Score Reports and Interpretive Guides: Review of Current Practices and Suggestions for Future Research

Dean P. Goodman; Ronald K. Hambleton

A critical, but often neglected, component of any large-scale assessment program is the reporting of test results. In the past decade, a body of evidence has been compiled that raises concerns over the ways in which these results are reported to and understood by their intended audiences. In this study, current approaches for reporting student-level results on large-scale assessments were investigated. Recent student test score reports and interpretive guides from 11 states, three U.S. commercial testing companies, and two Canadian provinces were reviewed. On the basis of past score-reporting research, testing standards, and the requirements of the No Child Left Behind Act of 2001, a number of promising and potentially problematic features of these reports and guides are identified, and recommendations are offered to help enhance future score-reporting designs and to inform future research in this important area.


Educational and Psychological Measurement | 1992

The effect of sample size on the functioning of the Mantel-Haenszel statistic

Kathleen M. Mazor; Brian E. Clauser; Ronald K. Hambleton

The Mantel-Haenszel (MH) procedure has become one of the most popular procedures for detecting differential item functioning. Valid results with relatively small numbers of examinees are one of the advantages typically attributed to this procedure. In this study, examinee item responses were simulated to contain differentially functioning items, and then were analyzed at five sample sizes to compare detection rates. Results showed the MH procedure missed 25 to 30% of the differentially functioning items when groups of 2000 were used. When 500 or fewer examinees were retained in each group, more than 50% of the differentially functioning items were missed. The items most likely to be undetected were those which were most difficult, those with a small difference in item difficulty between the two groups, and poorly discriminating items.

Collaboration


Dive into the Ronald K. Hambleton's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

H. Jane Rogers

University of Massachusetts Amherst

View shared research outputs
Top Co-Authors

Avatar

April L. Zenisky

University of Massachusetts Amherst

View shared research outputs
Top Co-Authors

Avatar

Stephen G. Sireci

University of Massachusetts Amherst

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Dehui Xing

University of Massachusetts Amherst

View shared research outputs
Top Co-Authors

Avatar

Linda L. Cook

University of Massachusetts Amherst

View shared research outputs
Top Co-Authors

Avatar

Brian E. Clauser

National Board of Medical Examiners

View shared research outputs
Top Co-Authors

Avatar

Daniel R. Eignor

University of Massachusetts Amherst

View shared research outputs
Researchain Logo
Decentralizing Knowledge