Steven M. Downing | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Steven M. Downing is active.

Explore More

Publication

Featured researches published by Steven M. Downing.

Medical Education | 2003

Validity: on the meaningful interpretation of assessment data

Steven M. Downing

Context All assessments in medical education require evidence of validity to be interpreted meaningfully. In contemporary usage, all validity is construct validity, which requires multiple sources of evidence; construct validity is the whole of validity, but has multiple facets. Five sources – content, response process, internal structure, relationship to other variables and consequences – are noted by the Standards for Educational and Psychological Testing as fruitful areas to seek validity evidence.

Applied Measurement in Education | 2002

A Review of Multiple-Choice Item-Writing Guidelines for Classroom Assessment

Thomas M. Haladyna; Steven M. Downing

A taxonomy of 31 multiple-choice item-writing guidelines was validated through a logical process that included two sources of evidence: the consensus achieved from reviewing what was found in 27 textbooks on educational testing and the results of 27 research studies and reviews published since 1990. This taxonomy is mainly intended for classroom assessment. Because textbooks have potential to educate teachers and future teachers, textbook writers are encouraged to consider these findings in future editions of their textbooks. This taxonomy may also have usefulness for developing test items for large-scale assessments. Finally, research on multiple-choice item writing is discussed both from substantive and methodological viewpoints.

Medical Education | 2004

Reliability: on the reproducibility of assessment data

Steven M. Downing

Context All assessment data, like other scientific experimental data, must be reproducible in order to be meaningfully interpreted.

Medical Education | 2004

Validity threats: overcoming interference with proposed interpretations of assessment data

Steven M. Downing; Thomas M. Haladyna

Context Factors that interfere with the ability to interpret assessment scores or ratings in the proposed manner threaten validity. To be interpreted in a meaningful manner, all assessments in medical education require sound, scientific evidence of validity.

Educational and Psychological Measurement | 1993

How Many Options is Enough for a Multiple-Choice Test Item?

Thomas M. Haladyna; Steven M. Downing

Textbook writers often recommend four or five options per multiple-choice item, and most, if not all, testing programs in the United States also employ four or five options. Recent reviews of research on the desirable number of options for a multiple-choice test item reveal that three options may be suitable for most ability and achievement tests. A study of the frequency of acceptably performing distractors is reported. Results from three different testing programs support the conclusion that test items seldom contain more than three useful options. Consequently, testing program personnel and classroom teachers may be better served by using 2-or 3-option items instead of the typically recommended 4- or 5-option items.

Medical Education | 2003

Item response theory: applications of modern test theory in medical education

Steven M. Downing

Context Item response theory (IRT) measurement models are discussed in the context of their potential usefulness in various medical education settings such as assessment of achievement and evaluation of clinical performance.

Mount Sinai Journal of Medicine | 2009

Direct Observation in Medical Education: A Review of the Literature and Evidence for Validity

H. Barrett Fromme; Reena Karani; Steven M. Downing

In 2000, the Accreditation Council for Medical Education introduced a new initiative that substantively changed the method by which residency programs are evaluated. In this new competency-based approach to residency education, assessment of performance became a main area of interest, and direct observation was offered as a tool to assess knowledge and skills. Despite being an inherent part of medical education as faculty and learners work together in clinical experiences, direct observation has traditionally been an informal and underused assessment method across all specialties. Residents and students report rarely being observed during their educational process, even though they value the experience. Reasons for this include a lack of faculty time, a lack of faculty skills, a potential stressful effect on the learner, and a perceived lack of validation of the assessment. This article examines the literature regarding the use of direct observation in medical education with a focus on validity evidence. We performed a PubMed search of articles pertaining to direct observation, using key words such as direct observation, performance observation, clinical observation, students, and residents. A subsequent search was conducted in known articles, focusing on variations of the term observation in the titles of articles and introducing the concept of clinical competence. In conclusion, direct observation is a unique and useful tool in the assessment of medical students and residents. Assessing learners in natural settings offers the opportunity to see beyond what they know and into what they actually do, which is fundamentally essential to training qualified physicians. Although the literature identifies several threats to its validity as an assessment, it also demonstrates methods to minimize those threats. Based on the current recommendations and need for performance assessment in education and with attention paid to the development and design, direct observation can and should be included in medical education curricula.

Academic Medicine | 2008

Validity evidence for an OSCE to assess competency in systems-based practice and practice-based learning and improvement: a preliminary investigation.

Prathibha Varkey; Neena Natt; Timothy G. Lesnick; Steven M. Downing; Rachel Yudkowsky

Purpose To determine the psychometric properties and validity of an OSCE to assess the competencies of Practice-Based Learning and Improvement (PBLI) and Systems-Based Practice (SBP) in graduate medical education. Method An eight-station OSCE was piloted at the end of a three-week Quality Improvement elective for nine preventive medicine and endocrinology fellows at Mayo Clinic. The stations assessed performance in quality measurement, root cause analysis, evidence-based medicine, insurance systems, team collaboration, prescription errors, Nolan’s model, and negotiation. Fellows’ performance in each of the stations was assessed by three faculty experts using checklists and a five-point global competency scale. A modified Angoff procedure was used to set standards. Evidence for the OSCE’s validity, feasibility, and acceptability was gathered. Results Evidence for content and response process validity was judged as excellent by institutional content experts. Interrater reliability of scores ranged from 0.85 to 1 for most stations. Interstation correlation coefficients ranged from −0.62 to 0.99, reflecting case specificity. Implementation cost was approximately

Archives of Surgery | 2011

Resident self-other assessor agreement: Influence of assessor, competency, and performance Level

Pamela A. Lipsett; Ilene Harris; Steven M. Downing

255 per fellow. All faculty members agreed that the OSCE was realistic and capable of providing accurate assessments. Conclusions The OSCE provides an opportunity to systematically sample the different subdomains of Quality Improvement. Furthermore, the OSCE provides an opportunity for the demonstration of skills rather than the testing of knowledge alone, thus making it a potentially powerful assessment tool for SBP and PBLI. The study OSCE was well suited to assess SBP and PBLI. The evidence gathered through this study lays the foundation for future validation work.

Medical Education | 2005

Threats to the validity of clinical teaching assessments: What about rater error?

Steven M. Downing

OBJECTIVES To review the literature on self-assessment in the context of resident performance and to determine the correlation between self-assessment across competencies in high- and low-performing residents and assessments performed by raters from a variety of professional roles (peers, nurses, and faculty). DESIGN Retrospective analysis of prospectively collected anonymous self-assessment and multiprofessional (360) performance assessments by competency and overall. SETTING University-based academic general surgical program. PARTICIPANTS Sixty-two residents rotating in general surgery. MAIN OUTCOME MEASURES Mean difference for each self-assessment dyad (self-peer, self-nurse, and self-attending physician) by resident performance quartile, adjusted for measurement error, correlation coefficients, and summed differences across all competencies. RESULTS Irrespective of self-other dyad, residents asked to rate their global performance overestimated their skills. Residents in the upper quartile underestimated their specific skills while those in the lowest-performing quartile overestimated their abilities when compared with faculty, peers, and especially nurse raters. Moreover, overestimation was greatest in competencies related to interpersonal skills, communication, teamwork, and professionalism. CONCLUSIONS Rater, level of performance, and the competency being assessed all influence the comparison of the residents self-assessment and those of other raters. Self-assessment of competencies related to behavior may be inaccurate when compared with raters from various professions. Residents in the lowest-performing quartile are least able to identify their weakness. These data have important implications for residents, program directors, and the public and suggest that strategies that help the lowest-performing residents recognize areas in need of improvement are needed.

Explore More