Thomas J. Beckman | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Thomas J. Beckman is active.

Explore More

Publication

Featured researches published by Thomas J. Beckman.

Academic Medicine | 2014

Standards for Reporting Qualitative Research: A Synthesis of Recommendations

Bridget O'Brien; Ilene Harris; Thomas J. Beckman; Darcy A. Reed; David A. Cook

Purpose Standards for reporting exist for many types of quantitative research, but currently none exist for the broad spectrum of qualitative research. The purpose of the present study was to formulate and define standards for reporting qualitative research while preserving the requisite flexibility to accommodate various paradigms, approaches, and methods. Method The authors identified guidelines, reporting standards, and critical appraisal criteria for qualitative research by searching PubMed, Web of Science, and Google through July 2013; reviewing the reference lists of retrieved sources; and contacting experts. Specifically, two authors reviewed a sample of sources to generate an initial set of items that were potentially important in reporting qualitative research. Through an iterative process of reviewing sources, modifying the set of items, and coding all sources for items, the authors prepared a near-final list of items and descriptions and sent this list to five external reviewers for feedback. The final items and descriptions included in the reporting standards reflect this feedback. Results The Standards for Reporting Qualitative Research (SRQR) consists of 21 items. The authors define and explain key elements of each item and provide examples from recently published articles to illustrate ways in which the standards can be met. Conclusions The SRQR aims to improve the transparency of all aspects of qualitative research by providing clear standards for reporting qualitative research. These standards will assist authors during manuscript preparation, editors and reviewers in evaluating a manuscript for potential publication, and readers when critically appraising, applying, and synthesizing study findings.

Medical Education | 2007

Quality of reporting of experimental studies in medical education: a systematic review

David A. Cook; Thomas J. Beckman; Georges Bordage

Objective Determine the prevalence of essential elements of reporting in experimental studies in medical education.

Journal of General Internal Medicine | 2004

How Reliable Are Assessments of Clinical Teaching?: A Review of the Published Instruments

Thomas J. Beckman; Amit K. Ghosh; David A. Cook; Patricia J. Erwin; Jayawant N. Mandrekar

BACKGROUND: Learner feedback is the primary method for evaluating clinical faculty, despite few existing standards for measuring learner assessments.OBJECTIVE: To review the published literature on instruments for evaluating clinical teachers and to summarize themes that will aid in developing universally appealing tools.DESIGN: Searching 5 electronic databases revealed over 330 articles. Excluded were reviews, editorials, and qualitative studies. Twenty-one articles describing instruments designed for evaluating clinical faculty by learners were found. Three investigators studied these papers and tabulated characteristics of the learning environments and validation methods. Salient themes among the evaluation studies were determined.MAIN RESULTS: Many studies combined evaluations from both outpatient and inpatient settings and some authors combined evaluations from different learner levels. Wide ranges in numbers of teachers, evaluators, evaluations, and scale items were observed. The most frequently encountered statistical methods were factor analysis and determining internal consistency reliability with Cronbach’s α. Less common methods were the use of test-retest reliability, interrater reliability, and convergent validity between validated instruments. Fourteen domains of teaching were identified and the most frequently studied domains were interpersonal and clinical-teaching skills.CONCLUSIONS: Characteristics of teacher evaluations vary between educational settings and between different learner levels, indicating that future studies should utilize more narrowly defined study populations. A variety of validation methods including temporal stability, interrater reliability, and convergent validity should be considered. Finally, existing data support the validation of instruments comprised solely of interpersonal and clinical-teaching domains.

Journal of General Internal Medicine | 2005

What is the Validity Evidence for Assessments of Clinical Teaching

Thomas J. Beckman; David A. Cook; Jayawant N. Mandrekar

BACKGROUND: Although a variety of validity evidence should be utilized when evaluating assessment tools, a review of teaching assessments suggested that authors pursue a limited range of validity evidence.OBJECTIVES: To develop a method for rating validity evidence and to quantify the evidence supporting scores from existing clinical teaching assessment instruments.DESIGN: A comprehensive search yielded 22 articles on clinical teaching assessments. Using standards outlined by the American Psychological and Education Research Associations, we developed a method for rating the 5 categories of validity evidence reported in each article. We then quantified the validity evidence by summing the ratings for each category. We also calculated weighted κ coefficients to determine interrater reliabilities for each category of validity evidence.MAIN RESULTS: Content and Internal Structure evidence received the highest ratings (27 and 32, respectively, of 44 possible). Relation to Other Variables, Consequences, and Response Process received the lowest ratings (9, 2, and 2, respectively). Interrater reliability was good for Content, Internal Structure, and Relation to Other Variables (κ range 0.52 to 0.96, all P values <.01), but poor for Consequences and Response Process.CONCLUSIONS: Content and Internal Structure evidence is well represented among published assessments of clinical teaching. Evidence for Relation to Other Variables, Consequences, and Response Process receive little attention, and future research should emphasize these categories. The low interrater reliability for Response Process and Consequences likely reflects the scarcity of reported evidence. With further development, our method for rating the validity evidence should prove useful in various settings.

Journal of General Internal Medicine | 2008

Predictive validity evidence for medical education research study quality instrument scores: Quality of submissions to JGIM's medical education special issue

Darcy A. Reed; Thomas J. Beckman; Scott M. Wright; Rachel B. Levine; David E. Kern; David A. Cook

BackgroundDeficiencies in medical education research quality are widely acknowledged. Content, internal structure, and criterion validity evidence support the use of the Medical Education Research Study Quality Instrument (MERSQI) to measure education research quality, but predictive validity evidence has not been explored.ObjectiveTo describe the quality of manuscripts submitted to the 2008 Journal of General Internal Medicine (JGIM) medical education issue and determine whether MERSQI scores predict editorial decisions.Design and ParticipantsCross-sectional study of original, quantitative research studies submitted for publication.MeasurementsStudy quality measured by MERSQI scores (possible range 5–18).ResultsOf 131 submitted manuscripts, 100 met inclusion criteria. The mean (SD) total MERSQI score was 9.6 (2.6), range 5–15.5. Most studies used single-group cross-sectional (54%) or pre-post designs (32%), were conducted at one institution (78%), and reported satisfaction or opinion outcomes (56%). Few (36%) reported validity evidence for evaluation instruments. A one-point increase in MERSQI score was associated with editorial decisions to send manuscripts for peer review versus reject without review (OR 1.31, 95%CI 1.07–1.61, p = 0.009) and to invite revisions after review versus reject after review (OR 1.29, 95%CI 1.05–1.58, p = 0.02). MERSQI scores predicted final acceptance versus rejection (OR 1.32; 95% CI 1.10–1.58, p = 0.003). The mean total MERSQI score of accepted manuscripts was significantly higher than rejected manuscripts (10.7 [2.5] versus 9.0 [2.4], p = 0.003).ConclusionsMERSQI scores predicted editorial decisions and identified areas of methodological strengths and weaknesses in submitted manuscripts. Researchers, reviewers, and editors might use this instrument as a measure of methodological quality.

Journal of General Internal Medicine | 2009

Effect of Rater Training on Reliability and Accuracy of Mini-CEX Scores: A Randomized, Controlled Trial

David A. Cook; Denise M. Dupras; Thomas J. Beckman; Kris G. Thomas; V. Shane Pankratz

BackgroundMini-CEX scores assess resident competence. Rater training might improve mini-CEX score interrater reliability, but evidence is lacking.ObjectiveEvaluate a rater training workshop using interrater reliability and accuracy.DesignRandomized trial (immediate versus delayed workshop) and single-group pre/post study (randomized groups combined).SettingAcademic medical center.ParticipantsFifty-two internal medicine clinic preceptors (31 randomized and 21 additional workshop attendees).InterventionThe workshop included rater error training, performance dimension training, behavioral observation training, and frame of reference training using lecture, video, and facilitated discussion. Delayed group received no intervention until after posttest.MeasurementsMini-CEX ratings at baseline (just before workshop for workshop group), and four weeks later using videotaped resident–patient encounters; mini-CEX ratings of live resident–patient encounters one year preceding and one year following the workshop; rater confidence using mini-CEX.ResultsAmong 31 randomized participants, interrater reliabilities in the delayed group (baseline intraclass correlation coefficient [ICC] 0.43, follow-up 0.53) and workshop group (baseline 0.40, follow-up 0.43) were not significantly different (p = 0.19). Mean ratings were similar at baseline (delayed 4.9 [95% confidence interval 4.6–5.2], workshop 4.8 [4.5–5.1]) and follow-up (delayed 5.4 [5.0–5.7], workshop 5.3 [5.0–5.6]; p = 0.88 for interaction). For the entire cohort, rater confidence (1 = not confident, 6 = very confident) improved from mean (SD) 3.8 (1.4) to 4.4 (1.0), p = 0.018. Interrater reliability for ratings of live encounters (entire cohort) was higher after the workshop (ICC 0.34) than before (ICC 0.18) but the standard error of measurement was similar for both periods.ConclusionsRater training did not improve interrater reliability or accuracy of mini-CEX scores.Clinical trials registrationclinicaltrials.gov identifier NCT00667940

Medical Teacher | 2007

Developing scholarly projects in education: A primer for medical teachers

Thomas J. Beckman; David A. Cook

Boyer and Glassicks broad definition of and standards for assessing scholarship apply to all aspects of education. Research on the quality of published medical education studies also reveals fundamentally important elements to address. In this article a three-step approach to developing medical education projects is proposed: refine the scholarly question, identify appropriate designs and methods, and select outcomes. Refining the scholarly question requires careful attention to literature review, conceptual framework, and statements of problem and study intent. The authors emphasize statement of study intent, which is a studys focal point, and conceptual framework, which situates a project within a theoretical context and provides a means for interpreting the results. They then review study designs and methods commonly used in education projects. They conclude with outcomes, which should be distinguished from assessment methods and instruments, and are separated into Kirkpatricks hierarchy of reaction, learning, behavior and results.

Advances in Health Sciences Education | 2010

Reflections on experimental research in medical education

David A. Cook; Thomas J. Beckman

As medical education research advances, it is important that education researchers employ rigorous methods for conducting and reporting their investigations. In this article we discuss several important yet oft neglected issues in designing experimental research in education. First, randomization controls for only a subset of possible confounders. Second, the posttest-only design is inherently stronger than the pretest–posttest design, provided the study is randomized and the sample is sufficiently large. Third, demonstrating the superiority of an educational intervention in comparison to no intervention does little to advance the art and science of education. Fourth, comparisons involving multifactorial interventions are hopelessly confounded, have limited application to new settings, and do little to advance our understanding of education. Fifth, single-group pretest–posttest studies are susceptible to numerous validity threats. Finally, educational interventions (including the comparison group) must be described in detail sufficient to allow replication.

Medical Teacher | 2003

Evaluating an instrument for the peer review of inpatient teaching

Thomas J. Beckman; Mark C. Lee; Charles H. Rohren; V. Shane Pankratz

The purpose of this study was to assess an instrument for the peer review of inpatient teaching at Mayo. The Mayo Teaching Evaluation Form (MTEF) is an instrument, based on the Stanford seven-category educational framework, which was developed for the peer review of inpatient teaching. The MTEF has 28 Likert-scaled items derived from the Stanford Faculty Development Program form (SFDP-26), the Mayo electronic evaluation form and three additional items. In this study three physician-evaluators used the MTEF to evaluate 10 attending physicians on the Mayo general internal medicine hospital services. Cronbachs alphas were used to assess the internal consistency of the MTEF, and Kendalls coefficient of concordance was used to summarize the inter-rater reliability. Results of this study reveal that the MTEF is internally consistent, based on average ratings across all evaluators (Cronbachs alpha=0.894). Stanford categories with the highest alphas are Self-Directed Learning, Learning Climate, Communication of Goals, and Evaluation. Categories with lower alphas are Feedback, Understanding and Retention, and Control of Teaching Session. Additionally, the majority of items on the MTEF show significant agreement across all evaluators, and teacher enthusiasm was among the most reliable items. In conclusion, the MTEF is overall internally consistent for the peer review of inpatient teaching at Mayo. Hence, the MTEF may be a useful element in the peer evaluation of teaching at our institution.

Journal of General Internal Medicine | 2007

Validation of a method for assessing resident physicians' quality improvement proposals.

James L. Leenstra; Thomas J. Beckman; Darcy A. Reed; William C. Mundell; Kris G. Thomas; Bryan J. Krajicek; Stephen S. Cha; Joseph C. Kolars; Furman S. McDonald

BACKGROUNDResidency programs involve trainees in quality improvement (QI) projects to evaluate competency in systems-based practice and practice-based learning and improvement. Valid approaches to assess QI proposals are lacking.OBJECTIVEWe developed an instrument for assessing resident QI proposals—the Quality Improvement Proposal Assessment Tool (QIPAT-7)—and determined its validity and reliability.DESIGNQIPAT-7 content was initially obtained from a national panel of QI experts. Through an iterative process, the instrument was refined, pilot-tested, and revised.PARTICIPANTSSeven raters used the instrument to assess 45 resident QI proposals.MEASUREMENTSPrincipal factor analysis was used to explore the dimensionality of instrument scores. Cronbach’s alpha and intraclass correlations were calculated to determine internal consistency and interrater reliability, respectively.RESULTSQIPAT-7 items comprised a single factor (eigenvalue = 3.4) suggesting a single assessment dimension. Interrater reliability for each item (range 0.79 to 0.93) and internal consistency reliability among the items (Cronbach’s alpha = 0.87) were high.CONCLUSIONSThis method for assessing resident physician QI proposals is supported by content and internal structure validity evidence. QIPAT-7 is a useful tool for assessing resident QI proposals. Future research should determine the reliability of QIPAT-7 scores in other residency and fellowship training programs. Correlations should also be made between assessment scores and criteria for QI proposal success such as implementation of QI proposals, resident scholarly productivity, and improved patient outcomes.

Explore More