Megan E. Welsh
University of Connecticut
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Megan E. Welsh.
Journal of School Psychology | 2014
Amy M. Briesch; Hariharan Swaminathan; Megan E. Welsh; Sandra M. Chafouleas
Generalizability Theory (GT) offers increased utility for assessment research given the ability to concurrently examine multiple sources of variance, inform both relative and absolute decision making, and determine both the consistency and generalizability of results. Despite these strengths, assessment researchers within the fields of education and psychology have been slow to adopt and utilize a GT approach. This underutilization may be due to an incomplete understanding of the conceptual underpinnings of GT, the actual steps involved in designing and implementing generalizability studies, or some combination of both issues. The goal of the current article is therefore two-fold: (a) to provide readers with the conceptual background and terminology related to the use of GT and (b) to facilitate understanding of the range of issues that need to be considered in the design, implementation, and interpretation of generalizability and dependability studies. Given the relevance of this analytic approach to applied assessment contexts, there exists a need to ensure that GT is both accessible to, and understood by, researchers in education and psychology. Important methodological and analytical considerations are presented and implications for applied use are described.
Journal of School Psychology | 2013
Sandra M. Chafouleas; Stephen P. Kilgus; Rose Jaffery; T. Chris Riley-Tillman; Megan E. Welsh; Theodore J. Christ
The purpose of this study was to investigate how Direct Behavior Rating Single Item Scales (DBR-SIS) involving targets of academically engaged, disruptive, and respectful behaviors function in school-based screening assessment. Participants included 831 students in kindergarten through eighth grades who attended schools in the northeastern United States. Teachers provided behavior ratings for a sample of students in their classrooms on the DBR-SIS, the Behavioral and Emotional Screening System (Kamphaus & Reynolds, 2007), and the Student Risk Screening Scale (Drummond, 1994). Given variations in rating procedures to accommodate scheduling differences across grades, analysis was conducted separately for elementary school and middle school grade levels. Results suggested that the recommended cut scores, the combination of behavior targets, and the resulting conditional probability indices varied depending on grade level grouping (lower elementary, upper elementary, middle). For example, for the lower elementary grade level grouping, a combination of disruptive behavior (cut score=2) and academically engaged behavior (cut score=8) was considered to offer the best balance among indices of diagnostic accuracy, whereas a cut score of 1 for disruptive behavior and 8 for academically engaged behavior were recommended for the upper elementary school grade level grouping and cut scores of 1 and 9, respectively, were suggested for middle school grade level grouping. Generally, DBR-SIS cut scores considered optimal for screening using single or combined targets including academically engaged behavior and disruptive behavior by offering a reasonable balance of indices for sensitivity (.51-.90), specificity (.47-.83), negative predictive power (.94-.98), and positive predictive power (.14-.41). The single target of respectful behavior performed poorly across all grade level groups, and performance of DBR-SIS targets was relatively better in the elementary school than middle school grade level groups. Overall, results supported that disruptive behavior is highly important in evaluating risk status in lower grade levels and that academically engaged behavior becomes more pertinent as students reach higher grade levels. Limitations, future directions, and implications are discussed.
Educational Assessment | 2007
Jerome V. D'Agostino; Megan E. Welsh; Nina M. Corson
Abstract The accuracy of achievement test score inferences largely depends on the sensitivity of scores to instruction focused on tested objectives. Sensitivity requirements are particularly challenging for standards-based assessments because a variety of plausible instructional differences across classrooms must be detected. For this study, we developed a new method for capturing the alignment between how teachers bring standards to life in their classrooms and how the standards are defined on a test. Teachers were asked to report the degree to which they emphasized the states academic standards, and to describe how they taught certain objectives from the standards. Two curriculum experts judged the alignment between how teachers brought the objectives to life in their classrooms and how the objectives were operationalized on the state test. Emphasis alone did not account for achievement differences among classrooms. The best predictors of classroom achievement were the match between how the standards were taught and tested, and the interaction between emphasis and match, indicating that test scores were sensitive to instruction of the standards, but in a narrow sense.
School Psychology Quarterly | 2015
Faith G. Miller; Daniel Cohen; Sandra M. Chafouleas; T. Chris Riley-Tillman; Megan E. Welsh; Gregory A. Fabiano
The purpose of this study was to examine the relation between teacher-implemented screening measures used to identify social, emotional, and behavioral risk. To this end, 5 screening options were evaluated: (a) Direct Behavior Rating - Single Item Scales (DBR-SIS), (b) Social Skills Improvement System - Performance Screening Guide (SSiS), (c) Behavioral and Emotional Screening System - Teacher Form (BESS), (d) Office discipline referrals (ODRs), and (e) School nomination methods. The sample included 1974 students who were assessed tri-annually by their teachers (52% female, 93% non-Hispanic, 81% white). Findings indicated that teacher ratings using standardized rating measures (DBR-SIS, BESS, and SSiS) resulted in a larger proportion of students identified at-risk than ODRs or school nomination methods. Further, risk identification varied by screening option, such that a large percentage of students were inconsistently identified depending on the measure used. Results further indicated weak to strong correlations between screening options. The relation between broad behavioral indicators and mental health screening was also explored by examining classification accuracy indices. Teacher ratings using DBR-SIS and SSiS correctly identified between 81% and 91% of the sample as at-risk using the BESS as a criterion. As less conservative measures of risk, DBR-SIS and SSiS identified more students as at-risk relative to other options. Results highlight the importance of considering the aims of the assessment when selecting broad screening measures to identify students in need of additional support.
Journal of School Psychology | 2014
Stephen P. Kilgus; T. Chris Riley-Tillman; Sandra M. Chafouleas; Theodore J. Christ; Megan E. Welsh
The purpose of this study was to evaluate the utility of Direct Behavior Rating Single Item Scale (DBR-SIS) targets of disruptive, engaged, and respectful behavior within school-based universal screening. Participants included 31 first-, 25 fourth-, and 23 seventh-grade teachers and their 1108 students, sampled from 13 schools across three geographic locations (northeast, southeast, and midwest). Each teacher rated approximately 15 of their students across three measures, including DBR-SIS, the Behavioral and Emotional Screening System (Kamphaus & Reynolds, 2007), and the Student Risk Screening Scale (Drummond, 1994). Moderate to high bivariate correlations and area under the curve statistics supported concurrent validity and diagnostic accuracy of DBR-SIS. Receiver operating characteristic curve analyses indicated that although respectful behavior cut scores recommended for screening remained constant across grade levels, cut scores varied for disruptive behavior and academic engaged behavior. Specific cut scores for first grade included 2 or less for disruptive behavior, 7 or greater for academically engaged behavior, and 9 or greater for respectful behavior. In fourth and seventh grades, cut scores changed to 1 or less for disruptive behavior and 8 or greater for academically engaged behavior, and remained the same for respectful behavior. Findings indicated that disruptive behavior was particularly appropriate for use in screening at first grade, whereas academically engaged behavior was most appropriate at both fourth and seventh grades. Each set of cut scores was associated with acceptable sensitivity (.79-.87), specificity (.71-.82), and negative predictive power (.94-.96), but low positive predictive power (.43-.44). DBR-SIS multiple gating procedures, through which students were only considered at risk overall if they exceeded cut scores on 2 or more DBR-SIS targets, were also determined acceptable in first and seventh grades, as the use of both disruptive behavior and academically engaged behavior in defining risk yielded acceptable conditional probability indices. Overall, the current findings are consistent with previous research, yielding further support for the DBR-SIS as a universal screener. Limitations, implications for practice, and directions for future research are discussed.
School Psychology Quarterly | 2012
Stephen P. Kilgus; Sandra M. Chafouleas; T. Chris Riley-Tillman; Megan E. Welsh
This study presents an evaluation of the diagnostic accuracy and concurrent validity of Direct Behavior Rating Single Item Scales for use in school-based behavior screening of second-grade students. Results indicated that each behavior target was a moderately to highly accurate predictor of behavioral risk. Optimal universal screening cut scores were also identified for each scale, with results supporting reduced false positive rates through the simultaneous use of multiple scales.
Gifted Child Quarterly | 2013
D. Betsy McCoach; Karen E. Rambo; Megan E. Welsh
This Methodological Brief gives an overview of statistical methods used to gauge academic growth and discusses issues surrounding the measurement of growth in gifted populations. To illustrate some of these issues, we describe a growth model that examines differences in summer lag between gifted and nongifted students. We also provide recommendations for educators and researchers who are interested in documenting the academic growth of gifted students.
Journal of Advanced Academics | 2011
Megan E. Welsh
States and districts are under increasing pressure to evaluate the effectiveness of their teachers and to ensure that all students receive high-quality instruction. This article describes some of the challenges associated with current effectiveness approaches, including paper-and-pencil tests of pedagogical content knowledge, classroom observation systems, and value-added models. It proposes development of a new teacher evaluation system using a virtual reality environment and describes how innovations in educational measurement and technology can be used to develop an improved teacher effectiveness measure.
The Journal of Environmental Education | 2007
Jerome V. D'Agostino; Kerry Schwartz; Adriana D. Cimetta; Megan E. Welsh
Although young people in 50 U.S. states and 21 countries learn about water resources through Project WET (Water Education for Teachers), few researchers have conducted summative evaluations of the program. The authors employed a partitioned, or differential, treatments design in which two groups of 6th-grade students received overlapping but unique lesson components. Using hierarchical linear modeling, the authors found that classrooms from both groups had similar pre- to posttest gains on a test of the common material, but each group outperformed the other group on tests of the unique material a group experienced.
Review of Educational Research | 2016
Susan M. Brookhart; Thomas R. Guskey; Alex J. Bowers; James H. McMillan; Jeffrey K. Smith; Lisa F. Smith; Michael T. Stevens; Megan E. Welsh
Grading refers to the symbols assigned to individual pieces of student work or to composite measures of student performance on report cards. This review of over 100 years of research on grading considers five types of studies: (a) early studies of the reliability of grades, (b) quantitative studies of the composition of K–12 report card grades, (c) survey and interview studies of teachers’ perceptions of grades, (d) studies of standards-based grading, and (e) grading in higher education. Early 20th-century studies generally condemned teachers’ grades as unreliable. More recent studies of the relationships of grades to tested achievement and survey studies of teachers’ grading practices and beliefs suggest that grades assess a multidimensional construct containing both cognitive and noncognitive factors reflecting what teachers value in student work. Implications for future research and for grading practices are discussed.