José Felipe Martínez

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where José Felipe Martínez is active.

Explore More

Publication

Featured researches published by José Felipe Martínez.

Educational Assessment | 2009

Classroom Assessment Practices, Teacher Judgments, and Student Achievement in Mathematics: Evidence from the ECLS

José Felipe Martínez; Brian M. Stecher; Hilda Borko

In this study we use data from the Early Childhood Longitudinal Survey third- and fifth-grade samples to investigate teacher judgments of student achievement, the extent to which they offer a similar picture of student mathematics achievement compared to standardized test scores, and whether classroom assessment practices moderate the relationship between the two measures. Results indicate that teacher ratings correlate strongly with standardized test scores; however, this relationship varies considerably across teachers, and this variation is associated with certain classroom assessment practices. Furthermore, the evidence suggests that teachers evaluate student performance not in absolute terms but relative to other students in the school and that they may adjust their grading for some students, perhaps with basis on perceived differences in need and/or ability.

Journal of Early Adolescence | 2015

Measuring Effective Teacher-Student Interactions From a Student Perspective A Multi-Level Analysis

Jason T. Downer; Megan W. Stuhlman; Jonathan Schweig; José Felipe Martínez; Erik A. Ruzek

This study applies multi-level analysis to student reports of effective teacher-student interactions in 50 upper elementary school classrooms (N = 594 fourth- and fifth-grade students). Observational studies suggest that teacher-student interactions fall into three domains: Emotional Support, Classroom Organization, and Instructional Support. Results of multi-level confirmatory factor analyses indicated that a three-factor model fits between- and within-classroom variability in students’ reports reasonably well. Multi-level regressions provide some evidence of criterion validity, with student reports at the classroom level related to parallel observations. Both classroom- and student-level student report data were associated with students’ reading proficiency and disciplinary referrals. Findings are discussed in terms of implications for future research on student reports of classroom interactions and their practical utility in teacher evaluation and feedback systems.

School Effectiveness and School Improvement | 2012

Consequences of omitting the classroom in multilevel models of schooling: an illustration using opportunity to learn and reading achievement

José Felipe Martínez

Statisticians have shown the theoretical extent of parameter distortion when the classroom level is ignored in multilevel analyses of schooling. This article illustrates the practical consequences of omitting the classroom for inferences drawn about the extent and mechanisms of schooling effects, using the relationship between reading achievement, opportunity to learn, and student composition. Findings indicate that omitting the classroom level inflates estimates of school-level variance, while at the same time underestimating the overall extent of variance related to schooling effects. Classrooms also moderate the effects of educational opportunities. Finally, compositional effects typically conceptualized at the school level may be best defined at the classroom level. Ignoring classroom nesting in the analysis thus not only underestimates the overall impact of schools but presents a distorted picture of the mechanisms through which the schooling environment influences student achievement. Special attention is paid to considering how modeling choices may inform education policy efforts.

Educational Evaluation and Policy Analysis | 2009

A Longitudinal Investigation of the Relationship between Teachers’ Self-Reports of Reform-Oriented Instruction and Mathematics and Science Achievement:

Vi-Nhuan Le; J. R. Lockwood; Brian M. Stecher; Laura S. Hamilton; José Felipe Martínez

In the past two decades, several major initiatives were launched to improve mathematics and science education. One prominent feature in these efforts was a new approach to teaching mathematics and science, referred to as reform-oriented teaching. Although past studies suggest this approach may improve student achievement, the relationships between reform-oriented pedagogy and achievement were weak. The weak relationships may be partially attributable to the limited time frame in which reform-oriented teaching was examined (typically a 1-year period). This study explored the relationship between mathematics and science achievement and reform-oriented teaching over a 3-year period. Results suggested greater exposure to reform-oriented instruction was generally not significantly associated with higher student achievement but the effects became stronger with prolonged exposure to reform-oriented practices. Reform-oriented instruction showed stronger, positive relationships with open-ended measures than with multiple-choice tests in both mathematics and science and with problem-solving skills than with procedural skills in mathematics.

Educational Assessment | 2007

Relationships among Measures as Empirical Evidence of Validity: Incorporating Multiple Indicators of Achievement and School Context.

Pete Goldschmidt; José Felipe Martínez; David Niemi; Eva L. Baker

In this article we examine empirical evidence on the criterion, predictive, transfer, and fairness aspects of validity of a large-scale language arts performance assessment, referred to as the Performance Assignment (PA). We use multilevel models to avoid biased inferences that might result from the naturally nested data. Specifically, we examine the relationships of the assessment with the Stanford Achievement Test, 9th Edition and the California High School Exit Examination. The results indicate that the measures are related, that students demonstrate a degree of transfer, and that the language arts PA is relatively more fair than comparison assessments. The results are robust to various model specifications and demonstrate that benefits do not accrue to all students equally.

Educational Assessment | 2012

Conceptual, Methodological, and Policy Issues in the Study of Teaching: Implications for Improving Instructional Practice at Scale

Richard Correnti; José Felipe Martínez

Education research and policy are increasingly converging around the notion that improving education outcomes system-wide will entail a large-scale effort to improve instruction inside the classroom. Instruction is one of the critical factors mediating the relationship between education policy and student outcomes, and thus an accurate sense of what goes on inside classrooms is needed to understand not only what works? in education, but also how? (Raudenbush & Sadoff, 2008). With a renewed focus on the critical role of instruction in classrooms as a determinant of student achievement, educational research efforts seeking to isolate the components of effective teachers (e.g., the Gates Foundation’s Measures of Effective Teaching Study) and effective teaching (e.g., Hiebert & Morris, 2012, described this important distinction) are again becoming prominent. In parallel, many districts and states are developing systems of teacher evaluation (Baker et al., 2010) that include among the key indicators measures of instruction (or classroom practice more generally) intended to help open the black box of the classroom. Although instruction is at the center of most conceptual models of educational quality and effectiveness, research and policy efforts often have focused on educational outcomes, paying less attention to examining educational processes, both as mediators and as important outcomes themselves. It is important to note that this is rather unlikely to result from competing ideas about whether gaining knowledge about teaching is important. Rather, it reflects the challenges facing researchers interested in rigorously studying instructional practice and in organizing

Educational Assessment | 2012

Measuring Classroom Assessment Practice Using Instructional Artifacts: A Validation Study of the QAS Notebook

José Felipe Martínez; Hilda Borko; Brian M. Stecher; Rebecca Luskin; Matt Kloser

We report the results of a pilot validation study of the Quality Assessment in Science Notebook, a portfolio-like instrument for measuring teacher assessment practices in middle school science classrooms. A statewide sample of 42 teachers collected 2 notebooks during the school year, corresponding to science topics taught in the fall and spring. Each notebook was scored on 9 dimensions of assessment practice by 3 trained raters. Our analysis investigated the reliability and validity of notebook ratings, with particular emphasis on identifying key sources of error in the ratings. The results suggest that variation in teacher practice across notebooks (i.e., over time) was more important than idiosyncratic rater inconsistencies as a source of error in the scores. The validity results point to a dominant factor underlying the ratings and some predictive power of notebook ratings on student achievement. We discuss implications of the results for measuring assessment practice through artifacts, drawing conceptual and methodological lessons about our model of assessment practice, the consistency of raters, and the estimation of variance over time with classroom-based measures of instruction.

Journal of Psychoeducational Assessment | 2010

Rating Performance Assessments of Students With Disabilities: A Study of Reliability and Bias

Ann M. Mastergeorge; José Felipe Martínez

Inclusion of students with disabilities in district-wide and state assessments is mandated by federal regulations, and teachers sometimes play an important role in rating these students’ work. In this study, trained teachers rated student proficiency in performance assessments in language arts and mathematics in third, fifth, and ninth grades. The scores assigned by teacher raters to students with and without disabilities in an initial blind rating were compared with the ratings assigned in a second occasion when raters were aware of each student’s disability status. A series of generalizability studies was used to determine if there are differences in the patterns of variability across groups and whether rater bias may play a role in these differences. Although knowledge of a student’s disability status did not increase or decrease the scores assigned by raters on average, the findings point to differences in the sources of variability across groups and specifically to greater inconsistency when rating papers from students with disabilities. The findings suggest that individual teachers may behave differently when scoring students with disabilities. A survey was also used to investigate rater perceptions of one’s own and other teacher’s bias when grading papers of students with disabilities. Implications for decision making in rating assessments are discussed.

Educational Evaluation and Policy Analysis | 2016

Approaches for Combining Multiple Measures of Teacher Performance: Reliability, Validity, and Implications for Evaluation Policy

José Felipe Martínez; Jonathan Schweig; Pete Goldschmidt

A key question facing teacher evaluation systems is how to combine multiple measures of complex constructs into composite indicators of performance. We use data from the Measures of Effective Teaching (MET) study to investigate the measurement properties of composite indicators obtained under various conjunctive, disjunctive (or complementary), and weighted (or compensatory) models. We find that accuracy varies across models and cut-scores and that models with similar accuracy may yield different teacher classifications. Accuracy and consistency are greatest if composites are constructed to maximize reliability and lowest if they seek to optimally predict student test scores. We discuss the implications of the results for the validity of inferences about the performance of individual teachers, and more generally for the design of teacher evaluation systems.

Educational Assessment | 2007

Language Arts Performance Assignments: Generalizability Studies of Local and Central Ratings.

José Felipe Martínez; Pete Goldschmidt; David Niemi; Eva L. Baker; Roxanne M. Sylvester

We conducted generalizability studies to examine the extent to which ratings of language arts performance assignments, administered in a large, diverse, urban district to students in second through ninth grades, result in reliable and precise estimates of true student performance. The results highlight three important points when considering the use of performance assessments in large-scale settings: (a) Rater training may significantly impact reliability; (b) simple rater agreement indices do not provide enough information to assess the reliability of inferences about true student achievement; and (c) assessments adequate for relative judgments of student performance do not necessarily provide sufficient precision for absolute criterion-referenced decisions.

Explore More