Kadriye Ercikan
University of British Columbia
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Kadriye Ercikan.
Educational Researcher | 2006
Kadriye Ercikan; Wolff-Michael Roth
In education research, a polar distinction is frequently made to describe and produce different kinds of research: quantitative versus qualitative. In this article, the authors argue against that polarization and the associated polarization of the “subjective” and the “objective,” and they question the attribution of generalizability to only one of the poles. The purpose of the article is twofold: (a) to demonstrate that this polarization is not meaningful or productive for education research, and (b) to propose an integrated approach to education research inquiry. The authors sketch how such integration might occur by adopting a continuum instead of a dichotomy of generalizability. They then consider how that continuum might be related to the types of research questions asked, and they argue that the questions asked should determine the modes of inquiry that are used to answer them.
International Journal of Educational Research | 1998
Kadriye Ercikan
Abstract Ideally, a single common form of a test would be used for international assessments. However, since the test is administered in different countries, it is necessary to translate the test into the languages of these countries. This chapter explores the application of a statistical method to examine the effect of translations on the equivalence of test items and the comparability of test scores. This method is used to identify poorly translated items in an international assessment which was administered in two languages and to examine how the comparability of scores is affected by problems in translations.
International Journal of Testing | 2002
Kadriye Ercikan
This article describes and discusses strategies used in disentangling sources of differential item functioning (DIF) in multilanguage assessments where multiple factors are expected to be causing DIF. Three strategies are used for identifying adaptation and curricular differences as sources of DIF: (a) judgmental reviews by multiple bilingual translators of all items, (b) cross-validation of DIF in multiple groups, and (c) examination of the distribution of DIF items by topic. Twenty-seven percent of the mathematics DIF items and 37% of the science DIF items were interpreted to be due to adaptation-related differences based on judgmental reviews. Most of these interpretations were also supported by the cross-validation analyses. Clustering of DIF items by topic provided curricular differences as interpretation for DIF only for small portions of the DIF items, approximately 23% of the mathematics DIF items and 13% of the science DIF items.
Applied Measurement in Education | 2004
Kadriye Ercikan; Mark J. Gierl; Tanya McCreith; Gautam Puhan; Kim Koh
This research examined the degree of comparability and sources of incomparability of English and French versions of reading, mathematics, and science tests that were administered as part of a survey of achievement in Canada. The results point to substantial psychometric differences between the 2 language versions. Approximately 18% to 36% of the items were identified as differentially functioning for the 2 language groups. Large proportions of these differential item functioning (DIF) items, 36% to 100% across age groups and content areas, were attributed to adaptation related differences. A smaller proportion, 27% to 33% of the DIF items, was attributed to curricular differences. Twenty-four to 49% of DIF items could not be attributed to either of the 2 sources considered in the study.
Child Care Health and Development | 2009
Y. J. Martinez; Kadriye Ercikan
BACKGROUND Survival rates of children with a chronic illness is at an all-time high. Up to 98% of children suffering from a chronic illness, which may have been considered fatal in the past, now reach early adulthood. It is estimated that as many as 30% of school-aged children are affected by a chronic illness. For this population of children, the prevalence of educational and psychological problems is nearly double in comparison with the general population. METHODS This study investigated the educational and psychological effects of childhood chronic illness among 1512 Canadian children (ages 10-15 years). This was a retrospective analysis using data from the National Longitudinal Survey of Children and Youth, taking a cross-sectional look at the relationships between childhood chronic illnesses, performance on a Mathematics Computation Exercise (MCE) and ratings on an Anxiety and Emotional Disorder (AED) scale. RESULTS When AED ratings and educational handicaps were controlled for, children identified with chronic illnesses still had weaker performance on the MCE. Chronic illness did not appear to have a relationship with childrens AED ratings. The regression analysis indicated that community type and illness were the strongest predictors of MCE scores. CONCLUSIONS The core research implications of this study concern measurement issues that need to be addressed in future large-scale studies. Clinical implications of this research concern the need for co-ordinated services between the home, hospital and school settings so that services and programmes focus on the ecology of the child who is ill.
Archive | 2003
Robert J. Mislevy; Mark Wilson; Kadriye Ercikan; Naomi Chudowsky
“Validity, reliability, comparability, and fairness are not just measurement issues, but social values that have meaning and force outside of measurement wherever evaluative judgments and decisions are made” (Messick, 1994, p. 2).
International Journal of Testing | 2005
Kadriye Ercikan; Kim Koh
The objective of this research was to examine the comparability of constructs assessed by English and French versions of the Third International Mathematics and Science Study (TIMSS). The differences in constructs observed in our analyses indicate serious limitations of using TIMSS results for making comparisons that use overall performance in mathematics and science. In particular, large differences between constructs were observed between the U.S. and French scales. Such limitations include rank ordering of performance of countries such as the United States and France, as well as conducting research using TIMSS data to compare factors associated with performance. The results from this study point to differences in constructs assessed by TIMSS in different countries and the importance of empirical evidence to support construct comparability before TIMSS results can be meaningfully used for research.
Applied Measurement in Education | 2006
Sharon Mendes-Barnett; Kadriye Ercikan
This study contributes to understanding sources of gender differential item functioning (DIF) on mathematics tests. This study focused on identifying sources of DIF and differential bundle functioning for boys and girls on the British Columbia Principles of Mathematics Exam (Grade 12) using a confirmatory SIBTEST approach based on a multidimensional model. Problem solving as a content area was confirmed as a source of gender DIF in favor of boys when the item is presented in the form of a story problem or when the problems are noncontext specific. Patterns and relations content areas produced a mixture of confirmed sources of DIF, with some subtopics favoring the girls and some favoring the boys. In contrast to what might be expected given the findings of previous gender DIF research, this study did not find geometry to be a source of gender DIF. All of the higher cognitive level items favored boys. High levels of DIF were detected in favor of girls on the bundle of computation items in which no equations were provided in the question.
Applied Measurement in Education | 2002
Kadriye Ercikan; Marc W. Julian
In this study we examined the degree of variation in the accuracy of classifying student performance to proficiency-level scores with changes in the number of proficiency levels and the measurement accuracy. Furthermore, we examined the degree to which the classification accuracy varies across different ability levels given different numbers of proficiency levels based on the same test and the same set of cut-scores. The results of the study based on simulations indicate that the classification accuracy decreased, on average, by 10% for an increase of 1 proficiency level, 20% for an increase of 2 proficiency levels, and 20% to 30% for an increase of 3 proficiency levels. In addition, classification accuracy varied 10% to 20% for tests with reliabilities that ranged between 0.70 and 0.93. The findings regarding the variability of classification accuracy for different score ranges point to serious limitations of interpretability of single indexes that are intended to represent classification accuracy. Suggestions are made for estimating classification accuracy for critical score ranges.
International Journal of Testing | 2013
Maria Elena Oliveri; Kadriye Ercikan; Bruno D. Zumbo
In this study, we investigated differential item functioning (DIF) and its sources using a latent class (LC) modeling approach. Potential sources of LC DIF related to instruction and teacher-related variables were investigated using substantive and three statistical approaches: descriptive discriminant function, multinomial logistic regression, and multilevel multinomial logistic regression analyses. Results revealed that differential response patterns, as indicated by identification of LCs, were most strongly associated with student achievement levels and teacher-related variables rather than manifest characteristics such as gender, test language, and country, which are the focus of typical measurement comparability research. Findings from this study have important implications for measurement comparability and validity research. Evidence of within-group heterogeneity in the test data structure suggests that the identification of DIF and its sources may not apply to all examinees in the group and that measurement incomparability may be greater among groups that are not defined by manifest variables such as gender and ethnicity. Results suggest that alternative variables that may be more closely related to the investigated construct should be examined when conducting measurement comparability research.