Is this you? Create Your Porfile

Margaret Wu

Victoria University, Australia

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Margaret Wu is active.

Explore More

Publication

Featured researches published by Margaret Wu.

Journal of Educational and Behavioral Statistics | 1997

Multilevel Item Response Models: An Approach to Errors in Variables Regression

Raymond J. Adams; Mark Wilson; Margaret Wu

In this article we show how certain analytic problems that arise when one attempts to use latent variables as outcomes in regression analyses can be addressed by taking a multilevel perspective on item response modeling. Under a multilevel, or hierarchical, perspective we cast the item response model as a within-student model and the student population distribution as a between-student model. Taking this perspective leads naturally to an extension of the student population model to include a range of student-level variables, and it invites the possibility of further extending the models to additional levels so that multilevel models can be applied with latent outcome variables. In the two-level case, the model that we employ is formally equivalent to the plausible value procedures that are used as part of the National Assessment of Educational Progress (NAEP), but we present the method for a different class of measurement models, and we use a simultaneous estimation method rather than two-step estimation. In our application of the models to the appropriate treatment of measurement error in the dependent variable of a between-student regression, we also illustrate the adequacy of some approximate procedures that are used in NAEP.

Educational and Psychological Measurement | 2012

The Rasch Rating Model and the Disordered Threshold Controversy

Raymond J. Adams; Margaret Wu; Mark Wilson

The Rasch rating (or partial credit) model is a widely applied item response model that is used to model ordinal observed variables that are assumed to collectively reflect a common latent variable. In the application of the model there is considerable controversy surrounding the assessment of fit. This controversy is most notable when the set of parameters that are associated with the categories of an item have estimates that are not ordered in value in the same order as the categories. Some consider this disordering to be inconsistent with the intended order of the response categories in a variable and often term it reversed deltas. This article examines a variety of derivations of the model to illuminate the controversy. The examination of the derivations shows that the so-called parameter disorder and order of the response categories are separate phenomena. When the data fit the Rasch rating model the response categories are ordered regardless of the (order of the) values of the parameter estimates. In summary, reversed deltas are not necessarily evidence of a problem. In fact the reversed deltas phenomenon is indicative of specific patterns in the relative numbers of respondents in each category. When there are preferences about such relative numbers in categories, the patterns of deltas may be a useful diagnostic.

Discourse: Studies in The Cultural Politics of Education | 2015

Leaning too far? PISA, policy and Australia's ‘top five’ ambitions

Radhika Gorur; Margaret Wu

Australia has declared its ambition to be within the ‘top five’ in the Programme for International Student Assessment (PISA) by 2025. So serious is it about this ambition, that the Australian Government has incorporated it into the Australian Education Act, 2013. Given this focus on PISA results and rankings, we go beyond average scores to take a close look at Australias performance in PISA, examining rankings by different geographical units, by item content and by test completion. Based on this analysis and using data from interviews with measurement and policy experts, we show how uninformative and even misleading the ‘average performance scores’, on which the rankings are based, can be. We explore how a more nuanced understanding would point to quite different policy actions. After considering the PISA data and Australias ‘top five’ ambition closely, we argue that neither the rankings nor such ambitions should be given much credence.

Journal of Informetrics | 2014

Estimating the accuracies of journal impact factor through bootstrap

Kuan Ming Chen; Tsung-Hau Jen; Margaret Wu

The journal impact factor (JIF) reported in journal citation reports has been used to represent the influence and prestige of a journal. Whereas the consideration of the stochastic nature of a statistic is a prerequisite for statistical inference, the estimation of JIF uncertainty is necessary yet unavailable for comparing the impact among journals. Using journals in the Database of Research in Science Education (DoRISE), the current study proposes bootstrap methods to estimate the JIF variability. The paper also provides a comprehensive exposition of the sources of JIF variability. The collections of articles in the year of interest and in the preceding years both contribute to JIF variability. In addition, the variability estimate differs depending on the way a database selects its journals for inclusion. In the bootstrap process, the nested structure of articles in a journal was accounted for to ensure that each bootstrap replication reflects the actual citation characteristics of articles in the journal. In conclusion, the proposed point and interval estimates of the JIF statistic are obtained and more informative inferences on the impact of journals can be drawn.

Archive | 2012

Implications of International Studies for National and Local Policy in Mathematics Education

John A. Dossey; Margaret Wu

This chapter examines large-scale comparative studies of mathematics education focussed on student achievement in an attempt to explain how such investigations influence the formation and implementation of policies affecting mathematics education. In doing so, we review the nature of comparative studies and policy research. Bennett’s (1991) formulation of policy development and implementation is used in examining national reactions to the results of international studies. Focus is given to the degree to which mathematics educators and others have played major roles in determining related policy outcomes affecting curriculum and the development and interpretations of the assessment instruments and processes themselves.

Archive | 2012

Using Item Response Theory as a Tool in Educational Measurement

Margaret Wu

Item response theory (IRT) and classical test theory (CTT) are invaluable tools for the construction of assessment instruments and the measurement of student proficiencies in educational settings. However, the advantages of IRT over CTT are not always clear. This chapter uses an example item analysis to contrast IRT and CTT. It is hoped that the readers can gain a deeper understanding of IRT through comparisons of similarities and differences between IRT and CTT statistics. In particular, this chapter discusses item properties such as the difficulty and discrimination power of items, as well as person ability measures contrasting the weighted likelihood estimates and plausible values in non-technical ways. The main advantage of IRT over CTT is outlined through a discussion on the construction of a developmental scale on which individual students are located. Further, some limitations of both IRT and CTT are brought to light to guide the valid use of IRT and CTT results. Lastly, the IRT software program, ConQuest (Wu et al. ACERConQuest version 2: Generalised item response modelling software. Australian Council for Educational Research, Camberwell, 2007), is used to run the item analysis to illustrate some of the program’s functionalities.

Archive | 2012

Using User-Defined Fit Statistic to Analyze Two-Tier Items in Mathematics

Hak Ping Tam; Margaret Wu; Doris Ching Heung Lau; Magdalena Mo Ching 莫慕貞 Mok

The two-tier item is a relatively new item format and is gradually gaining popularity in some areas of educational research. In science education, a typical two-tier item is made up of two portions. The purpose of the first portion is to assess whether students could identify the correct concept with respect to the information stated in the item stem, while the second examines the reason they supplied to justify the option they chose in the first portion. Since the data thus collected are related in a certain way, they pose challenges regarding how analysis should be done to capture the relationship that exists between the two tiers. This chapter attempts to analyze such data by using a user-defined fit statistic within the Rasch approach. The kind of information that can be gathered will be illustrated by way of analyzing a data set in mathematics.

Archive | 1998