Paul E. Newton | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Paul E. Newton is active.

Explore More

Publication

Featured researches published by Paul E. Newton.

Assessment in Education: Principles, Policy & Practice | 2007

Clarifying the purposes of educational assessment

Paul E. Newton

This article concerns the importance of clarity in thinking and talking about certain core concepts of educational assessment. It begins by identifying three quite distinct interpretations of the term ‘assessment purpose’. It continues by challenging the supposed distinction between ‘formative’ and ‘summative’—arguing that the latter only applies to a kind of assessment result while the former only applies to a kind of use of assessment results. It ends by illustrating the wide range of uses to which assessment results might be put and stresses the importance of not concealing important distinctions by locating multiple discrete purposes within a small number of misleading categories.

Measurement: Interdisciplinary Research & Perspective | 2012

Clarifying the Consensus Definition of Validity

Paul E. Newton

The 1999 Standards for Educational and Psychological Testing defines validity as the degree to which evidence and theory support the interpretations of test scores entailed by proposed uses of tests. Although quite explicit, there are ways in which this definition lacks precision, consistency, and clarity. The history of validity has taught us that ambiguity risks oversimplification, misunderstanding, inadequate validation, and the inevitable potential for inappropriate interpretation and use of results. This article identifies ways in which the spirit of the Standards can be clarified, with the intention of reducing these risks. The article provides an elaboration of the consensus definition, invoking a narrow, technical sense of validity, unique to the professions of educational and psychological measurement and assessment; an assessment-based decision-making procedure is valid if the argument for interpreting assessment outcomes (under stated conditions and in terms of stated conclusions) as measures of the attribute entailed by the decision is sufficiently strong.

Research Papers in Education | 2000

Would the real gold standard please step forward

Jo-Anne Baird; Mike Cresswell; Paul E. Newton

Debate about public examination standards has been a consistent feature of educational assessment in Britain over the past few decades. The most frequently voiced concern has been that public examination standards have fallen over the years − for example, the so-called A level ‘gold standard’ may be slipping. In this paper we consider some of the claims which have been made about falling standards and argue that they reveal a variety of underlying assumptions about the nature of examination standards and what it means to maintain them. We argue that, because people disagree about these fundamental matters, examination standards can never be maintained to everyones satisfaction. We consider the practical implications of the various coexisting definitions of examination standards and their implications for the perceived fairness of the examinations. We raise the question of whether the adoption of a single definition of examination standards would be desirable in practice but conclude that it would not. It follows that examining boards can legitimately be required to defend their maintenance of standards against challenges from a range of possibly conflicting perspectives. This makes it essential for the boards to be open about the problematic nature of examination standards and the processes by which they are determined.

British Educational Research Journal | 2005

The public understanding of measurement inaccuracy

Paul E. Newton

Assessment agencies are increasingly facing pressure on two fronts; first, to increase transparency and openness and second, to improve public confidence. Yet, in relation to one of the central concepts of educational measurement—inherent error—many believe that increased public understanding is incompatible with public confidence: a general recognition of the true nature and extent of measurement inaccuracy would fatally undermine trust in the system. The present article is premised on a contrary proposal: not understanding measurement inaccuracy is a far greater threat than understanding it, since it will result in the system repeatedly being held to account for more than it can possibly deliver. As unrealistic expectations are unmet, so the system will appear to have failed; and this recurrent process will gradually erode public confidence. The article develops ethical and practical arguments in favour of educating the public about the inherent limitations of educational measurement. Primary amongst the ethical arguments is the proposal, from contemporary validity theory, that users who fail to understand measurement inaccuracy will be ill equipped to draw valid inferences from results.

Curriculum Journal | 2008

Alternative perspectives on learning outcomes: challenges for assessment

Richard Daugherty; Paul Black; Kathryn Ecclestone; Mary James; Paul E. Newton

In discussing the relationship between curriculum and assessment it is commonly argued that assessment should be aligned to curriculum or, alternatively, that they should be congruent with each other. This article explores that relationship in five educational contexts in the UK and in Europe, ranging across school education, workplace learning, vocational education and higher education. Four main themes are highlighted: construct definition, progression, assessment procedures, and system-level accountability. What emerges from the five case studies under review is a multi-layered process of knowledge being constructed in diverse ways at different levels in each context. The article concludes that, rather than thinking in terms of either alignment or congruence, these relationships are better understood in terms of non-linear systems embracing curriculum, pedagogy and assessment.

Assessment in Education: Principles, Policy & Practice | 2005

Examination standards and the limits of linking

Paul E. Newton

There is a tendency in the literature to characterize linking as equating done somewhat less rigorously. The ambiguity of this conception can lead to confusion amongst policy‐makers and members of the public and can result in the proliferation of comparability myths. As the constructs assessed by two tests decrease in similarity, so the difference between equating and linking becomes one of kind rather than degree. To help make sense of linking in different contexts, a general model is proposed, based upon the idea of a ‘linking construct’. This general model is used to define the limits of linking and to clarify what users and stakeholders need to know about linking and linked scores. Finally, a distinction is drawn between judgemental linking as a method (e.g., social moderation) and judgemental linking as a theory (i.e., the value judgement theory of linking). The latter presents a challenge to the general model, which is defended.

Journal of Education Policy | 2005

Threats to the professional understanding of assessment error

Paul E. Newton

A case study investigation was undertaken to identify threats to the professional understanding of assessment error which arise from accounts presented within the education press. Through a predominantly qualitative analysis of articles published in a leading education newspaper, during 2002 and 2003, it explored how assessment agencies in England were represented as responding to allegations of error. A number of threats to professional understanding were identified; in particular, the overarching threat that media reports may help to construct, and to maintain, a mythical image of assessment as a process which can and ought to be free from both measurement inaccuracy and human error. The results highlighted an underlying tension between the need to increase public understanding (of assessment error) and the need to retain public confidence (in our assessment systems). It was concluded that assessment agencies need to develop approaches to enhancing the public and professional understanding of assessment error, to counteract potentially misleading images from media reports.

Measurement: Interdisciplinary Research & Perspective | 2010

Thinking about Linking.

Paul E. Newton

Despite over a century of aligning test and examination scales, the theory of linking has received relatively little attention. Recently, though, frameworks for classifying linking relationships have proliferated, both in England and the United States. Limitations of U.S. frameworks, particularly the idea that linking relationships ought to be classified along a continuum representing degree of similarity to equating, are highlighted by linking challenges faced in England. A new framework, which focuses more upon definitions of comparability than methods for linking standards, highlights alternative ways of thinking about linking.

Assessment in Education: Principles, Policy & Practice | 2016

The great validity debate

Paul E. Newton; Jo-Anne Baird

Validity is the most important term in the educational and psychological measurement lexicon. Measurement professionals are generally happy to agree about that. What they are less happy to agree about is what the term ought to mean. North American measurement professionals have negotiated a kind of consensus on this thorny issue, through the definition and description of validity in the Standards for Educational and Psychological Testing (American Educational Research Association, American Psychological Association, and National Council on Measurement in Education [AERA, APA, and NCME], 2014). Yet, the status of this consensus is unclear, given continuing debate amongst scholars, and given the fact that all sorts of different definitions and descriptions can be found on the websites of measurement organisations within the USA and elsewhere, and within the pages of prominent textbooks. In short, there is no widespread professional consensus concerning the best way to use the term. In 1997, Linda Crocker penned an editorial for the North American National Council on Measurement in Education (NCME) publication, Educational Measurement: Issues and Practice, entitled: The Great Validity Debate (Crocker, 1997, p. 4). Her editorial introduced a special issue of the journal devoted to a controversy which she described as having been ‘brewing in psychometric circles’ since the late 1980s. It concerned the significance of consequences for the concept of validity and pivoted, for many, around the issue of whether validation should be ‘regarded as a scientific, empirical enterprise or a sociopolitical process as well.’ She suggested that: ‘the prevailing argument in this debate will shape the nature of measurement practice and professional preparation for years to come.’ Well over a decade later, Newton and Shaw undertook an extensive review of the literature on validity, to provide a foundation for an introductory overview of the concept of validity (Newton & Shaw, 2014). Their research led them to conclude that no position in this debate had yet prevailed. Not only was the controversy over consequences still raging (e.g. Cizek, 2012), new controversies had arisen, including debate over the relationship between validity and truth (e.g. Borsboom & Markus, 2013; Borsboom, Mellenbergh, & van Heerden, 2004; Kane, 2013a, 2013b). In an attempt to explore potential for resolving these debates, Newton and Shaw organised a coordinated session at the 2014 NCME Annual Meeting, in Philadelphia, entitled: What is the Best Way to Use the Term ‘Validity’? The six focal papers at the heart of this new special issue began life in that session. Lorrie Shepard, a contributor to the original special issue (edited by Linda Crocker in 1997) contributed a ‘reflective overview’ to the session and agreed to provide a similar contribution in this new special issue. In the spirit of facilitating debate, we decided to introduce an element of peer commentary to the following pages. The six focal papers were prepared simultaneously and then circulated to a group of leading measurement professionals for

Measurement: Interdisciplinary Research & Perspective | 2012

Questioning the consensus definition of validity

Paul E. Newton

The focus article provided me with an opportunity to unpack the consensus definition of validity and to explore its implications in the light of recent debates. I proposed an elaboration of the consensus definition, which was intended to express the spirit of the Standards for Educational and Psychological Testing (American Educational Research Association, American Psychological Association, & National Council on Measurement in Education, 1999) with increased precision, highlighting a range of features including the following:

Explore More