NotesFAQContact Us
Collection
Advanced
Search Tips
Publication Date
In 20250
Since 20240
Since 2021 (last 5 years)0
Since 2016 (last 10 years)0
Since 2006 (last 20 years)17
Education Level
Elementary Secondary Education19
Elementary Education1
Grade 31
Grade 41
Grade 51
Junior High Schools1
Secondary Education1
Audience
What Works Clearinghouse Rating
Showing 1 to 15 of 19 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Wiliam, Dylan – Measurement: Interdisciplinary Research and Perspectives, 2013
In "How Is Testing Supposed to Improve Schooling?" Edward Haertel has proposed a framework for thinking about the mechanisms by which testing might improve the various educational processes undertaken in schools. The framework seems to the author to be quite general (he uses the word "general" here in its mathematical sense of including all cases)…
Descriptors: Educational Testing, Educational Improvement, Test Results, Test Use
Peer reviewed Peer reviewed
Direct linkDirect link
Mori, Kazuo; Uchida, Akitoshi – Research in Education, 2012
Longitudinal change in the average Z scores for four groups of pupils sorted by quartiles was examined for its stability over three years. The data, collected from 1998 to 2009, was obtained from nine cohorts of Japanese junior high school pupils totaling 1,962 subjects. It showed illusionary declines among the mid-range pupils but improvements…
Descriptors: Foreign Countries, Junior High School Students, Cohort Analysis, Evaluation Problems
Sukin, Tia M. – ProQuest LLC, 2010
The presence of outlying anchor items is an issue faced by many testing agencies. The decision to retain or remove an item is a difficult one, especially when the content representation of the anchor set becomes questionable by item removal decisions. Additionally, the reason for the aberrancy is not always clear, and if the performance of the…
Descriptors: Simulation, Science Achievement, Sampling, Data Analysis
Wang, Shudong; Jiao, Hong; Jin, Ying; Thum, Yeow Meng – Online Submission, 2010
The vertical scales of large-scale achievement tests created by using item response theory (IRT) models are mostly based on cluster (or correlated) educational data in which students usually are clustered in certain groups or settings (classrooms or schools). While such application directly violated assumption of independent sample of person in…
Descriptors: Scaling, Achievement Tests, Data Analysis, Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Cresswell, Mike – Measurement: Interdisciplinary Research and Perspectives, 2010
Paul Newton (2010), with his characteristic concern about theory, has set out two different ways of thinking about the basis upon which equivalences of one sort or another are established between test score scales. His reason for doing this is a desire to establish "the defensibility of linkages lower on the continuum than concordance."…
Descriptors: Foreign Countries, Measurement Techniques, Psychometrics, Comparative Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Papay, John P. – American Educational Research Journal, 2011
Recently, educational researchers and practitioners have turned to value-added models to evaluate teacher performance. Although value-added estimates depend on the assessment used to measure student achievement, the importance of outcome selection has received scant attention in the literature. Using data from a large, urban school district, I…
Descriptors: Urban Schools, Teacher Effectiveness, Reading Achievement, Achievement Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Newton, Paul E. – Measurement: Interdisciplinary Research and Perspectives, 2010
This article presents the author's rejoinder to thinking about linking from issue 8(1). Particularly within the more embracing linking frameworks, e.g., Holland & Dorans (2006) and Holland (2007), there appears to be a major disjunction between (1) classification discourse: the supposed basis for classification, that is, the underlying theory…
Descriptors: Foreign Countries, Measurement Techniques, Psychometrics, Comparative Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Walker, Michael E. – Measurement: Interdisciplinary Research and Perspectives, 2010
"Linking" is a term given to a general class of procedures by which one represents scores X on one test or measure in terms of scores Y on another test or measure. A recent taxonomy by Holland and Dorans (2006; Holland, 2007) organizes the various types of links into three broad categories: prediction, scale aligning, and equating. In…
Descriptors: Foreign Countries, Test Construction, Test Validity, Measurement Techniques
Peer reviewed Peer reviewed
Direct linkDirect link
Baird, Jo-Anne – Measurement: Interdisciplinary Research and Perspectives, 2010
Newton's article (2010) makes three main contributions to the literature. First, it is transatlantic, bringing together literatures that have been dealing with similar problems, using sometimes different methods and certainly with distinctive educational, cultural perspectives. He points out that neither of these literatures has all of the…
Descriptors: Foreign Countries, Predictive Validity, Standards, Ethics
Peer reviewed Peer reviewed
Direct linkDirect link
von Davier, Alina A. – Measurement: Interdisciplinary Research and Perspectives, 2010
The article "Thinking About Linking" by Newton (2010) presents a novel philosophical perspective on the way that educational assessments should be linked. Newton starts by describing the linking framework as it was characterized in various publications and identifies a cross-cultural dimension in the definitions and uses of test…
Descriptors: Foreign Countries, Educational Assessment, Student Evaluation, Evaluation Criteria
Peer reviewed Peer reviewed
Direct linkDirect link
Myford, Carol M.; Wolfe, Edward W. – Journal of Educational Measurement, 2009
In this study, we describe a framework for monitoring rater performance over time. We present several statistical indices to identify raters whose standards drift and explain how to use those indices operationally. To illustrate the use of the framework, we analyzed rating data from the 2002 Advanced Placement English Literature and Composition…
Descriptors: English Literature, Advanced Placement, Measures (Individuals), Writing (Composition)
Kozloff, Allison Burstein – ProQuest LLC, 2009
Comprehensive academic achievement tests are routinely used by school psychologists in psycho-educational assessment batteries to identify learning disabled students. A variety of assessment measures are used across age groups to determine if a discrepancy exists between academic achievement and intellectual functioning; however, among the most…
Descriptors: Intelligence, Educational Assessment, Academic Achievement, Achievement Tests
Peer reviewed Peer reviewed
PDF on ERIC Download full text
McRae, D. J. – Third Education Group Review, 2006
This article presents the author's comments on ""Lake Woebegone," Twenty Years Later" by J. J. Cannell, MD. J. J. Cannell's article on the so-called "Lake Woebegone" effect for K-12 educational testing systems is mostly an historical account of technical issues and policy considerations that led in part to development…
Descriptors: Elementary Secondary Education, Educational Testing, Standardized Tests, Test Use
Peer reviewed Peer reviewed
Direct linkDirect link
Newton, Paul E.; Whetton, Chris – Oxford Review of Education, 2005
One way to manage marking error, in a large-scale educational testing context, is to establish a mechanism through which appeals can be lodged. While, at one level, this seems to offer a straightforward technical solution to the problem of marking error, it can also result in unintended consequences, with political, social or educational…
Descriptors: Foreign Countries, Educational Testing, Testing Problems, Scoring
Peer reviewed Peer reviewed
Direct linkDirect link
Wiliam, Dylan – Review of Research in Education, 2010
The idea that validity should be considered a property of inferences, rather than of assessments, has developed slowly over the past century. In early writings about the validity of educational assessments, validity was defined as a property of an assessment. The most common definition was that an assessment was valid to the extent that it…
Descriptors: Educational Assessment, Validity, Inferences, Construct Validity
Previous Page | Next Page ยป
Pages: 1  |  2