ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	0
Since 2006 (last 20 years)	17

Descriptor

Educational Testing	19
Testing Problems	19
Educational Assessment	11
Evaluation Methods	9
Psychometrics	9
Foreign Countries	8
Measurement Techniques	8
Evaluation Problems	7
High Stakes Tests	7
Test Interpretation	7
Test Use	7
Definitions	6
Test Construction	6
Test Validity	6
Classification	5
Comparative Analysis	5
Equated Scores	5
Measurement	5
Predictive Measurement	5
Student Evaluation	5
Test Theory	5
Educational Policy	4
Scaling	4
Standardized Tests	4
Achievement Tests	3
More ▼

Source

Measurement:…	8
ProQuest LLC	2
American Educational Research…	1
Journal of Educational…	1
National Research Center on…	1
Online Submission	1
Oxford Review of Education	1
Research in Education	1
Review of Research in…	1
Third Education Group Review	1
Thomas B. Fordham Institute	1
More ▼

Publication Type

Journal Articles	14
Opinion Papers	9
Reports - Research	4
Reports - Evaluative	3
Dissertations/Theses -…	2
Numerical/Quantitative Data	1
Reports - Descriptive	1
Speeches/Meeting Papers	1

Education Level

Elementary Secondary Education	19
Elementary Education	1
Grade 3	1
Grade 4	1
Grade 5	1
Junior High Schools	1
Secondary Education	1

Audience

Location

United Kingdom (England)	4
United States	4
United Kingdom	3
California	2
United Kingdom (Wales)	2
Arizona	1
Australia	1
Colorado	1
Delaware	1
Idaho	1
Illinois	1
Indiana	1
Japan	1
Kansas	1
Maine	1
Maryland	1
Massachusetts	1
Michigan	1
Minnesota	1
Montana	1
Nevada	1
New Hampshire	1
New Jersey	1
New Mexico	1
North Dakota	1
More ▼

Laws, Policies, & Programs

No Child Left Behind Act 2001	2
Individuals with Disabilities…	1

Assessments and Surveys

Advanced Placement…	3
SAT (College Admission Test)	3
Kaufman Test of Educational…	1
Stanford Achievement Tests	1
Wechsler Individual…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 19 results Save | Export

How Is Testing Supposed to Improve Schooling? Some Reflections

Peer reviewed

Direct link

Wiliam, Dylan – Measurement: Interdisciplinary Research and Perspectives, 2013

In "How Is Testing Supposed to Improve Schooling?" Edward Haertel has proposed a framework for thinking about the mechanisms by which testing might improve the various educational processes undertaken in schools. The framework seems to the author to be quite general (he uses the word "general" here in its mathematical sense of including all cases)…

Descriptors: Educational Testing, Educational Improvement, Test Results, Test Use

The Leading Group Effect: Illusionary Declines in Scholastic Standard Scores of Mid-Range Japanese Junior High School Pupils

Peer reviewed

Direct link

Mori, Kazuo; Uchida, Akitoshi – Research in Education, 2012

Longitudinal change in the average Z scores for four groups of pupils sorted by quartiles was examined for its stability over three years. The data, collected from 1998 to 2009, was obtained from nine cohorts of Japanese junior high school pupils totaling 1,962 subjects. It showed illusionary declines among the mid-range pupils but improvements…

Descriptors: Foreign Countries, Junior High School Students, Cohort Analysis, Evaluation Problems

Item Parameter Drift as an Indication of Differential Opportunity to Learn: An Exploration of Item Flagging Methods & Accurate Classification of Examinees

Direct link

Sukin, Tia M. – ProQuest LLC, 2010

The presence of outlying anchor items is an issue faced by many testing agencies. The decision to retain or remove an item is a difficult one, especially when the content representation of the anchor set becomes questionable by item removal decisions. Additionally, the reason for the aberrancy is not always clear, and if the performance of the…

Descriptors: Simulation, Science Achievement, Sampling, Data Analysis

Investigating Effect of Ignoring Hierarchical Data Structures on Accuracy of Vertical Scaling Using Mixed-Effects Rasch Model

Download full text

Wang, Shudong; Jiao, Hong; Jin, Ying; Thum, Yeow Meng – Online Submission, 2010

The vertical scales of large-scale achievement tests created by using item response theory (IRT) models are mostly based on cluster (or correlated) educational data in which students usually are clustered in certain groups or settings (classrooms or schools). While such application directly violated assumption of independent sample of person in…

Descriptors: Scaling, Achievement Tests, Data Analysis, Item Response Theory

Defending the Quality of Links between Scores from Different Tests and Exams

Peer reviewed

Direct link

Cresswell, Mike – Measurement: Interdisciplinary Research and Perspectives, 2010

Paul Newton (2010), with his characteristic concern about theory, has set out two different ways of thinking about the basis upon which equivalences of one sort or another are established between test score scales. His reason for doing this is a desire to establish "the defensibility of linkages lower on the continuum than concordance."…

Descriptors: Foreign Countries, Measurement Techniques, Psychometrics, Comparative Analysis

Different Tests, Different Answers: The Stability of Teacher Value-Added Estimates across Outcome Measures

Peer reviewed

Direct link

Papay, John P. – American Educational Research Journal, 2011

Recently, educational researchers and practitioners have turned to value-added models to evaluate teacher performance. Although value-added estimates depend on the assessment used to measure student achievement, the importance of outcome selection has received scant attention in the literature. Using data from a large, urban school district, I…

Descriptors: Urban Schools, Teacher Effectiveness, Reading Achievement, Achievement Tests

Conceptualizing Comparability

Peer reviewed

Direct link

Newton, Paul E. – Measurement: Interdisciplinary Research and Perspectives, 2010

This article presents the author's rejoinder to thinking about linking from issue 8(1). Particularly within the more embracing linking frameworks, e.g., Holland & Dorans (2006) and Holland (2007), there appears to be a major disjunction between (1) classification discourse: the supposed basis for classification, that is, the underlying theory…

Descriptors: Foreign Countries, Measurement Techniques, Psychometrics, Comparative Analysis

Linking through Improved Design, Not Redefinition: Commentary on Newton

Peer reviewed

Direct link

Walker, Michael E. – Measurement: Interdisciplinary Research and Perspectives, 2010

"Linking" is a term given to a general class of procedures by which one represents scores X on one test or measure in terms of scores Y on another test or measure. A recent taxonomy by Holland and Dorans (2006; Holland, 2007) organizes the various types of links into three broad categories: prediction, scale aligning, and equating. In…

Descriptors: Foreign Countries, Test Construction, Test Validity, Measurement Techniques

What Constitutes Legitimate Causal Linking?

Peer reviewed

Direct link

Baird, Jo-Anne – Measurement: Interdisciplinary Research and Perspectives, 2010

Newton's article (2010) makes three main contributions to the literature. First, it is transatlantic, bringing together literatures that have been dealing with similar problems, using sometimes different methods and certainly with distinctive educational, cultural perspectives. He points out that neither of these literatures has all of the…

Descriptors: Foreign Countries, Predictive Validity, Standards, Ethics

What Dictates the Meaning of Test Linking? A Reaction to "Thinking about Linking"

Peer reviewed

Direct link

von Davier, Alina A. – Measurement: Interdisciplinary Research and Perspectives, 2010

The article "Thinking About Linking" by Newton (2010) presents a novel philosophical perspective on the way that educational assessments should be linked. Newton starts by describing the linking framework as it was characterized in various publications and identifies a cross-cultural dimension in the definitions and uses of test…

Descriptors: Foreign Countries, Educational Assessment, Student Evaluation, Evaluation Criteria

Monitoring Rater Performance over Time: A Framework for Detecting Differential Accuracy and Differential Scale Category Use

Peer reviewed

Direct link

Myford, Carol M.; Wolfe, Edward W. – Journal of Educational Measurement, 2009

In this study, we describe a framework for monitoring rater performance over time. We present several statistical indices to identify raters whose standards drift and explain how to use those indices operationally. To illustrate the use of the framework, we analyzed rating data from the 2002 Advanced Placement English Literature and Composition…

Descriptors: English Literature, Advanced Placement, Measures (Individuals), Writing (Composition)

A Review of Academic Achievement Tests: Recommendations for Age Appropriate Administration

Direct link

Kozloff, Allison Burstein – ProQuest LLC, 2009

Comprehensive academic achievement tests are routinely used by school psychologists in psycho-educational assessment batteries to identify learning disabled students. A variety of assessment measures are used across age groups to determine if a discrepancy exists between academic achievement and intellectual functioning; however, among the most…

Descriptors: Intelligence, Educational Assessment, Academic Achievement, Achievement Tests

Comments on ""Lake Woebegone," Twenty Years Later" by J. J. Cannell, MD

Peer reviewed
PDF on ERIC

Download full text

McRae, D. J. – Third Education Group Review, 2006

This article presents the author's comments on ""Lake Woebegone," Twenty Years Later" by J. J. Cannell, MD. J. J. Cannell's article on the so-called "Lake Woebegone" effect for K-12 educational testing systems is mostly an historical account of technical issues and policy considerations that led in part to development…

Descriptors: Elementary Secondary Education, Educational Testing, Standardized Tests, Test Use

The Effectiveness of Systems for Appealing against Marking Error

Peer reviewed

Direct link

Newton, Paul E.; Whetton, Chris – Oxford Review of Education, 2005

One way to manage marking error, in a large-scale educational testing context, is to establish a mechanism through which appeals can be lodged. While, at one level, this seems to offer a straightforward technical solution to the problem of marking error, it can also result in unintended consequences, with political, social or educational…

Descriptors: Foreign Countries, Educational Testing, Testing Problems, Scoring

What Counts as Evidence of Educational Achievement? The Role of Constructs in the Pursuit of Equity in Assessment

Peer reviewed

Direct link

Wiliam, Dylan – Review of Research in Education, 2010

The idea that validity should be considered a property of inferences, rather than of assessments, has developed slowly over the past century. In early writings about the validity of educational assessments, validity was defined as a property of an assessment. The most common definition was that an assessment was valid to the extent that it…

Descriptors: Educational Assessment, Validity, Inferences, Construct Validity

Previous Page | Next Page »

Pages: 1 | 2

Newton, Paul E.	2
Wiliam, Dylan	2
Adkins, Deborah	1
Baird, Jo-Anne	1
Cresswell, Mike	1
Cronin, John	1
Dahlin, Michael	1
Ford, Donna Y.	1
Hill, Heather C.	1
Jiao, Hong	1
Jin, Ying	1
Kingsbury, G. Gage	1
Kozloff, Allison Burstein	1
McRae, D. J.	1
Mori, Kazuo	1
Myford, Carol M.	1
Papay, John P.	1
Schilling, Stephen	1
Sukin, Tia M.	1
Thum, Yeow Meng	1
Uchida, Akitoshi	1
Walker, Michael E.	1
Wang, Shudong	1
Whetton, Chris	1
Wolfe, Edward W.	1
More ▼