ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	1
Since 2006 (last 20 years)	5

Descriptor

Educational Testing	6
Error of Measurement	6
Evaluation Methods	6
Correlation	2
Educational Policy	2
Item Response Theory	2
Measurement	2
Sample Size	2
Statistical Analysis	2
Teacher Effectiveness	2
Teacher Evaluation	2
Test Bias	2
Test Reliability	2
Ability Grouping	1
Accountability	1
Accuracy	1
Achievement Tests	1
Bayesian Statistics	1
Conflict Resolution	1
Educational Assessment	1
Educational Researchers	1
Equations (Mathematics)	1
Error Patterns	1
Evaluation Criteria	1
Evaluation Problems	1
More ▼

Source

American Educational Research…	1
ETS Research Report Series	1
Educational Assessment	1
International Journal of…	1
Journal of Experimental…	1
Policy Analysis for…	1

Author

DeMars, Christine E.	1
Harris, Douglas N.	1
Papay, John P.	1
Phan, Ha	1
Socha, Alan	1
Stefanie A. Wind	1
Williams, Richard H.	1
Yangmeng Xu	1
Zilberberg, Anna	1
Zimmerman, Donald W.	1
Zwick, Rebecca	1
More ▼

Publication Type

Journal Articles	5
Reports - Research	4
Reports - Evaluative	2
Opinion Papers	1

Education Level

Elementary Secondary Education	2
Elementary Education	1
Grade 3	1
Grade 4	1
Grade 5	1

Audience

Location

California

Laws, Policies, & Programs

No Child Left Behind Act 2001

Assessments and Surveys

Stanford Achievement Tests

What Works Clearinghouse Rating

Showing all 6 results Save | Export

Resolving and Re-Scoring Constructed Response Items in Mixed-Format Assessments: An Exploration of Three Approaches

Peer reviewed

Direct link

Stefanie A. Wind; Yangmeng Xu – Educational Assessment, 2024

We explored three approaches to resolving or re-scoring constructed-response items in mixed-format assessments: rater agreement, person fit, and targeted double scoring (TDS). We used a simulation study to consider how the three approaches impact the psychometric properties of student achievement estimates, with an emphasis on person fit. We found…

Descriptors: Interrater Reliability, Error of Measurement, Evaluation Methods, Examiners

Differential Item Functioning Detection with the Mantel-Haenszel Procedure: The Effects of Matching Types and Other Factors

Peer reviewed

Direct link

Socha, Alan; DeMars, Christine E.; Zilberberg, Anna; Phan, Ha – International Journal of Testing, 2015

The Mantel-Haenszel (MH) procedure is commonly used to detect items that function differentially for groups of examinees from various demographic and linguistic backgrounds--for example, in international assessments. As in some other DIF methods, the total score is used to match examinees on ability. In thin matching, each of the total score…

Descriptors: Test Items, Educational Testing, Evaluation Methods, Ability Grouping

A Review of ETS Differential Item Functioning Assessment Procedures: Flagging Rules, Minimum Sample Size Requirements, and Criterion Refinement. Research Report. ETS RR-12-08

Peer reviewed
PDF on ERIC

Download full text

Zwick, Rebecca – ETS Research Report Series, 2012

Differential item functioning (DIF) analysis is a key component in the evaluation of the fairness and validity of educational tests. The goal of this project was to review the status of ETS DIF analysis procedures, focusing on three aspects: (a) the nature and stringency of the statistical rules used to flag items, (b) the minimum sample size…

Descriptors: Test Bias, Sample Size, Bayesian Statistics, Evaluation Methods

Different Tests, Different Answers: The Stability of Teacher Value-Added Estimates across Outcome Measures

Peer reviewed

Direct link

Papay, John P. – American Educational Research Journal, 2011

Recently, educational researchers and practitioners have turned to value-added models to evaluate teacher performance. Although value-added estimates depend on the assessment used to measure student achievement, the importance of outcome selection has received scant attention in the literature. Using data from a large, urban school district, I…

Descriptors: Urban Schools, Teacher Effectiveness, Reading Achievement, Achievement Tests

Value-Added Measures of Education Performance: Clearing Away the Smoke and Mirrors. Policy Brief 10-4

Direct link

Harris, Douglas N. – Policy Analysis for California Education, PACE (NJ3), 2010

In this policy brief, the author explores the problems with attainment measures when it comes to evaluating performance at the school level, and explores the best uses of value-added measures. These value-added measures, the author writes, are useful for sorting out-of-school influences from school influences or from teacher performance, giving…

Descriptors: Principals, Observation, Teacher Evaluation, Measurement Techniques

On the Virtues and Vices of the Standard Error of Measurement.

Peer reviewed

Williams, Richard H.; Zimmerman, Donald W. – Journal of Experimental Education, 1984

This paper provides a list of 10 salient features of the standard error of measurement, contrasting it to the reliability coefficient. It is concluded that the standard error of measurement should be regarded as a primary characteristic of a mental test. (Author/DWH)

Descriptors: Educational Testing, Error of Measurement, Evaluation Methods, Psychological Testing