ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	1
Since 2006 (last 20 years)	2

Descriptor

Educational Assessment	3
Error of Measurement	3
Reliability	2
Scores	2
Accuracy	1
Achievement Tests	1
Comparative Analysis	1
Computation	1
Data	1
Elementary School Students	1
Evaluators	1
Generalizability Theory	1
Generalization	1
Grade 6	1
Intermediate Grades	1
Item Analysis	1
Item Response Theory	1
Measurement	1
Models	1
Performance Based Assessment	1
Sampling	1
Science Education	1
State Programs	1
Test Bias	1
Testing Programs	1
More ▼

Source

Applied Measurement in…

Author

DeMars, Christine	1
Gao, Xiaohong	1
Lee, Won-Chan	1
Song, Yoon Ah	1

Publication Type

Journal Articles	3
Reports - Research	3

Education Level

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 3 results Save | Export

Effects of Using Double Ratings as Item Scores on IRT Proficiency Estimation

Peer reviewed

Direct link

Song, Yoon Ah; Lee, Won-Chan – Applied Measurement in Education, 2022

This article presents the performance of item response theory (IRT) models when double ratings are used as item scores over single ratings when rater effects are present. Study 1 examined the influence of the number of ratings on the accuracy of proficiency estimation in the generalized partial credit model (GPCM). Study 2 compared the accuracy of…

Descriptors: Item Response Theory, Item Analysis, Scores, Accuracy

Estimating Variance Components from Sparse Data Matrices in Large-Scale Educational Assessments

Peer reviewed

Direct link

DeMars, Christine – Applied Measurement in Education, 2015

In generalizability theory studies in large-scale testing contexts, sometimes a facet is very sparsely crossed with the object of measurement. For example, when assessments are scored by human raters, it may not be practical to have every rater score all students. Sometimes the scoring is systematically designed such that the raters are…

Descriptors: Educational Assessment, Measurement, Data, Generalizability Theory

Generalizability of Large-Scale Performance Assessments in Science: Promises and Problems.

Peer reviewed

Gao, Xiaohong; And Others – Applied Measurement in Education, 1994

This study provides empirical evidence about the sampling variability and generalizability (reliability) of a statewide performance assessment for grade six. Results for 600 students at individual and school levels indicate that task-sampling variability was the major source of measurement error. Rater-sampling variability was negligible. (SLD)

Descriptors: Achievement Tests, Educational Assessment, Elementary School Students, Error of Measurement