ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	1
Since 2006 (last 20 years)	2

Source

Applied Measurement in…

Author

Gao, Xiaohong	2
Brennan, Robert L.	1
Carol Eckerly	1
Feldt, Leonard S.	1
John R. Donoghue	1
Kannan, Priya	1
Katz, Irvin R.	1
Sgammato, Adrienne	1
Tannenbaum, Richard J.	1

Publication Type

Journal Articles	5
Reports - Research	3
Reports - Evaluative	2

Education Level

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 5 results Save | Export

New Tests of Rater Drift in Trend Scoring

Peer reviewed

Direct link

John R. Donoghue; Carol Eckerly – Applied Measurement in Education, 2024

Trend scoring constructed response items (i.e. rescoring Time A responses at Time B) gives rise to two-way data that follow a product multinomial distribution rather than the multinomial distribution that is usually assumed. Recent work has shown that the difference in sampling model can have profound negative effects on statistics usually used to…

Descriptors: Scoring, Error of Measurement, Reliability, Scoring Rubrics

Evaluating the Consistency of Angoff-Based Cut Scores Using Subsets of Items within a Generalizability Theory Framework

Peer reviewed

Direct link

Kannan, Priya; Sgammato, Adrienne; Tannenbaum, Richard J.; Katz, Irvin R. – Applied Measurement in Education, 2015

The Angoff method requires experts to view every item on the test and make a probability judgment. This can be time consuming when there are large numbers of items on the test. In this study, a G-theory framework was used to determine if a subset of items can be used to make generalizable cut-score recommendations. Angoff ratings (i.e.,…

Descriptors: Reliability, Standard Setting (Scoring), Cutting Scores, Test Items

The Sampling Theory for the Intraclass Reliability Coefficient.

Peer reviewed

Feldt, Leonard S. – Applied Measurement in Education, 1990

Sampling theory for the intraclass reliability coefficient, a Spearman-Brown extrapolation of alpha to a single measurement for each examinee, is less recognized and less cited than that of coefficient alpha. Techniques for constructing confidence intervals and testing hypotheses for the intraclass coefficient are presented. (SLD)

Descriptors: Hypothesis Testing, Measurement Techniques, Reliability, Sampling

Variability of Estimated Variance Components and Related Statistics in a Performance Assessment.

Peer reviewed

Gao, Xiaohong; Brennan, Robert L. – Applied Measurement in Education, 2001

Studied the sampling variability of estimated variance components using data collected over several years for a listening and writing performance assessment and evaluated the stability of estimated measurement precision. Results indicate that the estimated variance components varied from one year to another and suggest that the measurement…

Descriptors: Estimation (Mathematics), Generalizability Theory, Listening Comprehension Tests, Performance Based Assessment

Generalizability of Large-Scale Performance Assessments in Science: Promises and Problems.

Peer reviewed

Gao, Xiaohong; And Others – Applied Measurement in Education, 1994

This study provides empirical evidence about the sampling variability and generalizability (reliability) of a statewide performance assessment for grade six. Results for 600 students at individual and school levels indicate that task-sampling variability was the major source of measurement error. Rater-sampling variability was negligible. (SLD)

Descriptors: Achievement Tests, Educational Assessment, Elementary School Students, Error of Measurement

Reliability	5
Sampling	5
Error of Measurement	3
Generalizability Theory	2
Performance Based Assessment	2
Achievement Tests	1
Cutting Scores	1
Data Analysis	1
Design	1
Educational Assessment	1
Elementary School Students	1
Estimation (Mathematics)	1
Generalization	1
Grade 6	1
Hypothesis Testing	1
Intermediate Grades	1
Interrater Reliability	1
Licensing Examinations…	1
Listening Comprehension Tests	1
Measurement Techniques	1
Probability	1
Responses	1
Science Education	1
Scores	1
Scoring	1
More ▼