ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	1
Since 2006 (last 20 years)	2

Descriptor

Foreign Countries	2
Generalizability Theory	2
Scoring	2
Test Items	2
Computation	1
Credentials	1
Cutting Scores	1
Difficulty Level	1
Group Discussion	1
High Stakes Tests	1
Interrater Reliability	1
Physicians	1
Probability	1
Program Effectiveness	1
Standard Setting (Scoring)	1
More ▼

Source

Applied Measurement in…

Author

Bimpeh, Yaw	1
Chis, Liliana	1
Clauser, Brian E.	1
Harik, Polina	1
Harrison, Liz	1
Margolis, Melissa J.	1
McManus, I. C.	1
Mollon, Jennifer	1
Pointer, William	1
Smith, Ben Alexander	1
Williams, Simon	1
More ▼

Publication Type

Journal Articles	2
Reports - Evaluative	2

Education Level

Audience

Location

United Kingdom

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 2 results Save | Export

Evaluating Human Scoring Using Generalizability Theory

Peer reviewed

Direct link

Bimpeh, Yaw; Pointer, William; Smith, Ben Alexander; Harrison, Liz – Applied Measurement in Education, 2020

Many high-stakes examinations in the United Kingdom (UK) use both constructed-response items and selected-response items. We need to evaluate the inter-rater reliability for constructed-response items that are scored by humans. While there are a variety of methods for evaluating rater consistency across ratings in the psychometric literature, we…

Descriptors: Scoring, Generalizability Theory, Interrater Reliability, Foreign Countries

An Empirical Examination of the Impact of Group Discussion and Examinee Performance Information on Judgments Made in the Angoff Standard-Setting Procedure

Peer reviewed

Direct link

Clauser, Brian E.; Harik, Polina; Margolis, Melissa J.; McManus, I. C.; Mollon, Jennifer; Chis, Liliana; Williams, Simon – Applied Measurement in Education, 2009

Numerous studies have compared the Angoff standard-setting procedure to other standard-setting methods, but relatively few studies have evaluated the procedure based on internal criteria. This study uses a generalizability theory framework to evaluate the stability of the estimated cut score. To provide a measure of internal consistency, this…

Descriptors: Generalizability Theory, Group Discussion, Standard Setting (Scoring), Scoring