ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	1
Since 2006 (last 20 years)	2

Source

Applied Measurement in…

Author

Bimpeh, Yaw	1
Brennan, Robert L.	1
Chis, Liliana	1
Clauser, Brian E.	1
Frisbie, David A.	1
Harik, Polina	1
Harrison, Liz	1
Kane, Michael	1
Lee, Guemin	1
Margolis, Melissa J.	1
McManus, I. C.	1
Mollon, Jennifer	1
Pointer, William	1
Quellmalz, Edys S.	1
Smith, Ben Alexander	1
Williams, Simon	1
More ▼

Publication Type

Journal Articles	6
Reports - Evaluative	6
Speeches/Meeting Papers	1

Education Level

Audience

Location

United Kingdom

Laws, Policies, & Programs

Assessments and Surveys

National Assessment of…

What Works Clearinghouse Rating

Showing all 6 results Save | Export

Evaluating Human Scoring Using Generalizability Theory

Peer reviewed

Direct link

Bimpeh, Yaw; Pointer, William; Smith, Ben Alexander; Harrison, Liz – Applied Measurement in Education, 2020

Many high-stakes examinations in the United Kingdom (UK) use both constructed-response items and selected-response items. We need to evaluate the inter-rater reliability for constructed-response items that are scored by humans. While there are a variety of methods for evaluating rater consistency across ratings in the psychometric literature, we…

Descriptors: Scoring, Generalizability Theory, Interrater Reliability, Foreign Countries

An Empirical Examination of the Impact of Group Discussion and Examinee Performance Information on Judgments Made in the Angoff Standard-Setting Procedure

Peer reviewed

Direct link

Clauser, Brian E.; Harik, Polina; Margolis, Melissa J.; McManus, I. C.; Mollon, Jennifer; Chis, Liliana; Williams, Simon – Applied Measurement in Education, 2009

Numerous studies have compared the Angoff standard-setting procedure to other standard-setting methods, but relatively few studies have evaluated the procedure based on internal criteria. This study uses a generalizability theory framework to evaluate the stability of the estimated cut score. To provide a measure of internal consistency, this…

Descriptors: Generalizability Theory, Group Discussion, Standard Setting (Scoring), Scoring

Estimating Reliability under a Generalizability Theory Model for Test Scores Composed of Testlets.

Peer reviewed

Lee, Guemin; Frisbie, David A. – Applied Measurement in Education, 1999

Studied the appropriateness and implications of using a generalizability theory approach to estimating the reliability of scores from tests composed of testlets. Analyses of data from two national standardization samples suggest that manipulating the number of passages is a more productive way to obtain efficient measurement than manipulating the…

Descriptors: Generalizability Theory, Models, National Surveys, Reliability

The Precision of Measurements.

Peer reviewed

Kane, Michael – Applied Measurement in Education, 1996

This overview of the role of error and tolerance for error in measurement asserts that the generic precision associated with a measurement procedure is defined as the root mean square error, or standard error, in some relevant population. This view of precision is explored in several applications of measurement. (SLD)

Descriptors: Error of Measurement, Error Patterns, Generalizability Theory, Measurement Techniques

The Context of Context Effects.

Peer reviewed

Brennan, Robert L. – Applied Measurement in Education, 1992

A conceptual framework and heuristic model for considering the existence, magnitude, and consequences of context effects are presented through an extension of some generalizability theory concepts. Context effects are often misunderstood, and current measurement models have serious limitations for examining them. Their importance needs to be…

Descriptors: Adaptive Testing, Context Effect, Equated Scores, Equations (Mathematics)

Developing Criteria for Performance Assessments: The Missing Link.

Peer reviewed

Quellmalz, Edys S. – Applied Measurement in Education, 1991

It is proposed that criteria for evaluating the quality of performance should be defined, at least tentatively, during the initial design of a performance assessment. Six characteristics of sound criteria are (1) significance; (2) fidelity; (3) generalizability; (4) developmental appropriateness; (5) accessibility; and (6) utility. (SLD)

Descriptors: Child Development, Cognitive Tests, Educational Assessment, Evaluation Criteria

Generalizability Theory	6
Test Items	3
Foreign Countries	2
Measurement Techniques	2
Reliability	2
Scoring	2
Test Use	2
Adaptive Testing	1
Child Development	1
Cognitive Tests	1
Computation	1
Context Effect	1
Credentials	1
Cutting Scores	1
Difficulty Level	1
Educational Assessment	1
Equated Scores	1
Equations (Mathematics)	1
Error Patterns	1
Error of Measurement	1
Evaluation Criteria	1
Evaluation Methods	1
Evaluation Problems	1
Group Discussion	1
Heuristics	1
More ▼