ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	1
Since 2006 (last 20 years)	3

Descriptor

Generalizability Theory	6
Item Response Theory	6
Comparative Analysis	3
Error of Measurement	2
Interrater Reliability	2
Scores	2
Test Items	2
Attention	1
Classroom Observation…	1
Computation	1
Content Validity	1
Cutting Scores	1
Data Analysis	1
Difficulty Level	1
Equations (Mathematics)	1
Estimation (Mathematics)	1
Evaluation	1
Evaluators	1
Interaction	1
Interpersonal Communication	1
Licensing Examinations…	1
Mathematical Models	1
Mathematics Instruction	1
Measurement	1
Measurement Techniques	1
More ▼

Source

Applied Psychological…	1
Asia Pacific Education Review	1
International Journal of…	1
Journal of Outcome Measurement	1
Society for Research on…	1

Author

Ahmet Guven	1
Arce, Alvaro J.	1
Bock, R. Darrell	1
Brennan, Robert L.	1
Claire Riddell	1
Lanrong Li	1
Lee, Guemin	1
Linacre, John M.	1
Lunz, Mary E.	1
Muraki, Eiji	1
Park, In-Yong	1
Robert Schoen	1
Schumacker, Randall E.	1
Wang, Ze	1
Xiaotong Yang	1
More ▼

Publication Type

Reports - Evaluative	6
Journal Articles	4
Speeches/Meeting Papers	1

Education Level

Elementary Secondary Education	1
Grade 3	1
Grade 5	1
Grade 7	1

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 6 results Save | Export

Using a Many-Facet Rasch Model to Gain Insight into Measurement of Instructional Practice in Mathematics

Peer reviewed

Direct link

Robert Schoen; Lanrong Li; Xiaotong Yang; Ahmet Guven; Claire Riddell – Society for Research on Educational Effectiveness, 2021

Many classroom-observation instruments have been developed (e.g., Gleason et al., 2017; Nava et al., 2019; Sawada et al., 2002), but a very small number of studies published in refereed journals have rigorously examined the quality of the ratings and the instrument using measurement models. For example, Gleason et al. developed a mathematics…

Descriptors: Item Response Theory, Models, Measurement, Mathematics Instruction

A Comparison of the Approaches of Generalizability Theory and Item Response Theory in Estimating the Reliability of Test Scores for Testlet-Composed Tests

Peer reviewed

Direct link

Lee, Guemin; Park, In-Yong – Asia Pacific Education Review, 2012

Previous assessments of the reliability of test scores for testlet-composed tests have indicated that item-based estimation methods overestimate reliability. This study was designed to address issues related to the extent to which item-based estimation methods overestimate the reliability of test scores composed of testlets and to compare several…

Descriptors: Generalizability Theory, Simulation, Computation, Item Response Theory

Applying Rasch Model and Generalizability Theory to Study Modified-Angoff Cut Scores

Peer reviewed

Direct link

Arce, Alvaro J.; Wang, Ze – International Journal of Testing, 2012

The traditional approach to scale modified-Angoff cut scores transfers the raw cuts to an existing raw-to-scale score conversion table. Under the traditional approach, cut scores and conversion table raw scores are not only seen as interchangeable but also as originating from a common scaling process. In this article, we propose an alternative…

Descriptors: Generalizability Theory, Item Response Theory, Cutting Scores, Scaling

The Information in Multiple Ratings

Peer reviewed

Direct link

Bock, R. Darrell; Brennan, Robert L.; Muraki, Eiji – Applied Psychological Measurement, 2002

In assessment programs where scores are reported for individual examinees, it is desirable to have responses to performance exercises graded by more than one rater. If more than one item on each test form is so graded, it is also desirable that different raters grade the responses of any one examinee. This gives rise to sampling designs in which…

Descriptors: Generalizability Theory, Test Items, Item Response Theory, Error of Measurement

Scoring and Analysis of Performance Examinations: A Comparison of Methods and Interpretations.

Peer reviewed

Lunz, Mary E.; Schumacker, Randall E. – Journal of Outcome Measurement, 1997

Results and interpretations of the data from a performance examination were compared for four methods of analysis for 74 medical specialty certification candidates: (1) traditional summary statistics; (2) inter-judge correlations; (3) generalizability theory; and (4) the multifaceted Rasch model. Advantages of the Rasch model are outlined. (SLD)

Descriptors: Comparative Analysis, Data Analysis, Generalizability Theory, Interrater Reliability

Generalizability Theory and Many-Facet Rasch Measurement.

Download full text

Linacre, John M. – 1993

Generalizability theory (G-theory) and many-facet Rasch measurement (Rasch) manage the variability inherent when raters rate examinees on test items. The purpose of G-theory is to estimate test reliability in a raw score metric. Unadjusted examinee raw scores are reported as measures. A variance component is estimated for the examinee…

Descriptors: Comparative Analysis, Equations (Mathematics), Estimation (Mathematics), Evaluators