ERIC - Search Results

Publication Date

In 2025	2
Since 2024	3
Since 2021 (last 5 years)	7
Since 2016 (last 10 years)	14
Since 2006 (last 20 years)	37

Descriptor

Generalizability Theory	56
Test Items	56
Error of Measurement	15
Foreign Countries	14
Scores	13
Test Reliability	13
Reliability	11
Test Construction	11
Difficulty Level	10
Item Response Theory	9
English (Second Language)	8
Interrater Reliability	8
Language Tests	7
Mathematics Tests	7
Scoring	7
Comparative Analysis	6
Cutting Scores	6
Second Language Learning	6
Test Bias	6
Computation	5
Item Analysis	5
Licensing Examinations…	5
Models	5
Multivariate Analysis	5
Psychometrics	5
More ▼

Publication Type

Journal Articles	41
Reports - Research	37
Reports - Evaluative	16
Speeches/Meeting Papers	9
Information Analyses	3
Numerical/Quantitative Data	3
Reports - Descriptive	2
Opinion Papers	1
Tests/Questionnaires	1

Education Level

Higher Education	10
Postsecondary Education	7
Secondary Education	6
Grade 8	4
Elementary Education	3
Grade 5	3
Grade 7	3
Junior High Schools	3
Middle Schools	3
Grade 3	2
Grade 4	2
Grade 9	2
Early Childhood Education	1
Elementary Secondary Education	1
Grade 10	1
Grade 6	1
Intermediate Grades	1
Preschool Education	1
More ▼

Audience

Researchers

Location

Turkey	3
Turkey (Ankara)	3
United Kingdom	2
Alabama	1
California	1
Colorado	1
Germany	1
Haiti	1
Illinois (Chicago)	1
Indiana	1
Mexico	1
Netherlands	1
Tanzania	1
Turkey (Istanbul)	1
United States	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…	2
ACT Assessment	1
National Assessment of…	1
Program for International…	1
Test of English for…	1
Trends in International…	1
United States Medical…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 56 results Save | Export

Generalizability Theory Approach to Analyzing Automated-Item Generated Test Forms

Peer reviewed

Direct link

Stella Y. Kim; Sungyeun Kim – Educational Measurement: Issues and Practice, 2025

This study presents several multivariate Generalizability theory designs for analyzing automatic item-generated (AIG) based test forms. The study used real data to illustrate the analysis procedure and discuss practical considerations. We collected the data from two groups of students, each group receiving a different form generated by AIG. A…

Descriptors: Generalizability Theory, Automation, Test Items, Students

Item-Level Heterogeneity in Value Added Models: Implications for Reliability, Cross-Study Comparability, and Effect Sizes. EdWorkingPaper No. 25-1173

Download full text

Joshua B. Gilbert; Zachary Himmelsbach; Luke W. Miratrix; Andrew D. Ho; Benjamin W. Domingue – Annenberg Institute for School Reform at Brown University, 2025

Value added models (VAMs) attempt to estimate the causal effects of teachers and schools on student test scores. We apply Generalizability Theory to show how estimated VA effects depend upon the selection of test items. Standard VAMs estimate causal effects on the items that are included on the test. Generalizability demands consideration of how…

Descriptors: Value Added Models, Reliability, Effect Size, Test Items

Generalizing beyond the Test: Permutation-Based Profile Analysis for Explaining DIF Using Item Features

Peer reviewed

Direct link

Maria Bolsinova; Jesper Tijmstra; Leslie Rutkowski; David Rutkowski – Journal of Educational and Behavioral Statistics, 2024

Profile analysis is one of the main tools for studying whether differential item functioning can be related to specific features of test items. While relevant, profile analysis in its current form has two restrictions that limit its usefulness in practice: It assumes that all test items have equal discrimination parameters, and it does not test…

Descriptors: Test Items, Item Analysis, Generalizability Theory, Achievement Tests

Sample Size and Item Parameter Estimation Precision When Utilizing the Masters' Partial Credit Model

Download full text

Custer, Michael; Kim, Jongpil – Online Submission, 2023

This study utilizes an analysis of diminishing returns to examine the relationship between sample size and item parameter estimation precision when utilizing the Masters' Partial Credit Model for polytomous items. Item data from the standardization of the Batelle Developmental Inventory, 3rd Edition were used. Each item was scored with a…

Descriptors: Sample Size, Item Response Theory, Test Items, Computation

Extended Multivariate Generalizability Theory with Complex Design Structures

Peer reviewed

Direct link

Brennan, Robert L.; Kim, Stella Y.; Lee, Won-Chan – Educational and Psychological Measurement, 2022

This article extends multivariate generalizability theory (MGT) to tests with different random-effects designs for each level of a fixed facet. There are numerous situations in which the design of a test and the resulting data structure are not definable by a single design. One example is mixed-format tests that are composed of multiple-choice and…

Descriptors: Multivariate Analysis, Generalizability Theory, Multiple Choice Tests, Test Construction

Evaluating Human Scoring Using Generalizability Theory

Peer reviewed

Direct link

Bimpeh, Yaw; Pointer, William; Smith, Ben Alexander; Harrison, Liz – Applied Measurement in Education, 2020

Many high-stakes examinations in the United Kingdom (UK) use both constructed-response items and selected-response items. We need to evaluate the inter-rater reliability for constructed-response items that are scored by humans. While there are a variety of methods for evaluating rater consistency across ratings in the psychometric literature, we…

Descriptors: Scoring, Generalizability Theory, Interrater Reliability, Foreign Countries

Comparison of G and Phi Coefficients Estimated in Generalizability Theory with Real Cases

Peer reviewed
PDF on ERIC

Download full text

Deniz, Kaan Zulfikar; Ilican, Emel – International Journal of Assessment Tools in Education, 2021

This study aims to compare the G and Phi coefficients as estimated by D studies for a measurement tool with the G and Phi coefficients obtained from real cases in which items of differing difficulty levels were added and also to determine the conditions under which the D studies estimated reliability coefficients closer to reality. The study group…

Descriptors: Generalizability Theory, Test Items, Difficulty Level, Test Reliability

The Use of Open-Ended Questions in Large-Scale Tests for Selection: Generalizability and Dependability

Peer reviewed
PDF on ERIC

Download full text

Atilgan, Hakan; Demir, Elif Kübra; Ogretmen, Tuncay; Basokcu, Tahsin Oguz – International Journal of Progressive Education, 2020

It has become a critical question what the reliability level would be when open-ended questions are used in large-scale selection tests. One of the aims of the present study is to determine what the reliability would be in the event that the answers given by test-takers are scored by experts when open-ended short answer questions are used in…

Descriptors: Foreign Countries, Secondary School Students, Test Items, Test Reliability

Investigating and Optimizing Score Dependability of a Local ITA Speaking Test across Language Groups: A Generalizability Theory Approach

Peer reviewed

Direct link

Shin, Ji-young – Language Testing, 2022

With the present study I investigated the sources of score variance and dependability in a local oral English proficiency test for potential international teaching assistants (ITAs) across four first language (L1) groups, and suggested alternative test designs. Using generalizability theory, I examined the relative importance of L1s (i.e., Indian,…

Descriptors: Foreign Students, Language Tests, Language Proficiency, Oral Language

An Investigation of Reliability Coefficients Estimated for Decision Studies in Generalizability Theory

Peer reviewed
PDF on ERIC

Download full text

Kamis, Ömer; Dogan, C. Deha – Journal of Education and Learning, 2018

This research aimed to compare the G and Phi coefficients estimated in Decision studies in Generalizability theory and obtained in actual cases for the same conditions of similar facets by using crossed design. The research was conducted as pure research on 120 individuals (students), six items and 12 raters. An achievement test composed of six…

Descriptors: Generalizability Theory, Decision Making, Reliability, Computation

Multivariate Generalizability Analysis of Automated Scoring for Short Answer Items of Social Studies in Large-Scale Assessment

Peer reviewed

Direct link

Sung, Kyung Hee; Noh, Eun Hee; Chon, Kyong Hee – Asia Pacific Education Review, 2017

With increased use of constructed response items in large scale assessments, the cost of scoring has been a major consideration (Noh et al. in KICE Report RRE 2012-6, 2012; Wainer and Thissen in "Applied Measurement in Education" 6:103-118, 1993). In response to the scoring cost issues, various forms of automated system for scoring…

Descriptors: Automation, Scoring, Social Studies, Test Items

Hidden Item Variance in Multiple Mini-Interview Scores

Peer reviewed

Direct link

Zaidi, Nikki L.; Swoboda, Christopher M.; Kelcey, Benjamin M.; Manuel, R. Stephen – Advances in Health Sciences Education, 2017

The extant literature has largely ignored a potentially significant source of variance in multiple mini-interview (MMI) scores by "hiding" the variance attributable to the sample of attributes used on an evaluation form. This potential source of hidden variance can be defined as rating items, which typically comprise an MMI evaluation…

Descriptors: Interviews, Scores, Generalizability Theory, Monte Carlo Methods

Using Generalizability Analysis to Estimate Parameters for Anatomy Assessments: A Multi-institutional Study

Peer reviewed

Direct link

Byram, Jessica N.; Seifert, Mark F.; Brooks, William S.; Fraser-Cotlin, Laura; Thorp, Laura E.; Williams, James M.; Wilson, Adam B. – Anatomical Sciences Education, 2017

With integrated curricula and multidisciplinary assessments becoming more prevalent in medical education, there is a continued need for educational research to explore the advantages, consequences, and challenges of integration practices. This retrospective analysis investigated the number of items needed to reliably assess anatomical knowledge in…

Descriptors: Anatomy, Science Tests, Test Items, Test Reliability

An Information-Correction Method for Testlet-Based Test Analysis: From the Perspectives of Item Response Theory and Generalizability Theory. Research Report. ETS RR-17-27

Peer reviewed
PDF on ERIC

Download full text

Li, Feifei – ETS Research Report Series, 2017

An information-correction method for testlet-based tests is introduced. This method takes advantage of both generalizability theory (GT) and item response theory (IRT). The measurement error for the examinee proficiency parameter is often underestimated when a unidimensional conditional-independence IRT model is specified for a testlet dataset. By…

Descriptors: Item Response Theory, Generalizability Theory, Tests, Error of Measurement

Evaluating the Consistency of Angoff-Based Cut Scores Using Subsets of Items within a Generalizability Theory Framework

Peer reviewed

Direct link

Kannan, Priya; Sgammato, Adrienne; Tannenbaum, Richard J.; Katz, Irvin R. – Applied Measurement in Education, 2015

The Angoff method requires experts to view every item on the test and make a probability judgment. This can be time consuming when there are large numbers of items on the test. In this study, a G-theory framework was used to determine if a subset of items can be used to make generalizable cut-score recommendations. Angoff ratings (i.e.,…

Descriptors: Reliability, Standard Setting (Scoring), Cutting Scores, Test Items

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4

Applied Measurement in…	6
Educational and Psychological…	4
Educational Measurement:…	3
International Journal of…	3
Journal of Educational…	3
Language Testing	3
Advances in Health Sciences…	2
ETS Research Report Series	2
Educational Sciences: Theory…	2
Online Submission	2
Anatomical Sciences Education	1
Annenberg Institute for…	1
Applied Psychological…	1
Asia Pacific Education Review	1
Assessment for Effective…	1
Behavioral Research and…	1
Chemistry Education Research…	1
Educational Researcher	1
International Journal of…	1
International Journal of…	1
Journal of Education and…	1
Journal of Educational and…	1
Language Assessment Quarterly	1
National Center for Research…	1
Psychological Bulletin	1
More ▼

Solano-Flores, Guillermo	5
Lee, Guemin	3
Brennan, Robert L.	2
Clauser, Brian E.	2
Frisbie, David A.	2
Harik, Polina	2
Li, Min	2
Margolis, Melissa J.	2
Webb, Noreen M.	2
Alonzo, Julie	1
Anderson, Dan	1
Anderson, Daniel	1
Andrew D. Ho	1
Arce, Alvaro J.	1
Atilgan, Hakan	1
Backhoff, Eduardo	1
Barbera, Jack	1
Basokcu, Tahsin Oguz	1
Benjamin W. Domingue	1
Bennett, Randy Elliot	1
Bimpeh, Yaw	1
Bock, R. Darrell	1
Bordage, Georges	1
Brooks, William S.	1
More ▼