ERIC - Search Results

Publication Date

In 2025	1
Since 2024	3
Since 2021 (last 5 years)	15
Since 2016 (last 10 years)	32
Since 2006 (last 20 years)	74

Descriptor

Generalizability Theory	126
Test Reliability	126
Interrater Reliability	39
Test Validity	37
Scores	33
Foreign Countries	22
Error of Measurement	21
Evaluation Methods	17
Statistical Analysis	17
Test Theory	17
Performance Based Assessment	16
Test Construction	15
Item Response Theory	14
Psychometrics	14
Elementary School Students	13
Test Items	13
Higher Education	11
Measurement Techniques	11
Reading Tests	11
Teacher Evaluation	11
Decision Making	10
Scoring	10
Student Evaluation	10
Writing Evaluation	10
Estimation (Mathematics)	9
More ▼

Publication Type

Reports - Research	81
Journal Articles	79
Reports - Evaluative	31
Speeches/Meeting Papers	25
Reports - Descriptive	9
Numerical/Quantitative Data	6
Information Analyses	3
Opinion Papers	2
Books	1
Collected Works - General	1
Dissertations/Theses -…	1
Non-Print Media	1
Reference Materials - General	1
Tests/Questionnaires	1
More ▼

Education Level

Higher Education	23
Elementary Education	21
Postsecondary Education	15
Secondary Education	10
Middle Schools	9
Early Childhood Education	8
Junior High Schools	8
Primary Education	6
Grade 8	5
Intermediate Grades	5
Grade 3	4
Grade 4	4
Grade 5	3
Grade 7	3
Elementary Secondary Education	2
Grade 1	2
Grade 10	2
Grade 6	2
High Schools	2
Kindergarten	2
Preschool Education	2
Grade 12	1
Grade 2	1
Grade 9	1
Two Year Colleges	1
More ▼

Audience

Researchers	9
Policymakers	1

Location

Turkey	6
Canada	4
Turkey (Ankara)	3
Cyprus	2
Indiana	2
Norway	2
Alabama	1
California	1
China	1
Colorado	1
Finland (Helsinki)	1
Florida	1
Georgia	1
Idaho	1
Illinois (Chicago)	1
Mexico (Mexico City)	1
Michigan	1
Netherlands	1
Oklahoma	1
Pennsylvania	1
South Korea	1
Texas	1
Turkey (Istanbul)	1
United Kingdom	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

Teacher Performance…	3
Childrens Depression Inventory	2
Battelle Developmental…	1
Cognitive Abilities Test	1
Conners Teacher Rating Scale	1
Dynamic Indicators of Basic…	1
Early Childhood Environment…	1
Group Assessment of Logical…	1
Medical College Admission Test	1
National Survey of Student…	1
SAT (College Admission Test)	1
Stages of Concern…	1
Stanford Binet Intelligence…	1
Strengths and Difficulties…	1
Students Evaluation of…	1
Test of English as a Foreign…	1
More ▼

What Works Clearinghouse Rating

Showing 1 to 15 of 126 results Save | Export

Direct Discrepancy Dynamic Fit Index Cutoffs for Arbitrary Covariance Structure Models

Peer reviewed

Direct link

Daniel McNeish; Melissa G. Wolf – Structural Equation Modeling: A Multidisciplinary Journal, 2024

Despite the popularity of traditional fit index cutoffs like RMSEA [less than or equal to] 0.06 and CFI [greater than or equal to] 0.95, several studies have noted issues with overgeneralizing traditional cutoffs. Computational methods have been proposed to avoid overgeneralization by deriving cutoffs specifically tailored to the characteristics…

Descriptors: Structural Equation Models, Cutting Scores, Generalizability Theory, Error of Measurement

Comparison of the Results of the Generalizability Theory with the Inter-Rater Agreement Coefficients

Peer reviewed
PDF on ERIC

Download full text

Eser, Mehmet Taha; Aksu, Gökhan – International Journal of Curriculum and Instruction, 2022

The agreement between raters is examined within the scope of the concept of "inter-rater reliability". Although there are clear definitions of the concepts of agreement between raters and reliability between raters, there is no clear information about the conditions under which agreement and reliability level methods are appropriate to…

Descriptors: Generalizability Theory, Interrater Reliability, Evaluation Methods, Test Theory

How Not to Fool Ourselves about Heterogeneity of Treatment Effects. EdWorkingPaper No. 25-1116

Download full text

Paul T. von Hippel; Brendan A. Schuetze – Annenberg Institute for School Reform at Brown University, 2025

Researchers across many fields have called for greater attention to heterogeneity of treatment effects--shifting focus from the average effect to variation in effects between different treatments, studies, or subgroups. True heterogeneity is important, but many reports of heterogeneity have proved to be false, non-replicable, or exaggerated. In…

Descriptors: Educational Research, Replication (Evaluation), Generalizability Theory, Inferences

Extended Multivariate Generalizability Theory with Complex Design Structures

Peer reviewed

Direct link

Brennan, Robert L.; Kim, Stella Y.; Lee, Won-Chan – Educational and Psychological Measurement, 2022

This article extends multivariate generalizability theory (MGT) to tests with different random-effects designs for each level of a fixed facet. There are numerous situations in which the design of a test and the resulting data structure are not definable by a single design. One example is mixed-format tests that are composed of multiple-choice and…

Descriptors: Multivariate Analysis, Generalizability Theory, Multiple Choice Tests, Test Construction

A Short Note on Optimizing Cost-Generalizability via a Machine-Learning Approach

Peer reviewed

Direct link

Jiang, Zhehan; Shi, Dexin; Distefano, Christine – Educational and Psychological Measurement, 2021

The costs of an objective structured clinical examination (OSCE) are of concern to health profession educators globally. As OSCEs are usually designed under generalizability theory (G-theory) framework, this article proposes a machine-learning-based approach to optimize the costs, while maintaining the minimum required generalizability…

Descriptors: Artificial Intelligence, Generalizability Theory, Objective Tests, Foreign Countries

Comparison of G and Phi Coefficients Estimated in Generalizability Theory with Real Cases

Peer reviewed
PDF on ERIC

Download full text

Deniz, Kaan Zulfikar; Ilican, Emel – International Journal of Assessment Tools in Education, 2021

This study aims to compare the G and Phi coefficients as estimated by D studies for a measurement tool with the G and Phi coefficients obtained from real cases in which items of differing difficulty levels were added and also to determine the conditions under which the D studies estimated reliability coefficients closer to reality. The study group…

Descriptors: Generalizability Theory, Test Items, Difficulty Level, Test Reliability

Quantile Reliability: Beyond Global Estimates of Internal Consistency

Peer reviewed

Direct link

Jeffrey Shero; Jessica Logan – Society for Research on Educational Effectiveness, 2024

Background/Context: Previous research in educational assessment has consistently emphasized the importance of reliability as a cornerstone of test quality. Traditional measures of reliability, such as test-retest and split-half reliability, offer a broad view of how internally consistent a measure is but overlook the variability in this internal…

Descriptors: Educational Assessment, Special Education, Students with Disabilities, Learning Disabilities

Conditional Standard Error of Measurement: Classical Test Theory, Generalizability Theory and Many-Facet Rasch Measurement with Applications to Writing Assessment

Peer reviewed
PDF on ERIC

Download full text

Huebner, Alan; Skar, Gustaf B. – Practical Assessment, Research & Evaluation, 2021

Writing assessments often consist of students responding to multiple prompts, which are judged by more than one rater. To establish the reliability of these assessments, there exist different methods to disentangle variation due to prompts and raters, including classical test theory, Many Facet Rasch Measurement (MFRM), and Generalizability Theory…

Descriptors: Error of Measurement, Test Theory, Generalizability Theory, Item Response Theory

Validity. Improving Literacy Brief: Understanding Screening

Direct link

Petscher, Y.; Pentimonti, J.; Stanley, C. – National Center on Improving Literacy, 2019

Validity is broadly defined as how well something measures what it's supposed to measure. The reliability and validity of scores from assessments are two concepts that are closely knit together and feed into each other.

Descriptors: Screening Tests, Scores, Test Validity, Test Reliability

The Use of Open-Ended Questions in Large-Scale Tests for Selection: Generalizability and Dependability

Peer reviewed
PDF on ERIC

Download full text

Atilgan, Hakan; Demir, Elif Kübra; Ogretmen, Tuncay; Basokcu, Tahsin Oguz – International Journal of Progressive Education, 2020

It has become a critical question what the reliability level would be when open-ended questions are used in large-scale selection tests. One of the aims of the present study is to determine what the reliability would be in the event that the answers given by test-takers are scored by experts when open-ended short answer questions are used in…

Descriptors: Foreign Countries, Secondary School Students, Test Items, Test Reliability

Mastery Measurement in Mathematics and the Goldilocks Effect

Peer reviewed

Direct link

Solomon, Benjamin G.; VanDerHeyden, Amanda M.; Solomon, Emily C.; Korzeniewski, Erika R.; Payne, Lexy L.; Campaña, Kayla V.; Dillon, Chasen R. – School Psychology, 2022

Math curriculum-based measurement (CBM) is an essential tool for multi-tiered systems of support decision making, but the reliability of math CBMs has received little research, particularly using more rigorous methods such as generalizability (G) theory. Math CBM is historically organized into two domains: mastery measures and general outcome…

Descriptors: Mathematics Tests, Mathematics Skills, Mathematics Achievement, Curriculum Based Assessment

(In)Stability of Test Scores

Peer reviewed
PDF on ERIC

Download full text

Merchant, Stefan; Rich, Jessica; Klinger, Don A. – Canadian Journal of Educational Administration and Policy, 2022

Both school and district administrators use the results of standardized, large-scale tests to inform decisions about the need for, or success of, educational programs and interventions. However, test results at the school level are subject to random fluctuations due to changes in cohort, test items, and other factors outside of the school's…

Descriptors: Standardized Tests, Foreign Countries, Generalizability Theory, Scores

The Reliability of Framework for Teaching Scores in Kindergarten

Peer reviewed
PDF on ERIC

Download full text

Direct link

Patrick, Helen; French, Brian F.; Mantzicopoulos, Panayota – Journal of Psychoeducational Assessment, 2020

We evaluated the score stability of the Framework for Teaching (FFT), a prominent observation instrument used for teacher evaluation. Three raters each scored 200 reading and mathematics lessons taught by 20 kindergarten teachers. Using Generalizability theory analyses, we decomposed the FFT's Classroom Environment, Instruction, and Total scores…

Descriptors: Teacher Evaluation, Observation, Scores, Test Reliability

Structural Validity, Internal Consistency, and Rater Reliability of the Modified Barium Swallow Impairment Profile: Breaking Ground on a 52,726-Patient, Clinical Data Set

Peer reviewed

Direct link

Clain, Alex E.; Alkhuwaiter, Munirah; Davidson, Kate; Martin-Harris, Bonnie – Journal of Speech, Language, and Hearing Research, 2022

Purpose: The purpose of this study was to extend the assessment of the psychometric properties of the Modified Barium Swallow Impairment Profile (MBSImP). Here, we re-examined structural validity and internal consistency using a large clinical-registry data set and formally examined rater reliability in a smaller data set. Method: This study…

Descriptors: Diagnostic Tests, Disability Identification, Physical Disabilities, Eating Disorders

Generalizability Theory in R

Peer reviewed
PDF on ERIC

Download full text

Huebner, Alan; Lucht, Marissa – Practical Assessment, Research & Evaluation, 2019

Generalizability theory is a modern, powerful, and broad framework used to assess the reliability, or dependability, of measurements. While there exist classic works that explain the basic concepts and mathematical foundations of the method, there is currently a lack of resources addressing computational resources for those researchers wishing to…

Descriptors: Generalizability Theory, Test Reliability, Computer Software, Statistical Analysis

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

Educational and Psychological…	8
Advances in Health Sciences…	6
Behavioral Research and…	6
Journal of Psychoeducational…	4
Applied Measurement in…	2
Assessing Writing	2
Assessment for Effective…	2
Educational Sciences: Theory…	2
International Journal of…	2
Journal of Educational…	2
Language Testing	2
Online Submission	2
Practical Assessment,…	2
Reading Psychology	2
Research & Practice in…	2
Advances in Physiology…	1
Alberta Journal of…	1
Anatomical Sciences Education	1
Annenberg Institute for…	1
Asian Journal of Education…	1
Assessment & Evaluation in…	1
Canadian Journal of…	1
Center for Research Use in…	1
Center for Research on…	1
Chemistry Education Research…	1
More ▼

Tindal, Gerald	7
Alonzo, Julie	6
Anderson, Daniel	6
Brennan, Robert L.	4
Lai, Cheng-Fei	4
Park, Bitnara Jasmine	4
Atilgan, Hakan	3
Capie, William	3
Aktas, Mehtap	2
Bordage, Georges	2
Conger, Anthony J.	2
Crowley, Susan L.	2
French, Brian F.	2
Gierl, Mark J.	2
Huebner, Alan	2
Johnson, Evelyn S.	2
Mantzicopoulos, Panayota	2
Patrick, Helen	2
Salmani-Nodoushan, Mohammad…	2
Sudweeks, Richard R.	2
Uzun, N. Bilge	2
Yudkowsky, Rachel	2
Abedi, Jamal	1
Agboh, Darren	1
More ▼