ERIC - Search Results

Publication Date

In 2025	1
Since 2024	2
Since 2021 (last 5 years)	4
Since 2016 (last 10 years)	5
Since 2006 (last 20 years)	10

Descriptor

Error of Measurement	21
Generalizability Theory	21
Test Reliability	21
Interrater Reliability	7
Scores	7
Performance Based Assessment	6
Test Theory	5
Estimation (Mathematics)	4
Measurement Techniques	4
Test Validity	4
Cutting Scores	3
Educational Research	3
Foreign Countries	3
Higher Education	3
Scoring	3
Test Interpretation	3
Analysis of Variance	2
Criterion Referenced Tests	2
Decision Making	2
Educational Assessment	2
Evaluation Methods	2
Grade 8	2
Intelligence Tests	2
Item Response Theory	2
Psychometrics	2
More ▼

Source

Applied Measurement in…	2
Advances in Health Sciences…	1
Alberta Journal of…	1
Annenberg Institute for…	1
Center for Research on…	1
Educational Measurement:…	1
Evaluation Review	1
Intelligence	1
International Journal of…	1
Journal of Educational…	1
Practical Assessment,…	1
Psychology in the Schools	1
Research & Practice in…	1
Structural Equation Modeling:…	1
More ▼

Publication Type

Journal Articles	13
Reports - Research	13
Reports - Evaluative	7
Speeches/Meeting Papers	5
Opinion Papers	1
Reports - Descriptive	1

Education Level

Elementary Education	2
Grade 8	2
Junior High Schools	2
Middle Schools	2
Secondary Education	2
Elementary Secondary Education	1
Grade 10	1
Grade 5	1
High Schools	1
Higher Education	1
Intermediate Grades	1
Postsecondary Education	1
More ▼

Audience

Researchers

Location

Norway	2
Canada	1
Florida	1
Oklahoma	1

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing 1 to 15 of 21 results Save | Export

Direct Discrepancy Dynamic Fit Index Cutoffs for Arbitrary Covariance Structure Models

Peer reviewed

Direct link

Daniel McNeish; Melissa G. Wolf – Structural Equation Modeling: A Multidisciplinary Journal, 2024

Despite the popularity of traditional fit index cutoffs like RMSEA [less than or equal to] 0.06 and CFI [greater than or equal to] 0.95, several studies have noted issues with overgeneralizing traditional cutoffs. Computational methods have been proposed to avoid overgeneralization by deriving cutoffs specifically tailored to the characteristics…

Descriptors: Structural Equation Models, Cutting Scores, Generalizability Theory, Error of Measurement

Comparison of the Results of the Generalizability Theory with the Inter-Rater Agreement Coefficients

Peer reviewed
PDF on ERIC

Download full text

Eser, Mehmet Taha; Aksu, Gökhan – International Journal of Curriculum and Instruction, 2022

The agreement between raters is examined within the scope of the concept of "inter-rater reliability". Although there are clear definitions of the concepts of agreement between raters and reliability between raters, there is no clear information about the conditions under which agreement and reliability level methods are appropriate to…

Descriptors: Generalizability Theory, Interrater Reliability, Evaluation Methods, Test Theory

How Not to Fool Ourselves about Heterogeneity of Treatment Effects. EdWorkingPaper No. 25-1116

Download full text

Paul T. von Hippel; Brendan A. Schuetze – Annenberg Institute for School Reform at Brown University, 2025

Researchers across many fields have called for greater attention to heterogeneity of treatment effects--shifting focus from the average effect to variation in effects between different treatments, studies, or subgroups. True heterogeneity is important, but many reports of heterogeneity have proved to be false, non-replicable, or exaggerated. In…

Descriptors: Educational Research, Replication (Evaluation), Generalizability Theory, Inferences

Conditional Standard Error of Measurement: Classical Test Theory, Generalizability Theory and Many-Facet Rasch Measurement with Applications to Writing Assessment

Peer reviewed
PDF on ERIC

Download full text

Huebner, Alan; Skar, Gustaf B. – Practical Assessment, Research & Evaluation, 2021

Writing assessments often consist of students responding to multiple prompts, which are judged by more than one rater. To establish the reliability of these assessments, there exist different methods to disentangle variation due to prompts and raters, including classical test theory, Many Facet Rasch Measurement (MFRM), and Generalizability Theory…

Descriptors: Error of Measurement, Test Theory, Generalizability Theory, Item Response Theory

The Exchangeability of Brief Intelligence Tests for Children with Intellectual Giftedness: Illuminating Error Variance Components' Influence on IQs

Peer reviewed

Direct link

Irby, Sarah M.; Floyd, Randy G. – Psychology in the Schools, 2017

This study examined the exchangeability of total scores (i.e., intelligent quotients [IQs]) from three brief intelligence tests. Tests were administered to 36 children with intellectual giftedness, scored live by one set of primary examiners and later scored by a secondary examiner. For each student, six IQs were calculated, and all 216 values…

Descriptors: Intelligence Tests, Gifted, Error of Measurement, Scores

An Application of Generalizability Theory to Evaluate the Technical Quality of an Alternate Assessment

Peer reviewed

Direct link

Taylor, Melinda Ann; Pastor, Dena A. – Applied Measurement in Education, 2013

Although federal regulations require testing students with severe cognitive disabilities, there is little guidance regarding how technical quality should be established. It is known that challenges exist with documentation of the reliability of scores for alternate assessments. Typical measures of reliability do little in modeling multiple sources…

Descriptors: Generalizability Theory, Alternative Assessment, Test Reliability, Scores

Generalizability Theory and Classical Test Theory

Peer reviewed

Direct link

Brennan, Robert L. – Applied Measurement in Education, 2011

Broadly conceived, reliability involves quantifying the consistencies and inconsistencies in observed scores. Generalizability theory, or G theory, is particularly well suited to addressing such matters in that it enables an investigator to quantify and distinguish the sources of inconsistencies in observed scores that arise, or could arise, over…

Descriptors: Generalizability Theory, Test Theory, Test Reliability, Item Response Theory

Using Multivariate Generalizability Theory to Assess the Effect of Content Stratification on the Reliability of a Performance Assessment

Peer reviewed

Direct link

Keller, Lisa A.; Clauser, Brian E.; Swanson, David B. – Advances in Health Sciences Education, 2010

In recent years, demand for performance assessments has continued to grow. However, performance assessments are notorious for lower reliability, and in particular, low reliability resulting from task specificity. Since reliability analyses typically treat the performance tasks as randomly sampled from an infinite universe of tasks, these estimates…

Descriptors: Generalizability Theory, Test Reliability, Performance Based Assessment, Error of Measurement

Generalizability of Student Writing across Multiple Tasks: A Challenge for Authentic Assessment

Peer reviewed
PDF on ERIC

Download full text

Hathcoat, John D.; Penn, Jeremy D. – Research & Practice in Assessment, 2012

Critics of standardized testing have recommended replacing standardized tests with more authentic assessment measures, such as classroom assignments, projects, or portfolios rated by a panel of raters using common rubrics. Little research has examined the consistency of scores across multiple authentic assignments or the implications of this…

Descriptors: Generalizability Theory, Performance Based Assessment, Writing Across the Curriculum, Standardized Tests

Emotional Intelligence: The MSCEIT from the Perspective of Generalizability Theory

Peer reviewed

Direct link

Follesdal, Hallvard; Hagtvet, Knut A. – Intelligence, 2009

The Mayer, Salovey, & Caruso Emotional Intelligence Test (MSCEIT) has been reported to provide reliable scores for the four-branch ability model of emotional intelligence [Mayer, J. D., Salovey, P., & Caruso, D. R. (2002). "Mayer-Salovey-Caruso Emotional Intelligence Test (MSCEIT). User's manual." Toronto, Canada: Multi-Health…

Descriptors: Emotional Intelligence, Intelligence Tests, Adults, Error of Measurement

My Current Thoughts on Coefficient Alpha and Successor Procedures. CSE Report 643

Download full text

Cronbach, Lee J. – Center for Research on Evaluation Standards and Student Testing CRESST, 2004

Where the accuracy of a measurement is important, whether for scientific or practical purposes, the investigator should evaluate how much random error affects the measurement. New research may not be necessary when a procedure has been studied enough to establish how much error it involves. But, with new measures, or measures being transferred…

Descriptors: Error of Measurement, Test Reliability, Generalizability Theory, Educational Research

The Role of Reliability in Criterion-Referenced Tests.

Peer reviewed

Kane, Michael T. – Journal of Educational Measurement, 1986

These analyses suggest that if a criterion-referenced test had a reliability (defined in terms of internal consistency) below 0.5, a simple a priori procedure would provide better estimates of students' universe scores than would individual observed scores. (Author/LMO)

Descriptors: Criterion Referenced Tests, Educational Research, Error of Measurement, Generalizability Theory

Error Sources Influencing Performance Assessment Reliability or Generalizability: A Meta Analysis.

Download full text

Jiang, Ying Hong; And Others – 1997

As performance-based assessments have gained wider use, there are increasing concerns about their dependability. This study is a synthesis of existing studies regarding the reliability or generalizability of performance assessments. The meta-analysis involves summarizing, examining, and evaluating research findings. Articles on the dependability…

Descriptors: Error of Measurement, Estimation (Mathematics), Generalizability Theory, Judges

Assessing the Reliability of Criterion-Referenced Measures Used to Evaluate Health-Education Programs.

Peer reviewed

Schaeffer, Gary A.; And Others – Evaluation Review, 1986

The reliability of criterion-referenced tests (CRTs) used in health program evaluation can be conceptualized in different ways. Formulas are presented for estimating appropriate standard error of measurement (SEM) for CRTs. The SEM can be used in computing confidence intervals for domain score estimates and for a cut-score. (Author/LMO)

Descriptors: Accountability, Criterion Referenced Tests, Cutting Scores, Error of Measurement

Generalizability of Performance Assessment Measures on the Florida Teacher Certification Examinations.

Download full text

Motika, Robert T. – 1997

Data from performance measures that were part of two foreign language teacher certification examinations were used in a generalizability study of the quality of their performance ratings. A total of 775 examinees from the Spanish K-12 and 192 examinees from the French K-12 subject area tests of the Florida Teacher Certification Examinations were…

Descriptors: Elementary Secondary Education, Error of Measurement, French, Generalizability Theory

Previous Page | Next Page »

Pages: 1 | 2

Brennan, Robert L.	2
Aksu, Gökhan	1
Brendan A. Schuetze	1
Clauser, Brian E.	1
Cronbach, Lee J.	1
Crowley, Susan	1
Daniel McNeish	1
Eser, Mehmet Taha	1
Floyd, Randy G.	1
Follesdal, Hallvard	1
Gierl, Mark J.	1
Hagtvet, Knut A.	1
Hathcoat, John D.	1
Huebner, Alan	1
Irby, Sarah M.	1
Jiang, Ying Hong	1
Johnson, Eugene G.	1
Kane, Michael T.	1
Keller, Lisa A.	1
Melissa G. Wolf	1
Motika, Robert T.	1
Naizer, Gilbert	1
Pastor, Dena A.	1
Paul T. von Hippel	1
More ▼