ERIC - Search Results

Publication Date

In 2025	1
Since 2024	1
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	3

Descriptor

Error of Measurement	8
Generalizability Theory	8
Test Validity	8
Scores	4
Test Reliability	4
Interrater Reliability	3
Academic Achievement	2
Educational Assessment	2
Intelligence Tests	2
Measurement Techniques	2
Occupational Tests	2
Performance Based Assessment	2
Psychometrics	2
Sampling	2
Student Evaluation	2
Adults	1
Aptitude Treatment Interaction	1
Behavior Patterns	1
Certification	1
Children	1
Context Effect	1
Credentials	1
Eating Disorders	1
Educational Research	1
Effect Size	1
More ▼

Source

Annenberg Institute for…	1
Evaluation and the Health…	1
Intelligence	1
Journal of Educational…	1
Psychology in the Schools	1

Author

Shavelson, Richard J.	2
Brendan A. Schuetze	1
Espelage, Dorothy L.	1
Floyd, Randy G.	1
Follesdal, Hallvard	1
Hagtvet, Knut A.	1
Irby, Sarah M.	1
Kamps, Jodi	1
Meskauskas, John A.	1
Paul T. von Hippel	1
Quittner, Alexandra L.	1
Schmidt, Frank L.	1
More ▼

Publication Type

Reports - Research	6
Journal Articles	4
Reports - Evaluative	2
Speeches/Meeting Papers	2

Education Level

Audience

Researchers

Location

Norway

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 8 results Save | Export

How Not to Fool Ourselves about Heterogeneity of Treatment Effects. EdWorkingPaper No. 25-1116

Download full text

Paul T. von Hippel; Brendan A. Schuetze – Annenberg Institute for School Reform at Brown University, 2025

Researchers across many fields have called for greater attention to heterogeneity of treatment effects--shifting focus from the average effect to variation in effects between different treatments, studies, or subgroups. True heterogeneity is important, but many reports of heterogeneity have proved to be false, non-replicable, or exaggerated. In…

Descriptors: Educational Research, Replication (Evaluation), Generalizability Theory, Inferences

The Exchangeability of Brief Intelligence Tests for Children with Intellectual Giftedness: Illuminating Error Variance Components' Influence on IQs

Peer reviewed

Direct link

Irby, Sarah M.; Floyd, Randy G. – Psychology in the Schools, 2017

This study examined the exchangeability of total scores (i.e., intelligent quotients [IQs]) from three brief intelligence tests. Tests were administered to 36 children with intellectual giftedness, scored live by one set of primary examiners and later scored by a secondary examiner. For each student, six IQs were calculated, and all 216 values…

Descriptors: Intelligence Tests, Gifted, Error of Measurement, Scores

Emotional Intelligence: The MSCEIT from the Perspective of Generalizability Theory

Peer reviewed

Direct link

Follesdal, Hallvard; Hagtvet, Knut A. – Intelligence, 2009

The Mayer, Salovey, & Caruso Emotional Intelligence Test (MSCEIT) has been reported to provide reliable scores for the four-branch ability model of emotional intelligence [Mayer, J. D., Salovey, P., & Caruso, D. R. (2002). "Mayer-Salovey-Caruso Emotional Intelligence Test (MSCEIT). User's manual." Toronto, Canada: Multi-Health…

Descriptors: Emotional Intelligence, Intelligence Tests, Adults, Error of Measurement

From Validity Generalization to Meta-Analysis: The Development and Application of a New Research Integration Procedure.

Download full text

Schmidt, Frank L. – 1985

This paper describes how work by the United States Office of Personnel Management on the generalizability of employment test validities led to the development of a widely applicable meta-analysis method. The method focuses strongly on estimating the true variance of study correlations and effect size. This validity generalization procedure has…

Descriptors: Effect Size, Error of Measurement, Estimation (Mathematics), Generalizability Theory

Setting Standards for Credentialing Examinations. An Update.

Peer reviewed

Meskauskas, John A. – Evaluation and the Health Professions, 1986

Two new indices of stability of content-referenced standard-setting results are presented, relating variability of judges' decisions to the variability of candidate scores and to the reliability of the test. These indices are used to indicate whether scores resulting from a standard-setting study are of sufficient precision. (Author/LMO)

Descriptors: Certification, Credentials, Error of Measurement, Generalizability Theory

Sampling Variability of Performance Assessments.

Peer reviewed

Shavelson, Richard J.; And Others – Journal of Educational Measurement, 1993

Evidence is presented on the generalizability and convergent validity of performance assessments using data from six studies of student achievement that sampled a wide range of measurement facets and methods. Results at individual and school levels indicate that task-sampling variability is the major source of measurement error. (SLD)

Descriptors: Academic Achievement, Educational Assessment, Error of Measurement, Generalizability Theory

An Application of Generalizability Theory to the Validation of a Behaviorally Anchored Role-Play Measure.

Espelage, Dorothy L.; Quittner, Alexandra L.; Kamps, Jodi – 1998

Generalizability theory (g-theory) was used, as an alternative to classical test theory, to evaluate measurement error in a behaviorally anchored role-play measure, highlighting the usefulness of this theory in instrument development. G-theory partitions an observed score into the universe score and error scores associated with separate sources of…

Descriptors: Behavior Patterns, Eating Disorders, Error of Measurement, Females

Sampling Variability of Performance Assessments. Report on the Status of Generalizability Performance: Generalizability and Transfer of Performance Assessments. Project 2.4: Design Theory and Psychometrics for Complex Performance Assessment in Science.

Download full text

Shavelson, Richard J.; And Others – 1993

In this paper, performance assessments are cast within a sampling framework. A performance assessment score is viewed as a sample of student performance drawn from a complex universe defined by a combination of all possible tasks, occasions, raters, and measurement methods. Using generalizability theory, the authors present evidence bearing on the…

Descriptors: Academic Achievement, Educational Assessment, Error of Measurement, Evaluators