ERIC - Search Results

Publication Date

In 2025	1
Since 2024	1
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	10
Since 2006 (last 20 years)	30

Descriptor

Error of Measurement	38
Scores	38
Reliability	15
Item Response Theory	8
Measurement	8
Test Reliability	8
Testing	7
Academic Achievement	6
Computation	6
Psychometrics	6
Test Construction	6
Statistical Analysis	5
Test Interpretation	5
Accountability	4
Comparative Analysis	4
Evaluation	4
Evaluation Methods	4
Models	4
Sampling	4
Test Items	4
Accuracy	3
Achievement Gains	3
Educational Improvement	3
Elementary School Students	3
Equations (Mathematics)	3
More ▼

Publication Type

Reports - Descriptive	38
Journal Articles	28
Speeches/Meeting Papers	3
Guides - General	1
Numerical/Quantitative Data	1
Opinion Papers	1
Reports - Evaluative	1

Education Level

Elementary Education	4
Elementary Secondary Education	3
Higher Education	3
Grade 3	2
High Schools	2
Junior High Schools	2
Middle Schools	2
Postsecondary Education	2
Secondary Education	2
Grade 5	1
Grade 8	1
More ▼

Audience

Policymakers	1
Researchers	1
Teachers	1

Location

Tennessee	2
Colorado (Boulder)	1
Maryland	1
Pennsylvania	1
Texas	1
United Kingdom (England)	1

Laws, Policies, & Programs

No Child Left Behind Act 2001	1
Race to the Top	1

Assessments and Surveys

General Educational…	2
ACT Assessment	1
National Assessment of…	1
Work Keys (ACT)	1

What Works Clearinghouse Rating

Showing 1 to 15 of 38 results Save | Export

ANCOVA versus Change Score for the Analysis of Two-Wave Data

Peer reviewed

Direct link

Oliver Lüdtke; Alexander Robitzsch – Journal of Experimental Education, 2025

There is a longstanding debate on whether the analysis of covariance (ANCOVA) or the change score approach is more appropriate when analyzing non-experimental longitudinal data. In this article, we use a structural modeling perspective to clarify that the ANCOVA approach is based on the assumption that all relevant covariates are measured (i.e.,…

Descriptors: Statistical Analysis, Longitudinal Studies, Error of Measurement, Hierarchical Linear Modeling

Interactions between Polygenic Scores and Environments: Methodological and Conceptual Challenges

Peer reviewed
PDF on ERIC

Download full text

Direct link

Domingue, Benjamin W.; Trejo, Sam; Armstrong-Carter, Emma; Tucker-Drob, Elliot M. – Grantee Submission, 2020

Interest in the study of gene-environment interaction has recently grown due to the sudden availability of molecular genetic data--in particular, polygenic scores--in many long-running longitudinal studies. Identifying and estimating statistical interactions comes with several analytic and inferential challenges; these challenges are heightened…

Descriptors: Genetics, Environmental Influences, Scores, Interaction

Conditional Precision of Measurement for Test Scores: Are Conditional Standard Errors Sufficient?

Peer reviewed

Direct link

Nicewander, W. Alan – Educational and Psychological Measurement, 2019

This inquiry is focused on three indicators of the precision of measurement--conditional on fixed values of ?, the latent variable of item response theory (IRT). The indicators that are compared are (1) The traditional, conditional standard errors, s(eX|?) = CSEM; (2) the IRT-based conditional standard errors, s[subscript irt](eX|?)=C[subscript…

Descriptors: Measurement, Accuracy, Scores, Error of Measurement

Research on Psychometric Modeling, Analysis, and Reporting of the National Assessment of Educational Progress

Peer reviewed
PDF on ERIC

Download full text

Direct link

Oranje, Andreas; Kolstad, Andrew – Journal of Educational and Behavioral Statistics, 2019

The design and psychometric methodology of the National Assessment of Educational Progress (NAEP) is constantly evolving to meet the changing interests and demands stemming from a rapidly shifting educational landscape. NAEP has been built on strong research foundations that include conducting extensive evaluations and comparisons before new…

Descriptors: National Competency Tests, Psychometrics, Statistical Analysis, Computation

Stabilizing Subgroup Proficiency Results to Improve the Identification of Low-Performing Schools. Study Snapshot. REL 2023-001

Peer reviewed
PDF on ERIC

Download full text

Regional Educational Laboratory Mid-Atlantic, 2023

This Snapshot highlights key findings from a study that used Bayesian stabilization to improve the reliability (long-term stability) of subgroup proficiency measures that the Pennsylvania Department of Education (PDE) uses to identify schools for Targeted Support and Improvement (TSI) or Additional Targeted Support and Improvement (ATSI). The…

Descriptors: At Risk Students, Low Achievement, Error of Measurement, Measurement Techniques

'Phantom' Compositional Effects in English School Value-Added Measures: The Consequences of Random Baseline Measurement Error

Peer reviewed

Direct link

Perry, Thomas – Research Papers in Education, 2019

A compositional effect is when pupil attainment is associated with the characteristics of their peers, over and above their own individual characteristics. Pupils at academically selective schools, for example, tend to out-perform similar-ability pupils who are educated with mixed-ability peers. Previous methodological studies however have shown…

Descriptors: Value Added Models, Correlation, Individual Characteristics, Peer Influence

Processes and Procedures for Estimating Score Reliability and Precision

Peer reviewed

Direct link

Bardhoshi, Gerta; Erford, Bradley T. – Measurement and Evaluation in Counseling and Development, 2017

Precision is a key facet of test development, with score reliability determined primarily according to the types of error one wants to approximate and demonstrate. This article identifies and discusses several primary forms of reliability estimation: internal consistency (i.e., split-half, KR-20, a), test-retest, alternate forms, interscorer, and…

Descriptors: Scores, Test Reliability, Accuracy, Pretests Posttests

Do 45% of College Students Lack Critical Thinking Skills? Revisiting a Central Conclusion of "Academically Adrift"

Peer reviewed

Direct link

Lane, David; Oswald, Frederick L. – Educational Measurement: Issues and Practice, 2016

The educational literature, the popular press, and educated laypeople have all echoed a conclusion from the book "Academically Adrift" by Richard Arum and Josipa Roksa (which has now become received wisdom), namely, that 45% of college students showed no significant gains in critical thinking skills. Similar results were reported by…

Descriptors: College Students, Critical Thinking, Thinking Skills, Statistical Analysis

Evaluating Evidence Regarding Relationships with Criteria

Peer reviewed

Direct link

Balkin, Richard S. – Measurement and Evaluation in Counseling and Development, 2017

An overview of standards related to demonstrating evidence regarding relationships with criteria as it pertains to instrument development was presented, along with heuristic examples. Additional measures and a comprehensive design are necessary to establish evidence related to the use and interpretation of test scores for the validation of a…

Descriptors: Evidence, Academic Standards, Test Construction, Evaluation Criteria

Comment on 3PL IRT Adjustment for Guessing

Peer reviewed

Direct link

Chiu, Ting-Wei; Camilli, Gregory – Applied Psychological Measurement, 2013

Guessing behavior is an issue discussed widely with regard to multiple choice tests. Its primary effect is on number-correct scores for examinees at lower levels of proficiency. This is a systematic error or bias, which increases observed test scores. Guessing also can inflate random error variance. Correction or adjustment for guessing formulas…

Descriptors: Item Response Theory, Guessing (Tests), Multiple Choice Tests, Error of Measurement

The Reliability and Precision of Total Scores and IRT Estimates as a Function of Polytomous IRT Parameters and Latent Trait Distribution

Peer reviewed

Direct link

Culpepper, Steven Andrew – Applied Psychological Measurement, 2013

A classic topic in the fields of psychometrics and measurement has been the impact of the number of scale categories on test score reliability. This study builds on previous research by further articulating the relationship between item response theory (IRT) and classical test theory (CTT). Equations are presented for comparing the reliability and…

Descriptors: Item Response Theory, Reliability, Scores, Error of Measurement

ACT Reporting Category Interpretation Guide: Version 1.0. ACT Working Paper 2016 (05)

Download full text

Powers, Sonya; Li, Dongmei; Suh, Hongwook; Harris, Deborah J. – ACT, Inc., 2016

ACT reporting categories and ACT Readiness Ranges are new features added to the ACT score reports starting in fall 2016. For each reporting category, the number correct score, the maximum points possible, the percent correct, and the ACT Readiness Range, along with an indicator of whether the reporting category score falls within the Readiness…

Descriptors: Scores, Classification, College Entrance Examinations, Error of Measurement

The Errors of Our Ways

Peer reviewed

Direct link

Kane, Michael – Journal of Educational Measurement, 2011

Errors don't exist in our data, but they serve a vital function. Reality is complicated, but our models need to be simple in order to be manageable. We assume that attributes are invariant over some conditions of observation, and once we do that we need some way of accounting for the variability in observed scores over these conditions of…

Descriptors: Error of Measurement, Scores, Test Interpretation, Testing

Sources of Score Scale Inconsistency. Research Report. ETS RR-11-10

Download full text

Haberman, Shelby J.; Dorans, Neil J. – Educational Testing Service, 2011

For testing programs that administer multiple forms within a year and across years, score equating is used to ensure that scores can be used interchangeably. In an ideal world, samples sizes are large and representative of populations that hardly change over time, and very reliable alternate test forms are built with nearly identical psychometric…

Descriptors: Scores, Reliability, Equated Scores, Test Construction

Generalizability Theory and the Fair and Valid Assessment of Linguistic Minorities

Peer reviewed

Direct link

Solano-Flores, Guillermo; Li, Min – Educational Research and Evaluation, 2013

We discuss generalizability (G) theory and the fair and valid assessment of linguistic minorities, especially emergent bilinguals. G theory allows examination of the relationship between score variation and language variation (e.g., variation of proficiency across languages, language modes, and social contexts). Studies examining score variation…

Descriptors: Measurement, Testing, Language Proficiency, Test Construction

Previous Page | Next Page »

Pages: 1 | 2 | 3

Journal of Educational…	4
Applied Psychological…	3
Council of Chief State School…	2
Educational Measurement:…	2
Educational and Psychological…	2
Measurement and Evaluation in…	2
ACT, Inc.	1
Applied Measurement in…	1
Assessment Update	1
Educational Research and…	1
Educational Testing Service	1
GED Testing Service	1
Grantee Submission	1
International Journal of…	1
Journal of Educational and…	1
Journal of Experimental…	1
Journal of Special Education	1
Language, Speech, and Hearing…	1
Mathematica Policy Research,…	1
Partnership for Assessment of…	1
Practical Assessment,…	1
Psychometrika	1
Regional Educational…	1
Research Papers in Education	1
School Administrator	1
More ▼

Kolen, Michael J.	4
Lee, Won-Chan	3
Harris, Deborah J.	2
Alexander Robitzsch	1
Armstrong-Carter, Emma	1
Balkin, Richard S.	1
Bardhoshi, Gerta	1
Birenbaum, Menucha	1
Booker, Kevin	1
Brennan, Robert L.	1
Briggs, Derek C.	1
Brockmann, Frank	1
Brown, Jonathan R.	1
Camilli, Gregory	1
Capraro, Robert M.	1
Chiu, Ting-Wei	1
Clemens, Nathan H.	1
Coverdale, Bradley J.	1
Culpepper, Steven Andrew	1
Davis, John L.	1
Domingue, Benjamin W.	1
Doorey, Nancy A.	1
Dorans, Neil J.	1
Eby, J. Robert	1
More ▼