ERIC - Search Results

Publication Date

In 2025	2
Since 2024	2
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	6
Since 2006 (last 20 years)	18

Descriptor

Error of Measurement	21
Scores	21
Testing	21
Reliability	7
Measurement	6
Test Reliability	5
Test Validity	5
Academic Achievement	4
Evaluation	4
Test Interpretation	4
Test Results	4
Validity	4
College Entrance Examinations	3
Comparative Analysis	3
Computation	3
Generalizability Theory	3
Mathematics Tests	3
Psychometrics	3
Scoring	3
Standardized Tests	3
Test Construction	3
Translation	3
Certification	2
College Admission	2
College Readiness	2
More ▼

Source

Council of Chief State School…	2
Educational and Psychological…	2
ACT Education Corp.	1
ACT, Inc.	1
Annenberg Institute for…	1
Centre for Economic…	1
Educational Assessment	1
Educational Forum	1
Educational Research and…	1
Educational Testing Service	1
International Journal of…	1
International Journal of…	1
Journal of Educational…	1
Journal of Psychoeducational…	1
Language Assessment Quarterly	1
National Association for…	1
Physical Educator	1
Psychometrika	1
More ▼

Publication Type

Journal Articles	12
Reports - Research	8
Reports - Descriptive	7
Opinion Papers	4
Reports - Evaluative	2
Guides - General	1
Numerical/Quantitative Data	1
Speeches/Meeting Papers	1
Tests/Questionnaires	1

Education Level

Higher Education	4
Postsecondary Education	3
Elementary Education	2
High Schools	2
Secondary Education	2
Elementary Secondary Education	1
Grade 3	1
Grade 4	1
Grade 5	1
Intermediate Grades	1

Audience

Administrators	1
Counselors	1
Policymakers	1
Teachers	1

Location

China (Beijing)	1
North Carolina	1
United States	1

Laws, Policies, & Programs

No Child Left Behind Act 2001

Assessments and Surveys

ACT Assessment	3
SAT (College Admission Test)	1
Wechsler Adult Intelligence…	1
Wechsler Intelligence Scale…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 21 results Save | Export

The Sensitivity of Value-Added Estimates to Test Scoring Decisions. EdWorkingPaper No. 25-1226

Download full text

Joshua B. Gilbert; James G. Soland; Benjamin W. Domingue – Annenberg Institute for School Reform at Brown University, 2025

Value-Added Models (VAMs) are both common and controversial in education policy and accountability research. While the sensitivity of VAMs to model specification and covariate selection is well documented, the extent to which test scoring methods (e.g., mean scores vs. IRT-based scores) may affect VA estimates is less studied. We examine the…

Descriptors: Value Added Models, Tests, Testing, Scoring

Initial Evidence Supporting Interpretations of Scores from the Enhanced ACT Test. ACT Research. Research Report. R2425

Download full text

Jeff Allen; Ty Cruce – ACT Education Corp., 2025

This report summarizes some of the evidence supporting interpretations of scores from the enhanced ACT, focusing on reliability, concurrent validity, predictive validity, and score comparability. The authors argue that the evidence presented in this report supports the interpretation of scores from the enhanced ACT as measures of high school…

Descriptors: College Entrance Examinations, Testing, Change, Scores

Does Evaluation Distort Teacher Effort and Decisions? Quasi-Experimental Evidence from a Policy of Retesting Students. CEP Discussion Paper No. 1612

Download full text

Aucejo, Esteban; Romano, Teresa; Taylor, Eric S. – Centre for Economic Performance, 2019

Performance evaluation may change employee effort and decisions in unintended ways, for example, in multitask jobs where the evaluation measure captures only a subset of (differentially weights) the job tasks. We show evidence of this multitask distortion in schools, with teachers allocating effort across students (tasks). Teachers are evaluated…

Descriptors: Teacher Evaluation, Student Evaluation, Mathematics Tests, Scores

Evaluation of Two Methods for Modeling Measurement Errors When Testing Interaction Effects with Observed Composite Scores

Peer reviewed

Direct link

Hsiao, Yu-Yu; Kwok, Oi-Man; Lai, Mark H. C. – Educational and Psychological Measurement, 2018

Path models with observed composites based on multiple items (e.g., mean or sum score of the items) are commonly used to test interaction effects. Under this practice, researchers generally assume that the observed composites are measured without errors. In this study, we reviewed and evaluated two alternative methods within the structural…

Descriptors: Error of Measurement, Testing, Scores, Models

ACT Reporting Category Interpretation Guide: Version 1.0. ACT Working Paper 2016 (05)

Download full text

Powers, Sonya; Li, Dongmei; Suh, Hongwook; Harris, Deborah J. – ACT, Inc., 2016

ACT reporting categories and ACT Readiness Ranges are new features added to the ACT score reports starting in fall 2016. For each reporting category, the number correct score, the maximum points possible, the percent correct, and the ACT Readiness Range, along with an indicator of whether the reporting category score falls within the Readiness…

Descriptors: Scores, Classification, College Entrance Examinations, Error of Measurement

Investigating Score Dependability in English/Chinese Interpreter Certification Performance Testing: A Generalizability Theory Approach

Peer reviewed

Direct link

Han, Chao – Language Assessment Quarterly, 2016

As a property of test scores, reliability/dependability constitutes an important psychometric consideration, and it underpins the validity of measurement results. A review of interpreter certification performance tests (ICPTs) reveals that (a) although reliability/dependability checking has been recognized as an important concern, its theoretical…

Descriptors: Foreign Countries, Scores, English, Chinese

The Errors of Our Ways

Peer reviewed

Direct link

Kane, Michael – Journal of Educational Measurement, 2011

Errors don't exist in our data, but they serve a vital function. Reality is complicated, but our models need to be simple in order to be manageable. We assume that attributes are invariant over some conditions of observation, and once we do that we need some way of accounting for the variability in observed scores over these conditions of…

Descriptors: Error of Measurement, Scores, Test Interpretation, Testing

Test-Retest Reproducibility of Two Short-Form Balance Measures Used in Individuals with Stroke

Peer reviewed

Direct link

Liaw, Lih-Jiun; Hsieh, Ching-Lin; Hsu, Miao-Ju; Chen, Hui-Mei; Lin, Jau-Hong; Lo, Sing-Kai – International Journal of Rehabilitation Research, 2012

The aim of this study is to determine the test-retest reproducibility of the seven-item Short-Form Berg Balance Scale (SFBBS) and the five-item Short-Form Postural Assessment Scale for Stroke Patients (SFPASS) in individuals with chronic stroke. Fifty-two chronic stroke patients from two rehabilitation departments were included in the study. Both…

Descriptors: Measurement, Measures (Individuals), Correlation, Patients

Generalizability Theory and the Fair and Valid Assessment of Linguistic Minorities

Peer reviewed

Direct link

Solano-Flores, Guillermo; Li, Min – Educational Research and Evaluation, 2013

We discuss generalizability (G) theory and the fair and valid assessment of linguistic minorities, especially emergent bilinguals. G theory allows examination of the relationship between score variation and language variation (e.g., variation of proficiency across languages, language modes, and social contexts). Studies examining score variation…

Descriptors: Measurement, Testing, Language Proficiency, Test Construction

On the Use, the Misuse, and the Very Limited Usefulness of Cronbach's Alpha

Peer reviewed

Direct link

Sijtsma, Klaas – Psychometrika, 2009

This discussion paper argues that both the use of Cronbach's alpha as a reliability estimate and as a measure of internal consistency suffer from major problems. First, alpha always has a value, which cannot be equal to the test score's reliability given the inter-item covariance matrix and the usual assumptions about measurement error. Second, in…

Descriptors: Measurement, Error of Measurement, Scores, Computation

Errors of Measurement, Theory, and Public Policy. William H. Angoff Memorial Lecture Series

Download full text

Kane, Michael – Educational Testing Service, 2010

The 12th annual William H. Angoff Memorial Lecture was presented by Dr. Michael T. Kane, ETS's (Educational Testing Service) Samuel J. Messick Chair in Test Validity and the former Director of Research at the National Conference of Bar Examiners. Dr. Kane argues that it is important for policymakers to recognize the impact of errors of measurement…

Descriptors: Error of Measurement, Scores, Public Policy, Test Theory

Commonly Unrecognized Error Variance in Statewide Assessment Programs: Sources of Error Variance and What Can Be Done to Reduce Them

Download full text

Brockmann, Frank – Council of Chief State School Officers, 2011

State testing programs today are more extensive than ever, and their results are required to serve more purposes and high-stakes decisions than one might have imagined. Assessment results are used to hold schools, districts, and states accountable for student performance and to help guide a multitude of important decisions. This report describes…

Descriptors: Accuracy, Measurement, Testing, Expertise

IQ Scores Should Be Corrected for the Flynn Effect in High-Stakes Decisions

Peer reviewed

Direct link

Fletcher, Jack M.; Stuebing, Karla K.; Hughes, Lisa C. – Journal of Psychoeducational Assessment, 2010

IQ test scores should be corrected for high stakes decisions that employ these assessments, including capital offense cases. If scores are not corrected, then diagnostic standards must change with each generation. Arguments against corrections, based on standards of practice, information present and absent in test manuals, and related issues,…

Descriptors: Testing, Mental Retardation, Validity, Intelligence Quotient

Addressing Two Commonly Unrecognized Sources of Score Instability in Annual State Assessments

Download full text

Doorey, Nancy A. – Council of Chief State School Officers, 2011

The work reported in this paper reflects a collaborative effort of many individuals representing multiple organizations. It began during a session at the October 2008 meeting of TILSA when a representative of a member state asked the group if any of their programs had experienced unexpected fluctuations in the annual state assessment scores, and…

Descriptors: Testing, Sampling, Expertise, Testing Programs

High School Exit Exams and "Mis"measurement

Peer reviewed

Direct link

Tienken, Christopher H. – Educational Forum, 2011

Test score validity takes center stage in the debate over the use of high school exit exams. Scant literature addresses the amount of conditional standard error of measurement (CSEM) present in individual student results on high school exit exams. The purpose of this study is to fill a void in the literature and add a national review of the CSEM,…

Descriptors: High Schools, Exit Examinations, State Departments of Education, Error of Measurement

Previous Page | Next Page »

Pages: 1 | 2

Kane, Michael	2
Li, Min	2
Solano-Flores, Guillermo	2
Aucejo, Esteban	1
Benjamin W. Domingue	1
Briggs, Derek C.	1
Brockmann, Frank	1
Chen, Hui-Mei	1
Cronbach, Lee J.	1
Doorey, Nancy A.	1
Fletcher, Jack M.	1
Foster, Jeff L.	1
Han, Chao	1
Harris, Deborah J.	1
Hsiao, Yu-Yu	1
Hsieh, Ching-Lin	1
Hsu, Miao-Ju	1
Hughes, Lisa C.	1
James G. Soland	1
Jeff Allen	1
Joshua B. Gilbert	1
Kwok, Oi-Man	1
Lai, Mark H. C.	1
Li, Dongmei	1
More ▼