ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	9

Descriptor

Error of Measurement	28
Scores	28
Test Theory	28
Reliability	9
Test Reliability	7
Correlation	5
Item Response Theory	5
Measurement Techniques	5
Computation	4
Cutting Scores	4
Generalizability Theory	4
Mathematical Models	4
Test Items	4
Achievement Gains	3
College Entrance Examinations	3
Comparative Analysis	3
Educational Testing	3
Factor Analysis	3
High Stakes Tests	3
Measurement	3
Psychometrics	3
Regression (Statistics)	3
Statistical Distributions	3
Test Interpretation	3
Test Validity	3
More ▼

Source

Educational and Psychological…	6
Applied Psychological…	3
ACT, Inc.	1
Dyslexia	1
ETS Research Report Series	1
Educational Measurement:…	1
Educational Testing Service	1
IEEE Transactions on Education	1
International Online Journal…	1
Journal of Experimental…	1
Journal of Special Education	1
National Center for Analysis…	1
Psychometrika	1
More ▼

Publication Type

Journal Articles	17
Reports - Research	14
Reports - Evaluative	10
Speeches/Meeting Papers	6
Book/Product Reviews	2
Reports - Descriptive	2
Guides - Non-Classroom	1
Opinion Papers	1

Education Level

Higher Education	4
Postsecondary Education	4
Elementary Secondary Education	1

Audience

Researchers	2
Practitioners	1

Location

Canada	1
New York	1
Turkey	1
United Kingdom (Great Britain)	1

Laws, Policies, & Programs

No Child Left Behind Act 2001

Assessments and Surveys

ACT Assessment	1
Alabama High School…	1
SAT (College Admission Test)	1

What Works Clearinghouse Rating

Showing 1 to 15 of 28 results Save | Export

Comparison of Performance Measures Obtained from Foreign Language Tests According to Item Response Theory vs Classical Test Theory

Peer reviewed
PDF on ERIC

Download full text

Polat, Murat – International Online Journal of Education and Teaching, 2022

Foreign language testing is a multi-dimensional phenomenon and obtaining objective and error-free scores on learners' language skills is often problematic. While assessing foreign language performance on high-stakes tests, using different testing approaches including Classical Test Theory (CTT), Generalizability Theory (GT) and/or Item Response…

Descriptors: Second Language Learning, Second Language Instruction, Item Response Theory, Language Tests

Measurement Error Correction Formula for Cluster-Level Group Differences in Cluster Randomized and Observational Studies

Peer reviewed

Direct link

Cho, Sun-Joo; Preacher, Kristopher J. – Educational and Psychological Measurement, 2016

Multilevel modeling (MLM) is frequently used to detect cluster-level group differences in cluster randomized trial and observational studies. Group differences on the outcomes (posttest scores) are detected by controlling for the covariate (pretest scores) as a proxy variable for unobserved factors that predict future attributes. The pretest and…

Descriptors: Error of Measurement, Error Correction, Multivariate Analysis, Hierarchical Linear Modeling

The Reliability and Precision of Total Scores and IRT Estimates as a Function of Polytomous IRT Parameters and Latent Trait Distribution

Peer reviewed

Direct link

Culpepper, Steven Andrew – Applied Psychological Measurement, 2013

A classic topic in the fields of psychometrics and measurement has been the impact of the number of scale categories on test score reliability. This study builds on previous research by further articulating the relationship between item response theory (IRT) and classical test theory (CTT). Equations are presented for comparing the reliability and…

Descriptors: Item Response Theory, Reliability, Scores, Error of Measurement

A Comparison of Three Methods for Computing Scale Score Conditional Standard Errors of Measurement. ACT Research Report Series, 2013 (7)

Download full text

Woodruff, David; Traynor, Anne; Cui, Zhongmin; Fang, Yu – ACT, Inc., 2013

Professional standards for educational testing recommend that both the overall standard error of measurement and the conditional standard error of measurement (CSEM) be computed on the score scale used to report scores to examinees. Several methods have been developed to compute scale score CSEMs. This paper compares three methods, based on…

Descriptors: Comparative Analysis, Error of Measurement, Scores, Scaling

A Control Systems Concept Inventory Test Design and Assessment

Peer reviewed

Direct link

Bristow, M.; Erkorkmaz, K.; Huissoon, J. P.; Jeon, Soo; Owen, W. S.; Waslander, S. L.; Stubley, G. D. – IEEE Transactions on Education, 2012

Any meaningful initiative to improve the teaching and learning in introductory control systems courses needs a clear test of student conceptual understanding to determine the effectiveness of proposed methods and activities. The authors propose a control systems concept inventory. Development of the inventory was collaborative and iterative. The…

Descriptors: Diagnostic Tests, Concept Formation, Undergraduate Students, Engineering Education

Defensible Progress Monitoring Data for Medium- and High-Stakes Decisions

Peer reviewed

Direct link

Parker, Richard I.; Vannest, Kimberly J.; Davis, John L.; Clemens, Nathan H. – Journal of Special Education, 2012

Within a response to intervention model, educators increasingly use progress monitoring (PM) to support medium- to high-stakes decisions for individual students. For PM to serve these more demanding decisions requires more careful consideration of measurement error. That error should be calculated within a fixed linear regression model rather than…

Descriptors: Measurement, Computation, Response to Intervention, Regression (Statistics)

Errors of Measurement, Theory, and Public Policy. William H. Angoff Memorial Lecture Series

Download full text

Kane, Michael – Educational Testing Service, 2010

The 12th annual William H. Angoff Memorial Lecture was presented by Dr. Michael T. Kane, ETS's (Educational Testing Service) Samuel J. Messick Chair in Test Validity and the former Director of Research at the National Conference of Bar Examiners. Dr. Kane argues that it is important for policymakers to recognize the impact of errors of measurement…

Descriptors: Error of Measurement, Scores, Public Policy, Test Theory

Linear Dependence on Gain Scores in Their Components Imposes Constraints on Their Use and Interpretation: Comment on "Are Simple Gain Scores Obsolete?"

Peer reviewed

Humphreys, Lloyd G. – Applied Psychological Measurement, 1996

The reliability of a gain is determined by the reliabilities of the components, the correlation between them, and their standard deviations. Reliability is not inherently low, but the components of gains in many investigations make low reliability likely and require caution in the use of gain scores. (SLD)

Descriptors: Achievement Gains, Change, Correlation, Error of Measurement

Commentary on the Commentaries of Collins and Humphreys.

Peer reviewed

Williams, Richard H.; Zimmerman, Donald W. – Applied Psychological Measurement, 1996

The critiques by L. Collins and L. Humphreys in this issue illustrate problems with the use of gain scores. Collins' examples show that familiar formulas for the reliability of differences do not reflect the precision of measures of change. Additional examples demonstrate flaws in the conventional approach to reliability. (SLD)

Descriptors: Achievement Gains, Change, Correlation, Error of Measurement

Classical Test Theory in Historical Perspective.

Peer reviewed

Traub, Ross E. – Educational Measurement: Issues and Practice, 1997

Classical test theory is founded on the proposition that measurement error, a random latent variable, is a component of the observed score random variable. This article traces the history of the development of classical test theory, beginning in the early 20th century. (SLD)

Descriptors: Educational History, Educational Testing, Error of Measurement, Psychometrics

Resolving Differences among Methods of Establishing Confidence Limits for Test Scores.

Peer reviewed

Glutting, Joseph J.; And Others – Educational and Psychological Measurement, 1987

This paper discusses the basic theory underlying confidence limits and presents reasons why psychologists should incorporate confidence ranges in their psychodiagnostic reports. Four methods for establishing confidence limits are compared. Three of the methods involve estimated true scores, and the fourth is the standard error of measurement…

Descriptors: Error of Measurement, Mathematical Formulas, Psychological Evaluation, Scores

Best Linear Prediction of Composite Universe Scores.

Peer reviewed

Jarjoura, David – Psychometrika, 1983

The problem of predicting universe scores for samples of examinees based on their responses to samples of items is treated. The measurement model categorizes items according to the cells of a table of test specifications, and the linear function derived for minimizing error variance in prediction uses responses to these categories. (Author/JKS)

Descriptors: Error of Measurement, Generalizability Theory, Item Sampling, Prediction

The Reliability of Sums and Differences of Test Scores: Some New Results and Anomalies.

Peer reviewed

Zimmerman, Donald W.; And Others – Journal of Experimental Education, 1981

Reliability coefficients of linear combinations of observed scores have anomalous properties which have led to difficulties in the investigation of difference scores and gain scores in test theory. Discrepancies between classical results and correct results obtained from more general formulas, which allow for correlated errors, are examined…

Descriptors: Error of Measurement, Mathematical Formulas, Mathematical Models, Scores

When Can Subscores Have Value? Research Report. ETS RR-05-08

Peer reviewed
PDF on ERIC

Download full text

Haberman, Shelby J. – ETS Research Report Series, 2005

In educational tests, subscores are often generated from a portion of the items in a larger test. Guidelines based on mean-squared error are proposed to indicate whether subscores are worth reporting. Alternatives considered are direct reports of subscores, estimates of subscores based on total score, combined estimates based on subscores and…

Descriptors: Scores, Test Items, Error of Measurement, Computation

The Role of Weighting in the Use of Aggregate Scores.

Peer reviewed

Stevens, Joseph J.; Aleamoni, Lawrence, M. – Educational and Psychological Measurement, 1986

Prior standardization of scores when an aggregate score is formed has been criticized. This article presents a demonstration of the effects of differential weighting of aggregate components that clarifies the need for prior standardization. The role of standardization in statistics and the use of aggregate scores in research are discussed.…

Descriptors: Correlation, Error of Measurement, Factor Analysis, Raw Scores

Previous Page | Next Page »

Pages: 1 | 2

Kane, Michael	2
Thompson, Bruce	2
Zimmerman, Donald W.	2
Aleamoni, Lawrence, M.	1
Blixt, Sonya L.	1
Borrello, Gloria M.	1
Boyd, Donald	1
Bristow, M.	1
Cho, Sun-Joo	1
Clemens, Nathan H.	1
Cotton, Sue M.	1
Crewther, David P.	1
Crewther, Sheila G.	1
Crowley, Susan	1
Cui, Zhongmin	1
Culpepper, Steven Andrew	1
Davis, John L.	1
Ecob, Russell	1
Erkorkmaz, K.	1
Espelage, Dorothy L.	1
Fang, Yu	1
Ferrando, Pere J.	1
Glutting, Joseph J.	1
Goldstein, Harvey	1
More ▼