ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	1
Since 2006 (last 20 years)	5

Descriptor

Comparative Analysis	6
Error of Measurement	6
Scaling	6
Equated Scores	4
Item Response Theory	4
College Entrance Examinations	2
Statistical Bias	2
Test Items	2
Test Theory	2
True Scores	2
Ability	1
Aptitude Tests	1
Bayesian Statistics	1
Cognitive Tests	1
Computation	1
Computer Simulation	1
Correlation	1
Critical Reading	1
Differences	1
Educational Testing	1
Evaluation Criteria	1
Generalizability Theory	1
Mathematics	1
Mathematics Tests	1
Methods	1
More ▼

Source

ETS Research Report Series	2
ACT, Inc.	1
Applied Measurement in…	1
Educational Measurement:…	1

Author

Moses, Tim	2
Cui, Zhongmin	1
Curley, Edward	1
Dorans, Neil	1
Fang, Yu	1
Fitzpatrick, Steven J.	1
Guo, Hongwen	1
Kim, Sooyeon	1
Kim, Stella Yun	1
Lee, Won-Chan	1
Liu, Jinghua	1
Morrison, Carol A.	1
Traynor, Anne	1
Woodruff, David	1
More ▼

Publication Type

Reports - Research	5
Journal Articles	4
Reports - Evaluative	1

Education Level

Higher Education	2
Postsecondary Education	2
High Schools	1
Secondary Education	1

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

ACT Assessment	1
SAT (College Admission Test)	1

What Works Clearinghouse Rating

Showing all 6 results Save | Export

Maintaining Score Scales over Time: A Comparison of Five Scoring Methods

Peer reviewed

Direct link

Kim, Stella Yun; Lee, Won-Chan – Applied Measurement in Education, 2023

This study evaluates various scoring methods including number-correct scoring, IRT theta scoring, and hybrid scoring in terms of scale-score stability over time. A simulation study was conducted to examine the relative performance of five scoring methods in terms of preserving the first two moments of scale scores for a population in a chain of…

Descriptors: Scoring, Comparative Analysis, Item Response Theory, Simulation

Quantifying Error and Uncertainty Reductions in Scaling Functions: An ITEMS Module

Peer reviewed

Direct link

Moses, Tim – Educational Measurement: Issues and Practice, 2014

This module describes and extends X-to-Y regression measures that have been proposed for use in the assessment of X-to-Y scaling and equating results. Measures are developed that are similar to those based on prediction error in regression analyses but that are directly suited to interests in scaling and equating evaluations. The regression and…

Descriptors: Scaling, Regression (Statistics), Equated Scores, Comparative Analysis

A Comparison of Three Methods for Computing Scale Score Conditional Standard Errors of Measurement. ACT Research Report Series, 2013 (7)

Download full text

Woodruff, David; Traynor, Anne; Cui, Zhongmin; Fang, Yu – ACT, Inc., 2013

Professional standards for educational testing recommend that both the overall standard error of measurement and the conditional standard error of measurement (CSEM) be computed on the score scale used to report scores to examinees. Several methods have been developed to compute scale score CSEMs. This paper compares three methods, based on…

Descriptors: Comparative Analysis, Error of Measurement, Scores, Scaling

The Stability of the Score Scales for the "SAT Reasoning Test"™ from 2005 to 2010. Research Report. ETS RR-12-15

Peer reviewed
PDF on ERIC

Download full text

Guo, Hongwen; Liu, Jinghua; Curley, Edward; Dorans, Neil – ETS Research Report Series, 2012

This study examines the stability of the "SAT Reasoning Test"™ score scales from 2005 to 2010. A 2005 old form (OF) was administered along with a 2010 new form (NF). A new conversion for OF was derived through direct equipercentile equating. A comparison of the newly derived and the original OF conversions showed that Critical Reading…

Descriptors: Aptitude Tests, Cognitive Tests, Thinking Skills, Equated Scores

Reliability and the Nonequivalent Groups with Anchor Test Design. Research Report. ETS RR-07-16

Peer reviewed
PDF on ERIC

Download full text

Moses, Tim; Kim, Sooyeon – ETS Research Report Series, 2007

This study evaluated the impact of unequal reliability on test equating methods in the nonequivalent groups with anchor test (NEAT) design. Classical true score-based models were compared in terms of their assumptions about how reliability impacts test scores. These models were related to treatment of population ability differences by different…

Descriptors: Reliability, Equated Scores, Test Items, Statistical Analysis

Direct and Indirect Equating: A Comparison of Four Methods Using the Rasch Model.

Download full text

Morrison, Carol A.; Fitzpatrick, Steven J. – 1992

An attempt was made to determine which item response theory (IRT) equating method results in the least amount of equating error or "scale drift" when equating scores across one or more test forms. An internal anchor test design was employed with five different test forms, each consisting of 30 items, 10 in common with the base test and 5…

Descriptors: Comparative Analysis, Computer Simulation, Equated Scores, Error of Measurement