ERIC - Search Results

Publication Date

In 2025	1
Since 2024	1
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	5
Since 2006 (last 20 years)	11

Descriptor

Equated Scores	13
Error of Measurement	13
Evaluation Methods	13
Sampling	5
Item Response Theory	4
Sample Size	4
Simulation	4
Correlation	3
Statistical Analysis	3
Test Items	3
Testing Programs	3
Achievement Rating	2
Bias	2
Comparative Analysis	2
Computer Assisted Testing	2
Design	2
Evaluation Problems	2
Group Testing	2
Measurement Techniques	2
Probability	2
School Districts	2
Statistical Bias	2
Statistical Distributions	2
Test Bias	2
Test Format	2
More ▼

Source

Applied Measurement in…	3
Applied Psychological…	2
ETS Research Report Series	2
Educational Measurement:…	1
International Journal of…	1
Journal of Educational…	1
Journal of Educational and…	1
Stanford Center for Education…	1

Publication Type

Journal Articles	11
Reports - Research	7
Reports - Evaluative	4
Reports - Descriptive	2
Numerical/Quantitative Data	1
Speeches/Meeting Papers	1

Education Level

Higher Education	2
Adult Education	1
Elementary Secondary Education	1
Grade 4	1
Junior High Schools	1
Middle Schools	1
Postsecondary Education	1
Secondary Education	1

Audience

Researchers

Location

Turkey

Laws, Policies, & Programs

Assessments and Surveys

Praxis Series	1
SAT (College Admission Test)	1

What Works Clearinghouse Rating

Showing all 13 results Save | Export

IRT Observed-Score Equating for Rater-Mediated Assessments Using a Hierarchical Rater Model

Peer reviewed

Direct link

Tong Wu; Stella Y. Kim; Carl Westine; Michelle Boyer – Journal of Educational Measurement, 2025

While significant attention has been given to test equating to ensure score comparability, limited research has explored equating methods for rater-mediated assessments, where human raters inherently introduce error. If not properly addressed, these errors can undermine score interchangeability and test validity. This study proposes an equating…

Descriptors: Item Response Theory, Evaluators, Error of Measurement, Test Validity

Comparison of Kernel Equating Methods under NEAT and NEC Designs

Peer reviewed
PDF on ERIC

Download full text

Ozsoy, Seyma Nur; Kilmen, Sevilay – International Journal of Assessment Tools in Education, 2023

In this study, Kernel test equating methods were compared under NEAT and NEC designs. In NEAT design, Kernel post-stratification and chain equating methods taking into account optimal and large bandwidths were compared. In the NEC design, gender and/or computer/tablet use was considered as a covariate, and Kernel test equating methods were…

Descriptors: Equated Scores, Testing, Test Items, Statistical Analysis

Some Methods and Evaluation for Linking and Equating with Small Samples

Peer reviewed

Direct link

Peabody, Michael R. – Applied Measurement in Education, 2020

The purpose of the current article is to introduce the equating and evaluation methods used in this special issue. Although a comprehensive review of all existing models and methodologies would be impractical given the format, a brief introduction to some of the more popular models will be provided. A brief discussion of the conditions required…

Descriptors: Evaluation Methods, Equated Scores, Sample Size, Item Response Theory

Equating with Small and Unbalanced Samples

Peer reviewed

Direct link

Goodman, Joshua T.; Dallas, Andrew D.; Fan, Fen – Applied Measurement in Education, 2020

Recent research has suggested that re-setting the standard for each administration of a small sample examination, in addition to the high cost, does not adequately maintain similar performance expectations year after year. Small-sample equating methods have shown promise with samples between 20 and 30. For groups that have fewer than 20 students,…

Descriptors: Equated Scores, Sample Size, Sampling, Weighted Scores

Linking U.S. School District Test Score Distributions to a Common Scale. CEPA Working Paper No. 16-09

Download full text

Reardon, Sean F.; Kalogrides, Demetra; Ho, Andrew D. – Stanford Center for Education Policy Analysis, 2017

There is no comprehensive database of U.S. district-level test scores that is comparable across states. We describe and evaluate a method for constructing such a database. First, we estimate linear, reliability-adjusted linking transformations from state test score scales to the scale of the National Assessment of Educational Progress (NAEP). We…

Descriptors: School Districts, Scores, Statistical Distributions, Database Design

Impact of Design Effects in Large-Scale District and State Assessments

Peer reviewed

Direct link

Phillips, Gary W. – Applied Measurement in Education, 2015

This article proposes that sampling design effects have potentially huge unrecognized impacts on the results reported by large-scale district and state assessments in the United States. When design effects are unrecognized and unaccounted for they lead to underestimating the sampling error in item and test statistics. Underestimating the sampling…

Descriptors: State Programs, Sampling, Research Design, Error of Measurement

Using the Kernel Method of Test Equating for Estimating the Standard Errors of Population Invariance Measures

Peer reviewed

Direct link

Moses, Tim – Journal of Educational and Behavioral Statistics, 2008

Equating functions are supposed to be population invariant, meaning that the choice of subpopulation used to compute the equating function should not matter. The extent to which equating functions are population invariant is typically assessed in terms of practical difference criteria that do not account for equating functions' sampling…

Descriptors: Equated Scores, Error of Measurement, Sampling, Evaluation Methods

Measurement, Sampling, and Equating Errors in Large-Scale Assessments

Peer reviewed

Direct link

Wu, Margaret – Educational Measurement: Issues and Practice, 2010

In large-scale assessments, such as state-wide testing programs, national sample-based assessments, and international comparative studies, there are many steps involved in the measurement and reporting of student achievement. There are always sources of inaccuracies in each of the steps. It is of interest to identify the source and magnitude of…

Descriptors: Testing Programs, Educational Assessment, Measures (Individuals), Program Effectiveness

The Effectiveness of Circular Equating as a Criterion for Evaluating Equating.

Peer reviewed

Wang, Tianyou; Hanson, Bradley A.; Harris, Deborah J. – Applied Psychological Measurement, 2000

Studied whether circular equating could provide an adequate measure of various types of equating error when applied to different equating methods under different equating designs. Analyses and simluations show that circular equating is generally invalid as a criterion to evaluate the adequacy of equating. (SLD)

Descriptors: Criteria, Equated Scores, Error of Measurement, Evaluation Methods

An Exploration of Kernel Equating Using SAT® Data: Equating to a Similar Population and to a Distant Population. Research Report. ETS RR-07-17

Peer reviewed
PDF on ERIC

Download full text

Liu, Jinghua; Low, Albert C. – ETS Research Report Series, 2007

This study applied kernel equating (KE) in two scenarios: equating to a very similar population and equating to a very different population, referred to as a distant population, using SAT® data. The KE results were compared to the results obtained from analogous classical equating methods in both scenarios. The results indicate that KE results are…

Descriptors: College Entrance Examinations, Equated Scores, Comparative Analysis, Evaluation Methods

Choice of Anchor Test in Equating. Research Report. ETS RR-06-35

Peer reviewed
PDF on ERIC

Download full text

Sinharay, Sandip; Holland, Paul – ETS Research Report Series, 2006

It is a widely held belief that anchor tests should be miniature versions (i.e., minitests), with respect to content and statistical characteristics of the tests being equated. This paper examines the foundations for this belief. It examines the requirement of statistical representativeness of anchor tests that are content representative. The…

Descriptors: Test Items, Equated Scores, Evaluation Methods, Difficulty Level

Equating Scores from Adaptive to Linear Tests

Peer reviewed

Direct link

van der Linden, Wim J. – Applied Psychological Measurement, 2006

Two local methods for observed-score equating are applied to the problem of equating an adaptive test to a linear test. In an empirical study, the methods were evaluated against a method based on the test characteristic function (TCF) of the linear test and traditional equipercentile equating applied to the ability estimates on the adaptive test…

Descriptors: Adaptive Testing, Computer Assisted Testing, Test Format, Equated Scores

Download full text

Cook, Linda L.; Petersen, Nancy S. – 1986

This paper examines how various equating methods are affected by: (1) sampling error; (2) sample characteristics; and (3) characteristics of anchor test items. It reviews empirical studies that investigated the invariance of equating transformations, and it discusses empirical and simulation studies that focus on how the properties of anchor tests…

Descriptors: Educational Research, Equated Scores, Error of Measurement, Evaluation Methods

Carl Westine	1
Cook, Linda L.	1
Dallas, Andrew D.	1
Fan, Fen	1
Goodman, Joshua T.	1
Hanson, Bradley A.	1
Harris, Deborah J.	1
Ho, Andrew D.	1
Holland, Paul	1
Kalogrides, Demetra	1
Kilmen, Sevilay	1
Liu, Jinghua	1
Low, Albert C.	1
Michelle Boyer	1
Moses, Tim	1
Ozsoy, Seyma Nur	1
Peabody, Michael R.	1
Petersen, Nancy S.	1
Phillips, Gary W.	1
Reardon, Sean F.	1
Sinharay, Sandip	1
Stella Y. Kim	1
Tong Wu	1
Wang, Tianyou	1
Wu, Margaret	1
More ▼