ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	11
Since 2016 (last 10 years)	30
Since 2006 (last 20 years)	71

Descriptor

Equated Scores	110
Sample Size	110
Test Items	38
Item Response Theory	34
Error of Measurement	33
Comparative Analysis	29
Accuracy	24
Simulation	23
Statistical Analysis	20
Sampling	19
Difficulty Level	17
Test Length	17
Test Construction	13
Test Format	13
Estimation (Mathematics)	12
Mathematical Models	12
Raw Scores	12
Statistical Bias	12
Computation	11
Methods	10
Evaluation Methods	9
Licensing Examinations…	9
Testing Programs	9
True Scores	9
Computer Simulation	7
More ▼

Publication Type

Journal Articles	75
Reports - Research	73
Reports - Evaluative	29
Speeches/Meeting Papers	18
Dissertations/Theses -…	4
Numerical/Quantitative Data	4
Reports - Descriptive	4
Tests/Questionnaires	1

Education Level

Higher Education	3
Secondary Education	3
Grade 8	2
Junior High Schools	2
Middle Schools	2
Postsecondary Education	2
Elementary Education	1
Elementary Secondary Education	1
Grade 4	1
High Schools	1
Intermediate Grades	1
More ▼

Audience

Researchers	2
Practitioners	1

Location

Canada	1
Delaware	1
Florida	1

Laws, Policies, & Programs

Assessments and Surveys

Advanced Placement…	4
Test of English as a Foreign…	3
ACT Assessment	2
National Assessment of…	2
Dynamic Indicators of Basic…	1
General Educational…	1
Iowa Tests of Basic Skills	1
Measures of Academic Progress	1
SAT (College Admission Test)	1
Wechsler Adult Intelligence…	1
Wechsler Intelligence Scale…	1
More ▼

What Works Clearinghouse Rating

Showing 1 to 15 of 110 results Save | Export

The NEAT Equating via Chaining Random Forests in the Context of Small Sample Sizes: A Machine-Learning Method

Peer reviewed

Direct link

Jiang, Zhehan; Han, Yuting; Xu, Lingling; Shi, Dexin; Liu, Ren; Ouyang, Jinying; Cai, Fen – Educational and Psychological Measurement, 2023

The part of responses that is absent in the nonequivalent groups with anchor test (NEAT) design can be managed to a planned missing scenario. In the context of small sample sizes, we present a machine learning (ML)-based imputation technique called chaining random forests (CRF) to perform equating tasks within the NEAT design. Specifically, seven…

Descriptors: Test Items, Equated Scores, Sample Size, Artificial Intelligence

Evaluating Six Approaches to Handling Zero-Frequency Scores under Equipercentile Equating

Peer reviewed

Direct link

Sun, Ting; Kim, Stella Yun – Measurement: Interdisciplinary Research and Perspectives, 2021

In many large testing programs, equipercentile equating has been widely used under a random groups design to adjust test difficulty between forms. However, one thorny issue occurs with equipercentile equating when a particular score has no observed frequency. The purpose of this study is to suggest and evaluate six potential methods in…

Descriptors: Equated Scores, Test Length, Sample Size, Methods

What Affects the Quality of Score Transformations? Potential Issues in True-Score Equating Using the Partial Credit Model

Peer reviewed

Direct link

Fellinghauer, Carolina; Debelak, Rudolf; Strobl, Carolin – Educational and Psychological Measurement, 2023

This simulation study investigated to what extent departures from construct similarity as well as differences in the difficulty and targeting of scales impact the score transformation when scales are equated by means of concurrent calibration using the partial credit model with a common person design. Practical implications of the simulation…

Descriptors: True Scores, Equated Scores, Test Items, Sample Size

Effect of Missing Data on Test Equating Methods Under NEAT Design

Peer reviewed
PDF on ERIC

Download full text

Semih Asiret; Seçil Ömür Sünbül – International Journal of Psychology and Educational Studies, 2023

In this study, it was aimed to examine the effect of missing data in different patterns and sizes on test equating methods under the NEAT design for different factors. For this purpose, as part of this study, factors such as sample size, average difficulty level difference between the test forms, difference between the ability distribution,…

Descriptors: Research Problems, Data, Test Items, Equated Scores

Evaluating Population Invariance of Test Equating during the COVID-19 Pandemic

Peer reviewed

Direct link

Li, Dongmei; Kapoor, Shalini – Educational Measurement: Issues and Practice, 2022

Population invariance is a desirable property of test equating which might not hold when significant changes occur in the test population, such as those brought about by the COVID-19 pandemic. This research aims to investigate whether equating functions are reasonably invariant when the test population is impacted by the pandemic. Based on…

Descriptors: Test Items, Equated Scores, COVID-19, Pandemics

Detecting Item Parameter Drift in Small Sample Rasch Equating

Peer reviewed

Direct link

Daniel Jurich; Chunyan Liu – Applied Measurement in Education, 2023

Screening items for parameter drift helps protect against serious validity threats and ensure score comparability when equating forms. Although many high-stakes credentialing examinations operate with small sample sizes, few studies have investigated methods to detect drift in small sample equating. This study demonstrates that several newly…

Descriptors: High Stakes Tests, Sample Size, Item Response Theory, Equated Scores

Comparing Drift Detection Methods for Accurate Rasch Equating in Different Sample Sizes

Peer reviewed

Direct link

Alahmadi, Sarah; Jones, Andrew T.; Barry, Carol L.; Ibáñez, Beatriz – Applied Measurement in Education, 2023

Rasch common-item equating is often used in high-stakes testing to maintain equivalent passing standards across test administrations. If unaddressed, item parameter drift poses a major threat to the accuracy of Rasch common-item equating. We compared the performance of well-established and newly developed drift detection methods in small and large…

Descriptors: Equated Scores, Item Response Theory, Sample Size, Test Items

Effect of Statistically Matching Equating Samples for Common-Item Equating. Research Report. ETS RR-21-02

Peer reviewed
PDF on ERIC

Download full text

Lu, Ru; Kim, Sooyeon – ETS Research Report Series, 2021

This study evaluated the impact of subgroup weighting for equating through a common-item anchor. We used data from a single test form to create two research forms for which the equating relationship was known. The results showed that equating was most accurate when the new form and reference form samples were weighted to be similar to the target…

Descriptors: Equated Scores, Weighted Scores, Raw Scores, Test Items

A New Statistic for Selecting the Smoothing Parameter for Polynomial Loglinear Equating under the Random Groups Design

Peer reviewed

Direct link

Liu, Chunyan; Kolen, Michael J. – Journal of Educational Measurement, 2020

Smoothing is designed to yield smoother equating results that can reduce random equating error without introducing very much systematic error. The main objective of this study is to propose a new statistic and to compare its performance to the performance of the Akaike information criterion and likelihood ratio chi-square difference statistics in…

Descriptors: Equated Scores, Statistical Analysis, Error of Measurement, Criteria

Detection of Outliers in Anchor Items Using Modified Rasch Fit Statistics

Peer reviewed

Direct link

Liu, Chunyan; Jurich, Daniel; Morrison, Carol; Grabovsky, Irina – Applied Measurement in Education, 2021

The existence of outliers in the anchor items can be detrimental to the estimation of examinee ability and undermine the validity of score interpretation across forms. However, in practice, anchor item performance can become distorted due to various reasons. This study compares the performance of modified "INFIT" and "OUTFIT"…

Descriptors: Equated Scores, Test Items, Item Response Theory, Difficulty Level

Effectiveness of Equating at the Passing Score for Exams with Small Sample Sizes

Peer reviewed

Direct link

Wolkowitz, Amanda A.; Wright, Keith D. – Journal of Educational Measurement, 2019

This article explores the amount of equating error at a passing score when equating scores from exams with small samples sizes. This article focuses on equating using classical test theory methods of Tucker linear, Levine linear, frequency estimation, and chained equipercentile equating. Both simulation and real data studies were used in the…

Descriptors: Error Patterns, Sample Size, Test Theory, Test Bias

The Effect of Chance Success on Equalization Error in Test Equation Based on Classical Test Theory

Peer reviewed
PDF on ERIC

Download full text

Koçak, Duygu – International Journal of Progressive Education, 2020

The aim of this study was to determine the effect of chance success on test equalization. For this purpose, artificially generated 500 and 1000 sample size data sets were synchronized using linear equalization and equal percentage equalization methods. In the data which were produced as a simulative, a total of four cases were created with no…

Descriptors: Test Theory, Equated Scores, Error of Measurement, Sample Size

Some Methods and Evaluation for Linking and Equating with Small Samples

Peer reviewed

Direct link

Peabody, Michael R. – Applied Measurement in Education, 2020

The purpose of the current article is to introduce the equating and evaluation methods used in this special issue. Although a comprehensive review of all existing models and methodologies would be impractical given the format, a brief introduction to some of the more popular models will be provided. A brief discussion of the conditions required…

Descriptors: Evaluation Methods, Equated Scores, Sample Size, Item Response Theory

Effect of Sample Size on Common Item Equating Using the Dichotomous Rasch Model

Peer reviewed

Direct link

O'Neill, Thomas R.; Gregg, Justin L.; Peabody, Michael R. – Applied Measurement in Education, 2020

This study addresses equating issues with varying sample sizes using the Rasch model by examining how sample size affects the stability of item calibrations and person ability estimates. A resampling design was used to create 9 sample size conditions (200, 100, 50, 45, 40, 35, 30, 25, and 20), each replicated 10 times. Items were recalibrated…

Descriptors: Sample Size, Equated Scores, Item Response Theory, Raw Scores

Effect of Item Parameter Drift in Mixed Format Common Items on Test Equating

Peer reviewed
PDF on ERIC

Download full text

Uysal, Ibrahim; Sahin-Kürsad, Merve; Kiliç, Abdullah Faruk – Participatory Educational Research, 2022

The aim of the study was to examine the common items in the mixed format (e.g., multiple-choices and essay items) contain parameter drifts in the test equating processes performed with the common item nonequivalent groups design. In this study, which was carried out using Monte Carlo simulation with a fully crossed design, the factors of test…

Descriptors: Test Items, Test Format, Item Response Theory, Equated Scores

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8

ETS Research Report Series	16
Journal of Educational…	14
Applied Measurement in…	13
Educational and Psychological…	8
ProQuest LLC	4
Educational Testing Service	3
Journal of Educational and…	3
Applied Psychological…	2
Educational Sciences: Theory…	2
International Journal of…	2
International Journal of…	2
Journal of Educational…	2
Practical Assessment,…	2
ACT, Inc.	1
AERA Online Paper Repository	1
Assessment for Effective…	1
Educational Measurement:…	1
Eurasian Journal of…	1
International Journal of…	1
Measurement:…	1
Participatory Educational…	1
Pearson	1
Psychometrika	1
Research Matters	1
Research Quarterly for…	1
More ▼

Kim, Sooyeon	10
Livingston, Samuel A.	9
Puhan, Gautam	6
Kolen, Michael J.	5
Moses, Tim	5
Haberman, Shelby	4
von Davier, Alina A.	4
Hanson, Bradley A.	3
Holland, Paul	3
Lewis, Charles	3
Algina, James	2
Babcock, Ben	2
Cheng, Philip E.	2
Cohen, Allan S.	2
Cui, Zhongmin	2
Harris, Deborah J.	2
Kim, Seock-Ho	2
Liou, Michelle	2
Liu, Chunyan	2
Liu, Jinghua	2
Lu, Ru	2
Peabody, Michael R.	2
Phillips, Gary W.	2
Pommerich, Mary	2
More ▼