ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	4
Since 2006 (last 20 years)	12

Descriptor

Difficulty Level	19
Equated Scores	19
Simulation	19
Test Items	12
Item Response Theory	10
Error of Measurement	8
Comparative Analysis	6
Sample Size	5
Correlation	4
Test Construction	4
Test Format	4
Latent Trait Theory	3
Measurement	3
Methods	3
Psychometrics	3
Sampling	3
Statistical Analysis	3
Test Length	3
Test Theory	3
Testing	3
True Scores	3
Ability	2
Ability Grouping	2
Accuracy	2
Computer Assisted Testing	2
More ▼

Source

ETS Research Report Series	3
ProQuest LLC	3
Journal of Educational…	2
Applied Measurement in…	1
Educational Sciences: Theory…	1
International Journal of…	1
Journal of Applied Measurement	1
Practical Assessment,…	1
Research Matters	1

Publication Type

Reports - Research	12
Journal Articles	11
Reports - Evaluative	4
Dissertations/Theses -…	3
Speeches/Meeting Papers	3
Numerical/Quantitative Data	1

Education Level

Higher Education	1
Postsecondary Education	1

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…

What Works Clearinghouse Rating

Showing 1 to 15 of 19 results Save | Export

Impacts of Differences in Group Abilities and Anchor Test Features on Three Non-IRT Test Equating Methods

Peer reviewed
PDF on ERIC

Download full text

Inga Laukaityte; Marie Wiberg – Practical Assessment, Research & Evaluation, 2024

The overall aim was to examine effects of differences in group ability and features of the anchor test form on equating bias and the standard error of equating (SEE) using both real and simulated data. Chained kernel equating, Postratification kernel equating, and Circle-arc equating were studied. A college admissions test with four different…

Descriptors: Ability Grouping, Test Items, College Entrance Examinations, High Stakes Tests

Comparing Small-Sample Equating with Angoff Judgement for Linking Cut-Scores on Two Tests

Download full text

Bramley, Tom – Research Matters, 2020

The aim of this study was to compare, by simulation, the accuracy of mapping a cut-score from one test to another by expert judgement (using the Angoff method) versus the accuracy with a small-sample equating method (chained linear equating). As expected, the standard-setting method resulted in more accurate equating when we assumed a higher level…

Descriptors: Cutting Scores, Standard Setting (Scoring), Equated Scores, Accuracy

Investigating Test Equating Methods in Small Samples through Various Factors

Peer reviewed
PDF on ERIC

Download full text

Asiret, Semih; Sünbül, Seçil Ömür – Educational Sciences: Theory and Practice, 2016

In this study, equating methods for random group design using small samples through factors such as sample size, difference in difficulty between forms, and guessing parameter was aimed for comparison. Moreover, which method gives better results under which conditions was also investigated. In this study, 5,000 dichotomous simulated data…

Descriptors: Equated Scores, Sample Size, Difficulty Level, Guessing (Tests)

Equating with Miditests Using IRT

Peer reviewed

Direct link

Fitzpatrick, Joseph; Skorupski, William P. – Journal of Educational Measurement, 2016

The equating performance of two internal anchor test structures--miditests and minitests--is studied for four IRT equating methods using simulated data. Originally proposed by Sinharay and Holland, miditests are anchors that have the same mean difficulty as the overall test but less variance in item difficulties. Four popular IRT equating methods…

Descriptors: Difficulty Level, Test Items, Comparative Analysis, Test Construction

The Effect of Anchor Test Construction on Scale Drift

Peer reviewed

Direct link

Antal, Judit; Proctor, Thomas P.; Melican, Gerald J. – Applied Measurement in Education, 2014

In common-item equating the anchor block is generally built to represent a miniature form of the total test in terms of content and statistical specifications. The statistical properties frequently reflect equal mean and spread of item difficulty. Sinharay and Holland (2007) suggested that the requirement for equal spread of difficulty may be too…

Descriptors: Test Items, Equated Scores, Difficulty Level, Item Response Theory

An Investigation of the Impact of Misrouting under Two-Stage Multistage Testing: A Simulation Study. Research Report. ETS RR-14-01

Peer reviewed
PDF on ERIC

Download full text

Kim, Sooyeon; Moses, Tim – ETS Research Report Series, 2014

The purpose of this study was to investigate the potential impact of misrouting under a 2-stage multistage test (MST) design, which includes 1 routing and 3 second-stage modules. Simulations were used to create a situation in which a large group of examinees took each of the 3 possible MST paths (high, middle, and low). We compared differences in…

Descriptors: Comparative Analysis, Difficulty Level, Scores, Test Wiseness

Equating Multidimensional Tests under a Random Groups Design: A Comparison of Various Equating Procedures

Direct link

Lee, Eunjung – ProQuest LLC, 2013

The purpose of this research was to compare the equating performance of various equating procedures for the multidimensional tests. To examine the various equating procedures, simulated data sets were used that were generated based on a multidimensional item response theory (MIRT) framework. Various equating procedures were examined, including…

Descriptors: Equated Scores, Tests, Comparative Analysis, Item Response Theory

The Effects of Anchor Length, Test Difficulty, Population Ability Differences, Mixture of Populations and Sample Size on the Psychometric Properties of Levine Observed Score Linear Equating Method for Different Assumptions

Direct link

Carvajal-Espinoza, Jorge E. – ProQuest LLC, 2011

The Non-Equivalent groups with Anchor Test equating (NEAT) design is a widely used equating design in large scale testing that involves two groups that do not have to be of equal ability. One group P gets form X and a group of items A and the other group Q gets form Y and the same group of items A. One of the most commonly used equating methods in…

Descriptors: Sample Size, Equated Scores, Psychometrics, Measurement

Observed-Score Equating with a Heterogeneous Target Population

Peer reviewed

Direct link

Duong, Minh Q.; von Davier, Alina A. – International Journal of Testing, 2012

Test equating is a statistical procedure for adjusting for test form differences in difficulty in a standardized assessment. Equating results are supposed to hold for a specified target population (Kolen & Brennan, 2004; von Davier, Holland, & Thayer, 2004) and to be (relatively) independent of the subpopulations from the target population (see…

Descriptors: Ability Grouping, Difficulty Level, Psychometrics, Statistical Analysis

Conditions Affecting the Accuracy of Classical Equating Methods for Small Samples under the NEAT Design: A Simulation Study

Direct link

Sunnassee, Devdass – ProQuest LLC, 2011

Small sample equating remains a largely unexplored area of research. This study attempts to fill in some of the research gaps via a large-scale, IRT-based simulation study that evaluates the performance of seven small-sample equating methods under various test characteristic and sampling conditions. The equating methods considered are typically…

Descriptors: Test Length, Test Format, Sample Size, Simulation

The Correlation between the Scores of a Test and an Anchor Test. Research Report. ETS RR-06-04

Peer reviewed
PDF on ERIC

Download full text

Sinharay, Sandip; Holland, Paul – ETS Research Report Series, 2006

It is a widely held belief that an anchor test used in equating should be a miniature version (or "minitest") of the tests to be equated; that is, the anchor test should be proportionally representative of the two tests in content and statistical characteristics. This paper examines the scientific foundation of this belief, especially…

Descriptors: Test Items, Equated Scores, Correlation, Tests

An Investigation of Factors Affecting Test Equating in Latent Trait Theory.

Peer reviewed

Sunathong, Surintorn; Schumacker, Randall E.; Beyerlein, Michael M. – Journal of Applied Measurement, 2000

Studied five factors that can affect the equating of scores from two tests onto a common score scale through the simulation and equating of 4,860 item data sets. Findings indicate three statistically significant two-way interactions for common item length and test length, item difficulty standard deviation and item distribution type, and item…

Descriptors: Difficulty Level, Equated Scores, Interaction, Item Response Theory

Choice of Anchor Test in Equating. Research Report. ETS RR-06-35

Peer reviewed
PDF on ERIC

Download full text

Sinharay, Sandip; Holland, Paul – ETS Research Report Series, 2006

It is a widely held belief that anchor tests should be miniature versions (i.e., minitests), with respect to content and statistical characteristics of the tests being equated. This paper examines the foundations for this belief. It examines the requirement of statistical representativeness of anchor tests that are content representative. The…

Descriptors: Test Items, Equated Scores, Evaluation Methods, Difficulty Level

Obtaining Some Degree of Correspondence Between Unequatable Scores: A Comparison of Item Response Theory and Equipercentile Equating Methods.

Yen, Wendy M. – 1982

Test scores that are not perfectly reliable cannot be strictly equated unless they are strictly parallel. This fact implies that tau equivalence can be lost if an equipercentile equating is applied to observed scores that are not strictly parallel. Thirty-six simulated data sets are produced to simulate equating tests with different difficulties…

Descriptors: Difficulty Level, Equated Scores, Latent Trait Theory, Methods

Equating Multiple Tests via an IRT Linking Design: Utilizing a Single Set of Anchor Items with Fixed Common Item Parameters during the Calibration Process.

Download full text

Li, Yuan H.; Griffith, William D.; Tam, Hak P. – 1997

This study explores the relative merits of a potentially useful item response theory (IRT) linking design: using a single set of anchor items with fixed common item parameters (FCIP) during the calibration process. An empirical study was conducted to investigate the appropriateness of this linking design using 6 groups of students taking 6 forms…

Descriptors: Ability, Difficulty Level, Equated Scores, Error of Measurement

Previous Page | Next Page »

Pages: 1 | 2

Holland, Paul	2
Li, Yuan H.	2
Sinharay, Sandip	2
Yen, Wendy M.	2
Antal, Judit	1
Asiret, Semih	1
Beyerlein, Michael M.	1
Bogan, Evelyn Doody	1
Bramley, Tom	1
Carvajal-Espinoza, Jorge E.	1
Curry, Allen R.	1
Duong, Minh Q.	1
Fitzpatrick, Joseph	1
Griffith, William D.	1
Hicks, Marilyn M.	1
Inga Laukaityte	1
Kim, Sooyeon	1
Lee, Eunjung	1
Lissitz, Robert W.	1
Marie Wiberg	1
Melican, Gerald J.	1
Moses, Tim	1
Proctor, Thomas P.	1
Schumacker, Randall E.	1
More ▼