ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	5
Since 2006 (last 20 years)	17

Descriptor

Difficulty Level	24
Equated Scores	24
Statistical Analysis	24
Test Items	18
Comparative Analysis	7
Item Analysis	6
Error of Measurement	5
Item Response Theory	5
Raw Scores	5
Test Format	5
Goodness of Fit	4
Sample Size	4
Testing Programs	4
Ability Grouping	3
College Entrance Examinations	3
Educational Assessment	3
Mathematical Models	3
Mathematics Tests	3
Scaling	3
Simulation	3
Standardized Tests	3
Test Reliability	3
Ability	2
Accuracy	2
Bias	2
More ▼

Source

ETS Research Report Series	4
Applied Psychological…	2
ACT, Inc.	1
Applied Measurement in…	1
Cambridge Assessment	1
Educational Testing Service	1
Eurasian Journal of…	1
International Journal of…	1
International Journal of…	1
Journal of Educational…	1
Journal of Educational and…	1
Online Submission	1
Pearson	1
Practical Assessment,…	1
More ▼

Publication Type

Reports - Research	19
Journal Articles	13
Speeches/Meeting Papers	7
Reports - Evaluative	5
Numerical/Quantitative Data	2

Education Level

Higher Education	2
Postsecondary Education	2
Elementary Secondary Education	1

Audience

Researchers

Location

Netherlands

Laws, Policies, & Programs

Assessments and Surveys

Program for International…	1
SAT (College Admission Test)	1

What Works Clearinghouse Rating

Showing 1 to 15 of 24 results Save | Export

Impacts of Differences in Group Abilities and Anchor Test Features on Three Non-IRT Test Equating Methods

Peer reviewed
PDF on ERIC

Download full text

Inga Laukaityte; Marie Wiberg – Practical Assessment, Research & Evaluation, 2024

The overall aim was to examine effects of differences in group ability and features of the anchor test form on equating bias and the standard error of equating (SEE) using both real and simulated data. Chained kernel equating, Postratification kernel equating, and Circle-arc equating were studied. A college admissions test with four different…

Descriptors: Ability Grouping, Test Items, College Entrance Examinations, High Stakes Tests

A Comparison of Kernel Equating and Item Response Theory Equating Methods

Peer reviewed
PDF on ERIC

Download full text

Akin-Arikan, Çigdem; Gelbal, Selahattin – Eurasian Journal of Educational Research, 2021

Purpose: This study aims to compare the performances of Item Response Theory (IRT) equating and kernel equating (KE) methods based on equating errors (RMSD) and standard error of equating (SEE) using the anchor item nonequivalent groups design. Method: Within this scope, a set of conditions, including ability distribution, type of anchor items…

Descriptors: Equated Scores, Item Response Theory, Test Items, Statistical Analysis

Subscore Equating and Profile Reporting

Peer reviewed

Direct link

Lim, Euijin; Lee, Won-Chan – Applied Measurement in Education, 2020

The purpose of this study is to address the necessity of subscore equating and to evaluate the performance of various equating methods for subtests. Assuming the random groups design and number-correct scoring, this paper analyzed real data and simulated data with four study factors including test dimensionality, subtest length, form difference in…

Descriptors: Equated Scores, Test Length, Test Format, Difficulty Level

Does Comparative Judgement of Scripts Provide an Effective Means of Maintaining Standards in Mathematics? Research Report

Download full text

Benton, Tom; Leech, Tony; Hughes, Sarah – Cambridge Assessment, 2020

In the context of examinations, the phrase "maintaining standards" usually refers to any activity designed to ensure that it is no easier (or harder) to achieve a given grade in one year than in another. Specifically, it tends to mean activities associated with setting examination grade boundaries. Benton et al (2020) describes a method…

Descriptors: Mathematics Tests, Equated Scores, Comparative Analysis, Difficulty Level

The Effect of Mini and Midi Anchor Tests on Test Equating

Peer reviewed
PDF on ERIC

Download full text

Arikan, Çigdem Akin – International Journal of Progressive Education, 2018

The main purpose of this study is to compare the test forms to the midi anchor test and the mini anchor test performance based on item response theory. The research was conducted with using simulated data which were generated based on Rasch model. In order to equate two test forms the anchor item nonequivalent groups (internal anchor test) was…

Descriptors: Equated Scores, Comparative Analysis, Item Response Theory, Tests

Equating without an Anchor for Nonequivalent Groups of Examinees

Peer reviewed

Direct link

Longford, Nicholas T. – Journal of Educational and Behavioral Statistics, 2015

An equating procedure for a testing program with evolving distribution of examinee profiles is developed. No anchor is available because the original scoring scheme was based on expert judgment of the item difficulties. Pairs of examinees from two administrations are formed by matching on coarsened propensity scores derived from a set of…

Descriptors: Equated Scores, Testing Programs, College Entrance Examinations, Scoring

Situations Where It Is Appropriate to Use Frequency Estimation Equipercentile Equating

Peer reviewed

Direct link

Guo, Hongwen; Oh, Hyeonjoo J.; Eignor, Daniel – Journal of Educational Measurement, 2013

In operational equating situations, frequency estimation equipercentile equating is considered only when the old and new groups have similar abilities. The frequency estimation assumptions are investigated in this study under various situations from both the levels of theoretical interest and practical use. It shows that frequency estimation…

Descriptors: Equated Scores, Computation, Statistical Analysis, Test Items

Equated Pooled Booklet Method in DIF Testing

Peer reviewed

Direct link

Cheng, Ying; Chen, Peihua; Qian, Jiahe; Chang, Hua-Hua – Applied Psychological Measurement, 2013

Differential item functioning (DIF) analysis is an important step in the data analysis of large-scale testing programs. Nowadays, many such programs endorse matrix sampling designs to reduce the load on examinees, such as the balanced incomplete block (BIB) design. These designs pose challenges to the traditional DIF analysis methods. For example,…

Descriptors: Test Bias, Equated Scores, Test Items, Effect Size

Observed-Score Equating with a Heterogeneous Target Population

Peer reviewed

Direct link

Duong, Minh Q.; von Davier, Alina A. – International Journal of Testing, 2012

Test equating is a statistical procedure for adjusting for test form differences in difficulty in a standardized assessment. Equating results are supposed to hold for a specified target population (Kolen & Brennan, 2004; von Davier, Holland, & Thayer, 2004) and to be (relatively) independent of the subpopulations from the target population (see…

Descriptors: Ability Grouping, Difficulty Level, Psychometrics, Statistical Analysis

An Application of Reverse Engineering to Automatic Item Generation: A Proof of Concept Using Automatically Generated Figures

Download full text

Lorié, William A. – Online Submission, 2013

A reverse engineering approach to automatic item generation (AIG) was applied to a figure-based publicly released test item from the Organisation for Economic Cooperation and Development (OECD) Programme for International Student Assessment (PISA) mathematical literacy cognitive instrument as part of a proof of concept. The author created an item…

Descriptors: Numeracy, Mathematical Concepts, Mathematical Logic, Difficulty Level

A Study of Frequency Estimation Equipercentile Equating When There Are Large Ability Differences. Research Report. ETS RR-09-45

Peer reviewed
PDF on ERIC

Download full text

Guo, Hongwen; Oh, Hyeonjoo J. – ETS Research Report Series, 2009

In operational equating, frequency estimation (FE) equipercentile equating is often excluded from consideration when the old and new groups have a large ability difference. This convention may, in some instances, cause the exclusion of one competitive equating method from the set of methods under consideration. In this report, we study the…

Descriptors: Equated Scores, Computation, Statistical Analysis, Test Items

Population Invariance of Vertical Scaling Results

Direct link

Powers, Sonya; Turhan, Ahmet; Binici, Salih – Pearson, 2012

The population sensitivity of vertical scaling results was evaluated for a state reading assessment spanning grades 3-10 and a state mathematics test spanning grades 3-8. Subpopulations considered included males and females. The 3-parameter logistic model was used to calibrate math and reading items and a common item design was used to construct…

Descriptors: Scaling, Equated Scores, Standardized Tests, Reading Tests

Evaluating the Effects of Differences in Group Abilities on the Tucker and the Levine Observed-Score Methods for Common-Item Nonequivalent Groups Equating. ACT Research Report Series 2010-1

Download full text

Chen, Hanwei; Cui, Zhongmin; Zhu, Rongchun; Gao, Xiaohong – ACT, Inc., 2010

The most critical feature of a common-item nonequivalent groups equating design is that the average score difference between the new and old groups can be accurately decomposed into a group ability difference and a form difficulty difference. Two widely used observed-score linear equating methods, the Tucker and the Levine observed-score methods,…

Descriptors: Equated Scores, Groups, Ability Grouping, Difficulty Level

The Effects of Different Types of Anchor Tests on Observed Score Equating. Research Report. ETS RR-09-41

Download full text

Liu, Jinghua; Sinharay, Sandip; Holland, Paul W.; Feigenbaum, Miriam; Curley, Edward – Educational Testing Service, 2009

This study explores the use of a different type of anchor, a "midi anchor", that has a smaller spread of item difficulties than the tests to be equated, and then contrasts its use with the use of a "mini anchor". The impact of different anchors on observed score equating were evaluated and compared with respect to systematic…

Descriptors: Equated Scores, Test Items, Difficulty Level, Error of Measurement

Investigating the Effectiveness of a Synthetic Linking Function on Small Sample Equating. Research Report. ETS RR-07-37

Peer reviewed
PDF on ERIC

Download full text

Kim, Sooyeon; von Davier, Alina A.; Haberman, Shelby – ETS Research Report Series, 2007

The synthetic function, which is a weighted average of the identity (the trivial linking function for forms that are known to be completely parallel) and a traditional equating method, has been proposed as an alternative for performing linking with very small samples (Kim, von Davier, & Haberman, 2006). The purpose of the present study was to…

Descriptors: Equated Scores, Sample Size, Statistical Analysis, Licensing Examinations (Professions)

Previous Page | Next Page »

Pages: 1 | 2

Guo, Hongwen	2
Oh, Hyeonjoo J.	2
Sinharay, Sandip	2
von Davier, Alina A.	2
Akin-Arikan, Çigdem	1
Algina, James	1
Arikan, Çigdem Akin	1
Beard, Jacob G.	1
Bell, Anita I.	1
Benton, Tom	1
Binici, Salih	1
Chang, Hua-Hua	1
Chen, Hanwei	1
Chen, Peihua	1
Cheng, Ying	1
Cope, Ronald T.	1
Cui, Zhongmin	1
Curley, Edward	1
Duong, Minh Q.	1
Eignor, Daniel	1
Feigenbaum, Miriam	1
Gao, Xiaohong	1
Gelbal, Selahattin	1
Haberman, Shelby	1
Holland, Paul	1
More ▼