ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	5
Since 2006 (last 20 years)	11

Descriptor

Difficulty Level	15
Equated Scores	15
Error of Measurement	15
Test Items	11
Simulation	8
Item Response Theory	7
Statistical Analysis	5
College Entrance Examinations	3
Measurement	3
Psychometrics	3
Sample Size	3
Sampling	3
Test Construction	3
Ability	2
Ability Grouping	2
Computation	2
Correlation	2
Mathematics Tests	2
Test Bias	2
Test Format	2
Test Reliability	2
Test Theory	2
Accuracy	1
Achievement Tests	1
Bayesian Statistics	1
More ▼

Source

Applied Measurement in…	3
ETS Research Report Series	1
Educational Testing Service	1
Educational and Psychological…	1
Eurasian Journal of…	1
Grantee Submission	1
International Journal of…	1
International Journal of…	1
Journal of Educational…	1
Practical Assessment,…	1
Research Matters	1
More ▼

Publication Type

Reports - Research	12
Journal Articles	11
Reports - Evaluative	3
Numerical/Quantitative Data	1
Speeches/Meeting Papers	1

Education Level

Elementary Education	1
Grade 8	1
Higher Education	1
Junior High Schools	1
Middle Schools	1
Postsecondary Education	1
Secondary Education	1

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

SAT (College Admission Test)

What Works Clearinghouse Rating

Showing all 15 results Save | Export

Impacts of Differences in Group Abilities and Anchor Test Features on Three Non-IRT Test Equating Methods

Peer reviewed
PDF on ERIC

Download full text

Inga Laukaityte; Marie Wiberg – Practical Assessment, Research & Evaluation, 2024

The overall aim was to examine effects of differences in group ability and features of the anchor test form on equating bias and the standard error of equating (SEE) using both real and simulated data. Chained kernel equating, Postratification kernel equating, and Circle-arc equating were studied. A college admissions test with four different…

Descriptors: Ability Grouping, Test Items, College Entrance Examinations, High Stakes Tests

A Comparison of Kernel Equating and Item Response Theory Equating Methods

Peer reviewed
PDF on ERIC

Download full text

Akin-Arikan, Çigdem; Gelbal, Selahattin – Eurasian Journal of Educational Research, 2021

Purpose: This study aims to compare the performances of Item Response Theory (IRT) equating and kernel equating (KE) methods based on equating errors (RMSD) and standard error of equating (SEE) using the anchor item nonequivalent groups design. Method: Within this scope, a set of conditions, including ability distribution, type of anchor items…

Descriptors: Equated Scores, Item Response Theory, Test Items, Statistical Analysis

The Effect of Chance Success on Equalization Error in Test Equation Based on Classical Test Theory

Peer reviewed
PDF on ERIC

Download full text

Koçak, Duygu – International Journal of Progressive Education, 2020

The aim of this study was to determine the effect of chance success on test equalization. For this purpose, artificially generated 500 and 1000 sample size data sets were synchronized using linear equalization and equal percentage equalization methods. In the data which were produced as a simulative, a total of four cases were created with no…

Descriptors: Test Theory, Equated Scores, Error of Measurement, Sample Size

Comparing Small-Sample Equating with Angoff Judgement for Linking Cut-Scores on Two Tests

Download full text

Bramley, Tom – Research Matters, 2020

The aim of this study was to compare, by simulation, the accuracy of mapping a cut-score from one test to another by expert judgement (using the Angoff method) versus the accuracy with a small-sample equating method (chained linear equating). As expected, the standard-setting method resulted in more accurate equating when we assumed a higher level…

Descriptors: Cutting Scores, Standard Setting (Scoring), Equated Scores, Accuracy

Psychometric Report on the Knowledge for Teaching Elementary Fractions Test Administered to Elementary Educators in Six States in Spring 2017. Research Report No. 2018-13

Download full text

Schoen, Robert C.; Yang, Xiaotong; Paek, Insu – Grantee Submission, 2018

This report provides evidence of the substantive and structural validity of the Knowledge for Teaching Elementary Fractions Test. Field-test data were gathered with a sample of 241 elementary educators, including teachers, administrators, and instructional support personnel, in spring 2017, as part of a larger study involving a multisite…

Descriptors: Psychometrics, Pedagogical Content Knowledge, Mathematics Tests, Mathematics Instruction

The Effect of Anchor Test Construction on Scale Drift

Peer reviewed

Direct link

Antal, Judit; Proctor, Thomas P.; Melican, Gerald J. – Applied Measurement in Education, 2014

In common-item equating the anchor block is generally built to represent a miniature form of the total test in terms of content and statistical specifications. The statistical properties frequently reflect equal mean and spread of item difficulty. Sinharay and Holland (2007) suggested that the requirement for equal spread of difficulty may be too…

Descriptors: Test Items, Equated Scores, Difficulty Level, Item Response Theory

Selection of Common Items as an Unrecognized Source of Variability in Test Equating: A Bootstrap Approximation Assuming Random Sampling of Common Items

Peer reviewed

Direct link

Michaelides, Michalis P.; Haertel, Edward H. – Applied Measurement in Education, 2014

The standard error of equating quantifies the variability in the estimation of an equating function. Because common items for deriving equated scores are treated as fixed, the only source of variability typically considered arises from the estimation of common-item parameters from responses of samples of examinees. Use of alternative, equally…

Descriptors: Equated Scores, Test Items, Sampling, Statistical Inference

Observed Score Equating Using a Mini-Version Anchor and an Anchor with Less Spread of Difficulty: A Comparison Study

Peer reviewed

Direct link

Liu, Jinghua; Sinharay, Sandip; Holland, Paul; Feigenbaum, Miriam; Curley, Edward – Educational and Psychological Measurement, 2011

Two different types of anchors are investigated in this study: a mini-version anchor and an anchor that has a less spread of difficulty than the tests to be equated. The latter is referred to as a midi anchor. The impact of these two different types of anchors on observed score equating are evaluated and compared with respect to systematic error…

Descriptors: Equated Scores, Test Items, Difficulty Level, Statistical Bias

Observed-Score Equating with a Heterogeneous Target Population

Peer reviewed

Direct link

Duong, Minh Q.; von Davier, Alina A. – International Journal of Testing, 2012

Test equating is a statistical procedure for adjusting for test form differences in difficulty in a standardized assessment. Equating results are supposed to hold for a specified target population (Kolen & Brennan, 2004; von Davier, Holland, & Thayer, 2004) and to be (relatively) independent of the subpopulations from the target population (see…

Descriptors: Ability Grouping, Difficulty Level, Psychometrics, Statistical Analysis

The Effects of Different Types of Anchor Tests on Observed Score Equating. Research Report. ETS RR-09-41

Download full text

Liu, Jinghua; Sinharay, Sandip; Holland, Paul W.; Feigenbaum, Miriam; Curley, Edward – Educational Testing Service, 2009

This study explores the use of a different type of anchor, a "midi anchor", that has a smaller spread of item difficulties than the tests to be equated, and then contrasts its use with the use of a "mini anchor". The impact of different anchors on observed score equating were evaluated and compared with respect to systematic…

Descriptors: Equated Scores, Test Items, Difficulty Level, Error of Measurement

Choice of Anchor Test in Equating. Research Report. ETS RR-06-35

Peer reviewed
PDF on ERIC

Download full text

Sinharay, Sandip; Holland, Paul – ETS Research Report Series, 2006

It is a widely held belief that anchor tests should be miniature versions (i.e., minitests), with respect to content and statistical characteristics of the tests being equated. This paper examines the foundations for this belief. It examines the requirement of statistical representativeness of anchor tests that are content representative. The…

Descriptors: Test Items, Equated Scores, Evaluation Methods, Difficulty Level

Experiences in the Application of Item Response Theory in Test Construction.

Peer reviewed

Green, Donald Ross; And Others – Applied Measurement in Education, 1989

Potential benefits of using item response theory in test construction are evaluated using the experience and evidence accumulated during nine years of using a three-parameter model in the development of major achievement batteries. Topics addressed include error of measurement, test equating, item bias, and item difficulty. (TJH)

Descriptors: Achievement Tests, Computer Assisted Testing, Difficulty Level, Equated Scores

Equating Multiple Tests via an IRT Linking Design: Utilizing a Single Set of Anchor Items with Fixed Common Item Parameters during the Calibration Process.

Download full text

Li, Yuan H.; Griffith, William D.; Tam, Hak P. – 1997

This study explores the relative merits of a potentially useful item response theory (IRT) linking design: using a single set of anchor items with fixed common item parameters (FCIP) during the calibration process. An empirical study was conducted to investigate the appropriateness of this linking design using 6 groups of students taking 6 forms…

Descriptors: Ability, Difficulty Level, Equated Scores, Error of Measurement

Applications of the Analytically Derived Asymptotic Standard Errors of Item Response Theory Item Parameter Estimates

Peer reviewed

Direct link

Li, Yuan H.; Lissitz, Robert W. – Journal of Educational Measurement, 2004

The analytically derived asymptotic standard errors (SEs) of maximum likelihood (ML) item estimates can be approximated by a mathematical function without examinees' responses to test items, and the empirically determined SEs of marginal maximum likelihood estimation (MMLE)/Bayesian item estimates can be obtained when the same set of items is…

Descriptors: Test Items, Computation, Item Response Theory, Error of Measurement

Invariance of Rasch Model Ability Parameter Estimates Over Different Collections of Items.

Curry, Allen R.; And Others – 1978

The efficacy of employing subsets of items from a calibrated item pool to estimate the Rasch model person parameters was investigated. Specifically, the degree of invariance of Rasch model ability-parameter estimates was examined across differing collections of simulated items. The ability-parameter estimates were obtained from a simulation of…

Descriptors: Career Development, Difficulty Level, Equated Scores, Error of Measurement

Sinharay, Sandip	3
Curley, Edward	2
Feigenbaum, Miriam	2
Holland, Paul	2
Li, Yuan H.	2
Liu, Jinghua	2
Akin-Arikan, Çigdem	1
Antal, Judit	1
Bramley, Tom	1
Curry, Allen R.	1
Duong, Minh Q.	1
Gelbal, Selahattin	1
Green, Donald Ross	1
Griffith, William D.	1
Haertel, Edward H.	1
Holland, Paul W.	1
Inga Laukaityte	1
Koçak, Duygu	1
Lissitz, Robert W.	1
Marie Wiberg	1
Melican, Gerald J.	1
Michaelides, Michalis P.	1
Paek, Insu	1
Proctor, Thomas P.	1
More ▼