ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	9

Descriptor

Computation	10
Difficulty Level	10
Equated Scores	10
Test Items	10
Item Response Theory	8
Ability	2
Error of Measurement	2
Measurement	2
Scaling	2
Statistical Analysis	2
Statistical Distributions	2
Achievement Gains	1
Aptitude Tests	1
Bayesian Statistics	1
Causal Models	1
Comparative Testing	1
Context Effect	1
Effect Size	1
Evaluation Methods	1
Foreign Countries	1
Grade 3	1
Grade 4	1
Grade 5	1
Grade 6	1
Grade 7	1
More ▼

Source

Educational and Psychological…	3
Applied Measurement in…	2
ETS Research Report Series	2
Journal of Educational…	2
National Center for Research…	1

Author

Guo, Hongwen	2
Michaelides, Michalis P.	2
Oh, Hyeonjoo J.	2
Ali, Usama S.	1
Bjermo, Jonas	1
Dimitrov, Dimiter M.	1
Eignor, Daniel	1
Fitzpatrick, Steven J.	1
Haertel, Edward H.	1
Li, Yuan H.	1
Lissitz, Robert W.	1
Miller, Frank	1
Miller, G. Edward	1
Walker, Michael E.	1
Xin, Tao	1
Ye, Meng	1
More ▼

Publication Type

Journal Articles	9
Reports - Research	8
Reports - Evaluative	2

Education Level

Secondary Education	2
Elementary Education	1
Elementary Secondary Education	1
Grade 3	1
Grade 4	1
Grade 5	1
Grade 6	1
Grade 7	1
Grade 8	1
Junior High Schools	1
Middle Schools	1
More ▼

Audience

Location

Saudi Arabia

Laws, Policies, & Programs

Assessments and Surveys

General Aptitude Test Battery

What Works Clearinghouse Rating

Showing all 10 results Save | Export

Efficient Estimation of Mean Ability Growth Using Vertical Scaling

Peer reviewed

Direct link

Bjermo, Jonas; Miller, Frank – Applied Measurement in Education, 2021

In recent years, the interest in measuring growth in student ability in various subjects between different grades in school has increased. Therefore, good precision in the estimated growth is of importance. This paper aims to compare estimation methods and test designs when it comes to precision and bias of the estimated growth of mean ability…

Descriptors: Scaling, Ability, Computation, Test Items

An Approach to Scoring and Equating Tests with Binary Items: Piloting With Large-Scale Assessments

Peer reviewed

Direct link

Dimitrov, Dimiter M. – Educational and Psychological Measurement, 2016

This article describes an approach to test scoring, referred to as "delta scoring" (D-scoring), for tests with dichotomously scored items. The D-scoring uses information from item response theory (IRT) calibration to facilitate computations and interpretations in the context of large-scale assessments. The D-score is computed from the…

Descriptors: Scoring, Equated Scores, Test Items, Measurement

Enhancing the Equating of Item Difficulty Metrics: Estimation of Reference Distribution. Research Report. ETS RR-14-07

Peer reviewed
PDF on ERIC

Download full text

Ali, Usama S.; Walker, Michael E. – ETS Research Report Series, 2014

Two methods are currently in use at Educational Testing Service (ETS) for equating observed item difficulty statistics. The first method involves the linear equating of item statistics in an observed sample to reference statistics on the same items. The second method, or the item response curve (IRC) method, involves the summation of conditional…

Descriptors: Difficulty Level, Test Items, Equated Scores, Causal Models

Situations Where It Is Appropriate to Use Frequency Estimation Equipercentile Equating

Peer reviewed

Direct link

Guo, Hongwen; Oh, Hyeonjoo J.; Eignor, Daniel – Journal of Educational Measurement, 2013

In operational equating situations, frequency estimation equipercentile equating is considered only when the old and new groups have similar abilities. The frequency estimation assumptions are investigated in this study under various situations from both the levels of theoretical interest and practical use. It shows that frequency estimation…

Descriptors: Equated Scores, Computation, Statistical Analysis, Test Items

Selection of Common Items as an Unrecognized Source of Variability in Test Equating: A Bootstrap Approximation Assuming Random Sampling of Common Items

Peer reviewed

Direct link

Michaelides, Michalis P.; Haertel, Edward H. – Applied Measurement in Education, 2014

The standard error of equating quantifies the variability in the estimation of an equating function. Because common items for deriving equated scores are treated as fixed, the only source of variability typically considered arises from the estimation of common-item parameters from responses of samples of examinees. Use of alternative, equally…

Descriptors: Equated Scores, Test Items, Sampling, Statistical Inference

Effects of Item Parameter Drift on Vertical Scaling with the Nonequivalent Groups with Anchor Test (NEAT) Design

Peer reviewed

Direct link

Ye, Meng; Xin, Tao – Educational and Psychological Measurement, 2014

The authors explored the effects of drifting common items on vertical scaling within the higher order framework of item parameter drift (IPD). The results showed that if IPD occurred between a pair of test levels, the scaling performance started to deviate from the ideal state, as indicated by bias of scaling. When there were two items drifting…

Descriptors: Scaling, Test Items, Equated Scores, Achievement Gains

Expected Equating Error Resulting from Incorrect Handling of Item Parameter Drift among the Common Items

Peer reviewed

Direct link

Miller, G. Edward; Fitzpatrick, Steven J. – Educational and Psychological Measurement, 2009

Incorrect handling of item parameter drift during the equating process can result in equating error. If the item parameter drift is due to construct-irrelevant factors, then inclusion of these items in the estimation of the equating constants can be expected to result in equating error. On the other hand, if the item parameter drift is related to…

Descriptors: Equated Scores, Computation, Item Response Theory, Test Items

A Study of Frequency Estimation Equipercentile Equating When There Are Large Ability Differences. Research Report. ETS RR-09-45

Peer reviewed
PDF on ERIC

Download full text

Guo, Hongwen; Oh, Hyeonjoo J. – ETS Research Report Series, 2009

In operational equating, frequency estimation (FE) equipercentile equating is often excluded from consideration when the old and new groups have a large ability difference. This convention may, in some instances, cause the exclusion of one competitive equating method from the set of methods under consideration. In this report, we study the…

Descriptors: Equated Scores, Computation, Statistical Analysis, Test Items

Effects of Misbehaving Common Items on Aggregate Scores and an Application of the Mantel-Haenszel Statistic in Test Equating. CSE Report 688

Download full text

Michaelides, Michalis P. – National Center for Research on Evaluation, Standards, and Student Testing (CRESST), 2006

Consistent behavior is a desirable characteristic that common items are expected to have when administered to different groups. Findings from the literature have established that items do not always behave in consistent ways; item indices and IRT item parameter estimates of the same items differ when obtained from different administrations.…

Descriptors: Equated Scores, Test Items, Item Response Theory, Evaluation Methods

Applications of the Analytically Derived Asymptotic Standard Errors of Item Response Theory Item Parameter Estimates

Peer reviewed

Direct link

Li, Yuan H.; Lissitz, Robert W. – Journal of Educational Measurement, 2004

The analytically derived asymptotic standard errors (SEs) of maximum likelihood (ML) item estimates can be approximated by a mathematical function without examinees' responses to test items, and the empirically determined SEs of marginal maximum likelihood estimation (MMLE)/Bayesian item estimates can be obtained when the same set of items is…

Descriptors: Test Items, Computation, Item Response Theory, Error of Measurement