ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	3
Since 2006 (last 20 years)	5

Descriptor

Comparative Analysis	8
Difficulty Level	8
Test Items	5
Item Response Theory	3
Simulation	3
Test Bias	3
Ability	2
Accuracy	2
Achievement Tests	2
Classification	2
Error of Measurement	2
Foreign Countries	2
Guessing (Tests)	2
Item Analysis	2
Maximum Likelihood Statistics	2
Models	2
Monte Carlo Methods	2
Sample Size	2
Secondary School Students	2
Adaptive Testing	1
Bayesian Statistics	1
Black Students	1
College Entrance Examinations	1
Computer Assisted Testing	1
Computer Simulation	1
More ▼

Source

Applied Measurement in…

Author

Abulela, Mohammed A. A.	1
Barnes, Laura L. B.	1
Cheong, Yuk Fai	1
Demars, Christine E.	1
Finch, Holmes	1
French, Brian F.	1
Kamata, Akihito	1
Koziol, Natalie A.	1
Marco, Gary L.	1
Olea, Julio	1
Ponsoda, Vicente	1
Revuelta, Javier	1
Rios, Joseph A.	1
Rodriguez, Maria Soledad	1
Wise, Steven L.	1
More ▼

Publication Type

Journal Articles	8
Reports - Research	7
Reports - Evaluative	1
Speeches/Meeting Papers	1

Education Level

Secondary Education

Audience

Location

Spain

Laws, Policies, & Programs

Assessments and Surveys

Program for International…	1
SAT (College Admission Test)	1
Trends in International…	1

What Works Clearinghouse Rating

Showing all 8 results Save | Export

Comparing the Robustness of Three Nonparametric DIF Procedures to Differential Rapid Guessing

Peer reviewed

Direct link

Abulela, Mohammed A. A.; Rios, Joseph A. – Applied Measurement in Education, 2022

When there are no personal consequences associated with test performance for examinees, rapid guessing (RG) is a concern and can differ between subgroups. To date, the impact of differential RG on item-level measurement invariance has received minimal attention. To that end, a simulation study was conducted to examine the robustness of the…

Descriptors: Comparative Analysis, Robustness (Statistics), Nonparametric Statistics, Item Analysis

A Comparison of Estimation Techniques for IRT Models with Small Samples

Peer reviewed

Direct link

Finch, Holmes; French, Brian F. – Applied Measurement in Education, 2019

The usefulness of item response theory (IRT) models depends, in large part, on the accuracy of item and person parameter estimates. For the standard 3 parameter logistic model, for example, these parameters include the item parameters of difficulty, discrimination, and pseudo-chance, as well as the person ability parameter. Several factors impact…

Descriptors: Item Response Theory, Accuracy, Test Items, Difficulty Level

Centering, Scale Indeterminacy, and Differential Item Functioning Detection in Hierarchical Generalized Linear and Generalized Linear Mixed Models

Peer reviewed

Direct link

Cheong, Yuk Fai; Kamata, Akihito – Applied Measurement in Education, 2013

In this article, we discuss and illustrate two centering and anchoring options available in differential item functioning (DIF) detection studies based on the hierarchical generalized linear and generalized linear mixed modeling frameworks. We compared and contrasted the assumptions of the two options, and examined the properties of their DIF…

Descriptors: Test Bias, Hierarchical Linear Modeling, Comparative Analysis, Test Items

Parameter Recovery and Classification Accuracy under Conditions of Testlet Dependency: A Comparison of the Traditional 2PL, Testlet, and Bi-Factor Models

Peer reviewed

Direct link

Koziol, Natalie A. – Applied Measurement in Education, 2016

Testlets, or groups of related items, are commonly included in educational assessments due to their many logistical and conceptual advantages. Despite their advantages, testlets introduce complications into the theory and practice of educational measurement. Responses to items within a testlet tend to be correlated even after controlling for…

Descriptors: Classification, Accuracy, Comparative Analysis, Models

An Analytic Comparison of Effect Sizes for Differential Item Functioning

Peer reviewed

Direct link

Demars, Christine E. – Applied Measurement in Education, 2011

Three types of effects sizes for DIF are described in this exposition: log of the odds-ratio (differences in log-odds), differences in probability-correct, and proportion of variance accounted for. Using these indices involves conceptualizing the degree of DIF in different ways. This integrative review discusses how these measures are impacted in…

Descriptors: Effect Size, Test Bias, Probability, Difficulty Level

The Utility of a Modified One-Parameter IRT Model with Small Samples.

Peer reviewed

Barnes, Laura L. B.; Wise, Steven L. – Applied Measurement in Education, 1991

One-parameter and three-parameter item response theory (IRT) model estimates were compared with estimates obtained from two modified one-parameter models that incorporated a constant nonzero guessing parameter. Using small-sample simulation data (50, 100, and 200 simulated examinees), modified 1-parameter models were most effective in estimating…

Descriptors: Ability, Achievement Tests, Comparative Analysis, Computer Simulation

The Effects of Test Difficulty Manipulation in Computerized Adaptive Testing and Self-Adapted Testing.

Peer reviewed

Ponsoda, Vicente; Olea, Julio; Rodriguez, Maria Soledad; Revuelta, Javier – Applied Measurement in Education, 1999

Compared easy and difficult versions of self-adapted tests (SAT) and computerized adapted tests. No significant differences were found among the tests for estimated ability or posttest state anxiety in studies with 187 Spanish high school students, although other significant differences were found. Discusses implications for interpreting test…

Descriptors: Ability, Adaptive Testing, Comparative Analysis, Computer Assisted Testing

Does the Use of Test Assembly Procedures Proposed in Legislation Make Any Difference in Test Properties and in the Test Performance of Black and White Test Takers?

Peer reviewed

Marco, Gary L. – Applied Measurement in Education, 1988

Four simulated mathematical and verbal test forms were produced by test assembly procedures proposed in legislative bills in California and New York in 1986 to minimize differences between majority and minority scores. Item response theory analyses of data for about 22,000 black and 28,000 White high-school students were conducted. (SLD)

Descriptors: Black Students, College Entrance Examinations, Comparative Analysis, Culture Fair Tests