ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	4
Since 2016 (last 10 years)	6
Since 2006 (last 20 years)	8

Descriptor

Item Analysis	13
Simulation	13
Test Length	13
Test Items	9
Item Response Theory	6
Adaptive Testing	5
Computer Assisted Testing	5
Difficulty Level	4
Evaluation Methods	4
Sample Size	4
Scoring	4
Comparative Analysis	3
Correlation	3
Error of Measurement	3
Statistical Analysis	3
Goodness of Fit	2
Guidelines	2
Item Banks	2
Latent Trait Theory	2
Military Personnel	2
Monte Carlo Methods	2
Test Construction	2
Test Reliability	2
Testing	2
Achievement Tests	1
More ▼

Source

Educational and Psychological…	2
Measurement:…	2
ETS Research Report Series	1
International Journal of…	1
Journal of Educational…	1
Psychometrika	1

Publication Type

Reports - Research	13
Journal Articles	8
Numerical/Quantitative Data	1

Education Level

Higher Education	1
Postsecondary Education	1

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

Stanford Binet Intelligence…	1
Test of English as a Foreign…	1

What Works Clearinghouse Rating

Showing all 13 results Save | Export

Assessing Dimensionality of IRT Models Using Traditional and Revised Parallel Analyses

Peer reviewed

Direct link

Guo, Wenjing; Choi, Youn-Jeng – Educational and Psychological Measurement, 2023

Determining the number of dimensions is extremely important in applying item response theory (IRT) models to data. Traditional and revised parallel analyses have been proposed within the factor analysis framework, and both have shown some promise in assessing dimensionality. However, their performance in the IRT framework has not been…

Descriptors: Item Response Theory, Evaluation Methods, Factor Analysis, Guidelines

There Are Many Greater Lower Bounds than Cronbach's [alpha]: A Monte Carlo Simulation Study

Peer reviewed

Direct link

Novak, Josip; Rebernjak, Blaž – Measurement: Interdisciplinary Research and Perspectives, 2023

A Monte Carlo simulation study was conducted to examine the performance of [alpha], [lambda]2, [lambda][subscript 4], [lambda][subscript 2], [omega][subscript T], GLB[subscript MRFA], and GLB[subscript Algebraic] coefficients. Population reliability, distribution shape, sample size, test length, and number of response categories were varied…

Descriptors: Monte Carlo Methods, Evaluation Methods, Reliability, Simulation

A Regression Discontinuity Design Framework for Controlling Selection Bias in Evaluations of Differential Item Functioning

Peer reviewed

Direct link

Koziol, Natalie A.; Goodrich, J. Marc; Yoon, HyeonJin – Educational and Psychological Measurement, 2022

Differential item functioning (DIF) is often used to examine validity evidence of alternate form test accommodations. Unfortunately, traditional approaches for evaluating DIF are prone to selection bias. This article proposes a novel DIF framework that capitalizes on regression discontinuity design analysis to control for selection bias. A…

Descriptors: Regression (Statistics), Item Analysis, Validity, Testing Accommodations

Robustness of Weighted Differential Item Functioning (DIF) Analysis: The Case of Mantel-Haenszel DIF Statistics. Research Report. ETS RR-21-12

Peer reviewed
PDF on ERIC

Download full text

Lu, Ru; Guo, Hongwen; Dorans, Neil J. – ETS Research Report Series, 2021

Two families of analysis methods can be used for differential item functioning (DIF) analysis. One family is DIF analysis based on observed scores, such as the Mantel-Haenszel (MH) and the standardized proportion-correct metric for DIF procedures; the other is analysis based on latent ability, in which the statistic is a measure of departure from…

Descriptors: Robustness (Statistics), Weighted Scores, Test Items, Item Analysis

Routing Strategies and Optimizing Design for Multistage Testing in International Large-Scale Assessments

Peer reviewed

Direct link

Svetina, Dubravka; Liaw, Yuan-Ling; Rutkowski, Leslie; Rutkowski, David – Journal of Educational Measurement, 2019

This study investigates the effect of several design and administration choices on item exposure and person/item parameter recovery under a multistage test (MST) design. In a simulation study, we examine whether number-correct (NC) or item response theory (IRT) methods are differentially effective at routing students to the correct next stage(s)…

Descriptors: Measurement, Item Analysis, Test Construction, Item Response Theory

Monte Carlo Simulation in Item Response Theory Applications Using SAS

Peer reviewed

Direct link

Ames, Allison J.; Leventhal, Brian C.; Ezike, Nnamdi C. – Measurement: Interdisciplinary Research and Perspectives, 2020

Data simulation and Monte Carlo simulation studies are important skills for researchers and practitioners of educational and psychological measurement, but there are few resources on the topic specific to item response theory. Even fewer resources exist on the statistical software techniques to implement simulation studies. This article presents…

Descriptors: Monte Carlo Methods, Item Response Theory, Simulation, Computer Software

Multidimensional CAT Item Selection Methods for Domain Scores and Composite Scores: Theory and Applications

Peer reviewed

Direct link

Yao, Lihua – Psychometrika, 2012

Multidimensional computer adaptive testing (MCAT) can provide higher precision and reliability or reduce test length when compared with unidimensional CAT or with the paper-and-pencil test. This study compared five item selection procedures in the MCAT framework for both domain scores and overall scores through simulation by varying the structure…

Descriptors: Item Banks, Test Length, Simulation, Adaptive Testing

Computerized Adaptive Testing with the Zinnes and Griggs Pairwise Preference Ideal Point Model

Peer reviewed

Direct link

Stark, Stephen; Chernyshenko, Oleksandr S. – International Journal of Testing, 2011

This article delves into a relatively unexplored area of measurement by focusing on adaptive testing with unidimensional pairwise preference items. The use of such tests is becoming more common in applied non-cognitive assessment because research suggests that this format may help to reduce certain types of rater error and response sets commonly…

Descriptors: Test Length, Simulation, Adaptive Testing, Item Analysis

Simulated and Empirical Studies of Flexilevel Testing in Air Force Technical Training Courses. Final Report for Period 1 May 1975-30 April 1977.

Harris, Dickie A.; Penell, Roger J. – 1977

This study used a series of simulations to answer questions about the efficacy of adaptive testing raised by empirical studies. The first study showed that for reasonable high entry points, parameters estimated from paper-and-pencil test protocols cross-validated remarkably well to groups actually tested at a computer terminal. This suggested that…

Descriptors: Adaptive Testing, Computer Assisted Testing, Cost Effectiveness, Difficulty Level

Some Results on the Robustness of Latent Trait Models.

Download full text

Hambleton, Ronald K.; Cook, Linda L. – 1978

The purpose of the present research was to study, systematically, the "goodness-of-fit" of the one-, two-, and three-parameter logistic models. We studied, using computer-simulated test data, the effects of four variables: variation in item discrimination parameters, the average value of the pseudo-chance level parameters, test length,…

Descriptors: Career Development, Difficulty Level, Goodness of Fit, Item Analysis

Test Speededness under Number-Right Scoring: An Analysis of the Test of English as a Foreign Language.

Download full text

Bejar, Isaac I. – 1985

The Test of English as a Foreign Language (TOEFL) was used in this study, which attempted to develop a new methodology for assessing the speededness of right-scored tests. Traditional procedures of assessing speededness have assumed that the test is scored under formula-scoring instructions; this approach is not always appropriate. In this study,…

Descriptors: College Entrance Examinations, English (Second Language), Estimation (Mathematics), Evaluation Methods

An Adaptive Testing Strategy for Achievement Test Batteries. Research Report 77-6.

Download full text

Brown, Joel M.; Weiss, David J. – 1977

An adaptive testing strategy is described for achievement tests covering multiple content areas. The strategy combines adaptive item selection both within and between the subtests in the multiple-subtest battery. A real-data simulation was conducted to compare the results from adaptive testing and from conventional testing, in terms of test…

Descriptors: Achievement Tests, Adaptive Testing, Branching, Comparative Analysis

Evaluations of Implied Orders as a Basis for Tailored Testing Using Simulations. Technical Report No. 4.

Cliff, Norman; And Others – 1977

TAILOR is a computer program that uses the implied orders concept as the basis for computerized adaptive testing. The basic characteristics of TAILOR, which does not involve pretesting, are reviewed here and two studies of it are reported. One is a Monte Carlo simulation based on the four-parameter Birnbaum model and the other uses a matrix of…

Descriptors: Adaptive Testing, Computer Assisted Testing, Computer Programs, Difficulty Level

Ames, Allison J.	1
Bejar, Isaac I.	1
Brown, Joel M.	1
Chernyshenko, Oleksandr S.	1
Choi, Youn-Jeng	1
Cliff, Norman	1
Cook, Linda L.	1
Dorans, Neil J.	1
Ezike, Nnamdi C.	1
Goodrich, J. Marc	1
Guo, Hongwen	1
Guo, Wenjing	1
Hambleton, Ronald K.	1
Harris, Dickie A.	1
Koziol, Natalie A.	1
Leventhal, Brian C.	1
Liaw, Yuan-Ling	1
Lu, Ru	1
Novak, Josip	1
Penell, Roger J.	1
Rebernjak, Blaž	1
Rutkowski, David	1
Rutkowski, Leslie	1
Stark, Stephen	1
More ▼