ERIC - Search Results

Publication Date

In 2026	0
Since 2025	1
Since 2022 (last 5 years)	12
Since 2017 (last 10 years)	22
Since 2007 (last 20 years)	77

Descriptor

Comparative Analysis	135
Test Items	135
Simulation	112
Item Response Theory	69
Computer Assisted Testing	32
Item Analysis	30
Models	26
Adaptive Testing	25
Sample Size	25
Computer Simulation	23
Statistical Analysis	23
Difficulty Level	21
Evaluation Methods	21
Scores	21
Correlation	16
Equated Scores	16
Error of Measurement	16
Mathematical Models	16
Test Bias	16
Test Format	16
Factor Analysis	15
Test Construction	14
Test Length	14
Ability	13
Computation	13
More ▼

Publication Type

Journal Articles	98
Reports - Research	87
Reports - Evaluative	36
Speeches/Meeting Papers	24
Dissertations/Theses -…	9
Reports - Descriptive	4
Information Analyses	1
Numerical/Quantitative Data	1
Tests/Questionnaires	1

Education Level

Elementary Secondary Education	4
Higher Education	4
Secondary Education	4
Postsecondary Education	3
Grade 12	2
High Schools	2
Elementary Education	1
Grade 4	1
Intermediate Grades	1

Audience

Researchers

Location

Botswana	1
Honduras	1
Japan	1
Netherlands	1
Tunisia	1

Laws, Policies, & Programs

Assessments and Surveys

Trends in International…	5
National Assessment of…	3
Program for International…	3
ACT Assessment	2
Advanced Placement…	1
Armed Services Vocational…	1
Law School Admission Test	1
Raven Advanced Progressive…	1
SAT (College Admission Test)	1
Test of English as a Foreign…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 135 results Save | Export

IRT Linking Methods for the Bifactor Model with Mixed Format Tests

Peer reviewed

Direct link

Sohee Kim; Ki Lynn Cole – International Journal of Testing, 2025

This study conducted a comprehensive comparison of Item Response Theory (IRT) linking methods applied to a bifactor model, examining their performance on both multiple choice (MC) and mixed format tests within the common item nonequivalent group design framework. Four distinct multidimensional IRT linking approaches were explored, consisting of…

Descriptors: Item Response Theory, Comparative Analysis, Models, Item Analysis

Analyzing Polytomous Test Data: A Comparison between an Information-Based IRT Model and the Generalized Partial Credit Model

Peer reviewed

Direct link

Joakim Wallmark; James O. Ramsay; Juan Li; Marie Wiberg – Journal of Educational and Behavioral Statistics, 2024

Item response theory (IRT) models the relationship between the possible scores on a test item against a test taker's attainment of the latent trait that the item is intended to measure. In this study, we compare two models for tests with polytomously scored items: the optimal scoring (OS) model, a nonparametric IRT model based on the principles of…

Descriptors: Item Response Theory, Test Items, Models, Scoring

Comparison of Item Response Theory Ability and Item Parameters According to Classical and Bayesian Estimation Methods

Peer reviewed
PDF on ERIC

Download full text

Eray Selçuk; Ergül Demir – International Journal of Assessment Tools in Education, 2024

This research aims to compare the ability and item parameter estimations of Item Response Theory according to Maximum likelihood and Bayesian approaches in different Monte Carlo simulation conditions. For this purpose, depending on the changes in the priori distribution type, sample size, test length, and logistics model, the ability and item…

Descriptors: Item Response Theory, Item Analysis, Test Items, Simulation

A Comparison of Latent Semantic Analysis and Latent Dirichlet Allocation in Educational Measurement

Peer reviewed

Direct link

Jordan M. Wheeler; Allan S. Cohen; Shiyu Wang – Journal of Educational and Behavioral Statistics, 2024

Topic models are mathematical and statistical models used to analyze textual data. The objective of topic models is to gain information about the latent semantic space of a set of related textual data. The semantic space of a set of textual data contains the relationship between documents and words and how they are used. Topic models are becoming…

Descriptors: Semantics, Educational Assessment, Evaluators, Reliability

Maintaining Score Scales over Time: A Comparison of Five Scoring Methods

Peer reviewed

Direct link

Kim, Stella Yun; Lee, Won-Chan – Applied Measurement in Education, 2023

This study evaluates various scoring methods including number-correct scoring, IRT theta scoring, and hybrid scoring in terms of scale-score stability over time. A simulation study was conducted to examine the relative performance of five scoring methods in terms of preserving the first two moments of scale scores for a population in a chain of…

Descriptors: Scoring, Comparative Analysis, Item Response Theory, Simulation

The Study of the Effect of Item Parameter Drift on Ability Estimation Obtained from Adaptive Testing under Different Conditions

Peer reviewed
PDF on ERIC

Download full text

Sahin Kursad, Merve; Cokluk Bokeoglu, Omay; Cikrikci, Rahime Nukhet – International Journal of Assessment Tools in Education, 2022

Item parameter drift (IPD) is the systematic differentiation of parameter values of items over time due to various reasons. If it occurs in computer adaptive tests (CAT), it causes errors in the estimation of item and ability parameters. Identification of the underlying conditions of this situation in CAT is important for estimating item and…

Descriptors: Item Analysis, Computer Assisted Testing, Test Items, Error of Measurement

Hybrid Maximum Clique Algorithm Using Parallel Integer Programming for Uniform Test Assembly

Peer reviewed

Direct link

Fuchimoto, Kazuma; Ishii, Takatoshi; Ueno, Maomi – IEEE Transactions on Learning Technologies, 2022

Educational assessments often require uniform test forms, for which each test form has equivalent measurement accuracy but with a different set of items. For uniform test assembly, an important issue is the increase of the number of assembled uniform tests. Although many automatic uniform test assembly methods exist, the maximum clique algorithm…

Descriptors: Simulation, Efficiency, Test Items, Educational Assessment

Diagnostic Classification Model for Forced-Choice Items and Noncognitive Tests

Peer reviewed

Direct link

Huang, Hung-Yu – Educational and Psychological Measurement, 2023

The forced-choice (FC) item formats used for noncognitive tests typically develop a set of response options that measure different traits and instruct respondents to make judgments among these options in terms of their preference to control the response biases that are commonly observed in normative tests. Diagnostic classification models (DCMs)…

Descriptors: Test Items, Classification, Bayesian Statistics, Decision Making

Closed Formula of Test Length Required for Adaptive Testing with Medium Probability of Solution

Peer reviewed

Direct link

Kárász, Judit T.; Széll, Krisztián; Takács, Szabolcs – Quality Assurance in Education: An International Perspective, 2023

Purpose: Based on the general formula, which depends on the length and difficulty of the test, the number of respondents and the number of ability levels, this study aims to provide a closed formula for the adaptive tests with medium difficulty (probability of solution is p = 1/2) to determine the accuracy of the parameters for each item and in…

Descriptors: Test Length, Probability, Comparative Analysis, Difficulty Level

A Regression Discontinuity Design Framework for Controlling Selection Bias in Evaluations of Differential Item Functioning

Peer reviewed

Direct link

Koziol, Natalie A.; Goodrich, J. Marc; Yoon, HyeonJin – Educational and Psychological Measurement, 2022

Differential item functioning (DIF) is often used to examine validity evidence of alternate form test accommodations. Unfortunately, traditional approaches for evaluating DIF are prone to selection bias. This article proposes a novel DIF framework that capitalizes on regression discontinuity design analysis to control for selection bias. A…

Descriptors: Regression (Statistics), Item Analysis, Validity, Testing Accommodations

Measuring Language Ability of Students with Compensatory Multidimensional CAT: A Post-Hoc Simulation Study

Peer reviewed

Direct link

Ozdemir, Burhanettin; Gelbal, Selahattin – Education and Information Technologies, 2022

The computerized adaptive tests (CAT) apply an adaptive process in which the items are tailored to individuals' ability scores. The multidimensional CAT (MCAT) designs differ in terms of different item selection, ability estimation, and termination methods being used. This study aims at investigating the performance of the MCAT designs used to…

Descriptors: Scores, Computer Assisted Testing, Test Items, Language Proficiency

A Log-Linear Modeling Approach for Differential Item Functioning Detection in Polytomously Scored Items

Peer reviewed

Direct link

Yesiltas, Gonca; Paek, Insu – Educational and Psychological Measurement, 2020

A log-linear model (LLM) is a well-known statistical method to examine the relationship among categorical variables. This study investigated the performance of LLM in detecting differential item functioning (DIF) for polytomously scored items via simulations where various sample sizes, ability mean differences (impact), and DIF types were…

Descriptors: Simulation, Sample Size, Item Analysis, Scores

Investigation of Equating Error in Tests with Differential Item Functioning

Peer reviewed
PDF on ERIC

Download full text

Yurtçu, Meltem; Güzeller, Cem Oktay – International Journal of Assessment Tools in Education, 2018

In this study purposes to indicate the effect of the number of DIF items and the distribution of DIF items in these forms, which be equalized on equating error. Mean-mean, mean-standard deviation, Haebara and Stocking-Lord Methods used in common item design equal groups as equalization methods. The study included six different simulation…

Descriptors: Error Patterns, Test Items, Item Analysis, Simulation

Comparing the Robustness of Three Nonparametric DIF Procedures to Differential Rapid Guessing

Peer reviewed

Direct link

Abulela, Mohammed A. A.; Rios, Joseph A. – Applied Measurement in Education, 2022

When there are no personal consequences associated with test performance for examinees, rapid guessing (RG) is a concern and can differ between subgroups. To date, the impact of differential RG on item-level measurement invariance has received minimal attention. To that end, a simulation study was conducted to examine the robustness of the…

Descriptors: Comparative Analysis, Robustness (Statistics), Nonparametric Statistics, Item Analysis

Testing Latent Variable Distribution Fit in IRT Using Posterior Residuals

Peer reviewed

Direct link

Monroe, Scott – Journal of Educational and Behavioral Statistics, 2021

This research proposes a new statistic for testing latent variable distribution fit for unidimensional item response theory (IRT) models. If the typical assumption of normality is violated, then item parameter estimates will be biased, and dependent quantities such as IRT score estimates will be adversely affected. The proposed statistic compares…

Descriptors: Item Response Theory, Simulation, Scores, Comparative Analysis

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

Applied Psychological…	19
Journal of Educational…	15
Educational and Psychological…	14
Applied Measurement in…	10
ETS Research Report Series	10
ProQuest LLC	9
International Journal of…	3
International Journal of…	3
Journal of Educational and…	3
Multivariate Behavioral…	3
Psychometrika	3
IEEE Transactions on Learning…	2
Practical Assessment,…	2
Asia Pacific Education Review	1
British Journal of…	1
Education and Information…	1
Educational Measurement:…	1
Hacettepe University Journal…	1
Journal of Science Education…	1
Large-scale Assessments in…	1
Learning and Instruction	1
Pearson	1
Perspectives in Education	1
Quality Assurance in…	1
Structural Equation Modeling	1
More ▼

Chang, Hua-Hua	5
Cohen, Allan S.	3
Nandakumar, Ratna	3
Reckase, Mark D.	3
Stocking, Martha L.	3
Cho, Sun-Joo	2
Dodd, Barbara G.	2
Hsu, Tse-Chi	2
Ishii, Takatoshi	2
Kamata, Akihito	2
Kim, Seock-Ho	2
Kirisci, Levent	2
Kolen, Michael J.	2
Koziol, Natalie A.	2
Lee, Won-Chan	2
McKinley, Robert L.	2
Narayanan, Pankaja	2
Oshima, T. C.	2
Paek, Insu	2
Parshall, Cynthia G.	2
Penfield, Randall D.	2
Schumacker, Randall E.	2
Smith, Richard M.	2
Swaminathan, H.	2
More ▼