ERIC - Search Results

Publication Date

In 2025	0
Since 2024	2
Since 2021 (last 5 years)	6
Since 2016 (last 10 years)	8
Since 2006 (last 20 years)	10

Descriptor

Item Analysis	12
Sample Size	12
Test Length	12
Test Items	9
Item Response Theory	7
Comparative Analysis	6
Monte Carlo Methods	5
Error of Measurement	4
Goodness of Fit	4
Simulation	4
Accuracy	3
Correlation	3
Guidelines	2
Mathematical Models	2
Statistical Analysis	2
Statistical Bias	2
Test Bias	2
Test Format	2
Ability Identification	1
Achievement Tests	1
Bayesian Statistics	1
Bias	1
Classification	1
College Entrance Examinations	1
Computer Simulation	1
More ▼

Source

Applied Measurement in…	3
Educational and Psychological…	2
Measurement:…	2
ETS Research Report Series	1
Eurasian Journal of…	1
Journal of Educational…	1

Publication Type

Reports - Research	11
Journal Articles	10
Speeches/Meeting Papers	2
Reports - Evaluative	1

Education Level

Higher Education	1
Postsecondary Education	1

Audience

Researchers

Location

Laws, Policies, & Programs

Assessments and Surveys

Iowa Tests of Basic Skills

What Works Clearinghouse Rating

Showing all 12 results Save | Export

There Are Many Greater Lower Bounds than Cronbach's [alpha]: A Monte Carlo Simulation Study

Peer reviewed

Direct link

Novak, Josip; Rebernjak, Blaž – Measurement: Interdisciplinary Research and Perspectives, 2023

A Monte Carlo simulation study was conducted to examine the performance of [alpha], [lambda]2, [lambda][subscript 4], [lambda][subscript 2], [omega][subscript T], GLB[subscript MRFA], and GLB[subscript Algebraic] coefficients. Population reliability, distribution shape, sample size, test length, and number of response categories were varied…

Descriptors: Monte Carlo Methods, Evaluation Methods, Reliability, Simulation

IRT Characteristic Curve Linking Methods Weighted by Information for Mixed-Format Tests

Peer reviewed

Direct link

Shaojie Wang; Won-Chan Lee; Minqiang Zhang; Lixin Yuan – Applied Measurement in Education, 2024

To reduce the impact of parameter estimation errors on IRT linking results, recent work introduced two information-weighted characteristic curve methods for dichotomous items. These two methods showed outstanding performance in both simulation and pseudo-form pseudo-group analysis. The current study expands upon the concept of information…

Descriptors: Item Response Theory, Test Format, Test Length, Error of Measurement

An Evaluation of Fit Indices Used in Model Selection of Dichotomous Mixture IRT Models

Peer reviewed

Direct link

Sedat Sen; Allan S. Cohen – Educational and Psychological Measurement, 2024

A Monte Carlo simulation study was conducted to compare fit indices used for detecting the correct latent class in three dichotomous mixture item response theory (IRT) models. Ten indices were considered: Akaike's information criterion (AIC), the corrected AIC (AICc), Bayesian information criterion (BIC), consistent AIC (CAIC), Draper's…

Descriptors: Goodness of Fit, Item Response Theory, Sample Size, Classification

Two IRT Characteristic Curve Linking Methods Weighted by Information

Peer reviewed

Direct link

Wang, Shaojie; Zhang, Minqiang; Lee, Won-Chan; Huang, Feifei; Li, Zonglong; Li, Yixing; Yu, Sufang – Journal of Educational Measurement, 2022

Traditional IRT characteristic curve linking methods ignore parameter estimation errors, which may undermine the accuracy of estimated linking constants. Two new linking methods are proposed that take into account parameter estimation errors. The item- (IWCC) and test-information-weighted characteristic curve (TWCC) methods employ weighting…

Descriptors: Item Response Theory, Error of Measurement, Accuracy, Monte Carlo Methods

A Regression Discontinuity Design Framework for Controlling Selection Bias in Evaluations of Differential Item Functioning

Peer reviewed

Direct link

Koziol, Natalie A.; Goodrich, J. Marc; Yoon, HyeonJin – Educational and Psychological Measurement, 2022

Differential item functioning (DIF) is often used to examine validity evidence of alternate form test accommodations. Unfortunately, traditional approaches for evaluating DIF are prone to selection bias. This article proposes a novel DIF framework that capitalizes on regression discontinuity design analysis to control for selection bias. A…

Descriptors: Regression (Statistics), Item Analysis, Validity, Testing Accommodations

Comparison of Different Forms of a Test with or without Items That Exhibit DIF

Peer reviewed
PDF on ERIC

Download full text

Tulek, Onder Kamil; Kose, Ibrahim Alper – Eurasian Journal of Educational Research, 2019

Purpose: This research investigates Tests that include DIF items and which are purified from DIF items. While doing this, the ability estimations and purified DIF items are compared to understand whether there is a correlation between the estimations. Method: The researcher used to R 3.4.1 in order to compare the items and after this situation;…

Descriptors: Test Items, Item Analysis, Item Response Theory, Test Length

Robustness of Weighted Differential Item Functioning (DIF) Analysis: The Case of Mantel-Haenszel DIF Statistics. Research Report. ETS RR-21-12

Peer reviewed
PDF on ERIC

Download full text

Lu, Ru; Guo, Hongwen; Dorans, Neil J. – ETS Research Report Series, 2021

Two families of analysis methods can be used for differential item functioning (DIF) analysis. One family is DIF analysis based on observed scores, such as the Mantel-Haenszel (MH) and the standardized proportion-correct metric for DIF procedures; the other is analysis based on latent ability, in which the statistic is a measure of departure from…

Descriptors: Robustness (Statistics), Weighted Scores, Test Items, Item Analysis

Monte Carlo Simulation in Item Response Theory Applications Using SAS

Peer reviewed

Direct link

Ames, Allison J.; Leventhal, Brian C.; Ezike, Nnamdi C. – Measurement: Interdisciplinary Research and Perspectives, 2020

Data simulation and Monte Carlo simulation studies are important skills for researchers and practitioners of educational and psychological measurement, but there are few resources on the topic specific to item response theory. Even fewer resources exist on the statistical software techniques to implement simulation studies. This article presents…

Descriptors: Monte Carlo Methods, Item Response Theory, Simulation, Computer Software

An Empirical Investigation of Methods for Assessing Item Fit for Mixed Format Tests

Peer reviewed

Direct link

Chon, Kyong Hee; Lee, Won-Chan; Ansley, Timothy N. – Applied Measurement in Education, 2013

Empirical information regarding performance of model-fit procedures has been a persistent need in measurement practice. Statistical procedures for evaluating item fit were applied to real test examples that consist of both dichotomously and polytomously scored items. The item fit statistics used in this study included the PARSCALE's G[squared],…

Descriptors: Test Format, Test Items, Item Analysis, Goodness of Fit

Simultaneous Use of Multiple Answer Copying Indexes to Improve Detection Rates

Peer reviewed

Direct link

Wollack, James A. – Applied Measurement in Education, 2006

Many of the currently available statistical indexes to detect answer copying lack sufficient power at small [alpha] levels or when the amount of copying is relatively small. Furthermore, there is no one index that is uniformly best. Depending on the type or amount of copying, certain indexes are better than others. The purpose of this article was…

Descriptors: Statistical Analysis, Item Analysis, Test Length, Sample Size

A Simulation Study of the Effects of Ability Range Restriction on IRT Item Bias Detection Procedures.

Download full text

Lautenschlager, Gary J.; Park, Dong-Gun – 1987

The effects of variations in degree of range restriction and different subgroup sample sizes on the validity of several item bias detection procedures based on Item Response Theory (IRT) were investigated in a simulation study. The degree of range restriction for each of two subpopulations was varied by cutting the specified subpopulation ability…

Descriptors: Computer Simulation, Item Analysis, Latent Trait Theory, Mathematical Models

A Comparison of the Fit of Empirical Data to Two Latent Trait Models. Report No. 92.

Hutten, Leah R. – 1979

Goodness of fit of raw test score data were compared, using two latent trait models: the Rasch model and the Birnbaum three-parameter logistic model. Data were taken from various achievement tests and the Scholastic Aptitude Test (Verbal). A minimum sample size of 1,000 was required, and the minimum test length was 40 items. Results indicated that…

Descriptors: Ability Identification, Achievement Tests, College Entrance Examinations, Comparative Analysis

Lee, Won-Chan	2
Allan S. Cohen	1
Ames, Allison J.	1
Ansley, Timothy N.	1
Chon, Kyong Hee	1
Dorans, Neil J.	1
Ezike, Nnamdi C.	1
Goodrich, J. Marc	1
Guo, Hongwen	1
Huang, Feifei	1
Hutten, Leah R.	1
Kose, Ibrahim Alper	1
Koziol, Natalie A.	1
Lautenschlager, Gary J.	1
Leventhal, Brian C.	1
Li, Yixing	1
Li, Zonglong	1
Lixin Yuan	1
Lu, Ru	1
Minqiang Zhang	1
Novak, Josip	1
Park, Dong-Gun	1
Rebernjak, Blaž	1
Sedat Sen	1
Shaojie Wang	1
More ▼