ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	7
Since 2016 (last 10 years)	17
Since 2006 (last 20 years)	41

Descriptor

Simulation	51
Test Items	51
Test Length	51
Item Response Theory	28
Sample Size	20
Comparative Analysis	13
Computer Assisted Testing	13
Error of Measurement	11
Models	11
Adaptive Testing	10
Correlation	10
Goodness of Fit	10
Item Analysis	9
Scores	9
Difficulty Level	8
Evaluation Methods	8
Test Bias	8
Test Construction	8
Accuracy	7
Computation	7
Guidelines	6
Nonparametric Statistics	6
Probability	6
Scoring	6
Ability	5
More ▼

Source

ProQuest LLC	10
Educational and Psychological…	9
Applied Psychological…	6
International Journal of…	4
Journal of Educational…	4
ETS Research Report Series	3
Measurement:…	2
Applied Measurement in…	1
Education Sciences	1
Education and Information…	1
Journal of Applied Measurement	1
Pearson	1
Quality Assurance in…	1
More ▼

Publication Type

Journal Articles	33
Reports - Research	33
Dissertations/Theses -…	10
Reports - Evaluative	8
Speeches/Meeting Papers	5

Education Level

Elementary Secondary Education	2
Early Childhood Education	1
Higher Education	1
Postsecondary Education	1
Preschool Education	1

Audience

Location

Taiwan

Laws, Policies, & Programs

Assessments and Surveys

Advanced Placement…	1
Armed Forces Qualification…	1
COMPASS (Computer Assisted…	1
Stanford Binet Intelligence…	1
Trends in International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 51 results Save | Export

Assessing Dimensionality of IRT Models Using Traditional and Revised Parallel Analyses

Peer reviewed

Direct link

Guo, Wenjing; Choi, Youn-Jeng – Educational and Psychological Measurement, 2023

Determining the number of dimensions is extremely important in applying item response theory (IRT) models to data. Traditional and revised parallel analyses have been proposed within the factor analysis framework, and both have shown some promise in assessing dimensionality. However, their performance in the IRT framework has not been…

Descriptors: Item Response Theory, Evaluation Methods, Factor Analysis, Guidelines

There Are Many Greater Lower Bounds than Cronbach's [alpha]: A Monte Carlo Simulation Study

Peer reviewed

Direct link

Novak, Josip; Rebernjak, Blaž – Measurement: Interdisciplinary Research and Perspectives, 2023

A Monte Carlo simulation study was conducted to examine the performance of [alpha], [lambda]2, [lambda][subscript 4], [lambda][subscript 2], [omega][subscript T], GLB[subscript MRFA], and GLB[subscript Algebraic] coefficients. Population reliability, distribution shape, sample size, test length, and number of response categories were varied…

Descriptors: Monte Carlo Methods, Evaluation Methods, Reliability, Simulation

Investigating Confidence Intervals of Item Parameters When Some Item Parameters Take Priors in the 2PL and 3PL Models

Peer reviewed

Direct link

Paek, Insu; Lin, Zhongtian; Chalmers, Robert Philip – Educational and Psychological Measurement, 2023

To reduce the chance of Heywood cases or nonconvergence in estimating the 2PL or the 3PL model in the marginal maximum likelihood with the expectation-maximization (MML-EM) estimation method, priors for the item slope parameter in the 2PL model or for the pseudo-guessing parameter in the 3PL model can be used and the marginal maximum a posteriori…

Descriptors: Models, Item Response Theory, Test Items, Intervals

The Recovery of Correlation between Latent Abilities Using Compensatory and Noncompensatory Multidimensional IRT Models

Peer reviewed

Direct link

Fu, Yanyan; Strachan, Tyler; Ip, Edward H.; Willse, John T.; Chen, Shyh-Huei; Ackerman, Terry – International Journal of Testing, 2020

This research examined correlation estimates between latent abilities when using the two-dimensional and three-dimensional compensatory and noncompensatory item response theory models. Simulation study results showed that the recovery of the latent correlation was best when the test contained 100% of simple structure items for all models and…

Descriptors: Item Response Theory, Models, Test Items, Simulation

Can Auxiliary Information Improve Rasch Estimation at Small Sample Sizes?

Direct link

Derek Sauder – ProQuest LLC, 2020

The Rasch model is commonly used to calibrate multiple choice items. However, the sample sizes needed to estimate the Rasch model can be difficult to attain (e.g., consider a small testing company trying to pretest new items). With small sample sizes, auxiliary information besides the item responses may improve estimation of the item parameters.…

Descriptors: Item Response Theory, Sample Size, Computation, Test Length

Closed Formula of Test Length Required for Adaptive Testing with Medium Probability of Solution

Peer reviewed

Direct link

Kárász, Judit T.; Széll, Krisztián; Takács, Szabolcs – Quality Assurance in Education: An International Perspective, 2023

Purpose: Based on the general formula, which depends on the length and difficulty of the test, the number of respondents and the number of ability levels, this study aims to provide a closed formula for the adaptive tests with medium difficulty (probability of solution is p = 1/2) to determine the accuracy of the parameters for each item and in…

Descriptors: Test Length, Probability, Comparative Analysis, Difficulty Level

A Regression Discontinuity Design Framework for Controlling Selection Bias in Evaluations of Differential Item Functioning

Peer reviewed

Direct link

Koziol, Natalie A.; Goodrich, J. Marc; Yoon, HyeonJin – Educational and Psychological Measurement, 2022

Differential item functioning (DIF) is often used to examine validity evidence of alternate form test accommodations. Unfortunately, traditional approaches for evaluating DIF are prone to selection bias. This article proposes a novel DIF framework that capitalizes on regression discontinuity design analysis to control for selection bias. A…

Descriptors: Regression (Statistics), Item Analysis, Validity, Testing Accommodations

Measuring Language Ability of Students with Compensatory Multidimensional CAT: A Post-Hoc Simulation Study

Peer reviewed

Direct link

Ozdemir, Burhanettin; Gelbal, Selahattin – Education and Information Technologies, 2022

The computerized adaptive tests (CAT) apply an adaptive process in which the items are tailored to individuals' ability scores. The multidimensional CAT (MCAT) designs differ in terms of different item selection, ability estimation, and termination methods being used. This study aims at investigating the performance of the MCAT designs used to…

Descriptors: Scores, Computer Assisted Testing, Test Items, Language Proficiency

Robustness of Weighted Differential Item Functioning (DIF) Analysis: The Case of Mantel-Haenszel DIF Statistics. Research Report. ETS RR-21-12

Peer reviewed
PDF on ERIC

Download full text

Lu, Ru; Guo, Hongwen; Dorans, Neil J. – ETS Research Report Series, 2021

Two families of analysis methods can be used for differential item functioning (DIF) analysis. One family is DIF analysis based on observed scores, such as the Mantel-Haenszel (MH) and the standardized proportion-correct metric for DIF procedures; the other is analysis based on latent ability, in which the statistic is a measure of departure from…

Descriptors: Robustness (Statistics), Weighted Scores, Test Items, Item Analysis

Routing Strategies and Optimizing Design for Multistage Testing in International Large-Scale Assessments

Peer reviewed

Direct link

Svetina, Dubravka; Liaw, Yuan-Ling; Rutkowski, Leslie; Rutkowski, David – Journal of Educational Measurement, 2019

This study investigates the effect of several design and administration choices on item exposure and person/item parameter recovery under a multistage test (MST) design. In a simulation study, we examine whether number-correct (NC) or item response theory (IRT) methods are differentially effective at routing students to the correct next stage(s)…

Descriptors: Measurement, Item Analysis, Test Construction, Item Response Theory

The Effect of Person Misfit on Item Parameter Estimation and Classification Accuracy: A Simulation Study

Peer reviewed
PDF on ERIC

Download full text

Mousavi, Amin; Cui, Ying – Education Sciences, 2020

Often, important decisions regarding accountability and placement of students in performance categories are made on the basis of test scores generated from tests, therefore, it is important to evaluate the validity of the inferences derived from test results. One of the threats to the validity of such inferences is aberrant responding. Several…

Descriptors: Student Evaluation, Educational Testing, Psychological Testing, Item Response Theory

Monte Carlo Simulation in Item Response Theory Applications Using SAS

Peer reviewed

Direct link

Ames, Allison J.; Leventhal, Brian C.; Ezike, Nnamdi C. – Measurement: Interdisciplinary Research and Perspectives, 2020

Data simulation and Monte Carlo simulation studies are important skills for researchers and practitioners of educational and psychological measurement, but there are few resources on the topic specific to item response theory. Even fewer resources exist on the statistical software techniques to implement simulation studies. This article presents…

Descriptors: Monte Carlo Methods, Item Response Theory, Simulation, Computer Software

A Comparison of Score Aggregation Methods for Unidimensional Tests on Different Dimensions. Research Report. ETS RR-18-01

Peer reviewed
PDF on ERIC

Download full text

Fu, Jianbin; Feng, Yuling – ETS Research Report Series, 2018

In this study, we propose aggregating test scores with unidimensional within-test structure and multidimensional across-test structure based on a 2-level, 1-factor model. In particular, we compare 6 score aggregation methods: average of standardized test raw scores (M1), regression factor score estimate of the 1-factor model based on the…

Descriptors: Comparative Analysis, Scores, Correlation, Standardized Tests

A Fair Comparison of the Performance of Computerized Adaptive Testing and Multistage Adaptive Testing

Direct link

Wang, Keyin – ProQuest LLC, 2017

The comparison of item-level computerized adaptive testing (CAT) and multistage adaptive testing (MST) has been researched extensively (e.g., Kim & Plake, 1993; Luecht et al., 1996; Patsula, 1999; Jodoin, 2003; Hambleton & Xing, 2006; Keng, 2008; Zheng, 2012). Various CAT and MST designs have been investigated and compared under the same…

Descriptors: Comparative Analysis, Computer Assisted Testing, Adaptive Testing, Test Items

Identifying Aberrant Responding: Use of Multiple Measures

Direct link

Steinkamp, Susan Christa – ProQuest LLC, 2017

For test scores that rely on the accurate estimation of ability via an IRT model, their use and interpretation is dependent upon the assumption that the IRT model fits the data. Examinees who do not put forth full effort in answering test questions, have prior knowledge of test content, or do not approach a test with the intent of answering…

Descriptors: Test Items, Item Response Theory, Scores, Test Wiseness

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4

Cui, Ying	2
Paek, Insu	2
Schumacker, Randall E.	2
Sijtsma, Klaas	2
Wang, Wen-Chung	2
Wells, Craig S.	2
Ackerman, Terry	1
Ames, Allison J.	1
Andersson, Björn	1
Banks, Kathleen	1
Batinic, Bernad	1
Beyerlein, Michael M.	1
Bolt, Daniel M.	1
Cappaert, Kevin	1
Chalmers, Robert Philip	1
Chen, Shyh-Huei	1
Cheng, Ying	1
Chernyshenko, Oleksandr S.	1
Chien, Yuehmei	1
Choi, Youn-Jeng	1
Cliff, Norman	1
Cohen, Allan S.	1
Cook, Linda L.	1
Cui, Zhongmin	1
More ▼