Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 28 |
Since 2006 (last 20 years) | 89 |
Descriptor
Item Response Theory | 102 |
Statistical Analysis | 102 |
Simulation | 99 |
Test Items | 40 |
Models | 30 |
Comparative Analysis | 24 |
Computation | 23 |
Sample Size | 21 |
Scores | 18 |
Test Bias | 16 |
Error of Measurement | 15 |
More ▼ |
Source
Author
Sinharay, Sandip | 6 |
Choi, Seung W. | 3 |
Lu, Ying | 3 |
Chang, Hua-Hua | 2 |
Cho, Sun-Joo | 2 |
De Boeck, Paul | 2 |
DeMars, Christine E. | 2 |
Deng, Nina | 2 |
Finch, W. Holmes | 2 |
French, Brian F. | 2 |
Kelecioglu, Hülya | 2 |
More ▼ |
Publication Type
Journal Articles | 84 |
Reports - Research | 76 |
Reports - Evaluative | 14 |
Dissertations/Theses -… | 7 |
Speeches/Meeting Papers | 7 |
Reports - Descriptive | 3 |
Collected Works - Proceedings | 2 |
Education Level
Audience
Location
Turkey | 2 |
Afghanistan | 1 |
Canada | 1 |
Finland | 1 |
Florida | 1 |
France | 1 |
Illinois (Chicago) | 1 |
Singapore | 1 |
United Kingdom | 1 |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Weese, James D.; Turner, Ronna C.; Liang, Xinya; Ames, Allison; Crawford, Brandon – Educational and Psychological Measurement, 2023
A study was conducted to implement the use of a standardized effect size and corresponding classification guidelines for polytomous data with the POLYSIBTEST procedure and compare those guidelines with prior recommendations. Two simulation studies were included. The first identifies new unstandardized test heuristics for classifying moderate and…
Descriptors: Effect Size, Classification, Guidelines, Statistical Analysis
Cho, April E.; Wang, Chun; Zhang, Xue; Xu, Gongjun – Grantee Submission, 2020
Multidimensional Item Response Theory (MIRT) is widely used in assessment and evaluation of educational and psychological tests. It models the individual response patterns by specifying functional relationship between individuals' multiple latent traits and their responses to test items. One major challenge in parameter estimation in MIRT is that…
Descriptors: Item Response Theory, Mathematics, Statistical Inference, Maximum Likelihood Statistics
Olivera-Aguilar, Margarita; Rikoon, Samuel H.; Gonzalez, Oscar; Kisbu-Sakarya, Yasemin; MacKinnon, David P. – Educational and Psychological Measurement, 2018
When testing a statistical mediation model, it is assumed that factorial measurement invariance holds for the mediating construct across levels of the independent variable X. The consequences of failing to address the violations of measurement invariance in mediation models are largely unknown. The purpose of the present study was to…
Descriptors: Error of Measurement, Statistical Analysis, Factor Analysis, Simulation
Paul J. Walter; Edward Nuhfer; Crisel Suarez – Numeracy, 2021
We introduce an approach for making a quantitative comparison of the item response curves (IRCs) of any two populations on a multiple-choice test instrument. In this study, we employ simulated and actual data. We apply our approach to a dataset of 12,187 participants on the 25-item Science Literacy Concept Inventory (SLCI), which includes ample…
Descriptors: Item Analysis, Multiple Choice Tests, Simulation, Data Analysis
Craig, Brandon – ProQuest LLC, 2017
The purpose of this study was to determine if using a multistage approach for the empirical selection of anchor items would lead to more accurate DIF detection rates than the anchor selection methods proposed by Kopf, Zeileis, & Strobl (2015b). A simulation study was conducted in which the sample size, percentage of DIF, and balance of DIF…
Descriptors: Simulation, Sample Size, Item Response Theory, Item Analysis
Andersson, Björn; Xin, Tao – Educational and Psychological Measurement, 2018
In applications of item response theory (IRT), an estimate of the reliability of the ability estimates or sum scores is often reported. However, analytical expressions for the standard errors of the estimators of the reliability coefficients are not available in the literature and therefore the variability associated with the estimated reliability…
Descriptors: Item Response Theory, Test Reliability, Test Items, Scores
Man, Kaiwen; Harring, Jeffery R.; Ouyang, Yunbo; Thomas, Sarah L. – International Journal of Testing, 2018
Many important high-stakes decisions--college admission, academic performance evaluation, and even job promotion--depend on accurate and reliable scores from valid large-scale assessments. However, examinees sometimes cheat by copying answers from other test-takers or practicing with test items ahead of time, which can undermine the effectiveness…
Descriptors: Reaction Time, High Stakes Tests, Test Wiseness, Cheating
Sinharay, Sandip – Grantee Submission, 2018
Tatsuoka (1984) suggested several extended caution indices and their standardized versions that have been used as person-fit statistics by researchers such as Drasgow, Levine, and McLaughlin (1987), Glas and Meijer (2003), and Molenaar and Hoijtink (1990). However, these indices are only defined for tests with dichotomous items. This paper extends…
Descriptors: Test Format, Goodness of Fit, Item Response Theory, Error Patterns
Kang, Hyeon-Ah; Lu, Ying; Chang, Hua-Hua – Applied Measurement in Education, 2017
Increasing use of item pools in large-scale educational assessments calls for an appropriate scaling procedure to achieve a common metric among field-tested items. The present study examines scaling procedures for developing a new item pool under a spiraled block linking design. The three scaling procedures are considered: (a) concurrent…
Descriptors: Item Response Theory, Accuracy, Educational Assessment, Test Items
Morgan, Grant B.; Moore, Courtney A.; Floyd, Harlee S. – Journal of Psychoeducational Assessment, 2018
Although content validity--how well each item of an instrument represents the construct being measured--is foundational in the development of an instrument, statistical validity is also important to the decisions that are made based on the instrument. The primary purpose of this study is to demonstrate how simulation studies can be used to assist…
Descriptors: Simulation, Decision Making, Test Construction, Validity
Dimitrov, Dimiter M. – Measurement and Evaluation in Counseling and Development, 2017
This article offers an approach to examining differential item functioning (DIF) under its item response theory (IRT) treatment in the framework of confirmatory factor analysis (CFA). The approach is based on integrating IRT- and CFA-based testing of DIF and using bias-corrected bootstrap confidence intervals with a syntax code in Mplus.
Descriptors: Test Bias, Item Response Theory, Factor Analysis, Evaluation Methods
Kim, Sooyeon; Livingston, Samuel A. – ETS Research Report Series, 2017
The purpose of this simulation study was to assess the accuracy of a classical test theory (CTT)-based procedure for estimating the alternate-forms reliability of scores on a multistage test (MST) having 3 stages. We generated item difficulty and discrimination parameters for 10 parallel, nonoverlapping forms of the complete 3-stage test and…
Descriptors: Accuracy, Test Theory, Test Reliability, Adaptive Testing
Lee, Wooyeol; Cho, Sun-Joo – Applied Measurement in Education, 2017
Utilizing a longitudinal item response model, this study investigated the effect of item parameter drift (IPD) on item parameters and person scores via a Monte Carlo study. Item parameter recovery was investigated for various IPD patterns in terms of bias and root mean-square error (RMSE), and percentage of time the 95% confidence interval covered…
Descriptors: Item Response Theory, Test Items, Bias, Computation
Kogar, Hakan – International Journal of Assessment Tools in Education, 2018
The aim of this simulation study, determine the relationship between true latent scores and estimated latent scores by including various control variables and different statistical models. The study also aimed to compare the statistical models and determine the effects of different distribution types, response formats and sample sizes on latent…
Descriptors: Simulation, Context Effect, Computation, Statistical Analysis
Hidalgo, Ma Dolores; Benítez, Isabel; Padilla, Jose-Luis; Gómez-Benito, Juana – Sociological Methods & Research, 2017
The growing use of scales in survey questionnaires warrants the need to address how does polytomous differential item functioning (DIF) affect observed scale score comparisons. The aim of this study is to investigate the impact of DIF on the type I error and effect size of the independent samples t-test on the observed total scale scores. A…
Descriptors: Test Items, Test Bias, Item Response Theory, Surveys