Publication Date
In 2025 | 1 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 9 |
Since 2006 (last 20 years) | 25 |
Descriptor
Simulation | 31 |
Item Response Theory | 15 |
Test Items | 15 |
Computer Assisted Testing | 10 |
Sample Size | 9 |
Test Bias | 8 |
Adaptive Testing | 7 |
Comparative Analysis | 7 |
Evaluation Methods | 6 |
Monte Carlo Methods | 6 |
Computation | 5 |
More ▼ |
Source
International Journal of… | 31 |
Author
Ackerman, Terry | 1 |
Aksu Dunya, Beyza | 1 |
Andrews-Todd, Jessica | 1 |
Bartfay, Emma | 1 |
Bradshaw, Laine P. | 1 |
Brown, Richard S. | 1 |
Chen, Shyh-Huei | 1 |
Chernyshenko, Oleksandr S. | 1 |
Chu, Man-Wai | 1 |
Cohen, Allan S. | 1 |
Cui, Ying | 1 |
More ▼ |
Publication Type
Journal Articles | 31 |
Reports - Research | 27 |
Reports - Evaluative | 4 |
Education Level
Elementary Education | 2 |
Grade 4 | 2 |
Middle Schools | 2 |
Early Childhood Education | 1 |
Elementary Secondary Education | 1 |
Grade 3 | 1 |
Grade 5 | 1 |
Grade 8 | 1 |
Intermediate Grades | 1 |
Junior High Schools | 1 |
Primary Education | 1 |
More ▼ |
Audience
Laws, Policies, & Programs
Assessments and Surveys
Program for International… | 2 |
Trends in International… | 1 |
What Works Clearinghouse Rating
Sohee Kim; Ki Lynn Cole – International Journal of Testing, 2025
This study conducted a comprehensive comparison of Item Response Theory (IRT) linking methods applied to a bifactor model, examining their performance on both multiple choice (MC) and mixed format tests within the common item nonequivalent group design framework. Four distinct multidimensional IRT linking approaches were explored, consisting of…
Descriptors: Item Response Theory, Comparative Analysis, Models, Item Analysis
Fu, Yanyan; Strachan, Tyler; Ip, Edward H.; Willse, John T.; Chen, Shyh-Huei; Ackerman, Terry – International Journal of Testing, 2020
This research examined correlation estimates between latent abilities when using the two-dimensional and three-dimensional compensatory and noncompensatory item response theory models. Simulation study results showed that the recovery of the latent correlation was best when the test contained 100% of simple structure items for all models and…
Descriptors: Item Response Theory, Models, Test Items, Simulation
Andrews-Todd, Jessica; Kerr, Deirdre – International Journal of Testing, 2019
Collaborative problem solving (CPS) has been deemed a critical twenty-first century competency for a variety of contexts. However, less attention has been given to work aimed at the assessment and acquisition of such capabilities. Recently large scale efforts have been devoted toward assessing CPS skills, but there are no agreed upon guiding…
Descriptors: Cooperative Learning, Problem Solving, Student Evaluation, Evidence Based Practice
Cui, Ying; Guo, Qi; Leighton, Jacqueline P.; Chu, Man-Wai – International Journal of Testing, 2020
This study explores the use of the Adaptive Neuro-Fuzzy Inference System (ANFIS), a neuro-fuzzy approach, to analyze the log data of technology-based assessments to extract relevant features of student problem-solving processes, and develop and refine a set of fuzzy logic rules that could be used to interpret student performance. The log data that…
Descriptors: Inferences, Artificial Intelligence, Data Analysis, Computer Assisted Testing
Aksu Dunya, Beyza – International Journal of Testing, 2018
This study was conducted to analyze potential item parameter drift (IPD) impact on person ability estimates and classification accuracy when drift affects an examinee subgroup. Using a series of simulations, three factors were manipulated: (a) percentage of IPD items in the CAT exam, (b) percentage of examinees affected by IPD, and (c) item pool…
Descriptors: Adaptive Testing, Classification, Accuracy, Computer Assisted Testing
Man, Kaiwen; Harring, Jeffery R.; Ouyang, Yunbo; Thomas, Sarah L. – International Journal of Testing, 2018
Many important high-stakes decisions--college admission, academic performance evaluation, and even job promotion--depend on accurate and reliable scores from valid large-scale assessments. However, examinees sometimes cheat by copying answers from other test-takers or practicing with test items ahead of time, which can undermine the effectiveness…
Descriptors: Reaction Time, High Stakes Tests, Test Wiseness, Cheating
Bradshaw, Laine P.; Madison, Matthew J. – International Journal of Testing, 2016
In item response theory (IRT), the invariance property states that item parameter estimates are independent of the examinee sample, and examinee ability estimates are independent of the test items. While this property has long been established and understood by the measurement community for IRT models, the same cannot be said for diagnostic…
Descriptors: Classification, Models, Simulation, Psychometrics
Sen, Sedat – International Journal of Testing, 2018
Recent research has shown that over-extraction of latent classes can be observed in the Bayesian estimation of the mixed Rasch model when the distribution of ability is non-normal. This study examined the effect of non-normal ability distributions on the number of latent classes in the mixed Rasch model when estimated with maximum likelihood…
Descriptors: Item Response Theory, Comparative Analysis, Computation, Maximum Likelihood Statistics
Rutkowski, Leslie; Rutkowski, David; Zhou, Yan – International Journal of Testing, 2016
Using an empirically-based simulation study, we show that typically used methods of choosing an item calibration sample have significant impacts on achievement bias and system rankings. We examine whether recent PISA accommodations, especially for lower performing participants, can mitigate some of this bias. Our findings indicate that standard…
Descriptors: Simulation, International Programs, Adolescents, Student Evaluation
Oshima, T. C.; Wright, Keith; White, Nick – International Journal of Testing, 2015
Raju, van der Linden, and Fleer (1995) introduced a framework for differential functioning of items and tests (DFIT) for unidimensional dichotomous models. Since then, DFIT has been shown to be a quite versatile framework as it can handle polytomous as well as multidimensional models both at the item and test levels. However, DFIT is still limited…
Descriptors: Test Bias, Item Response Theory, Test Items, Simulation
Wei, Hua; Lin, Jie – International Journal of Testing, 2015
Out-of-level testing refers to the practice of assessing a student with a test that is intended for students at a higher or lower grade level. Although the appropriateness of out-of-level testing for accountability purposes has been questioned by educators and policymakers, incorporating out-of-level items in formative assessments for accurate…
Descriptors: Test Items, Computer Assisted Testing, Adaptive Testing, Instructional Program Divisions
Kruyen, Peter M.; Emons, Wilco H. M.; Sijtsma, Klaas – International Journal of Testing, 2012
Personnel selection shows an enduring need for short stand-alone tests consisting of, say, 5 to 15 items. Despite their efficiency, short tests are more vulnerable to measurement error than longer test versions. Consequently, the question arises to what extent reducing test length deteriorates decision quality due to increased impact of…
Descriptors: Measurement, Personnel Selection, Decision Making, Error of Measurement
In'nami, Yo; Koizumi, Rie – International Journal of Testing, 2013
The importance of sample size, although widely discussed in the literature on structural equation modeling (SEM), has not been widely recognized among applied SEM researchers. To narrow this gap, we focus on second language testing and learning studies and examine the following: (a) Is the sample size sufficient in terms of precision and power of…
Descriptors: Structural Equation Models, Sample Size, Second Language Instruction, Monte Carlo Methods
Duong, Minh Q.; von Davier, Alina A. – International Journal of Testing, 2012
Test equating is a statistical procedure for adjusting for test form differences in difficulty in a standardized assessment. Equating results are supposed to hold for a specified target population (Kolen & Brennan, 2004; von Davier, Holland, & Thayer, 2004) and to be (relatively) independent of the subpopulations from the target population (see…
Descriptors: Ability Grouping, Difficulty Level, Psychometrics, Statistical Analysis
Gattamorta, Karina A.; Penfield, Randall D.; Myers, Nicholas D. – International Journal of Testing, 2012
Measurement invariance is a common consideration in the evaluation of the validity and fairness of test scores when the tested population contains distinct groups of examinees, such as examinees receiving different forms of a translated test. Measurement invariance in polytomous items has traditionally been evaluated at the item-level,…
Descriptors: Foreign Countries, Psychometrics, Test Bias, Test Items