ERIC - Search Results

Publication Date

In 2025	1
Since 2024	1
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	9
Since 2006 (last 20 years)	25

Descriptor

Simulation	31
Item Response Theory	15
Test Items	15
Computer Assisted Testing	10
Sample Size	9
Test Bias	8
Adaptive Testing	7
Comparative Analysis	7
Evaluation Methods	6
Monte Carlo Methods	6
Computation	5
Error of Measurement	5
Foreign Countries	5
Measurement	5
Psychometrics	5
Test Length	5
Achievement Tests	4
Cheating	4
Correlation	4
Mathematics Tests	4
Models	4
Probability	4
Statistical Analysis	4
Accuracy	3
Difficulty Level	3
More ▼

Source

International Journal of…

Publication Type

Journal Articles	31
Reports - Research	27
Reports - Evaluative	4

Education Level

Elementary Education	2
Grade 4	2
Middle Schools	2
Early Childhood Education	1
Elementary Secondary Education	1
Grade 3	1
Grade 5	1
Grade 8	1
Intermediate Grades	1
Junior High Schools	1
Primary Education	1
Secondary Education	1
More ▼

Audience

Location

Canada	2
Armenia	1
Austria	1
Denmark	1
Germany	1
Iran	1
Norway	1
Poland	1
Sweden	1
Tunisia	1

Laws, Policies, & Programs

Assessments and Surveys

Program for International…	2
Trends in International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 31 results Save | Export

IRT Linking Methods for the Bifactor Model with Mixed Format Tests

Peer reviewed

Direct link

Sohee Kim; Ki Lynn Cole – International Journal of Testing, 2025

This study conducted a comprehensive comparison of Item Response Theory (IRT) linking methods applied to a bifactor model, examining their performance on both multiple choice (MC) and mixed format tests within the common item nonequivalent group design framework. Four distinct multidimensional IRT linking approaches were explored, consisting of…

Descriptors: Item Response Theory, Comparative Analysis, Models, Item Analysis

The Recovery of Correlation between Latent Abilities Using Compensatory and Noncompensatory Multidimensional IRT Models

Peer reviewed

Direct link

Fu, Yanyan; Strachan, Tyler; Ip, Edward H.; Willse, John T.; Chen, Shyh-Huei; Ackerman, Terry – International Journal of Testing, 2020

This research examined correlation estimates between latent abilities when using the two-dimensional and three-dimensional compensatory and noncompensatory item response theory models. Simulation study results showed that the recovery of the latent correlation was best when the test contained 100% of simple structure items for all models and…

Descriptors: Item Response Theory, Models, Test Items, Simulation

Application of Ontologies for Assessing Collaborative Problem Solving Skills

Peer reviewed

Direct link

Andrews-Todd, Jessica; Kerr, Deirdre – International Journal of Testing, 2019

Collaborative problem solving (CPS) has been deemed a critical twenty-first century competency for a variety of contexts. However, less attention has been given to work aimed at the assessment and acquisition of such capabilities. Recently large scale efforts have been devoted toward assessing CPS skills, but there are no agreed upon guiding…

Descriptors: Cooperative Learning, Problem Solving, Student Evaluation, Evidence Based Practice

Log Data Analysis with ANFIS: A Fuzzy Neural Network Approach

Peer reviewed

Direct link

Cui, Ying; Guo, Qi; Leighton, Jacqueline P.; Chu, Man-Wai – International Journal of Testing, 2020

This study explores the use of the Adaptive Neuro-Fuzzy Inference System (ANFIS), a neuro-fuzzy approach, to analyze the log data of technology-based assessments to extract relevant features of student problem-solving processes, and develop and refine a set of fuzzy logic rules that could be used to interpret student performance. The log data that…

Descriptors: Inferences, Artificial Intelligence, Data Analysis, Computer Assisted Testing

Item Parameter Drift in Computer Adaptive Testing Due to Lack of Content Knowledge

Peer reviewed

Direct link

Aksu Dunya, Beyza – International Journal of Testing, 2018

This study was conducted to analyze potential item parameter drift (IPD) impact on person ability estimates and classification accuracy when drift affects an examinee subgroup. Using a series of simulations, three factors were manipulated: (a) percentage of IPD items in the CAT exam, (b) percentage of examinees affected by IPD, and (c) item pool…

Descriptors: Adaptive Testing, Classification, Accuracy, Computer Assisted Testing

Response Time Based Nonparametric Kullback-Leibler Divergence Measure for Detecting Aberrant Test-Taking Behavior

Peer reviewed

Direct link

Man, Kaiwen; Harring, Jeffery R.; Ouyang, Yunbo; Thomas, Sarah L. – International Journal of Testing, 2018

Many important high-stakes decisions--college admission, academic performance evaluation, and even job promotion--depend on accurate and reliable scores from valid large-scale assessments. However, examinees sometimes cheat by copying answers from other test-takers or practicing with test items ahead of time, which can undermine the effectiveness…

Descriptors: Reaction Time, High Stakes Tests, Test Wiseness, Cheating

Invariance Properties for General Diagnostic Classification Models

Peer reviewed

Direct link

Bradshaw, Laine P.; Madison, Matthew J. – International Journal of Testing, 2016

In item response theory (IRT), the invariance property states that item parameter estimates are independent of the examinee sample, and examinee ability estimates are independent of the test items. While this property has long been established and understood by the measurement community for IRT models, the same cannot be said for diagnostic…

Descriptors: Classification, Models, Simulation, Psychometrics

Spurious Latent Class Problem in the Mixed Rasch Model: A Comparison of Three Maximum Likelihood Estimation Methods under Different Ability Distributions

Peer reviewed

Direct link

Sen, Sedat – International Journal of Testing, 2018

Recent research has shown that over-extraction of latent classes can be observed in the Bayesian estimation of the mixed Rasch model when the distribution of ability is non-normal. This study examined the effect of non-normal ability distributions on the number of latent classes in the mixed Rasch model when estimated with maximum likelihood…

Descriptors: Item Response Theory, Comparative Analysis, Computation, Maximum Likelihood Statistics

Item Calibration Samples and the Stability of Achievement Estimates and System Rankings: Another Look at the PISA Model

Peer reviewed

Direct link

Rutkowski, Leslie; Rutkowski, David; Zhou, Yan – International Journal of Testing, 2016

Using an empirically-based simulation study, we show that typically used methods of choosing an item calibration sample have significant impacts on achievement bias and system rankings. We examine whether recent PISA accommodations, especially for lower performing participants, can mitigate some of this bias. Our findings indicate that standard…

Descriptors: Simulation, International Programs, Adolescents, Student Evaluation

Multiple-Group Noncompensatory Differential Item Functioning in Raju's Differential Functioning of Items and Tests

Peer reviewed

Direct link

Oshima, T. C.; Wright, Keith; White, Nick – International Journal of Testing, 2015

Raju, van der Linden, and Fleer (1995) introduced a framework for differential functioning of items and tests (DFIT) for unidimensional dichotomous models. Since then, DFIT has been shown to be a quite versatile framework as it can handle polytomous as well as multidimensional models both at the item and test levels. However, DFIT is still limited…

Descriptors: Test Bias, Item Response Theory, Test Items, Simulation

Using Out-of-Level Items in Computerized Adaptive Testing

Peer reviewed

Direct link

Wei, Hua; Lin, Jie – International Journal of Testing, 2015

Out-of-level testing refers to the practice of assessing a student with a test that is intended for students at a higher or lower grade level. Although the appropriateness of out-of-level testing for accountability purposes has been questioned by educators and policymakers, incorporating out-of-level items in formative assessments for accurate…

Descriptors: Test Items, Computer Assisted Testing, Adaptive Testing, Instructional Program Divisions

Test Length and Decision Quality in Personnel Selection: When Is Short Too Short?

Peer reviewed

Direct link

Kruyen, Peter M.; Emons, Wilco H. M.; Sijtsma, Klaas – International Journal of Testing, 2012

Personnel selection shows an enduring need for short stand-alone tests consisting of, say, 5 to 15 items. Despite their efficiency, short tests are more vulnerable to measurement error than longer test versions. Consequently, the question arises to what extent reducing test length deteriorates decision quality due to increased impact of…

Descriptors: Measurement, Personnel Selection, Decision Making, Error of Measurement

Review of Sample Size for Structural Equation Models in Second Language Testing and Learning Research: A Monte Carlo Approach

Peer reviewed

Direct link

In'nami, Yo; Koizumi, Rie – International Journal of Testing, 2013

The importance of sample size, although widely discussed in the literature on structural equation modeling (SEM), has not been widely recognized among applied SEM researchers. To narrow this gap, we focus on second language testing and learning studies and examine the following: (a) Is the sample size sufficient in terms of precision and power of…

Descriptors: Structural Equation Models, Sample Size, Second Language Instruction, Monte Carlo Methods

Observed-Score Equating with a Heterogeneous Target Population

Peer reviewed

Direct link

Duong, Minh Q.; von Davier, Alina A. – International Journal of Testing, 2012

Test equating is a statistical procedure for adjusting for test form differences in difficulty in a standardized assessment. Equating results are supposed to hold for a specified target population (Kolen & Brennan, 2004; von Davier, Holland, & Thayer, 2004) and to be (relatively) independent of the subpopulations from the target population (see…

Descriptors: Ability Grouping, Difficulty Level, Psychometrics, Statistical Analysis

Modeling Item-Level and Step-Level Invariance Effects in Polytomous Items Using the Partial Credit Model

Peer reviewed

Direct link

Gattamorta, Karina A.; Penfield, Randall D.; Myers, Nicholas D. – International Journal of Testing, 2012

Measurement invariance is a common consideration in the evaluation of the validity and fairness of test scores when the tested population contains distinct groups of examinees, such as examinees receiving different forms of a translated test. Measurement invariance in polytomous items has traditionally been evaluated at the item-level,…

Descriptors: Foreign Countries, Psychometrics, Test Bias, Test Items

Previous Page | Next Page »

Pages: 1 | 2 | 3

Ackerman, Terry	1
Aksu Dunya, Beyza	1
Andrews-Todd, Jessica	1
Bartfay, Emma	1
Bradshaw, Laine P.	1
Brown, Richard S.	1
Chen, Shyh-Huei	1
Chernyshenko, Oleksandr S.	1
Chu, Man-Wai	1
Cohen, Allan S.	1
Cui, Ying	1
Dayton, C. Mitchell	1
De Ayala, Ralph J.	1
DeMars, Christine E.	1
Drasgow, Fritz	1
Duong, Minh Q.	1
Elmore, Patricia B.	1
Emons, Wilco H. M.	1
Fu, Yanyan	1
Gattamorta, Karina A.	1
Glas, Cees A. W.	1
Guo, Jing	1
Guo, Qi	1
Hambleton, Ronald K.	1
Harring, Jeffery R.	1
More ▼