NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
Researchers2
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 72 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Sohee Kim; Ki Lynn Cole – International Journal of Testing, 2025
This study conducted a comprehensive comparison of Item Response Theory (IRT) linking methods applied to a bifactor model, examining their performance on both multiple choice (MC) and mixed format tests within the common item nonequivalent group design framework. Four distinct multidimensional IRT linking approaches were explored, consisting of…
Descriptors: Item Response Theory, Comparative Analysis, Models, Item Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Joakim Wallmark; James O. Ramsay; Juan Li; Marie Wiberg – Journal of Educational and Behavioral Statistics, 2024
Item response theory (IRT) models the relationship between the possible scores on a test item against a test taker's attainment of the latent trait that the item is intended to measure. In this study, we compare two models for tests with polytomously scored items: the optimal scoring (OS) model, a nonparametric IRT model based on the principles of…
Descriptors: Item Response Theory, Test Items, Models, Scoring
Peer reviewed Peer reviewed
Direct linkDirect link
Guo, Wenjing; Choi, Youn-Jeng – Educational and Psychological Measurement, 2023
Determining the number of dimensions is extremely important in applying item response theory (IRT) models to data. Traditional and revised parallel analyses have been proposed within the factor analysis framework, and both have shown some promise in assessing dimensionality. However, their performance in the IRT framework has not been…
Descriptors: Item Response Theory, Evaluation Methods, Factor Analysis, Guidelines
Peer reviewed Peer reviewed
Direct linkDirect link
Novak, Josip; Rebernjak, Blaž – Measurement: Interdisciplinary Research and Perspectives, 2023
A Monte Carlo simulation study was conducted to examine the performance of [alpha], [lambda]2, [lambda][subscript 4], [lambda][subscript 2], [omega][subscript T], GLB[subscript MRFA], and GLB[subscript Algebraic] coefficients. Population reliability, distribution shape, sample size, test length, and number of response categories were varied…
Descriptors: Monte Carlo Methods, Evaluation Methods, Reliability, Simulation
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Wan, Siyu; Keller, Lisa A. – Practical Assessment, Research & Evaluation, 2023
Statistical process control (SPC) charts have been widely used in the field of educational measurement. The cumulative sum (CUSUM) is an established SPC method to detect aberrant responses for educational assessments. There are many studies that investigated the performance of CUSUM in different test settings. This paper describes the CUSUM…
Descriptors: Visual Aids, Educational Assessment, Evaluation Methods, Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Bengs, Daniel; Kroehne, Ulf; Brefeld, Ulf – Journal of Educational Measurement, 2021
By tailoring test forms to the test-taker's proficiency, Computerized Adaptive Testing (CAT) enables substantial increases in testing efficiency over fixed forms testing. When used for formative assessment, the alignment of task difficulty with proficiency increases the chance that teachers can derive useful feedback from assessment data. The…
Descriptors: Computer Assisted Testing, Formative Evaluation, Group Testing, Program Effectiveness
Peer reviewed Peer reviewed
Direct linkDirect link
Kim, Kyung Yong – Journal of Educational Measurement, 2020
New items are often evaluated prior to their operational use to obtain item response theory (IRT) item parameter estimates for quality control purposes. Fixed parameter calibration is one linking method that is widely used to estimate parameters for new items and place them on the desired scale. This article provides detailed descriptions of two…
Descriptors: Item Response Theory, Evaluation Methods, Test Items, Simulation
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Metsämuuronen, Jari – International Journal of Educational Methodology, 2020
A new index of item discrimination power (IDP), dimension-corrected Somers' D (D2) is proposed. Somers' D is one of the superior alternatives for item-total- (Rit) and item-rest correlation (Rir) in reflecting the real IDP with items with scales 0/1 and 0/1/2, that is, up to three categories. D also reaches the extreme value +1 and -1 correctly…
Descriptors: Item Analysis, Correlation, Test Items, Simulation
Peer reviewed Peer reviewed
Direct linkDirect link
Yesiltas, Gonca; Paek, Insu – Educational and Psychological Measurement, 2020
A log-linear model (LLM) is a well-known statistical method to examine the relationship among categorical variables. This study investigated the performance of LLM in detecting differential item functioning (DIF) for polytomously scored items via simulations where various sample sizes, ability mean differences (impact), and DIF types were…
Descriptors: Simulation, Sample Size, Item Analysis, Scores
Jing Lu; Chun Wang; Jiwei Zhang; Xue Wang – Grantee Submission, 2023
Changepoints are abrupt variations in a sequence of data in statistical inference. In educational and psychological assessments, it is pivotal to properly differentiate examinees' aberrant behaviors from solution behavior to ensure test reliability and validity. In this paper, we propose a sequential Bayesian changepoint detection algorithm to…
Descriptors: Bayesian Statistics, Behavior Patterns, Computer Assisted Testing, Accuracy
Peer reviewed Peer reviewed
Direct linkDirect link
Drabinová, Adéla; Martinková, Patrícia – Journal of Educational Measurement, 2017
In this article we present a general approach not relying on item response theory models (non-IRT) to detect differential item functioning (DIF) in dichotomous items with presence of guessing. The proposed nonlinear regression (NLR) procedure for DIF detection is an extension of method based on logistic regression. As a non-IRT approach, NLR can…
Descriptors: Test Items, Regression (Statistics), Guessing (Tests), Identification
Peer reviewed Peer reviewed
Direct linkDirect link
Wyse, Adam E. – Educational Measurement: Issues and Practice, 2017
This article illustrates five different methods for estimating Angoff cut scores using item response theory (IRT) models. These include maximum likelihood (ML), expected a priori (EAP), modal a priori (MAP), and weighted maximum likelihood (WML) estimators, as well as the most commonly used approach based on translating ratings through the test…
Descriptors: Cutting Scores, Item Response Theory, Bayesian Statistics, Maximum Likelihood Statistics
Peer reviewed Peer reviewed
Direct linkDirect link
Guo, Rui; Zheng, Yi; Chang, Hua-Hua – Journal of Educational Measurement, 2015
An important assumption of item response theory is item parameter invariance. Sometimes, however, item parameters are not invariant across different test administrations due to factors other than sampling error; this phenomenon is termed item parameter drift. Several methods have been developed to detect drifted items. However, most of the…
Descriptors: Item Response Theory, Test Items, Evaluation Methods, Equated Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Moothedath, Shana; Chaporkar, Prasanna; Belur, Madhu N. – Perspectives in Education, 2016
In recent years, the computerised adaptive test (CAT) has gained popularity over conventional exams in evaluating student capabilities with desired accuracy. However, the key limitation of CAT is that it requires a large pool of pre-calibrated questions. In the absence of such a pre-calibrated question bank, offline exams with uncalibrated…
Descriptors: Guessing (Tests), Computer Assisted Testing, Adaptive Testing, Maximum Likelihood Statistics
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Lu, Ying – ETS Research Report Series, 2017
For standard- or criterion-based assessments, the use of cut scores to indicate mastery, nonmastery, or different levels of skill mastery is very common. As part of performance summary, it is of interest to examine the percentage of examinees at or above the cut scores (PAC) and how PAC evolves across administrations. This paper shows that…
Descriptors: Cutting Scores, Evaluation Methods, Mastery Learning, Performance Based Assessment
Previous Page | Next Page »
Pages: 1  |  2  |  3  |  4  |  5