ERIC - Search Results

Publication Date

In 2025	0
Since 2024	5
Since 2021 (last 5 years)	15
Since 2016 (last 10 years)	30
Since 2006 (last 20 years)	61

Descriptor

Error of Measurement	82
Simulation	82
Test Items	82
Item Response Theory	46
Computer Assisted Testing	19
Item Analysis	18
Sample Size	18
Comparative Analysis	16
Adaptive Testing	15
Scores	14
Test Bias	14
Statistical Analysis	13
Difficulty Level	12
Equated Scores	12
Models	11
Statistical Bias	11
Test Length	11
Accuracy	10
Computation	10
Evaluation Methods	10
Psychometrics	10
Correlation	9
Foreign Countries	9
Regression (Statistics)	8
Scaling	8
More ▼

Publication Type

Journal Articles	59
Reports - Research	57
Reports - Evaluative	15
Speeches/Meeting Papers	9
Dissertations/Theses -…	6
Reports - Descriptive	4
Numerical/Quantitative Data	2

Education Level

Elementary Secondary Education	4
Secondary Education	4
Elementary Education	3
Early Childhood Education	2
Grade 1	2
Grade 2	2
Grade 3	2
Higher Education	2
Postsecondary Education	2
Primary Education	2
Grade 9	1
High Schools	1
Junior High Schools	1
Middle Schools	1
More ▼

Audience

Researchers

Location

Canada	1
Indonesia	1
Saudi Arabia	1
Taiwan	1

Laws, Policies, & Programs

Assessments and Surveys

Program for International…	3
Trends in International…	3
Advanced Placement…	1
Armed Forces Qualification…	1
Behavioral Risk Factor…	1
Big Five Inventory	1
Cognitive Abilities Test	1
National Assessment of…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 82 results Save | Export

Multi-Group Regularized Gaussian Variational Estimation: Fast Detection of DIF

Peer reviewed

Direct link

Weicong Lyu; Chun Wang; Gongjun Xu – Grantee Submission, 2024

Data harmonization is an emerging approach to strategically combining data from multiple independent studies, enabling addressing new research questions that are not answerable by a single contributing study. A fundamental psychometric challenge for data harmonization is to create commensurate measures for the constructs of interest across…

Descriptors: Data Analysis, Test Items, Psychometrics, Item Response Theory

Investigating Confidence Intervals of Item Parameters When Some Item Parameters Take Priors in the 2PL and 3PL Models

Peer reviewed

Direct link

Paek, Insu; Lin, Zhongtian; Chalmers, Robert Philip – Educational and Psychological Measurement, 2023

To reduce the chance of Heywood cases or nonconvergence in estimating the 2PL or the 3PL model in the marginal maximum likelihood with the expectation-maximization (MML-EM) estimation method, priors for the item slope parameter in the 2PL model or for the pseudo-guessing parameter in the 3PL model can be used and the marginal maximum a posteriori…

Descriptors: Models, Item Response Theory, Test Items, Intervals

Impacts of Differences in Group Abilities and Anchor Test Features on Three Non-IRT Test Equating Methods

Peer reviewed
PDF on ERIC

Download full text

Inga Laukaityte; Marie Wiberg – Practical Assessment, Research & Evaluation, 2024

The overall aim was to examine effects of differences in group ability and features of the anchor test form on equating bias and the standard error of equating (SEE) using both real and simulated data. Chained kernel equating, Postratification kernel equating, and Circle-arc equating were studied. A college admissions test with four different…

Descriptors: Ability Grouping, Test Items, College Entrance Examinations, High Stakes Tests

Relative Robustness of CDMs and (M)IRT in Measuring Growth in Latent Skills

Peer reviewed

Direct link

Huang, Qi; Bolt, Daniel M. – Educational and Psychological Measurement, 2023

Previous studies have demonstrated evidence of latent skill continuity even in tests intentionally designed for measurement of binary skills. In addition, the assumption of binary skills when continuity is present has been shown to potentially create a lack of invariance in item and latent ability parameters that may undermine applications. In…

Descriptors: Item Response Theory, Test Items, Skill Development, Robustness (Statistics)

Robustness of Adaptive Measurement of Change to Item Parameter Estimation Error

Peer reviewed

Direct link

Cooperman, Allison W.; Weiss, David J.; Wang, Chun – Educational and Psychological Measurement, 2022

Adaptive measurement of change (AMC) is a psychometric method for measuring intra-individual change on one or more latent traits across testing occasions. Three hypothesis tests--a Z test, likelihood ratio test, and score ratio index--have demonstrated desirable statistical properties in this context, including low false positive rates and high…

Descriptors: Error of Measurement, Psychometrics, Hypothesis Testing, Simulation

Maintaining Score Scales over Time: A Comparison of Five Scoring Methods

Peer reviewed

Direct link

Kim, Stella Yun; Lee, Won-Chan – Applied Measurement in Education, 2023

This study evaluates various scoring methods including number-correct scoring, IRT theta scoring, and hybrid scoring in terms of scale-score stability over time. A simulation study was conducted to examine the relative performance of five scoring methods in terms of preserving the first two moments of scale scores for a population in a chain of…

Descriptors: Scoring, Comparative Analysis, Item Response Theory, Simulation

The Study of the Effect of Item Parameter Drift on Ability Estimation Obtained from Adaptive Testing under Different Conditions

Peer reviewed
PDF on ERIC

Download full text

Sahin Kursad, Merve; Cokluk Bokeoglu, Omay; Cikrikci, Rahime Nukhet – International Journal of Assessment Tools in Education, 2022

Item parameter drift (IPD) is the systematic differentiation of parameter values of items over time due to various reasons. If it occurs in computer adaptive tests (CAT), it causes errors in the estimation of item and ability parameters. Identification of the underlying conditions of this situation in CAT is important for estimating item and…

Descriptors: Item Analysis, Computer Assisted Testing, Test Items, Error of Measurement

A Regression Discontinuity Design Framework for Controlling Selection Bias in Evaluations of Differential Item Functioning

Peer reviewed

Direct link

Koziol, Natalie A.; Goodrich, J. Marc; Yoon, HyeonJin – Educational and Psychological Measurement, 2022

Differential item functioning (DIF) is often used to examine validity evidence of alternate form test accommodations. Unfortunately, traditional approaches for evaluating DIF are prone to selection bias. This article proposes a novel DIF framework that capitalizes on regression discontinuity design analysis to control for selection bias. A…

Descriptors: Regression (Statistics), Item Analysis, Validity, Testing Accommodations

Measuring Language Ability of Students with Compensatory Multidimensional CAT: A Post-Hoc Simulation Study

Peer reviewed

Direct link

Ozdemir, Burhanettin; Gelbal, Selahattin – Education and Information Technologies, 2022

The computerized adaptive tests (CAT) apply an adaptive process in which the items are tailored to individuals' ability scores. The multidimensional CAT (MCAT) designs differ in terms of different item selection, ability estimation, and termination methods being used. This study aims at investigating the performance of the MCAT designs used to…

Descriptors: Scores, Computer Assisted Testing, Test Items, Language Proficiency

Leveraging Item Parameter Drift to Assess Transfer Effects in Vocabulary Learning. EdWorkingPaper No. 23-868

Download full text

Joshua B. Gilbert; James S. Kim; Luke W. Miratrix – Annenberg Institute for School Reform at Brown University, 2024

Longitudinal models of individual growth typically emphasize between-person predictors of change but ignore how growth may vary "within" persons because each person contributes only one point at each time to the model. In contrast, modeling growth with multi-item assessments allows evaluation of how relative item performance may shift…

Descriptors: Vocabulary Development, Item Response Theory, Test Items, Student Development

Leveraging Item Parameter Drift to Assess Transfer Effects in Vocabulary Learning

Peer reviewed

Direct link

Joshua B. Gilbert; James S. Kim; Luke W. Miratrix – Applied Measurement in Education, 2024

Longitudinal models typically emphasize between-person predictors of change but ignore how growth varies "within" persons because each person contributes only one data point at each time. In contrast, modeling growth with multi-item assessments allows evaluation of how relative item performance may shift over time. While traditionally…

Descriptors: Vocabulary Development, Item Response Theory, Test Items, Student Development

Robustness of Weighted Differential Item Functioning (DIF) Analysis: The Case of Mantel-Haenszel DIF Statistics. Research Report. ETS RR-21-12

Peer reviewed
PDF on ERIC

Download full text

Lu, Ru; Guo, Hongwen; Dorans, Neil J. – ETS Research Report Series, 2021

Two families of analysis methods can be used for differential item functioning (DIF) analysis. One family is DIF analysis based on observed scores, such as the Mantel-Haenszel (MH) and the standardized proportion-correct metric for DIF procedures; the other is analysis based on latent ability, in which the statistic is a measure of departure from…

Descriptors: Robustness (Statistics), Weighted Scores, Test Items, Item Analysis

Impact of Item Parameter Drift on Rasch Scale Stability in Small Samples over Multiple Administrations

Peer reviewed

Direct link

Kopp, Jason P.; Jones, Andrew T. – Applied Measurement in Education, 2020

Traditional psychometric guidelines suggest that at least several hundred respondents are needed to obtain accurate parameter estimates under the Rasch model. However, recent research indicates that Rasch equating results in accurate parameter estimates with sample sizes as small as 25. Item parameter drift under the Rasch model has been…

Descriptors: Item Response Theory, Psychometrics, Sample Size, Sampling

Comparing Small-Sample Equating with Angoff Judgement for Linking Cut-Scores on Two Tests

Download full text

Bramley, Tom – Research Matters, 2020

The aim of this study was to compare, by simulation, the accuracy of mapping a cut-score from one test to another by expert judgement (using the Angoff method) versus the accuracy with a small-sample equating method (chained linear equating). As expected, the standard-setting method resulted in more accurate equating when we assumed a higher level…

Descriptors: Cutting Scores, Standard Setting (Scoring), Equated Scores, Accuracy

A Log-Linear Modeling Approach for Differential Item Functioning Detection in Polytomously Scored Items

Peer reviewed

Direct link

Yesiltas, Gonca; Paek, Insu – Educational and Psychological Measurement, 2020

A log-linear model (LLM) is a well-known statistical method to examine the relationship among categorical variables. This study investigated the performance of LLM in detecting differential item functioning (DIF) for polytomously scored items via simulations where various sample sizes, ability mean differences (impact), and DIF types were…

Descriptors: Simulation, Sample Size, Item Analysis, Scores

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6

Educational and Psychological…	14
Applied Psychological…	9
Journal of Educational…	8
ProQuest LLC	6
Applied Measurement in…	5
ETS Research Report Series	5
Grantee Submission	2
International Journal of…	2
Journal of Educational and…	2
Structural Equation Modeling:…	2
American Institutes for…	1
Annenberg Institute for…	1
EURASIA Journal of…	1
Education and Information…	1
Educational Research and…	1
International Journal of…	1
Journal of Applied Measurement	1
Journal of Psychoeducational…	1
Large-scale Assessments in…	1
Measurement and Evaluation in…	1
Practical Assessment,…	1
Psychometrika	1
Research Matters	1
Sociological Methods &…	1
More ▼

Zwick, Rebecca	4
Wang, Wen-Chung	3
Yi, Qing	3
Ban, Jae-Chun	2
Chun Wang	2
Emons, Wilco H. M.	2
Gongjun Xu	2
Hanson, Bradley A.	2
Harris, Deborah J.	2
James S. Kim	2
Joshua B. Gilbert	2
Lee, Won-Chan	2
Li, Yuan H.	2
Lissitz, Robert W.	2
Luke W. Miratrix	2
Paek, Insu	2
Rutkowski, Leslie	2
Shih, Ching-Lin	2
Sijtsma, Klaas	2
Thayer, Dorothy T.	2
Tijmstra, Jesper	2
Weiss, David J.	2
Abulela, Mohammed A. A.	1
Aksu Dunya, Beyza	1
More ▼