ERIC - Search Results

Publication Date

In 2025	1
Since 2024	3
Since 2021 (last 5 years)	4
Since 2016 (last 10 years)	10
Since 2006 (last 20 years)	15

Descriptor

Error of Measurement	36
Mathematical Models	17
Models	17
Scores	13
Item Response Theory	9
Simulation	8
Estimation (Mathematics)	7
Sample Size	7
Statistical Analysis	7
Test Reliability	7
Test Validity	7
Equations (Mathematics)	6
Evaluation Methods	5
Test Items	5
True Scores	5
Goodness of Fit	4
Item Analysis	4
Maximum Likelihood Statistics	4
Measurement	4
Sampling	4
Statistical Bias	4
Academic Achievement	3
Comparative Analysis	3
Correlation	3
Latent Trait Theory	3
More ▼

Source

Journal of Educational…

Publication Type

Journal Articles	32
Reports - Research	19
Reports - Evaluative	13

Education Level

Grade 10	1
Grade 9	1
High Schools	1
Secondary Education	1

Audience

Researchers

Location

South Carolina	1
United Kingdom (Scotland)	1

Laws, Policies, & Programs

Assessments and Surveys

ACT Assessment	1
Comprehensive Tests of Basic…	1
Iowa Tests of Basic Skills	1
National Longitudinal Study…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 36 results Save | Export

Using Item Scores and Distractors in Person-Fit Assessment

Peer reviewed

Direct link

Gorney, Kylie; Wollack, James A. – Journal of Educational Measurement, 2023

In order to detect a wide range of aberrant behaviors, it can be useful to incorporate information beyond the dichotomous item scores. In this paper, we extend the l[subscript z] and l*[subscript z] person-fit statistics so that unusual behavior in item scores and unusual behavior in item distractors can be used as indicators of aberrance. Through…

Descriptors: Test Items, Scores, Goodness of Fit, Statistics

Detecting Multidimensional DIF in Polytomous Items with IRT Methods and Estimation Approaches

Peer reviewed

Direct link

Güler Yavuz Temel – Journal of Educational Measurement, 2024

The purpose of this study was to investigate multidimensional DIF with a simple and nonsimple structure in the context of multidimensional Graded Response Model (MGRM). This study examined and compared the performance of the IRT-LR and Wald test using MML-EM and MHRM estimation approaches with different test factors and test structures in…

Descriptors: Computation, Multidimensional Scaling, Item Response Theory, Models

IRT Observed-Score Equating for Rater-Mediated Assessments Using a Hierarchical Rater Model

Peer reviewed

Direct link

Tong Wu; Stella Y. Kim; Carl Westine; Michelle Boyer – Journal of Educational Measurement, 2025

While significant attention has been given to test equating to ensure score comparability, limited research has explored equating methods for rater-mediated assessments, where human raters inherently introduce error. If not properly addressed, these errors can undermine score interchangeability and test validity. This study proposes an equating…

Descriptors: Item Response Theory, Evaluators, Error of Measurement, Test Validity

Likelihood-Based Estimation of Model-Derived Oral Reading Fluency

Peer reviewed

Direct link

Cornelis Potgieter; Xin Qiao; Akihito Kamata; Yusuf Kara – Journal of Educational Measurement, 2024

As part of the effort to develop an improved oral reading fluency (ORF) assessment system, Kara et al. estimated the ORF scores based on a latent variable psychometric model of accuracy and speed for ORF data via a fully Bayesian approach. This study further investigates likelihood-based estimators for the model-derived ORF scores, including…

Descriptors: Oral Reading, Reading Fluency, Scores, Psychometrics

Examining Differential Rater Functioning Using a Between-Subgroup Outfit Approach

Peer reviewed

Direct link

Wind, Stefanie A.; Sebok-Syer, Stefanie S. – Journal of Educational Measurement, 2019

When practitioners use modern measurement models to evaluate rating quality, they commonly examine rater fit statistics that summarize how well each rater's ratings fit the expectations of the measurement model. Essentially, this approach involves examining the unexpected ratings that each misfitting rater assigned (i.e., carrying out analyses of…

Descriptors: Measurement, Models, Evaluators, Simulation

Performance of Person-Fit Statistics under Model Misspecification

Peer reviewed

Direct link

Hong, Seong Eun; Monroe, Scott; Falk, Carl F. – Journal of Educational Measurement, 2020

In educational and psychological measurement, a person-fit statistic (PFS) is designed to identify aberrant response patterns. For parametric PFSs, valid inference depends on several assumptions, one of which is that the item response theory (IRT) model is correctly specified. Previous studies have used empirical data sets to explore the effects…

Descriptors: Educational Testing, Psychological Testing, Goodness of Fit, Error of Measurement

A New Interpretation of Augmented Subscores and Their Added Value in Terms of Parallel Forms

Peer reviewed

Direct link

Sinharay, Sandip – Journal of Educational Measurement, 2018

The value-added method of Haberman is arguably one of the most popular methods to evaluate the quality of subscores. The method is based on the classical test theory and deems a subscore to be of added value if the subscore predicts the corresponding true subscore better than does the total score. Sinharay provided an interpretation of the added…

Descriptors: Scores, Value Added Models, Raw Scores, Item Response Theory

A Comparison of Strategies for Smoothing Parameter Selection for Mixed-Format Tests under the Random Groups Design

Peer reviewed

Direct link

Liu, Chunyan; Kolen, Michael J. – Journal of Educational Measurement, 2018

Smoothing techniques are designed to improve the accuracy of equating functions. The main purpose of this study is to compare seven model selection strategies for choosing the smoothing parameter (C) for polynomial loglinear presmoothing and one procedure for model selection in cubic spline postsmoothing for mixed-format pseudo tests under the…

Descriptors: Comparative Analysis, Accuracy, Models, Sample Size

Lord's Wald Test for Detecting Dif in Multidimensional Irt Models: A Comparison of Two Estimation Approaches

Peer reviewed

Direct link

Lee, Soo; Suh, Youngsuk – Journal of Educational Measurement, 2018

Lord's Wald test for differential item functioning (DIF) has not been studied extensively in the context of the multidimensional item response theory (MIRT) framework. In this article, Lord's Wald test was implemented using two estimation approaches, marginal maximum likelihood estimation and Bayesian Markov chain Monte Carlo estimation, to detect…

Descriptors: Item Response Theory, Sample Size, Models, Error of Measurement

Asymptotic Standard Errors of Observed-Score Equating with Polytomous IRT Models

Peer reviewed

Direct link

Andersson, Björn – Journal of Educational Measurement, 2016

In observed-score equipercentile equating, the goal is to make scores on two scales or tests measuring the same construct comparable by matching the percentiles of the respective score distributions. If the tests consist of different items with multiple categories for each item, a suitable model for the responses is a polytomous item response…

Descriptors: Equated Scores, Item Response Theory, Error of Measurement, Tests

Evaluating the Wald Test for Item-Level Comparison of Saturated and Reduced Models in Cognitive Diagnosis

Peer reviewed

Direct link

de la Torre, Jimmy; Lee, Young-Sun – Journal of Educational Measurement, 2013

This article used the Wald test to evaluate the item-level fit of a saturated cognitive diagnosis model (CDM) relative to the fits of the reduced models it subsumes. A simulation study was carried out to examine the Type I error and power of the Wald test in the context of the G-DINA model. Results show that when the sample size is small and a…

Descriptors: Statistical Analysis, Test Items, Goodness of Fit, Error of Measurement

Measurement Error Adjustment Using the SIMEX Method: An Application to Student Growth Percentiles

Peer reviewed

Direct link

Shang, Yi – Journal of Educational Measurement, 2012

Growth models are used extensively in the context of educational accountability to evaluate student-, class-, and school-level growth. However, when error-prone test scores are used as independent variables or right-hand-side controls, the estimation of such growth models can be substantially biased. This article introduces a…

Descriptors: Error of Measurement, Statistical Analysis, Regression (Statistics), Simulation

Reporting Valid and Reliable Overall Scores and Domain Scores

Peer reviewed

Direct link

Yao, Lihua – Journal of Educational Measurement, 2010

In educational assessment, overall scores obtained by simply averaging a number of domain scores are sometimes reported. However, simply averaging the domain scores ignores the fact that different domains have different score points, that scores from those domains are related, and that at different score points the relationship between overall…

Descriptors: Educational Assessment, Error of Measurement, Item Response Theory, Scores

A Comparison of Item Fit Statistics for Mixed IRT Models

Peer reviewed

Direct link

Chon, Kyong Hee; Lee, Won-Chan; Dunbar, Stephen B. – Journal of Educational Measurement, 2010

In this study we examined procedures for assessing model-data fit of item response theory (IRT) models for mixed format data. The model fit indices used in this study include PARSCALE's G[superscript 2], Orlando and Thissen's S-X[superscript 2] and S-G[superscript 2], and Stone's chi[superscript 2*] and G[superscript 2*]. To investigate the…

Descriptors: Test Length, Goodness of Fit, Item Response Theory, Simulation

Test Length and the Standard Error of Measurement

Peer reviewed

Gardner, P. L. – Journal of Educational Measurement, 1970

Descriptors: Error of Measurement, Mathematical Models, Statistical Analysis, Test Reliability

Previous Page | Next Page »

Pages: 1 | 2 | 3

Kolen, Michael J.	2
Woodruff, David	2
Akihito Kamata	1
Andersson, Björn	1
Carl Westine	1
Chon, Kyong Hee	1
Cornelis Potgieter	1
Dunbar, Stephen B.	1
Embretson, Susan	1
Emrick, John A.	1
Falk, Carl F.	1
Gardner, P. L.	1
Gorney, Kylie	1
Güler Yavuz Temel	1
Hong, Seong Eun	1
Houghton, Pansy Du Bose	1
Huynh, Huynh	1
Kromrey, Jeffrey D.	1
Lee, Guemin	1
Lee, Soo	1
Lee, Won-Chan	1
Lee, Young-Sun	1
Liu, Chunyan	1
Michelle Boyer	1
More ▼