NotesFAQContact Us
Collection
Advanced
Search Tips
Source
Journal of Educational…115
Audience
Researchers2
Laws, Policies, & Programs
No Child Left Behind Act 20011
What Works Clearinghouse Rating
Showing 1 to 15 of 115 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Guo, Jinxin; Xu, Xin; Xin, Tao – Journal of Educational Measurement, 2023
Missingness due to not-reached items and omitted items has received much attention in the recent psychometric literature. Such missingness, if not handled properly, would lead to biased parameter estimation, as well as inaccurate inference of examinees, and further erode the validity of the test. This paper reviews some commonly used IRT based…
Descriptors: Psychometrics, Bias, Error of Measurement, Test Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Augustin Mutak; Robert Krause; Esther Ulitzsch; Sören Much; Jochen Ranger; Steffi Pohl – Journal of Educational Measurement, 2024
Understanding the intraindividual relation between an individual's speed and ability in testing scenarios is essential to assure a fair assessment. Different approaches exist for estimating this relationship, that either rely on specific study designs or on specific assumptions. This paper aims to add to the toolbox of approaches for estimating…
Descriptors: Testing, Academic Ability, Time on Task, Correlation
Peer reviewed Peer reviewed
Direct linkDirect link
Hwanggyu Lim; Danqi Zhu; Edison M. Choe; Kyung T. Han – Journal of Educational Measurement, 2024
This study presents a generalized version of the residual differential item functioning (RDIF) detection framework in item response theory, named GRDIF, to analyze differential item functioning (DIF) in multiple groups. The GRDIF framework retains the advantages of the original RDIF framework, such as computational efficiency and ease of…
Descriptors: Item Response Theory, Test Bias, Test Reliability, Test Construction
Peer reviewed Peer reviewed
Direct linkDirect link
Gorney, Kylie; Wollack, James A. – Journal of Educational Measurement, 2023
In order to detect a wide range of aberrant behaviors, it can be useful to incorporate information beyond the dichotomous item scores. In this paper, we extend the l[subscript z] and l*[subscript z] person-fit statistics so that unusual behavior in item scores and unusual behavior in item distractors can be used as indicators of aberrance. Through…
Descriptors: Test Items, Scores, Goodness of Fit, Statistics
Peer reviewed Peer reviewed
Direct linkDirect link
Güler Yavuz Temel – Journal of Educational Measurement, 2024
The purpose of this study was to investigate multidimensional DIF with a simple and nonsimple structure in the context of multidimensional Graded Response Model (MGRM). This study examined and compared the performance of the IRT-LR and Wald test using MML-EM and MHRM estimation approaches with different test factors and test structures in…
Descriptors: Computation, Multidimensional Scaling, Item Response Theory, Models
Peer reviewed Peer reviewed
Direct linkDirect link
Tong Wu; Stella Y. Kim; Carl Westine; Michelle Boyer – Journal of Educational Measurement, 2025
While significant attention has been given to test equating to ensure score comparability, limited research has explored equating methods for rater-mediated assessments, where human raters inherently introduce error. If not properly addressed, these errors can undermine score interchangeability and test validity. This study proposes an equating…
Descriptors: Item Response Theory, Evaluators, Error of Measurement, Test Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Cornelis Potgieter; Xin Qiao; Akihito Kamata; Yusuf Kara – Journal of Educational Measurement, 2024
As part of the effort to develop an improved oral reading fluency (ORF) assessment system, Kara et al. estimated the ORF scores based on a latent variable psychometric model of accuracy and speed for ORF data via a fully Bayesian approach. This study further investigates likelihood-based estimators for the model-derived ORF scores, including…
Descriptors: Oral Reading, Reading Fluency, Scores, Psychometrics
Peer reviewed Peer reviewed
Direct linkDirect link
Almehrizi, Rashid S. – Journal of Educational Measurement, 2021
Estimates of various variance components, universe score variance, measurement error variances, and generalizability coefficients, like all statistics, are subject to sampling variability, particularly in small samples. Such variability is quantified traditionally through estimated standard errors and/or confidence intervals. The paper derived new…
Descriptors: Error of Measurement, Statistics, Design, Generalizability Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Lee, Sunbok – Journal of Educational Measurement, 2020
In the logistic regression (LR) procedure for differential item functioning (DIF), the parameters of LR have often been estimated using maximum likelihood (ML) estimation. However, ML estimation suffers from the finite-sample bias. Furthermore, ML estimation for LR can be substantially biased in the presence of rare event data. The bias of ML…
Descriptors: Regression (Statistics), Test Bias, Maximum Likelihood Statistics, Simulation
Peer reviewed Peer reviewed
Direct linkDirect link
Liu, Chunyan; Kolen, Michael J. – Journal of Educational Measurement, 2020
Smoothing is designed to yield smoother equating results that can reduce random equating error without introducing very much systematic error. The main objective of this study is to propose a new statistic and to compare its performance to the performance of the Akaike information criterion and likelihood ratio chi-square difference statistics in…
Descriptors: Equated Scores, Statistical Analysis, Error of Measurement, Criteria
Peer reviewed Peer reviewed
Direct linkDirect link
Kim, Hyung Jin; Brennan, Robert L.; Lee, Won-Chan – Journal of Educational Measurement, 2020
In equating, smoothing techniques are frequently used to diminish sampling error. There are typically two types of smoothing: presmoothing and postsmoothing. For polynomial log-linear presmoothing, an optimum smoothing degree can be determined statistically based on the Akaike information criterion or Chi-square difference criterion. For…
Descriptors: Equated Scores, Sampling, Error of Measurement, Statistical Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Wind, Stefanie A.; Sebok-Syer, Stefanie S. – Journal of Educational Measurement, 2019
When practitioners use modern measurement models to evaluate rating quality, they commonly examine rater fit statistics that summarize how well each rater's ratings fit the expectations of the measurement model. Essentially, this approach involves examining the unexpected ratings that each misfitting rater assigned (i.e., carrying out analyses of…
Descriptors: Measurement, Models, Evaluators, Simulation
Peer reviewed Peer reviewed
Direct linkDirect link
Wang, Shaojie; Zhang, Minqiang; Lee, Won-Chan; Huang, Feifei; Li, Zonglong; Li, Yixing; Yu, Sufang – Journal of Educational Measurement, 2022
Traditional IRT characteristic curve linking methods ignore parameter estimation errors, which may undermine the accuracy of estimated linking constants. Two new linking methods are proposed that take into account parameter estimation errors. The item- (IWCC) and test-information-weighted characteristic curve (TWCC) methods employ weighting…
Descriptors: Item Response Theory, Error of Measurement, Accuracy, Monte Carlo Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Kim, Stella Y.; Lee, Won-Chan – Journal of Educational Measurement, 2020
The current study aims to evaluate the performance of three non-IRT procedures (i.e., normal approximation, Livingston-Lewis, and compound multinomial) for estimating classification indices when the observed score distribution shows atypical patterns: (a) bimodality, (b) structural (i.e., systematic) bumpiness, or (c) structural zeros (i.e., no…
Descriptors: Classification, Accuracy, Scores, Cutting Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Hong, Seong Eun; Monroe, Scott; Falk, Carl F. – Journal of Educational Measurement, 2020
In educational and psychological measurement, a person-fit statistic (PFS) is designed to identify aberrant response patterns. For parametric PFSs, valid inference depends on several assumptions, one of which is that the item response theory (IRT) model is correctly specified. Previous studies have used empirical data sets to explore the effects…
Descriptors: Educational Testing, Psychological Testing, Goodness of Fit, Error of Measurement
Previous Page | Next Page »
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7  |  8