NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 8 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Wolkowitz, Amanda A. – Journal of Educational Measurement, 2021
Decision consistency (DC) is the reliability of a classification decision based on a test score. In professional credentialing, the decision is often a high-stakes pass/fail decision. The current methods for estimating DC are computationally complex. The purpose of this research is to provide a computationally and conceptually simple method for…
Descriptors: Decision Making, Reliability, Classification, Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Haberman, Shelby J. – Journal of Educational Measurement, 2020
Examples of the impact of statistical theory on assessment practice are provided from the perspective of a statistician trained in theoretical statistics who began to work on assessments. Goodness of fit of item-response models is examined in terms of restricted likelihood-ratio tests and generalized residuals. Minimum discriminant information…
Descriptors: Statistics, Goodness of Fit, Item Response Theory, Statistical Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Wesolowski, Brian C.; Wind, Stefanie A. – Journal of Educational Measurement, 2019
Rater-mediated assessments are a common methodology for measuring persons, investigating rater behavior, and/or defining latent constructs. The purpose of this article is to provide a pedagogical framework for examining rater variability in the context of rater-mediated assessments using three distinct models. The first model is the observation…
Descriptors: Interrater Reliability, Models, Observation, Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Heritage, Margaret; Kingston, Neal M. – Journal of Educational Measurement, 2019
Classroom assessment and large-scale assessment have, for the most part, existed in mutual isolation. Some experts have felt this is for the best and others have been concerned that the schism limits the potential contribution of both forms of assessment. Margaret Heritage has long been a champion of best practices in classroom assessment. Neal…
Descriptors: Measurement, Psychometrics, Context Effect, Classroom Environment
Peer reviewed Peer reviewed
Direct linkDirect link
de la Torre, Jimmy – Journal of Educational Measurement, 2008
Most model fit analyses in cognitive diagnosis assume that a Q matrix is correct after it has been constructed, without verifying its appropriateness. Consequently, any model misfit attributable to the Q matrix cannot be addressed and remedied. To address this concern, this paper proposes an empirically based method of validating a Q matrix used…
Descriptors: Matrices, Validity, Models, Evaluation Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Camilli, Gregory; Prowker, Adam; Dossey, John A.; Lindquist, Mary M.; Chiu, Ting-Wei; Vargas, Sadako; de la Torre, Jimmy – Journal of Educational Measurement, 2008
A new method for analyzing differential item functioning is proposed to investigate the relative strengths and weaknesses of multiple groups of examinees. Accordingly, the notion of a conditional measure of difference between two groups (Reference and Focal) is generalized to a conditional variance. The objective of this article is to present and…
Descriptors: Test Bias, National Competency Tests, Grade 4, Difficulty Level
Peer reviewed Peer reviewed
Direct linkDirect link
Allen, Nancy L.; Holland, Paul W.; Thayer, Dorothy T. – Journal of Educational Measurement, 2005
Allowing students to choose the question(s) that they will answer from among several possible alternatives is often viewed as a mechanism for increasing fairness in certain types of assessments. The fairness of optional topic choice is not a universally accepted fact, however, and various studies have been done to assess this question. We examine…
Descriptors: Test Theory, Test Items, Student Evaluation, Evaluation Methods
Peer reviewed Peer reviewed
Mullis, Ina V. S. – Journal of Educational Measurement, 1992
An overview is given of the consensus process for development of the frameworks underlying the National Assessment of Educational Progress (NAEP) assessments, with emphasis on those for the 1990 and 1992 mathematics assessments, the 1992 reading assessment, and the 1994 science assessments. Innovative techniques for 1992 are described. (SLD)
Descriptors: Academic Standards, Content Validity, Educational Assessment, Elementary Secondary Education