NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 5 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Corinne Huggins-Manley; Anthony W. Raborn; Peggy K. Jones; Ted Myers – Journal of Educational Measurement, 2024
The purpose of this study is to develop a nonparametric DIF method that (a) compares focal groups directly to the composite group that will be used to develop the reported test score scale, and (b) allows practitioners to explore for DIF related to focal groups stemming from multicategorical variables that constitute a small proportion of the…
Descriptors: Nonparametric Statistics, Test Bias, Scores, Statistical Significance
Peer reviewed Peer reviewed
Direct linkDirect link
Wind, Stefanie A.; Jones, Eli – Journal of Educational Measurement, 2019
Researchers have explored a variety of topics related to identifying and distinguishing among specific types of rater effects, as well as the implications of different types of incomplete data collection designs for rater-mediated assessments. In this study, we used simulated data to examine the sensitivity of latent trait model indicators of…
Descriptors: Rating Scales, Models, Evaluators, Data Collection
Peer reviewed Peer reviewed
Stufflebeam, Daniel L. – Journal of Educational Measurement, 1971
Descriptors: Data Analysis, Educational Experiments, Evaluation Methods, Individual Differences
Peer reviewed Peer reviewed
Powers, Stephen; And Others – Journal of Educational Measurement, 1983
The validity of the equipercentile hypothesis of the Title I Evaluation and Reporting System norm-referenced evaluation model was examined using 3,224 seventh- and ninth-grade students. Findings from confidence interval procedures contradicted the equipercentile hypothesis. There was a pattern of large gains for students not receiving any special…
Descriptors: Achievement Gains, Evaluation Methods, Evaluation Needs, Hypothesis Testing
Peer reviewed Peer reviewed
Raymond, Mark R.; Viswesvaran, Chockalingam – Journal of Educational Measurement, 1993
Three variations of a least squares regression model are presented that are suitable for determining and correcting for rating error in designs in which examinees are evaluated by a subset of possible raters. Models are applied to ratings from 4 administrations of a medical certification examination in which 40 raters and approximately 115…
Descriptors: Error of Measurement, Evaluation Methods, Higher Education, Interrater Reliability