NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 5 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Wind, Stefanie A.; Walker, A. Adrienne – Educational Measurement: Issues and Practice, 2021
Many large-scale performance assessments include score resolution procedures for resolving discrepancies in rater judgments. The goal of score resolution is conceptually similar to person fit analyses: To identify students for whom observed scores may not accurately reflect their achievement. Previously, researchers have observed that…
Descriptors: Goodness of Fit, Performance Based Assessment, Evaluators, Decision Making
Peer reviewed Peer reviewed
Direct linkDirect link
Wind, Stefanie A.; Jones, Eli – Journal of Educational Measurement, 2019
Researchers have explored a variety of topics related to identifying and distinguishing among specific types of rater effects, as well as the implications of different types of incomplete data collection designs for rater-mediated assessments. In this study, we used simulated data to examine the sensitivity of latent trait model indicators of…
Descriptors: Rating Scales, Models, Evaluators, Data Collection
Peer reviewed Peer reviewed
Direct linkDirect link
Rossin, Emily G.; Bergee, Martin J. – Journal of Research in Music Education, 2021
This is the sixth and culminating study in a series whose purpose has been to acquire a conceptual understanding of school band performance and to develop an assessment based on this understanding. With the present study, we cross-validated and applied a rating scale for school band performance. In the cross-validation phase, college students…
Descriptors: Music Education, Music Activities, Music, Performance
Peer reviewed Peer reviewed
Direct linkDirect link
Lamprianou, Iasonas – Educational and Psychological Measurement, 2018
It is common practice for assessment programs to organize qualifying sessions during which the raters (often known as "markers" or "judges") demonstrate their consistency before operational rating commences. Because of the high-stakes nature of many rating activities, the research community tends to continuously explore new…
Descriptors: Social Networks, Network Analysis, Comparative Analysis, Innovation
Peer reviewed Peer reviewed
Direct linkDirect link
Youn, Soo Jung – Language Testing, 2015
This study investigates the validity of assessing L2 pragmatics in interaction using mixed methods, focusing on the evaluation inference. Open role-plays that are meaningful and relevant to the stakeholders in an English for Academic Purposes context were developed for classroom assessment. For meaningful score interpretations and accurate…
Descriptors: Second Language Learning, Pragmatics, Validity, Mixed Methods Research