Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 4 |
Since 2006 (last 20 years) | 5 |
Descriptor
Evaluation Methods | 5 |
Evaluators | 5 |
Goodness of Fit | 5 |
College Students | 2 |
Comparative Analysis | 2 |
Interrater Reliability | 2 |
Rating Scales | 2 |
Simulation | 2 |
Classification | 1 |
Competition | 1 |
Computation | 1 |
More ▼ |
Source
Educational Measurement:… | 1 |
Educational and Psychological… | 1 |
Journal of Educational… | 1 |
Journal of Research in Music… | 1 |
Language Testing | 1 |
Author
Wind, Stefanie A. | 2 |
Bergee, Martin J. | 1 |
Jones, Eli | 1 |
Lamprianou, Iasonas | 1 |
Rossin, Emily G. | 1 |
Walker, A. Adrienne | 1 |
Youn, Soo Jung | 1 |
Publication Type
Journal Articles | 5 |
Reports - Research | 5 |
Education Level
Higher Education | 2 |
Postsecondary Education | 2 |
High Schools | 1 |
Junior High Schools | 1 |
Middle Schools | 1 |
Secondary Education | 1 |
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
Test of English as a Foreign… | 1 |
What Works Clearinghouse Rating
Wind, Stefanie A.; Walker, A. Adrienne – Educational Measurement: Issues and Practice, 2021
Many large-scale performance assessments include score resolution procedures for resolving discrepancies in rater judgments. The goal of score resolution is conceptually similar to person fit analyses: To identify students for whom observed scores may not accurately reflect their achievement. Previously, researchers have observed that…
Descriptors: Goodness of Fit, Performance Based Assessment, Evaluators, Decision Making
Wind, Stefanie A.; Jones, Eli – Journal of Educational Measurement, 2019
Researchers have explored a variety of topics related to identifying and distinguishing among specific types of rater effects, as well as the implications of different types of incomplete data collection designs for rater-mediated assessments. In this study, we used simulated data to examine the sensitivity of latent trait model indicators of…
Descriptors: Rating Scales, Models, Evaluators, Data Collection
Rossin, Emily G.; Bergee, Martin J. – Journal of Research in Music Education, 2021
This is the sixth and culminating study in a series whose purpose has been to acquire a conceptual understanding of school band performance and to develop an assessment based on this understanding. With the present study, we cross-validated and applied a rating scale for school band performance. In the cross-validation phase, college students…
Descriptors: Music Education, Music Activities, Music, Performance
Lamprianou, Iasonas – Educational and Psychological Measurement, 2018
It is common practice for assessment programs to organize qualifying sessions during which the raters (often known as "markers" or "judges") demonstrate their consistency before operational rating commences. Because of the high-stakes nature of many rating activities, the research community tends to continuously explore new…
Descriptors: Social Networks, Network Analysis, Comparative Analysis, Innovation
Youn, Soo Jung – Language Testing, 2015
This study investigates the validity of assessing L2 pragmatics in interaction using mixed methods, focusing on the evaluation inference. Open role-plays that are meaningful and relevant to the stakeholders in an English for Academic Purposes context were developed for classroom assessment. For meaningful score interpretations and accurate…
Descriptors: Second Language Learning, Pragmatics, Validity, Mixed Methods Research