ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	4
Since 2006 (last 20 years)	5

Descriptor

Evaluation Methods	5
Evaluators	5
Goodness of Fit	5
College Students	2
Comparative Analysis	2
Interrater Reliability	2
Rating Scales	2
Simulation	2
Classification	1
Competition	1
Computation	1
Data Collection	1
Decision Making	1
Discourse Analysis	1
English (Second Language)	1
English for Academic Purposes	1
Error Patterns	1
Evaluation Criteria	1
Graphs	1
High School Students	1
High Stakes Tests	1
Innovation	1
Language Proficiency	1
Language Tests	1
Mathematical Models	1
More ▼

Source

Educational Measurement:…	1
Educational and Psychological…	1
Journal of Educational…	1
Journal of Research in Music…	1
Language Testing	1

Author

Wind, Stefanie A.	2
Bergee, Martin J.	1
Jones, Eli	1
Lamprianou, Iasonas	1
Rossin, Emily G.	1
Walker, A. Adrienne	1
Youn, Soo Jung	1

Publication Type

Journal Articles	5
Reports - Research	5

Education Level

Higher Education	2
Postsecondary Education	2
High Schools	1
Junior High Schools	1
Middle Schools	1
Secondary Education	1

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…

What Works Clearinghouse Rating

Showing all 5 results Save | Export

A Model-Data-Fit-Informed Approach to Score Resolution in Performance Assessments

Peer reviewed

Direct link

Wind, Stefanie A.; Walker, A. Adrienne – Educational Measurement: Issues and Practice, 2021

Many large-scale performance assessments include score resolution procedures for resolving discrepancies in rater judgments. The goal of score resolution is conceptually similar to person fit analyses: To identify students for whom observed scores may not accurately reflect their achievement. Previously, researchers have observed that…

Descriptors: Goodness of Fit, Performance Based Assessment, Evaluators, Decision Making

The Effects of Incomplete Rating Designs in Combination with Rater Effects

Peer reviewed

Direct link

Wind, Stefanie A.; Jones, Eli – Journal of Educational Measurement, 2019

Researchers have explored a variety of topics related to identifying and distinguishing among specific types of rater effects, as well as the implications of different types of incomplete data collection designs for rater-mediated assessments. In this study, we used simulated data to examine the sensitivity of latent trait model indicators of…

Descriptors: Rating Scales, Models, Evaluators, Data Collection

Cross-Validation and Application of a Scale Assessing School Band Performance

Peer reviewed

Direct link

Rossin, Emily G.; Bergee, Martin J. – Journal of Research in Music Education, 2021

This is the sixth and culminating study in a series whose purpose has been to acquire a conceptual understanding of school band performance and to develop an assessment based on this understanding. With the present study, we cross-validated and applied a rating scale for school band performance. In the cross-validation phase, college students…

Descriptors: Music Education, Music Activities, Music, Performance

Investigation of Rater Effects Using Social Network Analysis and Exponential Random Graph Models

Peer reviewed

Direct link

Lamprianou, Iasonas – Educational and Psychological Measurement, 2018

It is common practice for assessment programs to organize qualifying sessions during which the raters (often known as "markers" or "judges") demonstrate their consistency before operational rating commences. Because of the high-stakes nature of many rating activities, the research community tends to continuously explore new…

Descriptors: Social Networks, Network Analysis, Comparative Analysis, Innovation

Validity Argument for Assessing L2 Pragmatics in Interaction Using Mixed Methods

Peer reviewed

Direct link

Youn, Soo Jung – Language Testing, 2015

This study investigates the validity of assessing L2 pragmatics in interaction using mixed methods, focusing on the evaluation inference. Open role-plays that are meaningful and relevant to the stakeholders in an English for Academic Purposes context were developed for classroom assessment. For meaningful score interpretations and accurate…

Descriptors: Second Language Learning, Pragmatics, Validity, Mixed Methods Research