NotesFAQContact Us
Collection
Advanced
Search Tips
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Showing 1 to 15 of 21 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Brogan L. Barr; Virginia V. W. McIntosh; Eileen F. Britt; Jennifer Jordan; Janet D. Carter – Measurement: Interdisciplinary Research and Perspectives, 2024
Even when raters demonstrate agreement in the use of a measure, limited score variability or violation of often-ignored statistical assumptions can result in lower reliability estimates than intuitively expected. This article uses data drawn from two randomized controlled trials of schema therapy and cognitive behavioral therapy for the treatment…
Descriptors: Evaluators, Interrater Reliability, Reliability, Measurement Techniques
Peer reviewed Peer reviewed
Direct linkDirect link
Li, Wentao – Reading and Writing: An Interdisciplinary Journal, 2022
Scoring rubrics are known to be effective for assessing writing for both testing and classroom teaching purposes. How raters interpret the descriptors in a rubric can significantly impact the subsequent final score, and further, the descriptors may also color a rater's judgment of a student's writing quality. Little is known, however, about how…
Descriptors: Scoring Rubrics, Interrater Reliability, Writing Evaluation, Teaching Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Park, Yeonggwang; Cádiz, Manuel Díaz; Nagle, Kathleen F.; Stepp, Cara E. – Journal of Speech, Language, and Hearing Research, 2020
Purpose: Assessment of strained voice quality is difficult due to the weak reliability of auditory-perceptual evaluation and lack of strong acoustic correlates. This study evaluated the contributions of relative fundamental frequency (RFF) and mid-to-high frequency noise to the perception of strain. Method: Stimuli were created using recordings of…
Descriptors: Acoustics, Audio Equipment, Auditory Perception, Correlation
Peer reviewed Peer reviewed
Direct linkDirect link
Swapna Haresh Teckwani; Amanda Huee-Ping Wong; Nathasha Vihangi Luke; Ivan Cherh Chiet Low – Advances in Physiology Education, 2024
The advent of artificial intelligence (AI), particularly large language models (LLMs) like ChatGPT and Gemini, has significantly impacted the educational landscape, offering unique opportunities for learning and assessment. In the realm of written assessment grading, traditionally viewed as a laborious and subjective process, this study sought to…
Descriptors: Accuracy, Reliability, Computational Linguistics, Standards
Peer reviewed Peer reviewed
Direct linkDirect link
Rossin, Emily G.; Bergee, Martin J. – Journal of Research in Music Education, 2021
This is the sixth and culminating study in a series whose purpose has been to acquire a conceptual understanding of school band performance and to develop an assessment based on this understanding. With the present study, we cross-validated and applied a rating scale for school band performance. In the cross-validation phase, college students…
Descriptors: Music Education, Music Activities, Music, Performance
Peer reviewed Peer reviewed
Direct linkDirect link
Wind, Stefanie A.; Wolfe, Edward W.; Engelhard, George, Jr.; Foltz, Peter; Rosenstein, Mark – International Journal of Testing, 2018
Automated essay scoring engines (AESEs) are becoming increasingly popular as an efficient method for performance assessments in writing, including many language assessments that are used worldwide. Before they can be used operationally, AESEs must be "trained" using machine-learning techniques that incorporate human ratings. However, the…
Descriptors: Computer Assisted Testing, Essay Tests, Writing Evaluation, Scoring
Peer reviewed Peer reviewed
Direct linkDirect link
Lin, Chih-Kai – Language Testing, 2017
Sparse-rated data are common in operational performance-based language tests, as an inevitable result of assigning examinee responses to a fraction of available raters. The current study investigates the precision of two generalizability-theory methods (i.e., the rating method and the subdividing method) specifically designed to accommodate the…
Descriptors: Data Analysis, Language Tests, Generalizability Theory, Accuracy
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Lehan, Tara; Hussey, Heather; Mika, Eva – Journal of University Teaching and Learning Practice, 2016
Throughout the dissertation process, the chair and committee members provide feedback regarding quality to help the doctoral candidate to produce the highest-quality document and become an independent scholar. Nevertheless, results of previous research suggest that overall dissertation quality generally is poor. Because much of the feedback about…
Descriptors: Graduate Students, Doctoral Dissertations, Student Evaluation, Feedback (Response)
Peer reviewed Peer reviewed
Direct linkDirect link
Virtanen, T. E.; Pakarinen, E.; Lerkkanen, M.-K.; Poikkeus, A.-M.; Siekkinen, M.; Nurmi, J.-E. – Journal of Early Adolescence, 2018
This study examined the reliability and validity of the Classroom Assessment Scoring System-Secondary (CLASS-S) in Finnish classrooms. Trained observers coded classroom interactions based on video recordings of 46 Grade 6 classrooms (450 cycles). Concurrent associations were investigated with respect to teacher self-ratings (e.g., efficacy beliefs…
Descriptors: Factor Analysis, Classroom Observation Techniques, Foreign Countries, Factor Structure
Benyon, Howard E., III. – ProQuest LLC, 2014
This policy analysis project focused on state-level education policy which lacks evaluator training as well as on requirements for research-based best practices. Due to federal mandates and funding as well as accountability to all stakeholders, states are adopting more rigorous evaluation systems. These high-stakes evaluation systems are putting…
Descriptors: Educational Policy, Policy Analysis, Evaluators, Professional Training
Goe, Laura; Holdheide, Lynn; Miller, Tricia – Center on Great Teachers and Leaders, 2014
Across the nation, states and districts are in the process of building better teacher evaluation systems that not only identify highly effective teachers but also systematically provide data and feedback that can be used to improve teacher practice. The "Practical Guide to Designing Comprehensive Teacher Evaluation Systems" is a tool…
Descriptors: Teacher Evaluation, Evaluators, Educational Change, Accountability
Matsugu, Sawako – ProQuest LLC, 2013
Understanding the sources of variance in speaking assessment is important in Japan where society's high demand for English speaking skills is growing. Three challenges threaten fair assessment of speaking. First, in Japanese university speaking courses, teachers are typically the only raters, but teachers' knowledge of their students may unfairly…
Descriptors: Foreign Countries, Oral Language, English (Second Language), Second Language Learning
Long, Haiying – ProQuest LLC, 2012
As one of the most widely used creativity assessment tools, the Consensual Assessment Technique (CAT) has been praised as a valid tool to assess creativity. In Amabile's (1982) seminal work, the inter-rater reliability was defined as construct validity of the CAT. During the past three decades, researchers followed this definition and…
Descriptors: Creativity, Reliability, Educational Research, Educational Researchers
Peer reviewed Peer reviewed
Direct linkDirect link
McLeod, Bryce D.; Weisz, John R. – Journal of Clinical Child and Adolescent Psychology, 2010
Most everyday child and adolescent psychotherapy does not follow manuals that document the procedures. Consequently, usual clinical care has remained poorly understood and rarely studied. The Therapy Process Observational Coding System for Child Psychotherapy-Strategies scale (TPOCS-S) is an observational measure of youth psychotherapy procedures…
Descriptors: Interrater Reliability, Measures (Individuals), Psychotherapy, Depression (Psychology)
Goe, Laura; Holdheide, Lynn; Miller, Tricia – National Comprehensive Center for Teacher Quality, 2011
Across the nation, states and districts are in the process of building better teacher evaluation systems that not only identify highly effective teachers but also systematically provide data and feedback that can be used to improve teacher practice. "A Practical Guide to Designing Comprehensive Teacher Evaluation Systems" is a tool…
Descriptors: Feedback (Response), Teacher Effectiveness, Evaluators, Teacher Evaluation
Previous Page | Next Page »
Pages: 1  |  2