NotesFAQContact Us
Collection
Advanced
Search Tips
What Works Clearinghouse Rating
Does not meet standards1
Showing 76 to 90 of 2,743 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Ilona Rinne – Assessment & Evaluation in Higher Education, 2024
It is widely acknowledged in research that common criteria and aligned standards do not result in consistent assessment of such a complex performance as the final undergraduate thesis. Assessment is determined by examiners' understanding of rubrics and their views on thesis quality. There is still a gap in the research literature about how…
Descriptors: Foreign Countries, Undergraduate Students, Teacher Education Programs, Evaluation Criteria
Scott F. Marion, Editor; James W. Pellegrino, Editor; Amy I. Berman, Editor – National Academy of Education, 2024
High-quality assessments are crucial to many aspects of the educational process. They can help policymakers monitor long-term educational trends, assist state educational agencies (SEAs) and local educational agencies (LEAs) in allocating resources and professional development opportunities, provide insights to teachers about how well students…
Descriptors: Educational Assessment, Educational Policy, Equal Education, Test Validity
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Yang Yang – Shanlax International Journal of Education, 2024
This paper explores the reliability of using ChatGPT in evaluating EFL writing by assessing its intra- and inter-rater reliability. Eighty-two compositions were randomly sampled from the Written English Corpus of Chinese Learners. These compositions were rated by three experienced raters with regard to 'language', 'content', and 'organization'.…
Descriptors: English (Second Language), Second Language Instruction, Writing (Composition), Evaluation Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Nicole D. Martin; Stephanie N. Baker; Madeline Haynes; Jayce R. Warner – Computer Science Education, 2024
Background and Context: As computer science (CS) education expands and the need for well-prepared CS teachers grows, understanding what motivates teachers to teach CS can help address challenges to recruiting, preparing, and retaining teachers. Objective: The goal of this work was to develop and validate a scale that measures teachers' motivation…
Descriptors: Computer Science Education, Teacher Motivation, Measurement Techniques, Construct Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Kazuya Saito; Adam Tierney – Studies in Second Language Acquisition, 2024
This article proposes a conceptual and measurement framework for postpubertal, L2 speech learning aptitude that is centered around domain-general auditory processing (i.e., representing spectral and temporal characteristics of sounds). To this end, we examine the construct and reliability of a battery of auditory processing tests by presenting the…
Descriptors: Second Language Learning, Auditory Tests, Auditory Perception, Listening Comprehension Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Ping-Lin Chuang – Language Testing, 2025
This experimental study explores how source use features impact raters' judgment of argumentation in a second language (L2) integrated writing test. One hundred four experienced and novice raters were recruited to complete a rating task that simulated the scoring assignment of a local English Placement Test (EPT). Sixty written responses were…
Descriptors: Interrater Reliability, Evaluators, Information Sources, Primary Sources
Peer reviewed Peer reviewed
Direct linkDirect link
Bang Quan Zheng; Peter M. Bentler – Structural Equation Modeling: A Multidisciplinary Journal, 2025
This paper aims to advocate for a balanced approach to model fit evaluation in structural equation modeling (SEM). The ongoing debate surrounding chi-square test statistics and fit indices has been characterized by ambiguity and controversy. Despite the acknowledged limitations of relying solely on the chi-square test, its careful application can…
Descriptors: Monte Carlo Methods, Structural Equation Models, Goodness of Fit, Robustness (Statistics)
Peer reviewed Peer reviewed
Direct linkDirect link
Stacey Havlik; Peter Wiens; Arash Ghafoori; Melissa Jacobowitz; Kelly-Jo Sheback; Hannah Hudson – Journal of Education for Students Placed at Risk, 2025
While many teachers are unaware that students in their classes are experiencing homelessness, others may not know how to support students who are identified as lacking consistent housing (Wright et al., 2019). Thus, there is a critical need to better assess, understand, and enhance teachers' knowledge and attitudes toward homelessness. Therefore,…
Descriptors: Preservice Teachers, Preservice Teacher Education, Homeless People, Student Characteristics
Peer reviewed Peer reviewed
Direct linkDirect link
Marianne Berg Halvorsen; Arvid Nikolai Kildahl; Sabine Kaiser; Brynhildur Axelsdottir; Michael G. Aman; Sissel Berge Helverschou – Journal of Autism and Developmental Disorders, 2025
In recent years, there has been a proliferation of instruments for assessing mental health (MH) among autistic people. This study aimed to review the psychometric properties of broadband instruments used to assess MH problems among autistic people. In accordance with the PRISMA guidelines (PROSPERO: CRD42022316571) we searched the APA PsycINFO via…
Descriptors: Psychometrics, Mental Health, Clinical Diagnosis, Evaluation Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Daryl Close – Journal of Academic Ethics, 2025
For decades, student ratings of university faculty have been used by administrators in high stakes faculty employment decisions such as tenure, promotion, contract renewal and reappointment, and merit pay. However, virtually no attention has been paid to the ethical questions of using ratings in employment decisions. Instead, the ratings…
Descriptors: Student Evaluation of Teacher Performance, Ethics, College Students, College Faculty
Peer reviewed Peer reviewed
Direct linkDirect link
Jennifer Sdunzik; Ann M. Bessenbacher; Wilella D. Burgess; Asia M. Mohamud; Abdirisak Dalmar – American Journal of Evaluation, 2025
The success of development projects and evaluations hinges on having access to research protocols and methodologies that consider the needs and characteristics of stakeholders, subjects, and context while remaining rigorous and culturally sound. These efforts are often complicated by a dearth of tools that have been tested for validity and…
Descriptors: Foreign Countries, Program Evaluation, International Programs, Data Collection
Lambert, Richard G.; Holcomb, T. Scott; Bottoms, Bryndle L. – Center for Educational Measurement and Evaluation, 2021
The validity of the Kappa coefficient of chance-corrected agreement has been questioned when the prevalence of specific rating scale categories is low and agreement between raters is high. The researchers proposed the Lambda Coefficient of Rater-Mediated Agreement as an alternative to Kappa to address these concerns. Lambda corrects for chance…
Descriptors: Interrater Reliability, Teacher Evaluation, Test Validity, Evaluation Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Power, Jason Richard; Tanner, David – European Journal of Engineering Education, 2023
Self and peer assessments have been identified as effective strategies to develop a deeper understanding of complex concepts, enhance meta-cognitive capacity, and support learner self-efficacy. This study examines data related to peer and self-assessment exercises completed within a university engineering programme (n=61). Data related to…
Descriptors: Peer Evaluation, Self Evaluation (Individuals), Feedback (Response), Engineering Education
Peer reviewed Peer reviewed
Direct linkDirect link
Novak, Josip; Rebernjak, Blaž – Measurement: Interdisciplinary Research and Perspectives, 2023
A Monte Carlo simulation study was conducted to examine the performance of [alpha], [lambda]2, [lambda][subscript 4], [lambda][subscript 2], [omega][subscript T], GLB[subscript MRFA], and GLB[subscript Algebraic] coefficients. Population reliability, distribution shape, sample size, test length, and number of response categories were varied…
Descriptors: Monte Carlo Methods, Evaluation Methods, Reliability, Simulation
Courtney M. Koletar – ProQuest LLC, 2024
For decades, evaluators have noted that it is difficult for stakeholders to accept negative evaluation results (Carter, 1971; Taut & Brauns, 2003). There is a need for additional research on evaluation to better understand when and why stakeholders reject negative or critical evaluation findings. Drawing on social identity theory (SIT), the…
Descriptors: Evaluation Methods, Interrater Reliability, Criticism, Positive Reinforcement
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9  |  10  |  11  |  ...  |  183