NotesFAQContact Us
Collection
Advanced
Search Tips
What Works Clearinghouse Rating
Showing 1 to 15 of 666 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Susan K. Johnsen – Gifted Child Today, 2025
The author provides information about reliability and areas that educators should examine in determining if an assessment is consistent and trustworthy for use, and how it should be interpreted in making decisions about students. Reliability areas that are discussed in the column include internal consistency, test-retest or stability, inter-scorer…
Descriptors: Test Reliability, Academically Gifted, Student Evaluation, Error of Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Samuel D'Emanuele; Francesca Nardello; Fabrizio Garau; Diego Campaci; Federico Schena; Cantor Tarperi – Measurement in Physical Education and Exercise Science, 2025
The agreement between a wearable inertial sensor (GYKO, G) and the force platform (P) was assessed by evaluating "test-retest" and "inter-rater reliability." Thirty-eight subjects were enrolled; the selected indices of balance were investigated over foot positions and (un)stable conditions. Intraclass correlation coefficient…
Descriptors: Human Posture, Measurement Equipment, Interrater Reliability, Measurement Techniques
Peer reviewed Peer reviewed
Direct linkDirect link
Mazin T. Alqhazo; Tha’er Al-Kadi; Firas S. Alfwaress – Language, Speech, and Hearing Services in Schools, 2025
Purpose: The Stuttering Severity Instrument--Fourth Edition (SSI-4) is unavailable in Arabic language. The purpose of the current research is to translate the SSI-4 (Riley, 2009) into Arabic and to discuss its validity, as well as its intrajudge and interjudge reliability. Method: Archived videos of 28 school-aged children who stutter ranged in…
Descriptors: Arabic, Translation, Test Validity, Test Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Marcus Messer; Neil C. C. Brown; Michael Kölling; Miaojing Shi – ACM Transactions on Computing Education, 2025
Providing consistent summative assessment to students is important, as the grades they are awarded affect their progression through university and future career prospects. While small cohorts are typically assessed by a single assessor, such as the module/class leader, larger cohorts are often assessed by multiple assessors, typically teaching…
Descriptors: Foreign Countries, Grading, Interrater Reliability, Teaching Assistants
Peer reviewed Peer reviewed
Direct linkDirect link
Jonas Flodén – British Educational Research Journal, 2025
This study compares how the generative AI (GenAI) large language model (LLM) ChatGPT performs in grading university exams compared to human teachers. Aspects investigated include consistency, large discrepancies and length of answer. Implications for higher education, including the role of teachers and ethics, are also discussed. Three…
Descriptors: College Faculty, Artificial Intelligence, Comparative Testing, Scoring
Peer reviewed Peer reviewed
Direct linkDirect link
Sarah Stopforth; Roxanne Connelly; Vernon Gayle – Cambridge Journal of Education, 2025
Data on educational qualifications is essential in many research domains. The UK Millennium Cohort Study collected self-reported General Certificate of Secondary Education (GCSE) data in sweep 7 (cohort members aged 17). GCSE data from the National Pupil Database (NPD) has been linked to the MCS. This study investigates the consistency of these…
Descriptors: Foreign Countries, Adolescents, Case Studies, Secondary Education
Peer reviewed Peer reviewed
Direct linkDirect link
Tahereh Firoozi; Hamid Mohammadi; Mark J. Gierl – Journal of Educational Measurement, 2025
The purpose of this study is to describe and evaluate a multilingual automated essay scoring (AES) system for grading essays in three languages. Two different sentence embedding models were evaluated within the AES system, multilingual BERT (mBERT) and language-agnostic BERT sentence embedding (LaBSE). German, Italian, and Czech essays were…
Descriptors: College Students, Slavic Languages, German, Italian
Peer reviewed Peer reviewed
Direct linkDirect link
Chase Young; Benjamin Mitchell-Yellin; George Kevin Randall – Active Learning in Higher Education, 2025
The purpose of this study was to develop a valid, reliable, and brief measure of active learning in college classrooms that is cheap and easy to complete and yields results that faculty can easily use to inform their development as instructors. Initial construct and face validity was achieved by modifying existing instruments and creating a draft…
Descriptors: College Faculty, College Students, Active Learning, Classroom Observation Techniques
Peer reviewed Peer reviewed
Direct linkDirect link
Morten Pallisgaard Støve; Mathias Kringelholt Kristensen; Jonas Nielsen; Lea Dyhrberg Madsen – Measurement in Physical Education and Exercise Science, 2025
Between limb strength, asymmetry is a leading risk factor for hamstring strain re-injury. However, few accurate testing methodologies are available in clinical settings. This study examined the validity and reliability of eccentric knee flexor torque measured with a novel Nordic Hamstring Device. Twenty-seven healthy participants were assessed in…
Descriptors: Validity, Reliability, Human Body, Foreign Countries
Peer reviewed Peer reviewed
Direct linkDirect link
Kathryn J. Greenslade; Julia K. Bushell; Emily F. Dillon; Amy E. Ramage – International Journal of Language & Communication Disorders, 2025
Background: Pragmatic communication difficulties encompass many distinct behaviours, including the use of vague and/or insufficient language, a common characteristic following traumatic brain injury (TBI) that negatively impacts psychosocial outcomes. Existing assessments evaluate pragmatic communication broadly, often with only one or two items…
Descriptors: Neurological Impairments, Head Injuries, Language Impairments, Language Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Niamh Devane; Sofia Mazzoleni; Nicholas Behn; Jane Marshall; Stephanie Wilson; Katerina Hilari – International Journal of Language & Communication Disorders, 2025
Background and Aims: The reliability and validity of an intervention can be improved by checking treatment fidelity (TF). TF methods identify core components of an intervention, check their presence (or absence) and identify threats to fidelity. The Virtual Elaborated Semantic Feature Analysis (VESFA) intervention comprised individual sessions of…
Descriptors: Aphasia, Intervention, Fidelity, Feasibility Studies
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Alan Huebner; Gustaf B. Skar; Mengchen Huang – Practical Assessment, Research & Evaluation, 2025
Generalizability theory is a modern and powerful framework for conducting reliability analyses. It is flexible to accommodate both random and fixed facets. However, there has been a relative scarcity in the practical literature on how to handle the fixed facet case. This article aims to provide practitioners a conceptual understanding and…
Descriptors: Generalizability Theory, Multivariate Analysis, Statistical Analysis, Writing Evaluation
Peer reviewed Peer reviewed
Direct linkDirect link
Ole J. Kemi – Advances in Physiology Education, 2025
Students are assessed by coursework and/or exams, all of which are marked by assessors (markers). Student and marker performances are then subject to end-of-session board of examiner handling and analysis. This occurs annually and is the basis for evaluating students but also the wider learning and teaching efficiency of an academic institution.…
Descriptors: Undergraduate Students, Evaluation Methods, Evaluation Criteria, Academic Standards
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Wahyu Nanda Eka Saputra; Trikinasih Handayani; Prima Suci Rohmadheny; Rohmatus Naini; Dody Hartanto; Hardi Santosa; Dewi Afra Khairunnisa; Risma Risansyah; Hanan Riati; Faturrahman – Journal of Education and Learning (EduLearn), 2025
The students are urged to do something without expecting anything in return and only in the name of God. Every islamic student becomes something ideal if they can internalize and implement sincerity. Many people are willing to do something because of an ulterior motive. The importance of sincerity in humans is the background for developing a…
Descriptors: Islam, Interrater Reliability, Prosocial Behavior, Muslims
Peer reviewed Peer reviewed
Direct linkDirect link
Katharina Liegmann; Lisa Fischer; Kevin Dadaczynski; Reiner Hanewinkel; Frauke Nees; Matthis Morgenstern – International Journal of Behavioral Development, 2025
This study examined the new self-report version of the Strengths and Difficulties Questionnaire (SDQ-S), SDQ-Kids, in primary school children regarding internal consistency, teacher-child agreement, and validity. Data from 2,655 children in Grades 1 to 3 and their teachers were analyzed. Children completed SDQ-Kids, previously piloted (n = 896),…
Descriptors: Questionnaires, Behavior Problems, Screening Tests, Child Behavior
Previous Page | Next Page »
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9  |  10  |  11  |  ...  |  45