NotesFAQContact Us
Collection
Advanced
Search Tips
Laws, Policies, & Programs
Every Student Succeeds Act…2
What Works Clearinghouse Rating
Showing 1 to 15 of 108 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Lucy Chambers; Sylvia Vitello; Carmen Vidal Rodeiro – Assessment in Education: Principles, Policy & Practice, 2024
In England, some secondary-level qualifications comprise non-exam assessments which need to undergo moderation before grading. Currently, moderation is conducted at centre (school) level. This raises challenges for maintaining the standard across centres. Recent technological advances enable novel moderation methods that are no longer bound by…
Descriptors: Foreign Countries, Evaluation Methods, Comparative Analysis, Grading
Peer reviewed Peer reviewed
Direct linkDirect link
Yuan Tian; Xi Yang; Suhail A. Doi; Luis Furuya-Kanamori; Lifeng Lin; Joey S. W. Kwong; Chang Xu – Research Synthesis Methods, 2024
RobotReviewer is a tool for automatically assessing the risk of bias in randomized controlled trials, but there is limited evidence of its reliability. We evaluated the agreement between RobotReviewer and humans regarding the risk of bias assessment based on 1955 randomized controlled trials. The risk of bias in these trials was assessed via two…
Descriptors: Risk, Randomized Controlled Trials, Classification, Robotics
Peer reviewed Peer reviewed
Direct linkDirect link
Caroline F. Rowland; Amy Bidgood; Gary Jones; Andrew Jessop; Paula Stinson; Julian M. Pine; Samantha Durrant; Michelle S. Peter – Language Learning, 2025
A strong predictor of children's language is performance on non-word repetition (NWR) tasks. However, the basis of this relationship remains unknown. Some suggest that NWR tasks measure phonological working memory, which then affects language growth. Others argue that children's knowledge of language/language experience affects NWR performance. A…
Descriptors: Vocabulary Development, Comparative Analysis, Computational Linguistics, Language Skills
Peer reviewed Peer reviewed
Direct linkDirect link
Marine Simon; Alexandra Budke – Journal of Geography in Higher Education, 2024
Comparison is an important geographic method and a common task in geography education. Mastering comparison is a complex competency and written comparisons are challenging tasks both for students and assessors. As yet, however, there is no set test for evaluating comparison competency nor tool for enhancing it. Moreover, little is known about…
Descriptors: Geography Instruction, Student Evaluation, Comparative Analysis, Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Rebecca Sickinger; Tineke Brunfaut; John Pill – Language Testing, 2025
Comparative Judgement (CJ) is an evaluation method, typically conducted online, whereby a rank order is constructed, and scores calculated, from judges' pairwise comparisons of performances. CJ has been researched in various educational contexts, though only rarely in English as a Foreign Language (EFL) writing settings, and is generally agreed to…
Descriptors: Writing Evaluation, English (Second Language), Second Language Learning, Second Language Instruction
Gill, Tim – Research Matters, 2022
In Comparative Judgement (CJ) exercises, examiners are asked to look at a selection of candidate scripts (with marks removed) and order them in terms of which they believe display the best quality. By including scripts from different examination sessions, the results of these exercises can be used to help with maintaining standards. Results from…
Descriptors: Comparative Analysis, Decision Making, Scripts, Standards
Peer reviewed Peer reviewed
Direct linkDirect link
Yun Long; Haifeng Luo; Yu Zhang – npj Science of Learning, 2024
This study explores the use of Large Language Models (LLMs), specifically GPT-4, in analysing classroom dialogue--a key task for teaching diagnosis and quality improvement. Traditional qualitative methods are both knowledge- and labour-intensive. This research investigates the potential of LLMs to streamline and enhance this process. Using…
Descriptors: Classroom Communication, Computational Linguistics, Chinese, Mathematics Instruction
Peer reviewed Peer reviewed
Direct linkDirect link
Bramley, Tom; Vitello, Sylvia – Assessment in Education: Principles, Policy & Practice, 2019
Comparative Judgement (CJ) is an increasingly widely investigated method in assessment for creating a scale, for example of the quality of essays. One area that has attracted attention in CJ studies is the optimisation of the selection of pairs of objects for judgement. One approach is known as adaptive comparative judgement (ACJ). It has been…
Descriptors: Reliability, Evaluation Methods, Comparative Analysis, Essay Tests
Walland, Emma – Research Matters, 2022
In this article, I report on examiners' views and experiences of using Pairwise Comparative Judgement (PCJ) and Rank Ordering (RO) as alternatives to traditional analytical marking for GCSE English Language essays. Fifteen GCSE English Language examiners took part in the study. After each had judged 100 pairs of essays using PCJ and eight packs of…
Descriptors: Essays, Grading, Writing Evaluation, Evaluators
Leech, Tony; Chambers, Lucy – Research Matters, 2022
Two of the central issues in comparative judgement (CJ), which are perhaps underexplored compared to questions of the method's reliability and technical quality, are "what processes do judges use to make their decisions" and "what features do they focus on when making their decisions?" This article discusses both, in the…
Descriptors: Comparative Analysis, Decision Making, Evaluators, Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Paquot, Magali; Rubin, Rachel; Vandeweerd, Nathan – Language Learning, 2022
The main objective of this Methods Showcase Article is to show how the technique of adaptive comparative judgment, coupled with a crowdsourcing approach, can offer practical solutions to reliability issues as well as to address the time and cost difficulties associated with a text-based approach to proficiency assessment in L2 research. We…
Descriptors: Comparative Analysis, Decision Making, Language Proficiency, Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Azman Ong, Mohd Hanafi; Mohd Yasin, Norazlina; Ibrahim, Nur Syafikah – Asian Association of Open Universities Journal, 2022
Purpose: Measuring internal response of online learning is seen as fundamental to absorptive capacity which stimulates knowledge assimilation. However, the evaluation of practice and research of validated instruments that could effectively measure online learning response behavior is limited. Thus, in this study, a new instrument was designed…
Descriptors: Online Courses, Student Surveys, Student Attitudes, Factor Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Dissabandara, Lakal O.; Nawaratna, Sujeevi; Nirthanan, Selvanayagam – Anatomical Sciences Education, 2023
The objective structured practical examination (OSPE) is a reliable assessment of practical skills in anatomy teaching. It is often administered as low-stake assessments to track progress at multiple time points in anatomy curricula. Standard-setting OSPEs to derive a pass mark and to ensure assessment quality and rigor is a complex task. This…
Descriptors: Standard Setting, Anatomy, Medical Education, Medical Schools
Vidal Rodeiro, Carmen; Chambers, Lucy – Research Matters, 2022
Many high-stakes qualifications include non-exam assessments that are marked by teachers. Awarding bodies then apply a moderation process to bring the marking of these assessments to an agreed standard. Comparative Judgement (CJ) is a technique where two (or more) pieces of work are compared at a time, allowing an overall rank order of work to be…
Descriptors: Evaluation Methods, Portfolios (Background Materials), Decision Making, Task Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Printz, Trine; Pedersen, Ellen Raben; Juhl, Peter; Nielsen, Troels; Grøntved, Ågot Møller; Godballe, Christian – Journal of Speech, Language, and Hearing Research, 2017
Purpose: The aim of this study was to add further knowledge about the usefulness of the Voice Range Profile (VRP) assessment in clinical settings and research by analyzing VRP dual-microphone equipment precision, reliability, and room effect. Method: Test-retest studies were conducted in an anechoic chamber and an office: (a) comparing sound…
Descriptors: Audio Equipment, Reliability, Accuracy, Comparative Analysis
Previous Page | Next Page »
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7  |  8