NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 21 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Rebecca Sickinger; Tineke Brunfaut; John Pill – Language Testing, 2025
Comparative Judgement (CJ) is an evaluation method, typically conducted online, whereby a rank order is constructed, and scores calculated, from judges' pairwise comparisons of performances. CJ has been researched in various educational contexts, though only rarely in English as a Foreign Language (EFL) writing settings, and is generally agreed to…
Descriptors: Writing Evaluation, English (Second Language), Second Language Learning, Second Language Instruction
Peer reviewed Peer reviewed
Direct linkDirect link
Soland, James; Kuhfeld, Megan; Register, Brennan – Educational Assessment, 2023
Much of what we know about how children develop is based on survey data. In order to estimate growth across time and, thereby, better understand that development, short survey scales are typically administered at repeated timepoints. Before estimating growth, those repeated measures must be put onto the same scale. Yet, little research examines…
Descriptors: Comparative Analysis, Social Emotional Learning, Scaling, Effect Size
Peer reviewed Peer reviewed
Direct linkDirect link
Han, Chao; Xiao, Xiaoyan – Language Testing, 2022
The quality of sign language interpreting (SLI) is a gripping construct among practitioners, educators and researchers, calling for reliable and valid assessment. There has been a diverse array of methods in the extant literature to measure SLI quality, ranging from traditional error analysis to recent rubric scoring. In this study, we want to…
Descriptors: Comparative Analysis, Sign Language, Deaf Interpreting, Evaluators
Peer reviewed Peer reviewed
Direct linkDirect link
Gyamfi, George; Hanna, Barbara; Khosravi, Hassan – Assessment & Evaluation in Higher Education, 2022
Engaging students in the creation of learning resources is an effective way of developing a repository of revision items. However, a selection process is needed to separate high- from low-quality resources as some of the materials created by students can be ineffective, inappropriate or incorrect. In this study, we share our experiences and…
Descriptors: Peer Evaluation, Student Developed Materials, Educational Technology, Scoring
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Coniam, David; Lee, Tony; Milanovic, Michael; Pike, Nigel; Zhao, Wen – Language Education & Assessment, 2022
The calibration of test materials generally involves the interaction between empirical analysis and expert judgement. This paper explores the extent to which scale familiarity might affect expert judgement as a component of test validation in the calibration process. It forms part of a larger study that investigates the alignment of the…
Descriptors: Specialists, Language Tests, Test Validity, College Faculty
Wagner, Kyle; Smith, Alex; Allen, Abigail; McMaster, Kristen; Poch, Apryl; Lembke, Erica – Assessment for Effective Intervention, 2019
Researchers and practitioners have questioned whether scoring procedures used with curriculum-based measures of writing (CBM-W) capture growth in complexity of writing. We analyzed data from six independent samples to examine two potential scoring metrics for picture word CBM-W (PW), a sentence-level CBM task. Correct word sequences per response…
Descriptors: Curriculum Based Assessment, Writing Evaluation, Comparative Analysis, Scoring
Peer reviewed Peer reviewed
Direct linkDirect link
Haebig, Eileen; Leonard, Laurence B.; Deevy, Patricia; Schumaker, Jennifer; Karpicke, Jeffrey D.; Weber, Christine – Journal of Speech, Language, and Hearing Research, 2021
Purpose: Recent behavioral studies have demonstrated the effectiveness of implementing retrieval practice into learning tasks for children. Such approaches have revealed that repeated spaced retrieval (RSR) is particularly effective in promoting children's learning of word form and meaning information. This study further examines how retrieval…
Descriptors: Language Processing, Semantics, Teaching Methods, Learning Processes
Peer reviewed Peer reviewed
Direct linkDirect link
Sahan, Özgür; Razi, Salim – Language Testing, 2020
This study examines the decision-making behaviors of raters with varying levels of experience while assessing EFL essays of distinct qualities. The data were collected from 28 raters with varying levels of rating experience and working at the English language departments of different universities in Turkey. Using a 10-point analytic rubric, each…
Descriptors: Decision Making, Essays, Writing Evaluation, Evaluators
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Han, Qie – Working Papers in TESOL & Applied Linguistics, 2016
This literature review attempts to survey representative studies within the context of L2 speaking assessment that have contributed to the conceptualization of rater cognition. Two types of studies are looked at: 1) studies that examine "how" raters differ (and sometimes agree) in their cognitive processes and rating behaviors, in terms…
Descriptors: Second Language Learning, Student Evaluation, Evaluators, Speech Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Severo, Milton; Gaio, A. Rita; Povo, Ana; Silva-Pereira, Fernanda; Ferreira, Maria Amélia – Anatomical Sciences Education, 2015
In theory the formula scoring methods increase the reliability of multiple-choice tests in comparison with number-right scoring. This study aimed to evaluate the impact of the formula scoring method in clinical anatomy multiple-choice examinations, and to compare it with that from the number-right scoring method, hoping to achieve an…
Descriptors: Anatomy, Multiple Choice Tests, Scoring, Decision Making
Zeng, Songtian – ProQuest LLC, 2017
Over 30 states have adopted the Early Childhood Environmental Rating Scale-Revised (ECERS-R) as a component of their program quality assessment systems, but the use of ECERS-R on such a large scale has raised important questions about implementation. One of the most pressing question centers upon decisions users must make between two scoring…
Descriptors: Rating Scales, Scoring, Validity, Comparative Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Leivada, Evelina; Kambanaros, Maria; Taxitari, Loukia; Grohmann, Kleanthes K. – International Journal of Bilingual Education and Bilingualism, 2020
The present study examines whether bilectal Greek Cypriot educators are able to identify dialectal (Cypriot Greek) elements superimposed on the standard language (Standard Modern Greek) in a written variety-judgment task. By doing so, (meta)linguistic skills of bilectal teachers from Cyprus were put to the test and later compared to the results of…
Descriptors: Metalinguistics, Greek, Dialects, Standard Spoken Usage
Peer reviewed Peer reviewed
Direct linkDirect link
Schraw, Gregory – Educational Psychologist, 2010
I provide a summary of the four invited articles in this special issue and compare and contrast different methods for measuring self-regulation in computer-based learning environments (CBLEs). I present a taxonomy that distinguishes between offline and online measures and further distinguishes subcategories within each of these categories. I…
Descriptors: Scoring Rubrics, Scoring, Cognitive Processes, Self Control
Peer reviewed Peer reviewed
Direct linkDirect link
Barkaoui, Khaled – Language Assessment Quarterly, 2010
Various factors contribute to variability in English as a second language (ESL) essay scores and rating processes. Most previous research, however, has focused on score variability in relation to task, rater, and essay characteristics. A few studies have examined variability in essay rating processes. The current study used think-aloud protocols…
Descriptors: Protocol Analysis, Holistic Evaluation, Evaluation Criteria, Rating Scales
Iseli, Markus R.; Koenig, Alan D.; Lee, John J.; Wainess, Richard – National Center for Research on Evaluation, Standards, and Student Testing (CRESST), 2010
Assessment of complex task performance is crucial to evaluating personnel in critical job functions such as Navy damage control operations aboard ships. Games and simulations can be instrumental in this process, as they can present a broad range of complex scenarios without involving harm to people or property. However, "automatic"…
Descriptors: Performance Tests, Performance Based Assessment, Decision Making Skills, Military Training
Previous Page | Next Page »
Pages: 1  |  2