ERIC - Search Results

Publication Date

In 2025	1
Since 2024	1
Since 2021 (last 5 years)	6
Since 2016 (last 10 years)	11
Since 2006 (last 20 years)	16

Descriptor

Comparative Analysis	21
Scoring	21
Decision Making	20
Foreign Countries	7
Scores	6
Second Language Learning	6
English (Second Language)	5
Evaluators	5
Evaluation Methods	4
Higher Education	4
Language Tests	4
Rating Scales	4
Student Evaluation	4
Undergraduate Students	4
Validity	4
Writing Evaluation	4
Item Analysis	3
Multiple Choice Tests	3
Scoring Rubrics	3
Teaching Methods	3
Academic Achievement	2
Accuracy	2
Achievement Tests	2
Adolescents	2
Cognitive Processes	2
More ▼

Source

Language Testing	3
Anatomical Sciences Education	1
Applied Measurement in…	1
Assessment & Evaluation in…	1
Assessment for Effective…	1
Canadian Journal of Higher…	1
Educational Assessment	1
Educational Psychologist	1
Educational and Psychological…	1
International Journal of…	1
Journal of Abnormal Child…	1
Journal of Speech, Language,…	1
Language Assessment Quarterly	1
Language Education &…	1
National Center for Research…	1
ProQuest LLC	1
TESL Canada Journal	1
Working Papers in TESOL &…	1
More ▼

Publication Type

Journal Articles	17
Reports - Research	15
Reports - Evaluative	3
Dissertations/Theses -…	1
Information Analyses	1
Speeches/Meeting Papers	1

Education Level

Higher Education	5
Postsecondary Education	5
Elementary Education	2
Adult Education	1
Early Childhood Education	1
Secondary Education	1

Audience

Location

China	2
Australia	1
Austria	1
Canada	1
Cyprus	1
Europe	1
Turkey	1
United States	1

Laws, Policies, & Programs

Assessments and Surveys

Early Childhood Environment…	1
Kaufman Assessment Battery…	1
Peabody Picture Vocabulary…	1
Test of English as a Foreign…	1
Wechsler Individual…	1
Woodcock Johnson Tests of…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 21 results Save | Export

Comparative Judgement for Evaluating Young Learners' EFL Writing Performances: Reliability and Teacher Perceptions of Holistic and Dimension-Based Judgements

Peer reviewed

Direct link

Rebecca Sickinger; Tineke Brunfaut; John Pill – Language Testing, 2025

Comparative Judgement (CJ) is an evaluation method, typically conducted online, whereby a rank order is constructed, and scores calculated, from judges' pairwise comparisons of performances. CJ has been researched in various educational contexts, though only rarely in English as a Foreign Language (EFL) writing settings, and is generally agreed to…

Descriptors: Writing Evaluation, English (Second Language), Second Language Learning, Second Language Instruction

A Comparison of Methodologies for Scaling Longitudinal Social-Emotional Survey Responses

Peer reviewed

Direct link

Soland, James; Kuhfeld, Megan; Register, Brennan – Educational Assessment, 2023

Much of what we know about how children develop is based on survey data. In order to estimate growth across time and, thereby, better understand that development, short survey scales are typically administered at repeated timepoints. Before estimating growth, those repeated measures must be put onto the same scale. Yet, little research examines…

Descriptors: Comparative Analysis, Social Emotional Learning, Scaling, Effect Size

A Comparative Judgment Approach to Assessing Chinese Sign Language Interpreting

Peer reviewed

Direct link

Han, Chao; Xiao, Xiaoyan – Language Testing, 2022

The quality of sign language interpreting (SLI) is a gripping construct among practitioners, educators and researchers, calling for reliable and valid assessment. There has been a diverse array of methods in the extant literature to measure SLI quality, ranging from traditional error analysis to recent rubric scoring. In this study, we want to…

Descriptors: Comparative Analysis, Sign Language, Deaf Interpreting, Evaluators

Supporting Peer Evaluation of Student-Generated Content: A Study of Three Approaches

Peer reviewed

Direct link

Gyamfi, George; Hanna, Barbara; Khosravi, Hassan – Assessment & Evaluation in Higher Education, 2022

Engaging students in the creation of learning resources is an effective way of developing a repository of revision items. However, a selection process is needed to separate high- from low-quality resources as some of the materials created by students can be ineffective, inappropriate or incorrect. In this study, we share our experiences and…

Descriptors: Peer Evaluation, Student Developed Materials, Educational Technology, Scoring

The Role of Expert Judgement in Language Test Validation

Peer reviewed
PDF on ERIC

Download full text

Coniam, David; Lee, Tony; Milanovic, Michael; Pike, Nigel; Zhao, Wen – Language Education & Assessment, 2022

The calibration of test materials generally involves the interaction between empirical analysis and expert judgement. This paper explores the extent to which scale familiarity might affect expert judgement as a component of test validation in the calibration process. It forms part of a larger study that investigates the alignment of the…

Descriptors: Specialists, Language Tests, Test Validity, College Faculty

Exploration of New Complexity Metrics for Curriculum-Based Measures of Writing

Peer reviewed
PDF on ERIC

Download full text

Direct link

Wagner, Kyle; Smith, Alex; Allen, Abigail; McMaster, Kristen; Poch, Apryl; Lembke, Erica – Assessment for Effective Intervention, 2019

Researchers and practitioners have questioned whether scoring procedures used with curriculum-based measures of writing (CBM-W) capture growth in complexity of writing. We analyzed data from six independent samples to examine two potential scoring metrics for picture word CBM-W (PW), a sentence-level CBM task. Correct word sequences per response…

Descriptors: Curriculum Based Assessment, Writing Evaluation, Comparative Analysis, Scoring

The Neural Underpinnings of Processing Newly Taught Semantic Information: The Role of Retrieval Practice

Peer reviewed

Direct link

Haebig, Eileen; Leonard, Laurence B.; Deevy, Patricia; Schumaker, Jennifer; Karpicke, Jeffrey D.; Weber, Christine – Journal of Speech, Language, and Hearing Research, 2021

Purpose: Recent behavioral studies have demonstrated the effectiveness of implementing retrieval practice into learning tasks for children. Such approaches have revealed that repeated spaced retrieval (RSR) is particularly effective in promoting children's learning of word form and meaning information. This study further examines how retrieval…

Descriptors: Language Processing, Semantics, Teaching Methods, Learning Processes

Do Experience and Text Quality Matter for Raters' Decision-Making Behaviors?

Peer reviewed

Direct link

Sahan, Özgür; Razi, Salim – Language Testing, 2020

This study examines the decision-making behaviors of raters with varying levels of experience while assessing EFL essays of distinct qualities. The data were collected from 28 raters with varying levels of rating experience and working at the English language departments of different universities in Turkey. Using a 10-point analytic rubric, each…

Descriptors: Decision Making, Essays, Writing Evaluation, Evaluators

Rater Cognition in L2 Speaking Assessment: A Review of the Literature

Peer reviewed
PDF on ERIC

Download full text

Han, Qie – Working Papers in TESOL & Applied Linguistics, 2016

This literature review attempts to survey representative studies within the context of L2 speaking assessment that have contributed to the conceptualization of rater cognition. Two types of studies are looked at: 1) studies that examine "how" raters differ (and sometimes agree) in their cognitive processes and rating behaviors, in terms…

Descriptors: Second Language Learning, Student Evaluation, Evaluators, Speech Tests

Evidence-Based Decision about Test Scoring Rules in Clinical Anatomy Multiple-Choice Examinations

Peer reviewed

Direct link

Severo, Milton; Gaio, A. Rita; Povo, Ana; Silva-Pereira, Fernanda; Ferreira, Maria Amélia – Anatomical Sciences Education, 2015

In theory the formula scoring methods increase the reliability of multiple-choice tests in comparison with number-right scoring. This study aimed to evaluate the impact of the formula scoring method in clinical anatomy multiple-choice examinations, and to compare it with that from the number-right scoring method, hoping to achieve an…

Descriptors: Anatomy, Multiple Choice Tests, Scoring, Decision Making

Comparing Validity Evidence of Two ECERS-R Scoring Systems

Direct link

Zeng, Songtian – ProQuest LLC, 2017

Over 30 states have adopted the Early Childhood Environmental Rating Scale-Revised (ECERS-R) as a component of their program quality assessment systems, but the use of ECERS-R on such a large scale has raised important questions about implementation. One of the most pressing question centers upon decisions users must make between two scoring…

Descriptors: Rating Scales, Scoring, Validity, Comparative Analysis

(Meta)Linguistic Abilities of Bilectal Educators: The Case of Cyprus

Peer reviewed

Direct link

Leivada, Evelina; Kambanaros, Maria; Taxitari, Loukia; Grohmann, Kleanthes K. – International Journal of Bilingual Education and Bilingualism, 2020

The present study examines whether bilectal Greek Cypriot educators are able to identify dialectal (Cypriot Greek) elements superimposed on the standard language (Standard Modern Greek) in a written variety-judgment task. By doing so, (meta)linguistic skills of bilectal teachers from Cyprus were put to the test and later compared to the results of…

Descriptors: Metalinguistics, Greek, Dialects, Standard Spoken Usage

Measuring Self-Regulation in Computer-Based Learning Environments

Peer reviewed

Direct link

Schraw, Gregory – Educational Psychologist, 2010

I provide a summary of the four invited articles in this special issue and compare and contrast different methods for measuring self-regulation in computer-based learning environments (CBLEs). I present a taxonomy that distinguishes between offline and online measures and further distinguishes subcategories within each of these categories. I…

Descriptors: Scoring Rubrics, Scoring, Cognitive Processes, Self Control

Variability in ESL Essay Rating Processes: The Role of the Rating Scale and Rater Experience

Peer reviewed

Direct link

Barkaoui, Khaled – Language Assessment Quarterly, 2010

Various factors contribute to variability in English as a second language (ESL) essay scores and rating processes. Most previous research, however, has focused on score variability in relation to task, rater, and essay characteristics. A few studies have examined variability in essay rating processes. The current study used think-aloud protocols…

Descriptors: Protocol Analysis, Holistic Evaluation, Evaluation Criteria, Rating Scales

Automatic Assessment of Complex Task Performance in Games and Simulations. CRESST Report 775

Download full text

Iseli, Markus R.; Koenig, Alan D.; Lee, John J.; Wainess, Richard – National Center for Research on Evaluation, Standards, and Student Testing (CRESST), 2010

Assessment of complex task performance is crucial to evaluating personnel in critical job functions such as Navy damage control operations aboard ships. Games and simulations can be instrumental in this process, as they can present a broad range of complex scenarios without involving harm to people or property. However, "automatic"…

Descriptors: Performance Tests, Performance Based Assessment, Decision Making Skills, Military Training

Previous Page | Next Page »

Pages: 1 | 2

Allen, Abigail	1
Barkaoui, Khaled	1
Brugman, Daniel	1
Coniam, David	1
Deevy, Patricia	1
Dekovic, Maja	1
Des Brisay, Margaret	1
Druva-Roush, Cynthia Ann	1
Ferreira, Maria Amélia	1
Gaio, A. Rita	1
Gibbs, John C.	1
Grohmann, Kleanthes K.	1
Gyamfi, George	1
Haebig, Eileen	1
Han, Chao	1
Han, Qie	1
Hanna, Barbara	1
Iseli, Markus R.	1
John Pill	1
Kambanaros, Maria	1
Karpicke, Jeffrey D.	1
Khosravi, Hassan	1
Koenig, Alan D.	1
Kuhfeld, Megan	1
More ▼