ERIC - Search Results

Publication Date

In 2025	0
Since 2024	3
Since 2021 (last 5 years)	22
Since 2016 (last 10 years)	35
Since 2006 (last 20 years)	46

Descriptor

Comparative Analysis	71
Evaluation Methods	71
Evaluators	71
Foreign Countries	15
Reliability	13
Computer Software	12
Interrater Reliability	12
Decision Making	11
Validity	10
Scores	8
Student Evaluation	8
Correlation	7
Higher Education	7
Performance Based Assessment	7
Program Evaluation	7
Standards	7
College Faculty	6
Ethics	6
Language Tests	6
Peer Evaluation	6
Scoring	6
Second Language Learning	6
Accuracy	5
Artificial Intelligence	5
Educational Assessment	5
More ▼

Publication Type

Journal Articles	55
Reports - Research	49
Reports - Evaluative	18
Speeches/Meeting Papers	15
Tests/Questionnaires	7
Reports - Descriptive	4
Information Analyses	2
Collected Works - Proceedings	1
Opinion Papers	1

Education Level

Higher Education	16
Postsecondary Education	16
Early Childhood Education	3
Elementary Education	3
Elementary Secondary Education	2
Primary Education	2
Secondary Education	2
Grade 1	1
Grade 2	1
High Schools	1
Kindergarten	1
More ▼

Audience

Researchers

Location

United States	4
United Kingdom	3
China	2
Denmark	2
United Kingdom (England)	2
California	1
Canada	1
District of Columbia	1
Germany	1
Iran	1
Italy	1
Maryland	1
Norway	1
Poland	1
Spain	1
Switzerland	1
Tennessee	1
Texas	1
Thailand	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

Flesch Kincaid Grade Level…	1
Fry Readability Formula	1
National Adult Literacy…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 71 results Save | Export

Towards the Automatic Risk of Bias Assessment on Randomized Controlled Trials: A Comparison of RobotReviewer and Humans

Peer reviewed

Direct link

Yuan Tian; Xi Yang; Suhail A. Doi; Luis Furuya-Kanamori; Lifeng Lin; Joey S. W. Kwong; Chang Xu – Research Synthesis Methods, 2024

RobotReviewer is a tool for automatically assessing the risk of bias in randomized controlled trials, but there is limited evidence of its reliability. We evaluated the agreement between RobotReviewer and humans regarding the risk of bias assessment based on 1955 randomized controlled trials. The risk of bias in these trials was assessed via two…

Descriptors: Risk, Randomized Controlled Trials, Classification, Robotics

Critiquing the Rationales for Using Comparative Judgement: A Call for Clarity

Peer reviewed

Direct link

Kelly, Kate Tremain; Richardson, Mary; Isaacs, Talia – Assessment in Education: Principles, Policy & Practice, 2022

Comparative judgment is gaining popularity as an assessment tool, including for high-stakes testing purposes, despite relatively little research on the use of the technique. Advocates claim two main rationales for its use: that comparative judgment is valid because humans are better at comparative than absolute judgment, and because it distils the…

Descriptors: Comparative Analysis, Evaluation Methods, Evaluative Thinking, High Stakes Tests

Professionalizing Evaluation: A Time-Bound Comparison of the American Evaluation Association's Foundational Documents

Peer reviewed

Direct link

Tucker, Susan; Stevahn, Laurie; King, Jean A. – American Journal of Evaluation, 2023

This article compares the purposes and content of the four foundational documents of the American Evaluation Association (AEA): the Program Evaluation Standards, the AEA Public Statement on Cultural Competence in Evaluation, the AEA Evaluator Competencies, and the AEA Guiding Principles. This reflection on alignment is an early effort in the third…

Descriptors: Professionalism, Comparative Analysis, Professional Associations, Program Evaluation

Examining the Effect of Assessment Construct Characteristics on Machine Learning Scoring of Scientific Argumentation

Peer reviewed

Direct link

Kevin C. Haudek; Xiaoming Zhai – International Journal of Artificial Intelligence in Education, 2024

Argumentation, a key scientific practice presented in the "Framework for K-12 Science Education," requires students to construct and critique arguments, but timely evaluation of arguments in large-scale classrooms is challenging. Recent work has shown the potential of automated scoring systems for open response assessments, leveraging…

Descriptors: Accuracy, Persuasive Discourse, Artificial Intelligence, Learning Management Systems

Do You Mean What I Mean? Comparing Teacher Performance Self-Scores and Evaluator-Generated Scores

Peer reviewed

Direct link

Hunter, Seth B. – Journal of Education Human Resources, 2023

Teacher performance scores inform education leaders' management of teacher human resources. However, prior research has implied that different interpretations of performance criteria between teachers and their evaluators suppress teacher development. Although research has examined teacher perceptions of performance scores and compared teacher…

Descriptors: Teacher Evaluation, Teacher Effectiveness, Self Evaluation (Individuals), Interrater Reliability

The Concurrent Validity of Comparative Judgement Outcomes Compared with Marks

Download full text

Gill, Tim – Research Matters, 2022

In Comparative Judgement (CJ) exercises, examiners are asked to look at a selection of candidate scripts (with marks removed) and order them in terms of which they believe display the best quality. By including scripts from different examination sessions, the results of these exercises can be used to help with maintaining standards. Results from…

Descriptors: Comparative Analysis, Decision Making, Scripts, Standards

Combining Human and Automated Scoring Methods in Experimental Assessments of Writing: A Case Study Tutorial

Peer reviewed

Direct link

Reagan Mozer; Luke Miratrix; Jackie Eunjung Relyea; James S. Kim – Journal of Educational and Behavioral Statistics, 2024

In a randomized trial that collects text as an outcome, traditional approaches for assessing treatment impact require that each document first be manually coded for constructs of interest by human raters. An impact analysis can then be conducted to compare treatment and control groups, using the hand-coded scores as a measured outcome. This…

Descriptors: Scoring, Evaluation Methods, Writing Evaluation, Comparative Analysis

Judges' Views on Pairwise Comparative Judgement and Rank Ordering as Alternatives to Analytical Essay Marking

Download full text

Walland, Emma – Research Matters, 2022

In this article, I report on examiners' views and experiences of using Pairwise Comparative Judgement (PCJ) and Rank Ordering (RO) as alternatives to traditional analytical marking for GCSE English Language essays. Fifteen GCSE English Language examiners took part in the study. After each had judged 100 pairs of essays using PCJ and eight packs of…

Descriptors: Essays, Grading, Writing Evaluation, Evaluators

How Do Judges in Comparative Judgement Exercises Make Their Judgements?

Download full text

Leech, Tony; Chambers, Lucy – Research Matters, 2022

Two of the central issues in comparative judgement (CJ), which are perhaps underexplored compared to questions of the method's reliability and technical quality, are "what processes do judges use to make their decisions" and "what features do they focus on when making their decisions?" This article discusses both, in the…

Descriptors: Comparative Analysis, Decision Making, Evaluators, Reliability

Crowdsourced Adaptive Comparative Judgment: A Community-Based Solution for Proficiency Rating

Peer reviewed

Direct link

Paquot, Magali; Rubin, Rachel; Vandeweerd, Nathan – Language Learning, 2022

The main objective of this Methods Showcase Article is to show how the technique of adaptive comparative judgment, coupled with a crowdsourcing approach, can offer practical solutions to reliability issues as well as to address the time and cost difficulties associated with a text-based approach to proficiency assessment in L2 research. We…

Descriptors: Comparative Analysis, Decision Making, Language Proficiency, Reliability

Comprehensible to Whom? Examining Rater, Speaker, and Interlocutor Perspectives on Comprehensibility in an Interactive Context

Peer reviewed

Direct link

Nagle, Charlie L.; Trofimovich, Pavel; O'Brien, Mary Grantham; Kennedy, Sara – Modern Language Journal, 2022

Comprehensibility has emerged as a useful and intuitive means of globally evaluating second language (L2) speakers in many research and instructional contexts. In most cases, L2 speakers' comprehensibility is assessed by external listeners who do not engage in extensive communication with the speakers, even though the degree to which a speaker is…

Descriptors: Evaluators, Intelligibility, Pronunciation, Task Analysis

A Model-Data-Fit-Informed Approach to Score Resolution in Performance Assessments

Peer reviewed

Direct link

Wind, Stefanie A.; Walker, A. Adrienne – Educational Measurement: Issues and Practice, 2021

Many large-scale performance assessments include score resolution procedures for resolving discrepancies in rater judgments. The goal of score resolution is conceptually similar to person fit analyses: To identify students for whom observed scores may not accurately reflect their achievement. Previously, researchers have observed that…

Descriptors: Goodness of Fit, Performance Based Assessment, Evaluators, Decision Making

Intelligent Tutoring for Surgical Decision Making: A Planning-Based Approach

Peer reviewed

Direct link

Vannaprathip, Narumol; Haddawy, Peter; Schultheis, Holger; Suebnukarn, Siriwan – International Journal of Artificial Intelligence in Education, 2022

Virtual reality simulation has had a significant impact on training of psychomotor surgical skills, yet there is still a lack of work on its use to teach surgical decision making. This is particularly noteworthy given the recognized importance of decision making in achieving positive surgical outcomes. With the objective of filling this gap, we…

Descriptors: Intelligent Tutoring Systems, Decision Making, Surgery, Teaching Methods

Automated Assessment of Second Language Comprehensibility: Review, Training, Validation, and Generalization Studies

Peer reviewed

Direct link

Saito, Kazuya; Macmillan, Konstantinos; Kachlicka, Magdalena; Kunihara, Takuya; Minematsu, Nobuaki – Studies in Second Language Acquisition, 2023

Whereas many scholars have emphasized the relative importance of "comprehensibility" as an ecologically valid goal for L2 speech training, testing, and development, eliciting listeners' judgments is time-consuming. Following calls for research on more efficient L2 speech rating methods in applied linguistics, and growing attention toward…

Descriptors: Second Language Learning, Second Language Instruction, Interrater Reliability, Speech Communication

Comparing Machine and Human Reviewers to Evaluate the Risk of Bias in Randomized Controlled Trials

Peer reviewed

Direct link

Armijo-Olivo, Susan; Craig, Rodger; Campbell, Sandy – Research Synthesis Methods, 2020

Background: Evidence from new health technologies is growing, along with demands for evidence to inform policy decisions, creating challenges in completing health technology assessments (HTAs)/systematic reviews (SRs) in a timely manner. Software can decrease the time and burden by automating the process, but evidence validating such software is…

Descriptors: Comparative Analysis, Computer Software, Decision Making, Randomized Controlled Trials

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5

American Journal of Evaluation	7
Research Matters	4
Advances in Health Sciences…	2
Evaluation Review	2
International Journal of…	2
Interpreter and Translator…	2
Research Synthesis Methods	2
AERA Online Paper Repository	1
Advances in Language and…	1
Applied Psychological…	1
Assessment & Evaluation in…	1
Assessment in Education:…	1
Behaviour & Information…	1
British Journal of Teacher…	1
Bulgarian Comparative…	1
Canadian Modern Language…	1
Clinical Linguistics &…	1
Educational Leadership	1
Educational Measurement:…	1
Educational and Psychological…	1
English Language Teaching	1
Evaluation and Program…	1
International Education…	1
International Journal of…	1
International Journal of…	1
More ▼

Chambers, Lucy	2
Myford, Carol M.	2
Akbari, Alireza	1
Apple, Kristen	1
Armijo-Olivo, Susan	1
Azzam, Tarek	1
Bamberger, Michael	1
Bartholomew, Scott Ronald	1
Barwell, Fred	1
Blankenship, Mark H.	1
Bosch, Emma	1
Bozeman, Barry	1
Breidahl, Karen N.	1
Brooks, Ariana	1
Brown, Robert D.	1
Brydges, Ryan	1
Burset, Silvia	1
Campbell, Sandy	1
Cauble, Mary	1
Chafouleas, Sandra M.	1
Chang Xu	1
Chapel, Thomas J.	1
Coryn, Chris L. S.	1
Craig, Rodger	1
More ▼