ERIC - Search Results

Publication Date

In 2026	0
Since 2025	3
Since 2022 (last 5 years)	17
Since 2017 (last 10 years)	29
Since 2007 (last 20 years)	49

Descriptor

Decision Making	84
Evaluation Methods	84
Reliability	48
Test Reliability	25
Validity	25
Student Evaluation	18
Test Validity	18
Comparative Analysis	16
Scores	16
Models	15
Foreign Countries	14
Interrater Reliability	14
Evaluation Criteria	13
Evaluators	13
Program Evaluation	13
Test Construction	11
Higher Education	10
Data Collection	9
Measurement Techniques	9
Performance Based Assessment	9
Correlation	8
Data Analysis	8
Teacher Evaluation	8
Teaching Methods	8
Computer Software	7
More ▼

Publication Type

Journal Articles	48
Reports - Research	48
Reports - Descriptive	8
Reports - Evaluative	8
Speeches/Meeting Papers	7
Tests/Questionnaires	5
Books	3
Information Analyses	3
Opinion Papers	3
Collected Works - General	2
Guides - Classroom - Teacher	2
Guides - General	2
Collected Works - Proceedings	1
Dissertations/Theses -…	1
Guides - Non-Classroom	1
Numerical/Quantitative Data	1
Translations	1
More ▼

Education Level

Higher Education	14
Postsecondary Education	11
Secondary Education	7
Elementary Education	6
Elementary Secondary Education	3
Grade 1	2
Grade 2	2
Grade 3	2
Grade 4	2
Grade 5	2
Kindergarten	2
Early Childhood Education	1
Grade 8	1
High Schools	1
Junior High Schools	1
Middle Schools	1
Preschool Education	1
Primary Education	1
Two Year Colleges	1
More ▼

Audience

Practitioners	3
Researchers	3
Administrators	2
Counselors	1
Policymakers	1
Teachers	1

Location

Australia	2
Austria	2
California	2
Florida	2
Israel	2
United Kingdom	2
United Kingdom (England)	2
United States	2
Wisconsin (Milwaukee)	2
Arizona	1
Asia	1
Brazil	1
China	1
Connecticut	1
Denmark	1
Egypt	1
Estonia	1
Europe	1
Germany	1
Greece	1
Hawaii	1
Indonesia	1
Iran	1
Ireland	1
Italy	1
More ▼

Laws, Policies, & Programs

Elementary and Secondary…	1
Elementary and Secondary…	1
Race to the Top	1

Assessments and Surveys

Classroom Assessment Scoring…	1
Dynamic Indicators of Basic…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 84 results Save | Export

Evaluating the Consistency and Reliability of Attribution Methods in Automated Short Answer Grading (ASAG) Systems: Toward an Explainable Scoring System

Peer reviewed

Direct link

Wallace N. Pinto Jr.; Jinnie Shin – Journal of Educational Measurement, 2025

In recent years, the application of explainability techniques to automated essay scoring and automated short-answer grading (ASAG) models, particularly those based on transformer architectures, has gained significant attention. However, the reliability and consistency of these techniques remain underexplored. This study systematically investigates…

Descriptors: Automation, Grading, Computer Assisted Testing, Scoring

The AI Teacher Test: Measuring the Pedagogical Ability of Blender and GPT-3 in Educational Dialogues

Peer reviewed
PDF on ERIC

Download full text

Tack, Anaïs; Piech, Chris – International Educational Data Mining Society, 2022

How can we test whether state-of-the-art generative models, such as Blender and GPT-3, are good AI teachers, capable of replying to a student in an educational dialogue? Designing an AI teacher test is challenging: although evaluation methods are much-needed, there is no off-the-shelf solution to measuring pedagogical ability. This paper reports…

Descriptors: Artificial Intelligence, Dialogs (Language), Bayesian Statistics, Decision Making

Ensuring Data Quality in Large International Development Projects: Tools, Strategies, and Lessons Learned

Peer reviewed

Direct link

Jennifer Sdunzik; Ann M. Bessenbacher; Wilella D. Burgess; Asia M. Mohamud; Abdirisak Dalmar – American Journal of Evaluation, 2025

The success of development projects and evaluations hinges on having access to research protocols and methodologies that consider the needs and characteristics of stakeholders, subjects, and context while remaining rigorous and culturally sound. These efforts are often complicated by a dearth of tools that have been tested for validity and…

Descriptors: Foreign Countries, Program Evaluation, International Programs, Data Collection

A Computationally Simple Method for Estimating Decision Consistency

Peer reviewed

Direct link

Wolkowitz, Amanda A. – Journal of Educational Measurement, 2021

Decision consistency (DC) is the reliability of a classification decision based on a test score. In professional credentialing, the decision is often a high-stakes pass/fail decision. The current methods for estimating DC are computationally complex. The purpose of this research is to provide a computationally and conceptually simple method for…

Descriptors: Decision Making, Reliability, Classification, Scores

How to Evaluate Students' Decisions in a Data Comparison Problem: Correct Decision for the Wrong Reasons?

Peer reviewed

Direct link

Karel Kok; Sophia Chroszczinsky; Burkhard Priemer – Physical Review Physics Education Research, 2024

Data comparison problems are used in teaching and science education research that focuses on students' ability to compare datasets and their conceptual understanding of measurement uncertainties. However, the evaluation of students' decisions in these problems can pose a problem: e.g., students making a correct decision for the wrong reasons.…

Descriptors: Secondary School Students, Undergraduate Students, Comparative Analysis, Evaluation Methods

Comparative Judgement for Evaluating Young Learners' EFL Writing Performances: Reliability and Teacher Perceptions of Holistic and Dimension-Based Judgements

Peer reviewed

Direct link

Rebecca Sickinger; Tineke Brunfaut; John Pill – Language Testing, 2025

Comparative Judgement (CJ) is an evaluation method, typically conducted online, whereby a rank order is constructed, and scores calculated, from judges' pairwise comparisons of performances. CJ has been researched in various educational contexts, though only rarely in English as a Foreign Language (EFL) writing settings, and is generally agreed to…

Descriptors: Writing Evaluation, English (Second Language), Second Language Learning, Second Language Instruction

The Concurrent Validity of Comparative Judgement Outcomes Compared with Marks

Download full text

Gill, Tim – Research Matters, 2022

In Comparative Judgement (CJ) exercises, examiners are asked to look at a selection of candidate scripts (with marks removed) and order them in terms of which they believe display the best quality. By including scripts from different examination sessions, the results of these exercises can be used to help with maintaining standards. Results from…

Descriptors: Comparative Analysis, Decision Making, Scripts, Standards

Core Considerations for Selecting a Screener. Improving Literacy Brief

Direct link

National Center on Improving Literacy, 2022

There are many available screeners for reading and other education or social-emotional outcomes. This brief outlines important things to consider when choosing and using a screener.

Descriptors: Screening Tests, Literacy, Social Emotional Learning, Decision Making

The Five Key Types of "Marsilea crenata" as a Potential Identification Tool for Biology Students

Peer reviewed
PDF on ERIC

Download full text

W. Wisanti; Siti Zubaidah; Sri Rahayu Lestari; Novita Kartika Indah; Eva Kristinawati Putri – Journal of Biological Education Indonesia (Jurnal Pendidikan Biologi Indonesia), 2023

An identification key is one of the tools used to determine the identity of a plant specimen. This research aims to design an identification key for "M. crenata" and analyze its potential as an identification tool. The research uses an observational descriptive method. The identification keys were designed for populations growing in…

Descriptors: Biology, Science Instruction, Visual Aids, Plants (Botany)

How Do Judges in Comparative Judgement Exercises Make Their Judgements?

Download full text

Leech, Tony; Chambers, Lucy – Research Matters, 2022

Two of the central issues in comparative judgement (CJ), which are perhaps underexplored compared to questions of the method's reliability and technical quality, are "what processes do judges use to make their decisions" and "what features do they focus on when making their decisions?" This article discusses both, in the…

Descriptors: Comparative Analysis, Decision Making, Evaluators, Reliability

Crowdsourced Adaptive Comparative Judgment: A Community-Based Solution for Proficiency Rating

Peer reviewed

Direct link

Paquot, Magali; Rubin, Rachel; Vandeweerd, Nathan – Language Learning, 2022

The main objective of this Methods Showcase Article is to show how the technique of adaptive comparative judgment, coupled with a crowdsourcing approach, can offer practical solutions to reliability issues as well as to address the time and cost difficulties associated with a text-based approach to proficiency assessment in L2 research. We…

Descriptors: Comparative Analysis, Decision Making, Language Proficiency, Reliability

A Model-Data-Fit-Informed Approach to Score Resolution in Performance Assessments

Peer reviewed

Direct link

Wind, Stefanie A.; Walker, A. Adrienne – Educational Measurement: Issues and Practice, 2021

Many large-scale performance assessments include score resolution procedures for resolving discrepancies in rater judgments. The goal of score resolution is conceptually similar to person fit analyses: To identify students for whom observed scores may not accurately reflect their achievement. Previously, researchers have observed that…

Descriptors: Goodness of Fit, Performance Based Assessment, Evaluators, Decision Making

Development of Global Engineering Competency Scale: Exploratory and Confirmatory Factor Analysis

Peer reviewed

Direct link

Mazzurco, Andrea; Jesiek, Brent K.; Godwin, Allison – Journal of Civil Engineering Education, 2020

Due to globalization trends, engineers are increasingly expected to work effectively across national and cultural boundaries. However, there remains a lack of valid and reliable measures of global engineering competency. To address this gap, the research team has undertaken a large-scale research project to develop a suite of instruments to…

Descriptors: Engineering Education, Decision Making, Measures (Individuals), Trend Analysis

Automated Assessment of Second Language Comprehensibility: Review, Training, Validation, and Generalization Studies

Peer reviewed

Direct link

Saito, Kazuya; Macmillan, Konstantinos; Kachlicka, Magdalena; Kunihara, Takuya; Minematsu, Nobuaki – Studies in Second Language Acquisition, 2023

Whereas many scholars have emphasized the relative importance of "comprehensibility" as an ecologically valid goal for L2 speech training, testing, and development, eliciting listeners' judgments is time-consuming. Following calls for research on more efficient L2 speech rating methods in applied linguistics, and growing attention toward…

Descriptors: Second Language Learning, Second Language Instruction, Interrater Reliability, Speech Communication

Measurement and Credible Evidence in Extension Evaluations

Peer reviewed
PDF on ERIC

Download full text

Marc T. Braverman – Journal of Human Sciences & Extension, 2019

This article examines the concept of credible evidence in Extension evaluations with specific attention to the measures and measurement strategies used to collect and create data. Credibility depends on multiple factors, including data quality and methodological rigor, characteristics of the stakeholder audience, stakeholder beliefs about the…

Descriptors: Extension Education, Program Evaluation, Evaluation Methods, Planning

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6

Research Matters	3
Applied Measurement in…	2
Journal of Educational…	2
Regional Educational…	2
Studies in Educational…	2
Action in Teacher Education	1
American Journal of Evaluation	1
Assessment & Evaluation in…	1
Assessment for Effective…	1
Australian Universities'…	1
Canadian Journal of School…	1
Center for Research Use in…	1
Cognitive Research:…	1
College Student Journal	1
Educational Measurement:…	1
Educational Researcher	1
Educational and Psychological…	1
Evaluation Review	1
Evaluation and Program…	1
Focus on Exceptional Children	1
Guilford Publications	1
International Association for…	1
International Educational…	1
International Journal of…	1
International Journal of…	1
More ▼

Barnett, David W.	2
Chambers, Lucy	2
Lindsay, Jim	2
Miskell, Ryan	2
Schultz, Douglas G.	2
Abdirisak Dalmar	1
Agboh, Darren	1
Ahn, Unhai R.	1
Ann M. Bessenbacher	1
Armijo-Olivo, Susan	1
Arreola, Raoul A.	1
Asia M. Mohamud	1
Austin, G.	1
Bates, S.	1
Bergquist, Constance C.	1
Biton, Yaniv	1
Blackman, Horatio	1
Bogeholz, Susanne	1
Burkhard Priemer	1
Buser, Karen	1
Campbell, Sandy	1
Castillo, Jose M.	1
Chinn, Roberta N.	1
Christensen, Laurene L.	1
More ▼