ERIC - Search Results

Publication Date

In 2025	41
Since 2024	135
Since 2021 (last 5 years)	302
Since 2016 (last 10 years)	549

Descriptor

Evaluation Methods	549
Test Reliability	249
Reliability	228
Test Validity	184
Foreign Countries	177
Student Evaluation	125
Validity	110
Interrater Reliability	98
Test Construction	77
Scores	65
Correlation	58
Psychometrics	56
Student Attitudes	51
Scoring Rubrics	50
Factor Analysis	45
Teaching Methods	45
College Students	44
Statistical Analysis	43
Comparative Analysis	41
Teacher Attitudes	41
Evaluators	39
Measures (Individuals)	39
Higher Education	38
Elementary School Students	35
Evaluation Criteria	35
More ▼

Education Level

Higher Education	167
Postsecondary Education	149
Elementary Education	84
Secondary Education	73
Middle Schools	41
Elementary Secondary Education	37
Early Childhood Education	36
High Schools	31
Junior High Schools	28
Primary Education	22
Intermediate Grades	19
Grade 3	12
Kindergarten	12
Grade 4	11
Grade 5	11
Grade 1	9
Grade 2	9
Grade 6	9
Preschool Education	9
Grade 7	7
Grade 8	7
Adult Education	4
Grade 10	2
Adult Basic Education	1
Grade 11	1
More ▼

Audience

Teachers	8
Administrators	7
Support Staff	3
Practitioners	2
Counselors	1
Policymakers	1
Researchers	1
Students	1

Location

Turkey	19
China	16
Australia	13
United Kingdom (England)	12
Indonesia	11
United Kingdom	10
Germany	7
Israel	7
Iran	6
Malaysia	5
New York (New York)	5
Spain	5
United States	5
California	4
Canada	4
Europe	4
Illinois	4
Kansas	4
North Carolina	4
Texas	4
Utah	4
Vermont	4
Vietnam	4
Finland	3
Florida	3
More ▼

Laws, Policies, & Programs

Every Student Succeeds Act…	11
Individuals with Disabilities…	5
Rehabilitation Act 1973…	3
No Child Left Behind Act 2001	2
Elementary and Secondary…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 549 results Save | Export

Technical Adequacy-Reliability

Peer reviewed

Direct link

Susan K. Johnsen – Gifted Child Today, 2025

The author provides information about reliability and areas that educators should examine in determining if an assessment is consistent and trustworthy for use, and how it should be interpreted in making decisions about students. Reliability areas that are discussed in the column include internal consistency, test-retest or stability, inter-scorer…

Descriptors: Test Reliability, Academically Gifted, Student Evaluation, Error of Measurement

Grading Exams Using Large Language Models: A Comparison between Human and AI Grading of Exams in Higher Education Using ChatGPT

Peer reviewed

Direct link

Jonas Flodén – British Educational Research Journal, 2025

This study compares how the generative AI (GenAI) large language model (LLM) ChatGPT performs in grading university exams compared to human teachers. Aspects investigated include consistency, large discrepancies and length of answer. Implications for higher education, including the role of teachers and ethics, are also discussed. Three…

Descriptors: College Faculty, Artificial Intelligence, Comparative Testing, Scoring

Using Automated Procedures to Score Educational Essays Written in Three Languages

Peer reviewed

Direct link

Tahereh Firoozi; Hamid Mohammadi; Mark J. Gierl – Journal of Educational Measurement, 2025

The purpose of this study is to describe and evaluate a multilingual automated essay scoring (AES) system for grading essays in three languages. Two different sentence embedding models were evaluated within the AES system, multilingual BERT (mBERT) and language-agnostic BERT sentence embedding (LaBSE). German, Italian, and Czech essays were…

Descriptors: College Students, Slavic Languages, German, Italian

Evidence-Based Evaluation of Student and Marker Performances in Assessment and Examination

Peer reviewed

Direct link

Ole J. Kemi – Advances in Physiology Education, 2025

Students are assessed by coursework and/or exams, all of which are marked by assessors (markers). Student and marker performances are then subject to end-of-session board of examiner handling and analysis. This occurs annually and is the basis for evaluating students but also the wider learning and teaching efficiency of an academic institution.…

Descriptors: Undergraduate Students, Evaluation Methods, Evaluation Criteria, Academic Standards

Constructing a Roadmap to Measure the Quality of Business Assessments Aimed at Curriculum Management

Peer reviewed

Direct link

Silva, Thanuci; Santos, Regiane dos; Mallet, Débora – Journal of Education for Business, 2023

Assuring the quality of education is a concern of learning institutions. To do so, it is necessary to have assertive learning management, with consistent data on students' outcomes. This research provides associate deans and researchers, a roadmap with which to gather evidence to improve the quality of open-ended assessments. Based on statistical…

Descriptors: Student Evaluation, Evaluation Methods, Business Education, Higher Education

Validity and Reliability of Child-Friendly School Policy Evaluation Instruments in Primary Schools: Confirmatory Factor Analysis

Peer reviewed
PDF on ERIC

Download full text

Riana Nurhayati; Suranto Aw; Siti Irene Astuti Dwiningrum; Mami Hajaroh; Herwin Herwin – International Journal of Educational Methodology, 2024

Evaluation of child-friendly school (CFS) policies is essential to determine the achievements of school efforts in reducing violence cases. This research aims to proving the reliability and validity of CFS policy evaluation instruments in elementary schools with different locations. This investigation uses the Context Input Process Product (CIPP)…

Descriptors: Validity, Reliability, School Policy, Program Evaluation

Design of a Simple Rubric to Peer-Evaluate the Teamwork Skills of Engineering Students

Peer reviewed

Direct link

Swapneel Thite; Jayashri Ravishankar; Inmaculada Tomeo-Reyes; Araceli Martinez Ortiz – European Journal of Engineering Education, 2024

Effectively working in an engineering workplace requires strong teamwork skills, yet the existing literature within various disciplines reveals discrepancies in evaluating these skills. This complicates the design of a generic teamwork peer evaluation tool for engineering students. This study aims to address this gap by introducing the DRIVE…

Descriptors: Scoring Rubrics, Evaluation Methods, Peer Evaluation, Teamwork

The Value of Expanding Perspectives on Assessment

Peer reviewed

Direct link

Janice Kinghorn; Katherine McGuire; Bethany L. Miller; Aaron Zimmerman – Assessment Update, 2024

In this article, the authors share their reflections on how different experiences and paradigms have broadened their understanding of the work of assessment in higher education. As they collaborated to create a panel for the 2024 International Conference on Assessing Quality in Higher Education, they recognized that they, as assessment…

Descriptors: Higher Education, Assessment Literacy, Evaluation Criteria, Evaluation Methods

Interdisciplinary Thinking among Seventh-Grade Students in Lower-Secondary Science Education

Peer reviewed
PDF on ERIC

Download full text

Shasha Chen; Shaohui Chi; Zuhao Wang – Journal of Baltic Science Education, 2025

Interdisciplinary thinking is critical for equipping students to apply scientific knowledge and tackle societal challenges across various disciplines, which has been recognized as a key objective of twenty-first century science education. However, research on effective interdisciplinary assessment in secondary school science education is still…

Descriptors: Thinking Skills, Interdisciplinary Approach, Science Instruction, Grade 7

Comparison of the Results of the Generalizability Theory with the Inter-Rater Agreement Coefficients

Peer reviewed
PDF on ERIC

Download full text

Eser, Mehmet Taha; Aksu, Gökhan – International Journal of Curriculum and Instruction, 2022

The agreement between raters is examined within the scope of the concept of "inter-rater reliability". Although there are clear definitions of the concepts of agreement between raters and reliability between raters, there is no clear information about the conditions under which agreement and reliability level methods are appropriate to…

Descriptors: Generalizability Theory, Interrater Reliability, Evaluation Methods, Test Theory

Psychometric Properties of the Behavior Assessment System for Children Student Observation System (BASC-3 SOS) with Young Children in Special Education

Peer reviewed

Direct link

Schmidt, Ellyn M.; Rothenberg, W. Andrew; Davidson, Bridget C.; Barnett, Miya; Jent, Jason; Cadenas, Heleny; Fernandez, Corina; Davis, Eileen – Journal of Behavioral Education, 2023

Measuring classroom behavior among young children is important to guide assessment and intervention decisions, yet there is limited literature on appropriate direct observation tools for this purpose. This article describes the psychometric properties of the Behavior Assessment System for Children, Student Observation System (BASC-3 SOS) with 135…

Descriptors: Young Children, Special Education, Child Behavior, Psychometrics

Quantifying Multimodality: The Validity and Reliability of the QEMT and QEMR

Direct link

Paul Alexander Siegel – ProQuest LLC, 2024

While multimodality and multiliteracies has been a concept for 25 years (Kalantzis & Cope, 2023; The New London Group, 1996), research on and application of the concept within text complexity measures has been limited. Attempts to assess multiliteracies and multimodality (Jacobs, 2013; Schmerbeck & Lucht, 2017; Wyatt-Smith & Kimber,…

Descriptors: Multiple Literacies, Learning Modalities, Test Validity, Test Reliability

"LFK" Index Does Not Reliably Detect Small-Study Effects in Meta-Analysis: A Simulation Study

Peer reviewed

Direct link

Guido Schwarzer; Gerta Rücker; Cristina Semaca – Research Synthesis Methods, 2024

The "LFK" index has been promoted as an improved method to detect bias in meta-analysis. Putatively, its performance does not depend on the number of studies in the meta-analysis. We conducted a simulation study, comparing the "LFK" index test to three standard tests for funnel plot asymmetry in settings with smaller or larger…

Descriptors: Bias, Meta Analysis, Simulation, Evaluation Methods

Examining the Psychometric Impact of Targeted and Random Double-Scoring in Mixed-Format Assessments

Peer reviewed

Direct link

Yangmeng Xu; Stefanie A. Wind – Educational Measurement: Issues and Practice, 2025

Double-scoring constructed-response items is a common but costly practice in mixed-format assessments. This study explored the impacts of Targeted Double-Scoring (TDS) and random double-scoring procedures on the quality of psychometric outcomes, including student achievement estimates, person fit, and student classifications under various…

Descriptors: Academic Achievement, Psychometrics, Scoring, Evaluation Methods

An Exploration of "Real Time" Assessments as a Means to Better Understand Preceptors' Judgments of Student Performance

Peer reviewed

Direct link

Luu, Kimberly; Sidhu, Ravi; Chadha, Neil K.; Eva, Kevin W. – Advances in Health Sciences Education, 2023

Clinical supervisors are known to assess trainee performance idiosyncratically, causing concern about the validity of their ratings. The literature on this issue relies heavily on retrospective collection of decisions, resulting in the risk of inaccurate information regarding what actually drives raters' perceptions. Capturing in-the-moment…

Descriptors: Clinical Experience, Practicum Supervision, Student Evaluation, Evaluation Methods

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | ... | 37

ProQuest LLC	29
Grantee Submission	17
Journal of Educational…	9
Journal of Psychoeducational…	9
Advances in Health Sciences…	8
International Journal of…	7
Assessment in Education:…	6
Education and Information…	6
Educational Measurement:…	6
International Journal of…	6
Language Testing	6
Language Testing in Asia	6
Assessment & Evaluation in…	5
ETS Research Report Series	5
Journal of Speech, Language,…	5
Online Submission	5
Research Matters	5
Research Synthesis Methods	5
SAGE Open	5
Assessment for Effective…	4
Cogent Education	4
Educational Assessment	4
International Journal of…	4
Journal of Autism and…	4
Journal of Baltic Science…	4
More ▼

Amrein-Beardsley, Audrey	3
Kartowagiran, Badrun	3
Lembke, Erica S.	3
Wind, Stefanie A.	3
Al Otaiba, Stephanie	2
Algina, James	2
Bardhoshi, Gerta	2
Bottoms, Bryndle L.	2
Brownstein, Erica M.	2
Chambers, Lucy	2
Child, Simon	2
Crawford, Angela R.	2
Darling-Hammond, Linda	2
Dart, Evan H.	2
DeMartino, Sara	2
Erford, Bradley T.	2
Erica S. Lembke	2
Eva, Kevin W.	2
Gatlin, Brandy	2
Godley, Amanda	2
Gresham, Frank M.	2
Heldsinger, Sandra	2
Holcomb, T. Scott	2
Horvath, Larry	2
Johnson, Evelyn S.	2
More ▼

Journal Articles	454
Reports - Research	420
Tests/Questionnaires	41
Reports - Evaluative	37
Reports - Descriptive	35
Dissertations/Theses -…	30
Information Analyses	30
Speeches/Meeting Papers	9
Books	7
Guides - Classroom - Teacher	4
Numerical/Quantitative Data	4
Opinion Papers	4
Guides - Non-Classroom	3
Collected Works - General	2
Guides - General	2
Non-Print Media	2
Collected Works - Proceedings	1
Collected Works - Serial	1
Reports -…	1
More ▼

Wechsler Intelligence Scale…	5
National Assessment of…	4
Praxis Series	4
Program for International…	4
Woodcock Johnson Tests of…	4
Bayley Scales of Infant…	3
ACT Assessment	2
Autism Diagnostic Observation…	2
MacArthur Communicative…	2
New York State Regents…	2
Social Skills Improvement…	2
Aberrant Behavior Checklist	1
Ages and Stages Questionnaires	1
Bayley Mental Development…	1
Behavior Assessment System…	1
British Ability Scales	1
Child Behavior Checklist	1
Classroom Assessment Scoring…	1
Clinical Evaluation of…	1
Conners Rating Scales	1
Conners Teacher Rating Scale	1
Diagnostic Interview Schedule…	1
Draw a Person Test	1
Early Childhood Longitudinal…	1
Eyberg Child Behavior…	1
More ▼