ERIC - Search Results

Publication Date

In 2025	41
Since 2024	135

Descriptor

Evaluation Methods	135
Test Reliability	72
Reliability	46
Foreign Countries	45
Test Validity	42
Student Evaluation	28
Interrater Reliability	25
Test Construction	23
Validity	16
Error of Measurement	14
Measures (Individuals)	14
Psychometrics	14
Evaluation Criteria	13
College Students	12
Student Attitudes	12
Undergraduate Students	11
Accuracy	10
Artificial Intelligence	10
College Faculty	10
Computer Assisted Testing	10
Higher Education	10
Elementary School Students	9
Factor Analysis	9
Feedback (Response)	9
Knowledge Level	9
More ▼

Publication Type

Journal Articles	120
Reports - Research	107
Reports - Evaluative	10
Tests/Questionnaires	9
Information Analyses	8
Dissertations/Theses -…	7
Reports - Descriptive	6
Speeches/Meeting Papers	2
Books	1
Collected Works - General	1
Guides - Classroom - Teacher	1
More ▼

Education Level

Higher Education	48
Postsecondary Education	48
Elementary Education	20
Secondary Education	20
Middle Schools	10
Elementary Secondary Education	9
High Schools	8
Junior High Schools	8
Early Childhood Education	4
Grade 6	4
Intermediate Grades	4
Grade 7	3
Primary Education	3
Grade 5	2
Grade 8	2
Grade 1	1
Grade 3	1
Grade 4	1
Kindergarten	1
More ▼

Audience

Teachers	2
Administrators	1
Policymakers	1
Students	1

Location

Turkey	8
China	6
Spain	4
Germany	3
Australia	2
Greece	2
Indonesia	2
Israel	2
Saudi Arabia	2
Sweden	2
United Kingdom	2
Austria	1
Bahrain	1
Canada	1
Europe	1
Finland	1
Florida	1
Illinois (Urbana)	1
Indiana	1
Iran	1
Kuwait	1
New Jersey	1
North Carolina	1
Oman	1
Qatar	1
More ▼

Laws, Policies, & Programs

Every Student Succeeds Act…	1
No Child Left Behind Act 2001	1

Assessments and Surveys

Aberrant Behavior Checklist	1
Eyberg Child Behavior…	1
Program for International…	1
Social Skills Improvement…	1
Wechsler Intelligence Scale…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 135 results Save | Export

Technical Adequacy-Reliability

Peer reviewed

Direct link

Susan K. Johnsen – Gifted Child Today, 2025

The author provides information about reliability and areas that educators should examine in determining if an assessment is consistent and trustworthy for use, and how it should be interpreted in making decisions about students. Reliability areas that are discussed in the column include internal consistency, test-retest or stability, inter-scorer…

Descriptors: Test Reliability, Academically Gifted, Student Evaluation, Error of Measurement

Grading Exams Using Large Language Models: A Comparison between Human and AI Grading of Exams in Higher Education Using ChatGPT

Peer reviewed

Direct link

Jonas Flodén – British Educational Research Journal, 2025

This study compares how the generative AI (GenAI) large language model (LLM) ChatGPT performs in grading university exams compared to human teachers. Aspects investigated include consistency, large discrepancies and length of answer. Implications for higher education, including the role of teachers and ethics, are also discussed. Three…

Descriptors: College Faculty, Artificial Intelligence, Comparative Testing, Scoring

Using Automated Procedures to Score Educational Essays Written in Three Languages

Peer reviewed

Direct link

Tahereh Firoozi; Hamid Mohammadi; Mark J. Gierl – Journal of Educational Measurement, 2025

The purpose of this study is to describe and evaluate a multilingual automated essay scoring (AES) system for grading essays in three languages. Two different sentence embedding models were evaluated within the AES system, multilingual BERT (mBERT) and language-agnostic BERT sentence embedding (LaBSE). German, Italian, and Czech essays were…

Descriptors: College Students, Slavic Languages, German, Italian

Evidence-Based Evaluation of Student and Marker Performances in Assessment and Examination

Peer reviewed

Direct link

Ole J. Kemi – Advances in Physiology Education, 2025

Students are assessed by coursework and/or exams, all of which are marked by assessors (markers). Student and marker performances are then subject to end-of-session board of examiner handling and analysis. This occurs annually and is the basis for evaluating students but also the wider learning and teaching efficiency of an academic institution.…

Descriptors: Undergraduate Students, Evaluation Methods, Evaluation Criteria, Academic Standards

Validity and Reliability of Child-Friendly School Policy Evaluation Instruments in Primary Schools: Confirmatory Factor Analysis

Peer reviewed
PDF on ERIC

Download full text

Riana Nurhayati; Suranto Aw; Siti Irene Astuti Dwiningrum; Mami Hajaroh; Herwin Herwin – International Journal of Educational Methodology, 2024

Evaluation of child-friendly school (CFS) policies is essential to determine the achievements of school efforts in reducing violence cases. This research aims to proving the reliability and validity of CFS policy evaluation instruments in elementary schools with different locations. This investigation uses the Context Input Process Product (CIPP)…

Descriptors: Validity, Reliability, School Policy, Program Evaluation

Design of a Simple Rubric to Peer-Evaluate the Teamwork Skills of Engineering Students

Peer reviewed

Direct link

Swapneel Thite; Jayashri Ravishankar; Inmaculada Tomeo-Reyes; Araceli Martinez Ortiz – European Journal of Engineering Education, 2024

Effectively working in an engineering workplace requires strong teamwork skills, yet the existing literature within various disciplines reveals discrepancies in evaluating these skills. This complicates the design of a generic teamwork peer evaluation tool for engineering students. This study aims to address this gap by introducing the DRIVE…

Descriptors: Scoring Rubrics, Evaluation Methods, Peer Evaluation, Teamwork

The Value of Expanding Perspectives on Assessment

Peer reviewed

Direct link

Janice Kinghorn; Katherine McGuire; Bethany L. Miller; Aaron Zimmerman – Assessment Update, 2024

In this article, the authors share their reflections on how different experiences and paradigms have broadened their understanding of the work of assessment in higher education. As they collaborated to create a panel for the 2024 International Conference on Assessing Quality in Higher Education, they recognized that they, as assessment…

Descriptors: Higher Education, Assessment Literacy, Evaluation Criteria, Evaluation Methods

Interdisciplinary Thinking among Seventh-Grade Students in Lower-Secondary Science Education

Peer reviewed
PDF on ERIC

Download full text

Shasha Chen; Shaohui Chi; Zuhao Wang – Journal of Baltic Science Education, 2025

Interdisciplinary thinking is critical for equipping students to apply scientific knowledge and tackle societal challenges across various disciplines, which has been recognized as a key objective of twenty-first century science education. However, research on effective interdisciplinary assessment in secondary school science education is still…

Descriptors: Thinking Skills, Interdisciplinary Approach, Science Instruction, Grade 7

Quantifying Multimodality: The Validity and Reliability of the QEMT and QEMR

Direct link

Paul Alexander Siegel – ProQuest LLC, 2024

While multimodality and multiliteracies has been a concept for 25 years (Kalantzis & Cope, 2023; The New London Group, 1996), research on and application of the concept within text complexity measures has been limited. Attempts to assess multiliteracies and multimodality (Jacobs, 2013; Schmerbeck & Lucht, 2017; Wyatt-Smith & Kimber,…

Descriptors: Multiple Literacies, Learning Modalities, Test Validity, Test Reliability

"LFK" Index Does Not Reliably Detect Small-Study Effects in Meta-Analysis: A Simulation Study

Peer reviewed

Direct link

Guido Schwarzer; Gerta Rücker; Cristina Semaca – Research Synthesis Methods, 2024

The "LFK" index has been promoted as an improved method to detect bias in meta-analysis. Putatively, its performance does not depend on the number of studies in the meta-analysis. We conducted a simulation study, comparing the "LFK" index test to three standard tests for funnel plot asymmetry in settings with smaller or larger…

Descriptors: Bias, Meta Analysis, Simulation, Evaluation Methods

Examining the Psychometric Impact of Targeted and Random Double-Scoring in Mixed-Format Assessments

Peer reviewed

Direct link

Yangmeng Xu; Stefanie A. Wind – Educational Measurement: Issues and Practice, 2025

Double-scoring constructed-response items is a common but costly practice in mixed-format assessments. This study explored the impacts of Targeted Double-Scoring (TDS) and random double-scoring procedures on the quality of psychometric outcomes, including student achievement estimates, person fit, and student classifications under various…

Descriptors: Academic Achievement, Psychometrics, Scoring, Evaluation Methods

Evaluating the Performance of the LI3P in Latent Profile Analysis Models

Peer reviewed

Direct link

Russell P. Houpt; Kevin J. Grimm; Aaron T. McLaughlin; Daryl R. Van Tongeren – Structural Equation Modeling: A Multidisciplinary Journal, 2024

Numerous methods exist to determine the optimal number of classes when using latent profile analysis (LPA), but none are consistently correct. Recently, the likelihood incremental percentage per parameter (LI3P) was proposed as a model effect-size measure. To evaluate the LI3P more thoroughly, we simulated 50,000 datasets, manipulating factors…

Descriptors: Structural Equation Models, Profiles, Sample Size, Evaluation Methods

Evaluation of Maximal Reliability for Multidimensional Measuring Instruments Using Structural Equation Modeling

Peer reviewed

Direct link

Tenko Raykov; Bingsheng Zhang – Structural Equation Modeling: A Multidisciplinary Journal, 2024

Multidimensional measuring instruments are often used in behavioral, social, educational, marketing, and biomedical research. For these scales, the paper discusses how to find the optimal score based on their components that is associated with the highest possible reliability. Within the framework of structural equation modeling, an approach to…

Descriptors: Multidimensional Scaling, Measurement Equipment, Measurement Techniques, Test Reliability

Psychometric Assessment of the Rett Syndrome Caregiver Assessment of Symptom Severity (RCASS)

Peer reviewed

Direct link

Melissa Raspa; Angela Gwaltney; Carla Bann; Jana von Hehn; Timothy A. Benke; Eric D. Marsh; Sarika U. Peters; Amitha Ananth; Alan K. Percy; Jeffrey L. Neul – Journal of Autism and Developmental Disorders, 2025

Rett syndrome is a severe neurodevelopmental disorder that affects about 1 in 10,000 females. Clinical trials of disease modifying therapies are on the rise, but there are few psychometrically sound caregiver-reported outcome measures available to assess treatment benefit. We report on a new caregiver-reported outcome measure, the Rett Caregiver…

Descriptors: Neurodevelopmental Disorders, Genetic Disorders, Females, Test Validity

Comparing Single- and Multiple-Question Designs of Measuring Family Income in China Family Panel Studies

Peer reviewed

Direct link

Qiong Wu; Liping Gu – Sociological Methods & Research, 2024

Family income questions in general purpose surveys are usually collected with either a single-question summary design or a multiple-question disaggregation design. It is unclear how estimates from the two approaches agree with each other. The current paper takes advantage of a large-scale survey that has collected family income with both methods.…

Descriptors: Foreign Countries, Family Income, Questionnaires, Research Design

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

ProQuest LLC	7
Journal of Educational…	4
Research Synthesis Methods	4
Educational Process:…	3
Grantee Submission	3
International Journal of…	3
Journal of Baltic Science…	3
Language Testing	3
Structural Equation Modeling:…	3
Teaching in Higher Education	3
AERA Online Paper Repository	2
Assessment & Evaluation in…	2
Cogent Education	2
Educational Measurement:…	2
Electronic Journal of…	2
IEEE Transactions on Education	2
International Journal of…	2
International Journal of…	2
Journal of Autism and…	2
Journal of Psychoeducational…	2
Open Praxis	2
Sociological Methods &…	2
Action in Teacher Education	1
Advances in Engineering…	1
Advances in Health Sciences…	1
More ▼

Erica S. Lembke	2
Kristen L. McMaster	2
Manjary Guha	2
Seohyeon Choi	2
Stefanie A. Wind	2
Yangmeng Xu	2
Yaniv Biton	2
A. Suparmi	1
Aaron T. McLaughlin	1
Aaron Zimmerman	1
Abdullah Alshakhi	1
Abdullah Çetin	1
Adam Tierney	1
Aida Carballo-Fazanes	1
Alan K. Percy	1
Alberto Oliart-Ros	1
Alec R. Goldstein	1
Alexander Naumann	1
Alexandra Budke	1
Alina A. von Davier	1
Alison Cook-Sather	1
Alper Gülay	1
Alvaro Castillo-Paz	1
Alyssa M. Merbler	1
Amanda Timmerman	1
More ▼