ERIC - Search Results

Publication Date

In 2025

Descriptor

Evaluation Methods	41
Test Reliability	23
Reliability	13
Foreign Countries	12
Test Validity	12
Interrater Reliability	10
Student Evaluation	8
Test Construction	8
Error of Measurement	7
Computer Assisted Testing	6
Evaluation Criteria	6
Accuracy	5
Higher Education	5
Scores	5
Scoring	5
Student Attitudes	5
Teacher Attitudes	5
College Faculty	4
Formative Evaluation	4
Psychometrics	4
Secondary School Students	4
Tests	4
Undergraduate Students	4
Artificial Intelligence	3
Barriers	3
More ▼

Publication Type

Journal Articles	41
Reports - Research	35
Information Analyses	3
Reports - Evaluative	3
Reports - Descriptive	2
Tests/Questionnaires	2

Education Level

Higher Education	15
Postsecondary Education	15
Secondary Education	9
Elementary Education	6
High Schools	4
Elementary Secondary Education	3
Early Childhood Education	2
Grade 7	2
Junior High Schools	2
Middle Schools	2
Grade 8	1
Kindergarten	1
Primary Education	1
More ▼

Audience

Location

China	3
Saudi Arabia	2
Australia	1
Austria	1
Bahrain	1
Canada	1
Illinois (Urbana)	1
Israel	1
Kuwait	1
Oman	1
Qatar	1
South Africa	1
Thailand	1
United Arab Emirates	1
United Kingdom	1
Utah	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

Aberrant Behavior Checklist	1
Eyberg Child Behavior…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 41 results Save | Export

Technical Adequacy-Reliability

Peer reviewed

Direct link

Susan K. Johnsen – Gifted Child Today, 2025

The author provides information about reliability and areas that educators should examine in determining if an assessment is consistent and trustworthy for use, and how it should be interpreted in making decisions about students. Reliability areas that are discussed in the column include internal consistency, test-retest or stability, inter-scorer…

Descriptors: Test Reliability, Academically Gifted, Student Evaluation, Error of Measurement

Grading Exams Using Large Language Models: A Comparison between Human and AI Grading of Exams in Higher Education Using ChatGPT

Peer reviewed

Direct link

Jonas Flodén – British Educational Research Journal, 2025

This study compares how the generative AI (GenAI) large language model (LLM) ChatGPT performs in grading university exams compared to human teachers. Aspects investigated include consistency, large discrepancies and length of answer. Implications for higher education, including the role of teachers and ethics, are also discussed. Three…

Descriptors: College Faculty, Artificial Intelligence, Comparative Testing, Scoring

Using Automated Procedures to Score Educational Essays Written in Three Languages

Peer reviewed

Direct link

Tahereh Firoozi; Hamid Mohammadi; Mark J. Gierl – Journal of Educational Measurement, 2025

The purpose of this study is to describe and evaluate a multilingual automated essay scoring (AES) system for grading essays in three languages. Two different sentence embedding models were evaluated within the AES system, multilingual BERT (mBERT) and language-agnostic BERT sentence embedding (LaBSE). German, Italian, and Czech essays were…

Descriptors: College Students, Slavic Languages, German, Italian

Evidence-Based Evaluation of Student and Marker Performances in Assessment and Examination

Peer reviewed

Direct link

Ole J. Kemi – Advances in Physiology Education, 2025

Students are assessed by coursework and/or exams, all of which are marked by assessors (markers). Student and marker performances are then subject to end-of-session board of examiner handling and analysis. This occurs annually and is the basis for evaluating students but also the wider learning and teaching efficiency of an academic institution.…

Descriptors: Undergraduate Students, Evaluation Methods, Evaluation Criteria, Academic Standards

Interdisciplinary Thinking among Seventh-Grade Students in Lower-Secondary Science Education

Peer reviewed
PDF on ERIC

Download full text

Shasha Chen; Shaohui Chi; Zuhao Wang – Journal of Baltic Science Education, 2025

Interdisciplinary thinking is critical for equipping students to apply scientific knowledge and tackle societal challenges across various disciplines, which has been recognized as a key objective of twenty-first century science education. However, research on effective interdisciplinary assessment in secondary school science education is still…

Descriptors: Thinking Skills, Interdisciplinary Approach, Science Instruction, Grade 7

Examining the Psychometric Impact of Targeted and Random Double-Scoring in Mixed-Format Assessments

Peer reviewed

Direct link

Yangmeng Xu; Stefanie A. Wind – Educational Measurement: Issues and Practice, 2025

Double-scoring constructed-response items is a common but costly practice in mixed-format assessments. This study explored the impacts of Targeted Double-Scoring (TDS) and random double-scoring procedures on the quality of psychometric outcomes, including student achievement estimates, person fit, and student classifications under various…

Descriptors: Academic Achievement, Psychometrics, Scoring, Evaluation Methods

Psychometric Assessment of the Rett Syndrome Caregiver Assessment of Symptom Severity (RCASS)

Peer reviewed

Direct link

Melissa Raspa; Angela Gwaltney; Carla Bann; Jana von Hehn; Timothy A. Benke; Eric D. Marsh; Sarika U. Peters; Amitha Ananth; Alan K. Percy; Jeffrey L. Neul – Journal of Autism and Developmental Disorders, 2025

Rett syndrome is a severe neurodevelopmental disorder that affects about 1 in 10,000 females. Clinical trials of disease modifying therapies are on the rise, but there are few psychometrically sound caregiver-reported outcome measures available to assess treatment benefit. We report on a new caregiver-reported outcome measure, the Rett Caregiver…

Descriptors: Neurodevelopmental Disorders, Genetic Disorders, Females, Test Validity

The Development of Knowledge of Content and Teaching Task Instruments for Pre-Service Mathematics Teacher

Peer reviewed
PDF on ERIC

Download full text

Siti Suprihatiningsih; Masriyah; Rooselyna Ekawati – Journal of Education and Learning (EduLearn), 2025

The knowledge of the materials to be taught to the students is the basic knowledge that preservice mathematics teachers should possess, as they need to prepare themselves for teaching. In order to research preservice teachers' understanding of the subject matter and teaching skils, valid and reliable test instruments are required. Knowledge of…

Descriptors: Preservice Teachers, Pedagogical Content Knowledge, Preservice Teacher Education, Mathematics Teachers

The Proposed Specifiers for Conduct Disorder (PSCD): External Correlates and Incremental Validity over Alternate Psychopathy Measures

Peer reviewed

Direct link

Mojtaba Elhami Athar; Randall T. Salekin; Mahdi Hassanabadi; Parnian Rezaei; Golnoush Fakhr; Elham Zamani – Child & Youth Care Forum, 2025

The Proposed Specifiers for Conduct Disorder (PSCD) assesses psychopathy components of grandiose-manipulative (GM), callous-unemotional (CU), daring-impulsive (DI), and conduct disorder (CD). Research on PSCD is still in its infancy, and further research is necessary to examine its psychometric properties. We investigated the correlations between…

Descriptors: Preadolescents, Adolescents, Psychopathology, Behavior Disorders

Construction of a Sustainable Design Competency Assessment System for Fashion Designers in China

Peer reviewed

Direct link

Hua Yuan; Yunmei Wu; Hui Tao; Jun Yin; Ying Fang; Junjie Zhang; Yun Zhang – International Journal of Technology and Design Education, 2025

This paper introduces a framework aimed at assessing the sustainability of fashion designers, intending to evaluate their proficiency in sustainability and enhance higher education in design. To establish a system for assessing and evaluating sustainable design competence, we initiated interviews with both designers and fashion design students.…

Descriptors: Clothing, Design, Sustainability, Reliability

Assessment as Pedagogy: Inviting Authenticity through Relationality, Vulnerability and Wonder

Peer reviewed

Direct link

Claire Timperley; Kate Schick – Teaching in Higher Education, 2025

Traditional authentic assessment tasks are frequently tied to future work and enmeshed in neoliberal and capitalist visions of education. We advocate an alternative approach where authenticity signifies meaningful learning outside the confines of the classroom to promote deep learning that 'sticks'. We proffer an understanding of "assessment…

Descriptors: Performance Based Assessment, Philosophy, World Views, Instruction

Between Two Worlds: Locating Climate Literacy between Modern Educational Frameworks and Assessment Needs

Peer reviewed

Direct link

Dirk Gellermann; Hanno Michel; Ute Harms – Mind, Brain, and Education, 2025

In order for climate literacy assessments to be applicable in large-scale studies, it is essential that they comply with the standards of test administration while maintaining consistency with a comprehensive definition of the concept. In alignment with the different educational frameworks and the Climate Literacy Principles of the U.S. Global…

Descriptors: Climate, Environmental Education, Literacy, Measures (Individuals)

PBL Student Assessment: Consistency of Different Evaluation Methods in a Computing Faculty

Peer reviewed

Direct link

Henrique Mohallem Paiva; Flávia Maria Santoro; Victor Takashi Hayashi; Bianca Cassemiro Lima – IEEE Transactions on Education, 2025

Contribution: This article analyzes student assessment within a computing faculty employing a full project-based learning (PBL) approach. Examining 2078 final grades across 60 classes and periods, the study reveals a significant correlation between graded self-studies, exams, and projects. This result contributes to understanding the reliability…

Descriptors: Student Evaluation, Computer Science Education, College Faculty, Correlation

How Valid and Reliable Are Teachers' Assessments of Gifted Students?

Peer reviewed
PDF on ERIC

Download full text

Sümeyye Arkan; Sema Tan – International Journal of Assessment Tools in Education, 2025

Teachers' perceptions, attitudes, and opinions about students, curricula, or evaluation methods contribute to the development of students' talents. Thus, researchers often collect data from teachers to identify gifted students, determine educational practices to meet the students' needs and assess gifted education programs. Researchers often…

Descriptors: Talent Identification, Academically Gifted, Evaluation Methods, Measurement Techniques

Evaluating the Consistency and Reliability of Attribution Methods in Automated Short Answer Grading (ASAG) Systems: Toward an Explainable Scoring System

Peer reviewed

Direct link

Wallace N. Pinto Jr.; Jinnie Shin – Journal of Educational Measurement, 2025

In recent years, the application of explainability techniques to automated essay scoring and automated short-answer grading (ASAG) models, particularly those based on transformer architectures, has gained significant attention. However, the reliability and consistency of these techniques remain underexplored. This study systematically investigates…

Descriptors: Automation, Grading, Computer Assisted Testing, Scoring

Previous Page | Next Page »

Pages: 1 | 2 | 3

Journal of Educational…	4
Educational Process:…	3
Journal of Autism and…	2
Language Testing	2
Teaching in Higher Education	2
Advances in Physiology…	1
American Journal on…	1
Assessment & Evaluation in…	1
Autism: The International…	1
British Educational Research…	1
Child & Youth Care Forum	1
Education & Training	1
Educational Measurement:…	1
Educational and Psychological…	1
European Journal of Education	1
Exceptionality	1
Gifted Child Today	1
IEEE Transactions on Education	1
International Journal of…	1
International Journal of…	1
Journal of Adult and…	1
Journal of Baltic Science…	1
Journal of Computer Assisted…	1
Journal of Education and…	1
Journal of Education for…	1
More ▼

Abdullah Alshakhi	1
Alan K. Percy	1
Alyssa M. Merbler	1
Amanda Timmerman	1
Amery D. Wu	1
Amirhossein Rasooli	1
Amitha Ananth	1
Amy Bidgood	1
Andrew Jessop	1
Angela Gwaltney	1
Apantee Poonputta	1
Arash Ghafoori	1
Arvid Nikolai Kildahl	1
Audrey Linden	1
Bang Quan Zheng	1
Bianca Cassemiro Lima	1
Breanne J. Byiers	1
Brynhildur Axelsdottir	1
Cailing Yan	1
Carla Bann	1
Caroline F. Rowland	1
Chad A. Rose	1
Chahna Gonsalves	1
Chantel C. Burkitt	1
Cigdem Meral	1
More ▼