NotesFAQContact Us
Collection
Advanced
Search Tips
Publication Date
In 20260
Since 202517
Since 2022 (last 5 years)96
Since 2017 (last 10 years)232
What Works Clearinghouse Rating
Showing 1 to 15 of 232 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Jonas Flodén – British Educational Research Journal, 2025
This study compares how the generative AI (GenAI) large language model (LLM) ChatGPT performs in grading university exams compared to human teachers. Aspects investigated include consistency, large discrepancies and length of answer. Implications for higher education, including the role of teachers and ethics, are also discussed. Three…
Descriptors: College Faculty, Artificial Intelligence, Comparative Testing, Scoring
Peer reviewed Peer reviewed
Direct linkDirect link
Tahereh Firoozi; Hamid Mohammadi; Mark J. Gierl – Journal of Educational Measurement, 2025
The purpose of this study is to describe and evaluate a multilingual automated essay scoring (AES) system for grading essays in three languages. Two different sentence embedding models were evaluated within the AES system, multilingual BERT (mBERT) and language-agnostic BERT sentence embedding (LaBSE). German, Italian, and Czech essays were…
Descriptors: College Students, Slavic Languages, German, Italian
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Parker, Mark A. J.; Hedgeland, Holly; Jordan, Sally E.; Braithwaite, Nicholas St. J. – European Journal of Science and Mathematics Education, 2023
The study covers the development and testing of the alternative mechanics survey (AMS), a modified force concept inventory (FCI), which used automatically marked free-response questions. Data were collected over a period of three academic years from 611 participants who were taking physics classes at high school and university level. A total of…
Descriptors: Test Construction, Scientific Concepts, Physics, Test Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Renáta Kiss; Beno Csapó – International Journal of Early Childhood, 2025
Previous research has shown that phonological awareness is one of the most important prerequisites for early reading. Monitoring its development requires reliable, easy-to-use instruments especially in the last years of kindergarten. The present study aims to explore the potential for assessing phonological awareness and some of its subskills…
Descriptors: Phonological Awareness, Kindergarten, Reading Skills, Student Evaluation
Peer reviewed Peer reviewed
Direct linkDirect link
Xiong, Yao; Schunn, Christian D.; Wu, Yong – Journal of Computer Assisted Learning, 2023
Background: For peer assessment, reliability (i.e., consistency in ratings across peers) and validity (i.e., consistency of peer ratings with instructors or experts) are frequently examined in the research literature to address a central concern of instructors and students. Although the average levels are generally promising, both reliability and…
Descriptors: Peer Evaluation, Computer Assisted Testing, Test Reliability, Test Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Pauline Frizelle; Ana Buckley; Tricia Biancone; Anna Ceroni; Darren Dahly; Paul Fletcher; Dorothy V. M. Bishop; Cristina McKean – Journal of Child Language, 2024
This study reports on the feasibility of using the Test of Complex Syntax- Electronic (TECS-E), as a self-directed app, to measure sentence comprehension in children aged 4 to 5 ½ years old; how testing apps might be adapted for effective independent use; and agreement levels between face-to-face supported computerized and independent computerized…
Descriptors: Language Processing, Computer Software, Language Tests, Syntax
Peer reviewed Peer reviewed
Direct linkDirect link
Angela Chamberlain; Emily D'Arcy; Andrew J. O. Whitehouse; Kerry Wallace; Maya Hayden-Evans; Sonya Girdler; Benjamin Milbourn; Sven Bölte; Kiah Evans – Journal of Autism and Developmental Disorders, 2025
Purpose: The PEDI-CAT (ASD) is used to assess functioning of children and youth on the autism spectrum; however, current psychometric evidence is limited. This study aimed to explore the reliability, validity and acceptability of the PEDI-CAT (ASD) using a large Australian sample. Methods: Caregivers of 134 children and youth on the spectrum…
Descriptors: Autism Spectrum Disorders, Children, Youth, Test Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Swapna Haresh Teckwani; Amanda Huee-Ping Wong; Nathasha Vihangi Luke; Ivan Cherh Chiet Low – Advances in Physiology Education, 2024
The advent of artificial intelligence (AI), particularly large language models (LLMs) like ChatGPT and Gemini, has significantly impacted the educational landscape, offering unique opportunities for learning and assessment. In the realm of written assessment grading, traditionally viewed as a laborious and subjective process, this study sought to…
Descriptors: Accuracy, Reliability, Computational Linguistics, Standards
Peer reviewed Peer reviewed
Direct linkDirect link
Jyun-Hong Chen; Hsiu-Yi Chao – Journal of Educational and Behavioral Statistics, 2024
To solve the attenuation paradox in computerized adaptive testing (CAT), this study proposes an item selection method, the integer programming approach based on real-time test data (IPRD), to improve test efficiency. The IPRD method turns information regarding the ability distribution of the population from real-time test data into feasible test…
Descriptors: Data Use, Computer Assisted Testing, Adaptive Testing, Design
Peer reviewed Peer reviewed
Direct linkDirect link
Ágnes Hódi; Edit Tóth – International Journal of Early Childhood, 2024
Phonological awareness plays a key role in learning to read; therefore, its assessment has received a lot of attention. Research in the domain of phonological awareness has been characterized by attempts to develop reliable and valid assessment tools for diverse populations. Over the past few decades, phonological awareness assessment has gone…
Descriptors: Phonological Awareness, Computer Assisted Testing, Hungarian, Native Language
Peer reviewed Peer reviewed
Direct linkDirect link
Aditya Shah; Ajay Devmane; Mehul Ranka; Prathamesh Churi – Education and Information Technologies, 2024
Online learning has grown due to the advancement of technology and flexibility. Online examinations measure students' knowledge and skills. Traditional question papers include inconsistent difficulty levels, arbitrary question allocations, and poor grading. The suggested model calibrates question paper difficulty based on student performance to…
Descriptors: Computer Assisted Testing, Difficulty Level, Grading, Test Construction
Peer reviewed Peer reviewed
Direct linkDirect link
Yi-Jui I. Chen; Yi-Jhen Wu; Yi-Hsin Chen; Robin Irey – Journal of Psychoeducational Assessment, 2025
A short form of the 60-item computer-based orthographic processing assessment (long-form COPA or COPA-LF) was developed. The COPA-LF consists of five skills, including rapid perception, access, differentiation, correction, and arrangement. Thirty items from the COPA-LF were selected for the short-form COPA (COPA-SF) based on cognitive diagnostic…
Descriptors: Computer Assisted Testing, Test Length, Test Validity, Orthographic Symbols
Peer reviewed Peer reviewed
Direct linkDirect link
Wallace N. Pinto Jr.; Jinnie Shin – Journal of Educational Measurement, 2025
In recent years, the application of explainability techniques to automated essay scoring and automated short-answer grading (ASAG) models, particularly those based on transformer architectures, has gained significant attention. However, the reliability and consistency of these techniques remain underexplored. This study systematically investigates…
Descriptors: Automation, Grading, Computer Assisted Testing, Scoring
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Osman Tat; Abdullah Faruk Kilic – Turkish Online Journal of Distance Education, 2024
The widespread availability of internet access in daily life has resulted in a greater acceptance of online assessment methods. E-assessment platforms offer various features such as randomizing questions and answers, utilizing extensive question banks, setting time limits, and managing access during online exams. Electronic assessment enables…
Descriptors: Test Construction, Test Validity, Test Reliability, Anxiety
Peer reviewed Peer reviewed
Direct linkDirect link
Schoch, Kerstin; Ostermann, Thomas – Creativity Research Journal, 2022
Although art has been subject to psychological research for some time, the artwork itself received little attention in quantitative research. The rating instrument for two-dimensional pictorial works ("RizbA") fills this gap by providing a tool for formal picture analysis. This study validates the questionnaire on 294 images created by…
Descriptors: Psychometrics, Art, Measures (Individuals), Visual Arts
Previous Page | Next Page »
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9  |  10  |  11  |  ...  |  16