ERIC - Search Results

Publication Date

In 2026	0
Since 2025	4
Since 2022 (last 5 years)	21
Since 2017 (last 10 years)	58
Since 2007 (last 20 years)	163

Descriptor

Reliability	257
Scoring	257
Validity	90
Foreign Countries	50
Comparative Analysis	46
Scores	45
Evaluation Methods	43
Student Evaluation	37
Correlation	36
Writing Evaluation	30
Measures (Individuals)	27
Statistical Analysis	27
Test Construction	27
Computer Assisted Testing	25
Essays	25
Psychometrics	25
Elementary Secondary Education	24
Elementary School Students	23
Evaluators	23
Test Items	22
Models	21
English (Second Language)	20
Second Language Learning	20
Automation	19
Essay Tests	19
More ▼

Publication Type

Journal Articles	168
Reports - Research	151
Reports - Evaluative	50
Reports - Descriptive	24
Speeches/Meeting Papers	23
Numerical/Quantitative Data	8
Dissertations/Theses -…	7
Guides - Non-Classroom	7
Tests/Questionnaires	7
Collected Works - General	6
Books	4
Opinion Papers	3
Information Analyses	2
Collected Works - Serials	1
ERIC Digests in Full Text	1
ERIC Publications	1
Guides - Classroom - Teacher	1
Reports - General	1
More ▼

Education Level

Higher Education	38
Postsecondary Education	33
Secondary Education	25
Elementary Education	24
Elementary Secondary Education	12
Middle Schools	11
Early Childhood Education	9
High Schools	8
Grade 5	7
Grade 8	7
Junior High Schools	7
Grade 3	6
Primary Education	6
Grade 4	5
Intermediate Grades	5
Grade 2	4
Grade 6	4
Grade 7	4
Preschool Education	4
Grade 10	3
Grade 12	3
Grade 11	2
Kindergarten	2
Two Year Colleges	2
Grade 1	1
More ▼

Audience

Practitioners	2
Parents	1
Policymakers	1
Researchers	1
Students	1
Teachers	1

Location

Australia	8
California	5
Canada	5
Hong Kong	4
Japan	4
New York	4
Pennsylvania	4
United Kingdom	4
United Kingdom (England)	4
United States	4
China	3
Germany	3
Massachusetts	3
North Carolina	3
Singapore	3
Turkey	3
Vermont	3
Austria	2
Connecticut	2
Florida	2
India	2
Jordan	2
Netherlands	2
New Hampshire	2
Nigeria	2
More ▼

Laws, Policies, & Programs

Every Student Succeeds Act…	3
Elementary and Secondary…	1
Kentucky Education Reform Act…	1
No Child Left Behind Act 2001	1
Race to the Top	1

What Works Clearinghouse Rating

Showing 1 to 15 of 257 results Save | Export

Grading Exams Using Large Language Models: A Comparison between Human and AI Grading of Exams in Higher Education Using ChatGPT

Peer reviewed

Direct link

Jonas Flodén – British Educational Research Journal, 2025

This study compares how the generative AI (GenAI) large language model (LLM) ChatGPT performs in grading university exams compared to human teachers. Aspects investigated include consistency, large discrepancies and length of answer. Implications for higher education, including the role of teachers and ethics, are also discussed. Three…

Descriptors: College Faculty, Artificial Intelligence, Comparative Testing, Scoring

New Tests of Rater Drift in Trend Scoring

Peer reviewed

Direct link

John R. Donoghue; Carol Eckerly – Applied Measurement in Education, 2024

Trend scoring constructed response items (i.e. rescoring Time A responses at Time B) gives rise to two-way data that follow a product multinomial distribution rather than the multinomial distribution that is usually assumed. Recent work has shown that the difference in sampling model can have profound negative effects on statistics usually used to…

Descriptors: Scoring, Error of Measurement, Reliability, Scoring Rubrics

Accuracy and Reliability of Large Language Models in Assessing Learning Outcomes Achievement across Cognitive Domains

Peer reviewed

Direct link

Swapna Haresh Teckwani; Amanda Huee-Ping Wong; Nathasha Vihangi Luke; Ivan Cherh Chiet Low – Advances in Physiology Education, 2024

The advent of artificial intelligence (AI), particularly large language models (LLMs) like ChatGPT and Gemini, has significantly impacted the educational landscape, offering unique opportunities for learning and assessment. In the realm of written assessment grading, traditionally viewed as a laborious and subjective process, this study sought to…

Descriptors: Accuracy, Reliability, Computational Linguistics, Standards

ChatGPT as an Automated Essay Scoring Tool in the Writing Classrooms: How It Compares with Human Scoring

Peer reviewed

Direct link

Ngoc My Bui; Jessie S. Barrot – Education and Information Technologies, 2025

With the generative artificial intelligence (AI) tool's remarkable capabilities in understanding and generating meaningful content, intriguing questions have been raised about its potential as an automated essay scoring (AES) system. One such tool is ChatGPT, which is capable of scoring any written work based on predefined criteria. However,…

Descriptors: Artificial Intelligence, Natural Language Processing, Technology Uses in Education, Automation

Operationalizing a Weighted Performance Scoring Model for Sustainable e-Learning in Medical Education: Insights from Expert Judgement

Peer reviewed
PDF on ERIC

Download full text

Deborah Oluwadele; Yashik Singh; Timothy Adeliyi – Electronic Journal of e-Learning, 2024

Validation is needed for any newly developed model or framework because it requires several real-life applications. The investment made into e-learning in medical education is daunting, as is the expectation for a positive return on investment. The medical education domain requires data-wise implementation of e-learning as the debate continues…

Descriptors: Electronic Learning, Evaluation Methods, Medical Education, Sustainability

Coherence-Based Automatic Short Answer Scoring Using Sentence Embedding

Peer reviewed

Direct link

Dadi Ramesh; Suresh Kumar Sanampudi – European Journal of Education, 2024

Automatic essay scoring (AES) is an essential educational application in natural language processing. This automated process will alleviate the burden by increasing the reliability and consistency of the assessment. With the advances in text embedding libraries and neural network models, AES systems achieved good results in terms of accuracy.…

Descriptors: Scoring, Essays, Writing Evaluation, Memory

Utilizing Deep Learning AI to Analyze Scientific Models: Overcoming Challenges

Peer reviewed

Direct link

Tingting Li; Kevin Haudek; Joseph Krajcik – Journal of Science Education and Technology, 2025

Scientific modeling is a vital educational practice that helps students apply scientific knowledge to real-world phenomena. Despite advances in AI, challenges in accurately assessing such models persist, primarily due to the complexity of cognitive constructs and data imbalances in educational settings. This study addresses these challenges by…

Descriptors: Artificial Intelligence, Scientific Concepts, Models, Automation

Development and Validation of a Short-Form Inventory to Identify Personality Types: The Personality Identity Estimator (PIE)

Peer reviewed
PDF on ERIC

Download full text

Conti, Gary J. – Journal of Education and Learning, 2023

The use of personality inventories has been limited because of their cost and the length. To overcome these limitations, this study created the Personality Identity Estimator (PIE), an easy-to-use inventory to estimate personality types that can be used at no cost. PIE is a categorical inventory containing 12 items with 3 items for each of the 4…

Descriptors: Personality Measures, Personality Traits, Validity, Reliability

Strengthening the Pennsylvania School Climate Survey to Inform School Decisionmaking. REL 2024-006

Peer reviewed
PDF on ERIC

Download full text

Alyson Burnett; Katlyn Lee Milless; Michelle Bennett; Whitney Kozakowski; Sonia Alves; Christine Ross – Regional Educational Laboratory Mid-Atlantic, 2024

This study analyzed Pennsylvania School Climate Survey data from students and staff in the 2021/22 school year to assess the validity and reliability of the elementary school student version of the survey; approaches to scoring the survey in individual schools at all grade levels; and perceptions of school climate across student, staff, and school…

Descriptors: Educational Environment, Decision Making, Surveys, Validity

A Comparison of Latent Semantic Analysis and Latent Dirichlet Allocation in Educational Measurement

Peer reviewed

Direct link

Jordan M. Wheeler; Allan S. Cohen; Shiyu Wang – Journal of Educational and Behavioral Statistics, 2024

Topic models are mathematical and statistical models used to analyze textual data. The objective of topic models is to gain information about the latent semantic space of a set of related textual data. The semantic space of a set of textual data contains the relationship between documents and words and how they are used. Topic models are becoming…

Descriptors: Semantics, Educational Assessment, Evaluators, Reliability

On the Limitations of Human-Computer Agreement in Automated Essay Scoring

Peer reviewed
PDF on ERIC

Download full text

Doewes, Afrizal; Pechenizkiy, Mykola – International Educational Data Mining Society, 2021

Scoring essays is generally an exhausting and time-consuming task for teachers. Automated Essay Scoring (AES) facilitates the scoring process to be faster and more consistent. The most logical way to assess the performance of an automated scorer is by measuring the score agreement with the human raters. However, we provide empirical evidence that…

Descriptors: Man Machine Systems, Automation, Computer Assisted Testing, Scoring

Comparative Judgement, Proof Summaries and Proof Comprehension

Peer reviewed

Direct link

Davies, Ben; Alcock, Lara; Jones, Ian – Educational Studies in Mathematics, 2020

Proof is central to mathematics and has drawn substantial attention from the mathematics education community. Yet, valid and reliable measures of proof comprehension remain rare. In this article, we present a study investigating proof comprehension via students' summaries of a given proof. These summaries were evaluated by expert judges making…

Descriptors: Mathematical Logic, Mathematics Skills, Comprehension, Reliability

Modeling Writing Traits in a Formative Essay Corpus. Research Report. ETS RR-24-02

Peer reviewed
PDF on ERIC

Download full text

Paul Deane; Duanli Yan; Katherine Castellano; Yigal Attali; Michelle Lamar; Mo Zhang; Ian Blood; James V. Bruno; Chen Li; Wenju Cui; Chunyi Ruan; Colleen Appel; Kofi James; Rodolfo Long; Farah Qureshi – ETS Research Report Series, 2024

This paper presents a multidimensional model of variation in writing quality, register, and genre in student essays, trained and tested via confirmatory factor analysis of 1.37 million essay submissions to ETS' digital writing service, Criterion®. The model was also validated with several other corpora, which indicated that it provides a…

Descriptors: Writing (Composition), Essays, Models, Elementary School Students

Psychometric Models for Scoring Multiple Reporter Assessments: Applications to Integrative Data Analysis in Prevention Science and Beyond

Peer reviewed

Direct link

Curran, Patrick J.; Georgeson, A. R.; Bauer, Daniel J.; Hussong, Andrea M. – International Journal of Behavioral Development, 2021

Conducting valid and reliable empirical research in the prevention sciences is an inherently difficult and challenging task. Chief among these is the need to obtain numerical scores of underlying theoretical constructs for use in subsequent analysis. This challenge is further exacerbated by the increasingly common need to consider multiple…

Descriptors: Psychometrics, Scoring, Prevention, Scores

Comparative Judgement for Evaluating Young Learners' EFL Writing Performances: Reliability and Teacher Perceptions of Holistic and Dimension-Based Judgements

Peer reviewed

Direct link

Rebecca Sickinger; Tineke Brunfaut; John Pill – Language Testing, 2025

Comparative Judgement (CJ) is an evaluation method, typically conducted online, whereby a rank order is constructed, and scores calculated, from judges' pairwise comparisons of performances. CJ has been researched in various educational contexts, though only rarely in English as a Foreign Language (EFL) writing settings, and is generally agreed to…

Descriptors: Writing Evaluation, English (Second Language), Second Language Learning, Second Language Instruction

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | ... | 18

Educational and Psychological…	11
ETS Research Report Series	7
ProQuest LLC	7
Language Testing	6
Applied Measurement in…	5
Applied Psychological…	5
Grantee Submission	5
Journal of Psychoeducational…	5
Online Submission	5
Educational Assessment	4
International Journal of…	4
Australian Journal of…	3
Journal of Technology,…	3
Reading & Writing Quarterly	3
Applied Linguistics	2
Asia Pacific Education Review	2
Assessment	2
Assessment & Evaluation in…	2
Assessment for Effective…	2
Assessment in Higher Education	2
Education	2
Educational Research and…	2
Educational Testing Service	2
Eurasian Journal of…	2
Journal of Educational…	2
More ▼

Attali, Yigal	5
Nicewander, W. Alan	3
Algina, James	2
Bell, Courtney A.	2
Bowles, Ryan P.	2
Braun, Henry I.	2
Burstein, Jill	2
Carifio, James	2
Childs, Ruth A.	2
Coniam, David	2
Conroy, Maureen A.	2
Cox, Julia R.	2
Crehan, Kevin D.	2
Darling-Hammond, Linda	2
Ebuoh, Casmir N.	2
Foster, Tricia D.	2
Gierl, Mark J.	2
Greene, John F.	2
Haberman, Shelby J.	2
Hebert, Michael	2
Jaciw, Andrew P.	2
Jones, Ian	2
Justice, Laura M.	2
Kantor, Robert	2
More ▼

National Assessment of…	7
Test of English as a Foreign…	4
Graduate Record Examinations	3
Trends in International…	3
Advanced Placement…	2
Massachusetts Comprehensive…	2
Minnesota Multiphasic…	2
New York State Regents…	2
Torrance Tests of Creative…	2
Clinical Evaluation of…	1
Dynamic Indicators of Basic…	1
Flesch Kincaid Grade Level…	1
Graduate Management Admission…	1
Kaufman Assessment Battery…	1
Myers Briggs Type Indicator	1
Neale Analysis of Reading…	1
North Carolina End of Course…	1
Patterns of Adaptive Learning…	1
Peabody Picture Vocabulary…	1
Praxis Series	1
SAT (College Admission Test)	1
Stanford Achievement Tests	1
Test of English for…	1
Test of Nonverbal Intelligence	1
United States Medical…	1
More ▼