ERIC - Search Results

Publication Date

In 2026	0
Since 2025	3
Since 2022 (last 5 years)	28
Since 2017 (last 10 years)	84
Since 2007 (last 20 years)	288

Descriptor

Comparative Analysis	350
Correlation	350
Reliability	176
Foreign Countries	125
Test Reliability	119
Scores	88
Validity	83
Statistical Analysis	80
Test Validity	70
Measures (Individuals)	68
Interrater Reliability	65
Psychometrics	51
Factor Analysis	50
Questionnaires	46
Rating Scales	37
Academic Achievement	36
Scoring	32
Student Attitudes	31
College Students	29
Test Items	28
Undergraduate Students	28
Elementary School Students	27
Evaluation Methods	27
Second Language Learning	27
English (Second Language)	26
More ▼

Publication Type

Journal Articles	284
Reports - Research	273
Reports - Evaluative	36
Tests/Questionnaires	21
Dissertations/Theses -…	19
Speeches/Meeting Papers	10
Information Analyses	6
Reports - Descriptive	3
Book/Product Reviews	1
Collected Works - Proceedings	1
Collected Works - Serial	1
Dissertations/Theses -…	1
Historical Materials	1
Non-Print Media	1
Numerical/Quantitative Data	1
Reference Materials - General	1
More ▼

Education Level

Higher Education	96
Postsecondary Education	87
Secondary Education	37
Elementary Education	32
High Schools	18
Middle Schools	15
Early Childhood Education	13
Elementary Secondary Education	13
Preschool Education	9
Grade 5	8
Junior High Schools	8
Grade 8	7
Primary Education	7
Grade 11	6
Grade 4	6
Grade 6	5
Grade 7	5
Kindergarten	5
Grade 10	4
Grade 3	4
Intermediate Grades	4
Adult Education	2
Grade 12	2
Grade 2	2
Grade 9	2
More ▼

Audience

Practitioners	3
Researchers	1
Teachers	1

Location

Turkey	15
Canada	11
China	10
Netherlands	9
Hong Kong	8
Japan	7
Florida	5
Taiwan	5
Texas	5
United Kingdom	5
Australia	4
California	4
Germany	4
Greece	4
Singapore	4
Iran	3
New York	3
Norway	3
Ohio	3
Pakistan	3
Portugal	3
Saudi Arabia	3
Spain	3
Sweden	3
United Kingdom (England)	3
More ▼

Laws, Policies, & Programs

No Child Left Behind Act 2001

What Works Clearinghouse Rating

Showing 1 to 15 of 350 results Save | Export

Accuracy and Reliability of Large Language Models in Assessing Learning Outcomes Achievement across Cognitive Domains

Peer reviewed

Direct link

Swapna Haresh Teckwani; Amanda Huee-Ping Wong; Nathasha Vihangi Luke; Ivan Cherh Chiet Low – Advances in Physiology Education, 2024

The advent of artificial intelligence (AI), particularly large language models (LLMs) like ChatGPT and Gemini, has significantly impacted the educational landscape, offering unique opportunities for learning and assessment. In the realm of written assessment grading, traditionally viewed as a laborious and subjective process, this study sought to…

Descriptors: Accuracy, Reliability, Computational Linguistics, Standards

A Comparison of Yen's Q3 Coefficient and Rasch Testlet Modeling for Identifying Local Item Dependence: Evidence from Two Vocabulary Matching Tests

Peer reviewed

Direct link

Hung Tan Ha; Duyen Thi Bich Nguyen; Tim Stoeckel – Language Assessment Quarterly, 2025

This article compares two methods for detecting local item dependence (LID): residual correlation examination and Rasch testlet modeling (RTM), in a commonly used 3:6 matching format and an extended matching test (EMT) format. The two formats are hypothesized to facilitate different levels of item dependency due to differences in the number of…

Descriptors: Comparative Analysis, Language Tests, Test Items, Item Analysis

Comparative Analysis of LLMs Performance in Medical Embryology: A Cross-Platform Study of ChatGPT, Claude, Gemini, and Copilot

Peer reviewed

Direct link

Olena Bolgova; Paul Ganguly; Volodymyr Mavrych – Anatomical Sciences Education, 2025

Integrating artificial intelligence, particularly large language models (LLMs), into medical education represents a significant new step in how medical knowledge is accessed, processed, and evaluated. The objective of this study was to conduct a comprehensive analysis comparing the performance of advanced LLM chatbots in different topics of…

Descriptors: Comparative Analysis, Artificial Intelligence, Technology Uses in Education, Natural Language Processing

A Comparison of Latent Semantic Analysis and Latent Dirichlet Allocation in Educational Measurement

Peer reviewed

Direct link

Jordan M. Wheeler; Allan S. Cohen; Shiyu Wang – Journal of Educational and Behavioral Statistics, 2024

Topic models are mathematical and statistical models used to analyze textual data. The objective of topic models is to gain information about the latent semantic space of a set of related textual data. The semantic space of a set of textual data contains the relationship between documents and words and how they are used. Topic models are becoming…

Descriptors: Semantics, Educational Assessment, Evaluators, Reliability

A Comparison of Two Learning Approach Inventories and Their Utility in Predicting Examination Performance and Study Habits

Peer reviewed

Direct link

Andrew R. Thompson – Advances in Physiology Education, 2024

The revised two-factor Study Process Questionnaire and the Approaches and Study Skills Inventory for Students are two instruments commonly used to measure student learning approach. Although they are designed to measure similar constructs, it is unclear whether the metrics they provide differ in terms of their real-world classification of learning…

Descriptors: Comparative Analysis, Anatomy, Classification, Cognitive Style

Graders of the Future: Comparing the Consistency and Accuracy of GPT4 and Pre-Service Teachers in Physics Essay Question Assessments

Peer reviewed
PDF on ERIC

Download full text

Yubin Xu; Lin Liu; Jianwen Xiong; Guangtian Zhu – Journal of Baltic Science Education, 2025

As the development and application of large language models (LLMs) in physics education progress, the well-known AI-based chatbot ChatGPT4 has presented numerous opportunities for educational assessment. Investigating the potential of AI tools in practical educational assessment carries profound significance. This study explored the comparative…

Descriptors: Physics, Artificial Intelligence, Computer Software, Accuracy

Are the Verbal TTCT Forms Actually Interchangeable?

Peer reviewed

Direct link

Grajzel, Katalin; Dumas, Denis; Acar, Selcuk – Journal of Creative Behavior, 2022

One of the best-known and most frequently used measures of creative idea generation is the Torrance Test of Creative Thinking (TTCT). The TTCT Verbal, assessing verbal ideation, contains two forms created to be used interchangeably by researchers and practitioners. However, the parallel forms reliability of the two versions of the TTCT Verbal has…

Descriptors: Test Reliability, Creative Thinking, Creativity Tests, Verbal Ability

Rater Connections and the Detection of Bias in Performance Assessment

Peer reviewed

Direct link

Wind, Stefanie A. – Measurement: Interdisciplinary Research and Perspectives, 2022

In many performance assessments, one or two raters from the complete rater pool scores each performance, resulting in a sparse rating design, where there are limited observations of each rater relative to the complete sample of students. Although sparse rating designs can be constructed to facilitate estimation of student achievement, the…

Descriptors: Evaluators, Bias, Identification, Performance Based Assessment

Estimating the Impact of Local Item Dependency in a Test of Second Language Reading Comprehension

Peer reviewed
PDF on ERIC

Download full text

Tim Stoeckel; Liang Ye Tan; Hung Tan Ha; Nam Thi Phuong Ho; Tomoko Ishii; Young Ae Kim; Chunmei Huang; Stuart McLean – Vocabulary Learning and Instruction, 2024

Local item dependency (LID) occurs when test-takers' responses to one test item are affected by their responses to another. It can be problematic if it causes inflated reliability estimates or distorted person and item measures. The cued-recall reading comprehension test in Hu and Nation's (2000) well-known and influential coverage--comprehension…

Descriptors: Reading Comprehension, English (Second Language), Second Language Instruction, Second Language Learning

Estimating Hazard Ratios from Published Kaplan-Meier Survival Curves: A Methods Validation Study

Peer reviewed

Direct link

Saluja, Ronak; Cheng, Sierra; delos Santos, Keemo Althea; Chan, Kelvin K. W. – Research Synthesis Methods, 2019

Objective: Various statistical methods have been developed to estimate hazard ratios (HRs) from published Kaplan-Meier (KM) curves for the purpose of performing meta-analyses. The objective of this study was to determine the reliability, accuracy, and precision of four commonly used methods by Guyot, Williamson, Parmar, and Hoyle and Henley.…

Descriptors: Meta Analysis, Reliability, Accuracy, Randomized Controlled Trials

The Concurrent Validity of Comparative Judgement Outcomes Compared with Marks

Download full text

Gill, Tim – Research Matters, 2022

In Comparative Judgement (CJ) exercises, examiners are asked to look at a selection of candidate scripts (with marks removed) and order them in terms of which they believe display the best quality. By including scripts from different examination sessions, the results of these exercises can be used to help with maintaining standards. Results from…

Descriptors: Comparative Analysis, Decision Making, Scripts, Standards

Multi-Informant Assessment of Organizational Skills: Psychometric Characteristics of the Children's Organizational Skills Scale (COSS)

Peer reviewed

Direct link

Shannon Ryan; Thomas J. Power; Laura Pendergast; Bridget Poznanski; Jenelle Nissley-Tsiopinis; Howard Abikoff; Richard Gallagher; Katie Tremont; Jaclyn Cacia; Jennifer A. Mautone – Grantee Submission, 2024

Organization, time management, and planning (OTMP) skills are behavioral manifestations of executive functioning linked to academic outcomes. Interventions to improve OTMP skills have shown favorable outcomes. The Children's Organizational Skills Scale parent and teacher forms (COSS-P, COSS-T) are widely used for assessing OTMP skills, but there…

Descriptors: Psychometrics, Rating Scales, Executive Function, Time Management

Multi-Informant Assessment of Organizational Skills: Psychometric Characteristics of the Children's Organizational Skills Scale (COSS)

Peer reviewed

Direct link

Descriptors: Psychometrics, Rating Scales, Executive Function, Time Management

Short-Term Test-Retest Reliability of Contralateral Suppression of Click-Evoked Otoacoustic Emissions in Normal-Hearing Subjects

Peer reviewed

Direct link

Keppler, Hannah; Degeest, Sofie; Vinck, Bart – Journal of Speech, Language, and Hearing Research, 2021

Purpose: The objective of the current study was to investigate the short-term test-retest reliability of contralateral suppression (CS) of click-evoked otoacoustic emissions (CEOAEs) using commercially available otoacoustic emission equipment. Method: Twenty-three young normal-hearing subjects were tested. An otoscopic evaluation, admittance…

Descriptors: Test Reliability, Hearing (Physiology), Acoustics, Auditory Tests

The Spatial Requirements of the Left-Hand Rule: A Novel Instrument for Assessing the Coordination of Egocentric and Allocentric Frames of Reference

Peer reviewed

Direct link

Ramful, Ajay; Maesuri Patahuddin, Sitti; Moheeput, Khemanand; Johar, Rahmah – International Journal of Science Education, 2023

This paper explores the spatial dimension of Fleming's Left Hand Rule (LHR), commonly-used in Physics instruction for determining the direction of force using the left hand's thumb, forefinger and middle finger. A new instrument was developed to gauge students' ability to coordinate their fingers in 3D space (egocentric frame of reference) based…

Descriptors: Spatial Ability, Handedness, Science Education, Comparative Analysis

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | ... | 24

ProQuest LLC	19
Educational and Psychological…	17
Journal of Speech, Language,…	9
Creativity Research Journal	7
ETS Research Report Series	7
Online Submission	7
Journal of Psychoeducational…	6
Measurement in Physical…	6
Social Indicators Research	5
Applied Measurement in…	4
Educational Research and…	4
Journal of Education and…	4
Language Testing	4
Research Quarterly for…	4
Research in Developmental…	4
Advances in Health Sciences…	3
Applied Psychological…	3
Educational Sciences: Theory…	3
International Education…	3
International Journal of…	3
Journal of Autism and…	3
Journal of College Student…	3
Journal of Further and Higher…	3
Research Synthesis Methods	3
Advances in Physiology…	2
More ▼

Attali, Yigal	3
Coniam, David	3
Acar, Selcuk	2
Bridget Poznanski	2
Cheung, Ping Chung	2
Howard Abikoff	2
Hung Tan Ha	2
Jenelle Nissley-Tsiopinis	2
Kunnan, Antony John	2
Lau, Sing	2
Laura Pendergast	2
McNeil, Malcolm R.	2
Peyton, Vicki	2
Riley-Tillman, T. Chris	2
Shannon Ryan	2
Steedle, Jeffrey T.	2
Thomas J. Power	2
Tim Stoeckel	2
Tsai, Chin-Chung	2
Abdul Gafoor, K.	1
Abrami, Philip C.	1
Abramo, Annarita	1
Acquah, Sakina	1
Adams, Melanie M.	1
More ▼

SAT (College Admission Test)	5
Torrance Tests of Creative…	3
Wechsler Intelligence Scale…	3
Autism Diagnostic Observation…	2
McCarthy Scales of Childrens…	2
Minnesota Multiphasic…	2
Motivated Strategies for…	2
National Survey of Student…	2
Peabody Picture Vocabulary…	2
Praxis Series	2
Rosenberg Self Esteem Scale	2
Social Skills Rating System	2
State Trait Anxiety Inventory	2
Test of English as a Foreign…	2
Wechsler Adult Intelligence…	2
Wechsler Memory Scale	2
Woodcock Johnson Tests of…	2
ACT Assessment	1
Bayley Scales of Infant…	1
Beck Depression Inventory	1
Bruininks Oseretsky Test of…	1
Child Behavior Checklist	1
Childhood Autism Rating Scale	1
Computer Attitude Scale	1
Coping Inventory	1
More ▼