ERIC - Search Results

Publication Date

In 2025	11
Since 2024	39
Since 2021 (last 5 years)	129
Since 2016 (last 10 years)	422
Since 2006 (last 20 years)	909

Descriptor

Comparative Analysis	1333
Reliability	622
Test Reliability	507
Foreign Countries	462
Test Validity	277
Correlation	273
Statistical Analysis	259
Interrater Reliability	248
Validity	234
Scores	209
Measures (Individuals)	180
Higher Education	133
Questionnaires	131
Psychometrics	128
Factor Analysis	125
College Students	121
Evaluation Methods	115
English (Second Language)	112
Student Attitudes	112
Teaching Methods	106
Second Language Learning	104
Elementary School Students	103
Test Items	98
Undergraduate Students	93
Test Construction	92
More ▼

Publication Type

Reports - Research	1333
Journal Articles	1075
Speeches/Meeting Papers	102
Tests/Questionnaires	73
Information Analyses	24
Numerical/Quantitative Data	7
Reports - Evaluative	5
Dissertations/Theses -…	3
Opinion Papers	3
Collected Works - Serials	2
Book/Product Reviews	1
Collected Works - General	1
Collected Works - Serial	1
Guides - Non-Classroom	1
Historical Materials	1
More ▼

Education Level

Higher Education	285
Postsecondary Education	237
Secondary Education	129
Elementary Education	110
Middle Schools	55
High Schools	53
Elementary Secondary Education	40
Early Childhood Education	34
Junior High Schools	31
Grade 8	25
Intermediate Grades	22
Grade 7	21
Grade 6	19
Preschool Education	19
Grade 4	16
Grade 5	16
Grade 10	15
Primary Education	15
Kindergarten	12
Adult Education	11
Grade 11	10
Grade 9	9
Grade 12	8
Grade 3	8
Grade 1	7
More ▼

Audience

Researchers	28
Practitioners	17
Teachers	9
Policymakers	5
Administrators	4
Counselors	1
Parents	1

Location

Turkey	57
United States	32
China	27
Australia	25
United Kingdom (England)	23
Germany	22
United Kingdom	22
Netherlands	21
Canada	20
Iran	20
Hong Kong	16
Spain	15
Taiwan	15
Belgium	12
Indonesia	11
Jordan	11
Greece	10
Malaysia	10
Sweden	10
Texas	10
California	9
Finland	9
Japan	8
Nigeria	8
Turkey (Istanbul)	8
More ▼

Laws, Policies, & Programs

No Child Left Behind Act 2001	4
Individuals with Disabilities…	2
Americans with Disabilities…	1
Comprehensive Employment and…	1
Individuals with Disabilities…	1
Temporary Assistance for…	1

What Works Clearinghouse Rating

Meets WWC Standards with or without Reservations	1
Does not meet standards	1

Showing 1 to 15 of 1,333 results Save | Export

Reality or Illusion: Comparing Google Scholar and Scopus Data for Predatory Journals

Peer reviewed

Direct link

Manjula Wijewickrema – portal: Libraries and the Academy, 2024

This research compares the performance measures reported by two bibliographic databases relevant to a set of authors who have published in predatory journals. The reliability of decision-making based on the information provided by uncontrolled bibliographic databases is examined to support rational decisions. A sample of authors who published in…

Descriptors: Periodicals, Ethics, Deception, Authors

Comparing and Combining IRTree Models and Anchoring Vignettes in Addressing Response Styles

Peer reviewed

Direct link

Mingfeng Xue; Ping Chen – Journal of Educational Measurement, 2025

Response styles pose great threats to psychological measurements. This research compares IRTree models and anchoring vignettes in addressing response styles and estimating the target traits. It also explores the potential of combining them at the item level and total-score level (ratios of extreme and middle responses to vignettes). Four models…

Descriptors: Item Response Theory, Models, Comparative Analysis, Vignettes

Accuracy and Reliability of Large Language Models in Assessing Learning Outcomes Achievement across Cognitive Domains

Peer reviewed

Direct link

Swapna Haresh Teckwani; Amanda Huee-Ping Wong; Nathasha Vihangi Luke; Ivan Cherh Chiet Low – Advances in Physiology Education, 2024

The advent of artificial intelligence (AI), particularly large language models (LLMs) like ChatGPT and Gemini, has significantly impacted the educational landscape, offering unique opportunities for learning and assessment. In the realm of written assessment grading, traditionally viewed as a laborious and subjective process, this study sought to…

Descriptors: Accuracy, Reliability, Computational Linguistics, Standards

A Comparison of Yen's Q3 Coefficient and Rasch Testlet Modeling for Identifying Local Item Dependence: Evidence from Two Vocabulary Matching Tests

Peer reviewed

Direct link

Hung Tan Ha; Duyen Thi Bich Nguyen; Tim Stoeckel – Language Assessment Quarterly, 2025

This article compares two methods for detecting local item dependence (LID): residual correlation examination and Rasch testlet modeling (RTM), in a commonly used 3:6 matching format and an extended matching test (EMT) format. The two formats are hypothesized to facilitate different levels of item dependency due to differences in the number of…

Descriptors: Comparative Analysis, Language Tests, Test Items, Item Analysis

Comparing Music Recordings Using Pairwise Comparative Judgement: Exploring the Judge Experience

Download full text

Lucy Chambers; Emma Walland; Jo Ireland – Research Matters, 2024

Comparative Judgement (CJ) is traditionally and primarily used to compare written texts. In this study we explored whether we could extend its use to comparing audio files. We used GCSE Music portfolios which contained a mix of audio recordings, musical scores and text documents. Fifteen judges completed two exercises: one comparing musical…

Descriptors: Evaluative Thinking, Judges, Comparative Analysis, Reliability

Moderation of Non-Exam Assessments: A Novel Approach Using Comparative Judgement

Peer reviewed

Direct link

Lucy Chambers; Sylvia Vitello; Carmen Vidal Rodeiro – Assessment in Education: Principles, Policy & Practice, 2024

In England, some secondary-level qualifications comprise non-exam assessments which need to undergo moderation before grading. Currently, moderation is conducted at centre (school) level. This raises challenges for maintaining the standard across centres. Recent technological advances enable novel moderation methods that are no longer bound by…

Descriptors: Foreign Countries, Evaluation Methods, Comparative Analysis, Grading

Psychometric Properties of the Metacognitive Awareness Inventory (MAI): Standardization to an International Spanish with 12 Countries

Peer reviewed

Direct link

Antonio P. Gutierrez de Blume; Diana Marcela Montoya Londoño; Virginia Jiménez Rodríguez; Olivia Morán Núñez; Ariel Cuadro; Lilián Daset; Mauricio Molina Delgado; Claudia García de la Cadena; María Beatríz Beltrán Navarro; Aníbal Puente Ferreras; Sebastián Urquijo; Walter Lizandro Arias – Metacognition and Learning, 2024

Metacognition is defined as a higher-order thinking skill that enables individuals to monitor, control, and regulate their thinking and behavior. In education, this skill is important, as learners need to self-regulate their learning behaviors for successful lifelong learning. Thus, it is essential for educators and learners alike to know their…

Descriptors: Metacognition, Measures (Individuals), Psychometrics, Standards

Can Large Language Models Replace Humans in Systematic Reviews? Evaluating GPT-4's Efficacy in Screening and Extracting Data from Peer-Reviewed and Grey Literature in Multiple Languages

Peer reviewed

Direct link

Qusai Khraisha; Sophie Put; Johanna Kappenberg; Azza Warraitch; Kristin Hadfield – Research Synthesis Methods, 2024

Systematic reviews are vital for guiding practice, research and policy, although they are often slow and labour-intensive. Large language models (LLMs) could speed up and automate systematic reviews, but their performance in such tasks has yet to be comprehensively evaluated against humans, and no study has tested Generative Pre-Trained…

Descriptors: Peer Evaluation, Research Reports, Artificial Intelligence, Computer Software

Coherence-Based Automatic Short Answer Scoring Using Sentence Embedding

Peer reviewed

Direct link

Dadi Ramesh; Suresh Kumar Sanampudi – European Journal of Education, 2024

Automatic essay scoring (AES) is an essential educational application in natural language processing. This automated process will alleviate the burden by increasing the reliability and consistency of the assessment. With the advances in text embedding libraries and neural network models, AES systems achieved good results in terms of accuracy.…

Descriptors: Scoring, Essays, Writing Evaluation, Memory

Comparative Analysis of LLMs Performance in Medical Embryology: A Cross-Platform Study of ChatGPT, Claude, Gemini, and Copilot

Peer reviewed

Direct link

Olena Bolgova; Paul Ganguly; Volodymyr Mavrych – Anatomical Sciences Education, 2025

Integrating artificial intelligence, particularly large language models (LLMs), into medical education represents a significant new step in how medical knowledge is accessed, processed, and evaluated. The objective of this study was to conduct a comprehensive analysis comparing the performance of advanced LLM chatbots in different topics of…

Descriptors: Comparative Analysis, Artificial Intelligence, Technology Uses in Education, Natural Language Processing

Reliable Application of the MATH Taxonomy Sheds Light on Assessment Practices

Peer reviewed

Direct link

Kinnear, George; Bennett, Max; Binnie, Rachel; Bolt, Róisín; Zheng, Yinglan – Teaching Mathematics and Its Applications, 2020

The MATH taxonomy classifies questions according to the mathematical skills required to answer them. It was created to aid the development of more balanced assessments in undergraduate mathematics and has since been used to compare different assessment regimes across school and university. To date, there has been no systematic investigation of the…

Descriptors: Taxonomy, Mathematics Instruction, Teaching Methods, Reliability

The AI Teacher Test: Measuring the Pedagogical Ability of Blender and GPT-3 in Educational Dialogues

Peer reviewed
PDF on ERIC

Download full text

Tack, Anaïs; Piech, Chris – International Educational Data Mining Society, 2022

How can we test whether state-of-the-art generative models, such as Blender and GPT-3, are good AI teachers, capable of replying to a student in an educational dialogue? Designing an AI teacher test is challenging: although evaluation methods are much-needed, there is no off-the-shelf solution to measuring pedagogical ability. This paper reports…

Descriptors: Artificial Intelligence, Dialogs (Language), Bayesian Statistics, Decision Making

Accuracy Assessment of Two Electromagnetic Articulographs: Northern Digital Inc. WAVE and Northern Digital Inc. VOX

Peer reviewed

Direct link

Rebernik, Teja; Jacobi, Jidde; Tiede, Mark; Wieling, Martijn – Journal of Speech, Language, and Hearing Research, 2021

Purpose: This study compares two electromagnetic articulographs manufactured by Northern Digital, Inc.: the NDI Wave System (from 2008) and the NDI Vox-EMA System (from 2020). Method: Four experiments were completed: (1) comparison of statically positioned sensors; (2) tracking dynamic movements of sensors manipulated using a motor-driven LEGO…

Descriptors: Measurement Equipment, Articulation (Speech), Accuracy, Reliability

Assessing the Impact of Predictive Thinking-Based Learning Activities on Enhancing Creative Writing in Language Learning Classrooms

Peer reviewed
PDF on ERIC

Download full text

Ali Al-Barakat; Rommel AlAli; Omayya Al-Hassan; Khaled Al-Saud – Educational Process: International Journal, 2025

Background/purpose: The study tries to discover how predictive thinking can be incorporated into writing activities to assist students in developing their creative skills in writing learning environments. Through this study, teachers will be able to adopt a new teaching method that helps transform the way creative writing is taught in language…

Descriptors: Thinking Skills, Creative Writing, Writing Instruction, Validity

Benefits and Costs of Matching Prior to a Difference in Differences Analysis When Parallel Trends Does Not Hold

Peer reviewed

Direct link

Dae Woong Ham; Luke Miratrix – Grantee Submission, 2024

The consequence of a change in school leadership (e.g., principal turnover) on student achievement has important implications for education policy. The impact of such an event can be estimated via the popular Difference in Difference (DiD) estimator, where those schools with a turnover event are compared to a selected set of schools that did not…

Descriptors: Trend Analysis, Faculty Mobility, Academic Achievement, Principals

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | ... | 89

Educational and Psychological…	35
Journal of Speech, Language,…	27
Measurement in Physical…	20
Online Submission	20
Language Testing	19
ETS Research Report Series	16
Educational Research and…	14
Assessment & Evaluation in…	12
Journal of Autism and…	12
Journal of Education and…	11
Journal of Educational…	11
Educational Sciences: Theory…	10
International Education…	10
Journal of Psychoeducational…	10
Advances in Health Sciences…	9
Psychology in the Schools	9
Applied Measurement in…	8
Assessment in Education:…	8
English Language Teaching	8
International Journal of…	8
Psychological Assessment	8
Child Development	7
Creativity Research Journal	7
EURASIA Journal of…	7
Grantee Submission	7
More ▼

Reckase, Mark D.	5
Attali, Yigal	4
Jones, Ian	4
Benson, Jeri	3
Fletcher, Jack M.	3
Haladyna, Tom	3
Iwata, Brian A.	3
Kim, Sooyeon	3
Kunnan, Antony John	3
Miciak, Jeremy	3
Tsai, Chin-Chung	3
Vaughn, Sharon	3
Acar, Selcuk	2
Algozzine, Bob	2
August, Diane	2
Baron-Cohen, Simon	2
Bashaw, W. L.	2
Bauer, Daniel	2
Beach, Kristen D.	2
Benton, Tom	2
Bocian, Kathleen M.	2
Bothe, Anne K.	2
Bridget Poznanski	2
Byrne, Brian	2
More ▼

Peabody Picture Vocabulary…	9
Woodcock Johnson Tests of…	9
Wechsler Intelligence Scale…	8
Test of English as a Foreign…	7
Autism Diagnostic Observation…	6
Program for International…	6
Wechsler Adult Intelligence…	6
Wide Range Achievement Test	6
Torrance Tests of Creative…	5
Center for Epidemiologic…	4
Dynamic Indicators of Basic…	4
Early Childhood Environment…	4
Iowa Tests of Basic Skills	4
Minnesota Multiphasic…	4
Rosenberg Self Esteem Scale	4
SAT (College Admission Test)	4
Social Skills Rating System	4
Computer Attitude Scale	3
General Educational…	3
International English…	3
MacArthur Communicative…	3
Motivated Strategies for…	3
Self Description Questionnaire	3
ACT Assessment	2
Attitude Scale	2
More ▼