ERIC - Search Results

Publication Date

In 2026	0
Since 2025	12
Since 2022 (last 5 years)	114
Since 2017 (last 10 years)	375
Since 2007 (last 20 years)	1130

Descriptor

Comparative Analysis	1943
Reliability	880
Test Reliability	792
Foreign Countries	554
Test Validity	443
Correlation	350
Validity	332
Interrater Reliability	327
Statistical Analysis	321
Scores	280
Measures (Individuals)	236
Evaluation Methods	212
Higher Education	201
Psychometrics	180
Questionnaires	165
Factor Analysis	161
Test Construction	160
College Students	159
English (Second Language)	149
Student Attitudes	141
Test Items	136
Second Language Learning	133
Scoring	130
Rating Scales	127
Student Evaluation	125
More ▼

Education Level

Higher Education	360
Postsecondary Education	285
Secondary Education	150
Elementary Education	135
Elementary Secondary Education	73
High Schools	68
Middle Schools	61
Early Childhood Education	41
Junior High Schools	34
Grade 8	29
Preschool Education	25
Grade 7	24
Intermediate Grades	24
Grade 4	22
Grade 5	20
Grade 6	20
Kindergarten	20
Primary Education	20
Adult Education	19
Grade 10	16
Grade 11	12
Grade 12	10
Grade 2	10
Grade 3	10
Grade 9	10
More ▼

Audience

Researchers	35
Practitioners	29
Teachers	15
Administrators	9
Policymakers	6
Counselors	2
Media Staff	2
Parents	1
Support Staff	1

Location

Turkey	59
United States	47
Australia	36
China	33
Canada	32
United Kingdom (England)	32
United Kingdom	28
Germany	25
Netherlands	24
Taiwan	22
Hong Kong	20
Iran	20
Spain	17
Belgium	15
California	15
Florida	13
Finland	12
Greece	12
Sweden	12
Texas	12
Indonesia	11
Japan	11
Jordan	11
Malaysia	11
Portugal	11
More ▼

Laws, Policies, & Programs

No Child Left Behind Act 2001	6
Every Student Succeeds Act…	2
Individuals with Disabilities…	2
Americans with Disabilities…	1
Comprehensive Employment and…	1
Improving Americas Schools…	1
Individuals with Disabilities…	1
Race to the Top	1
Temporary Assistance for…	1

What Works Clearinghouse Rating

Meets WWC Standards with or without Reservations	1
Does not meet standards	1

Comparative Analysis X

Showing 46 to 60 of 1,943 results Save | Export

Utilizing Large Language Models for EFL Essay Grading: An Examination of Reliability and Validity in Rubric-Based Assessments

Peer reviewed

Direct link

Fatih Yavuz; Özgür Çelik; Gamze Yavas Çelik – British Journal of Educational Technology, 2025

This study investigates the validity and reliability of generative large language models (LLMs), specifically ChatGPT and Google's Bard, in grading student essays in higher education based on an analytical grading rubric. A total of 15 experienced English as a foreign language (EFL) instructors and two LLMs were asked to evaluate three student…

Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Computational Linguistics

Comparative Judgement for Evaluating Young Learners' EFL Writing Performances: Reliability and Teacher Perceptions of Holistic and Dimension-Based Judgements

Peer reviewed

Direct link

Rebecca Sickinger; Tineke Brunfaut; John Pill – Language Testing, 2025

Comparative Judgement (CJ) is an evaluation method, typically conducted online, whereby a rank order is constructed, and scores calculated, from judges' pairwise comparisons of performances. CJ has been researched in various educational contexts, though only rarely in English as a Foreign Language (EFL) writing settings, and is generally agreed to…

Descriptors: Writing Evaluation, English (Second Language), Second Language Learning, Second Language Instruction

Transparency and Replication in Cross-National Survey Research: Identification of Problems and Possible Solutions

Peer reviewed

Direct link

Damian, Elena; Meuleman, Bart; van Oorschot, Wim – Sociological Methods & Research, 2022

In this article, we examine whether cross-national studies disclose enough information for independent researchers to evaluate the validity and reliability of the findings (evaluation transparency) or to perform a direct replication (replicability transparency). The first contribution is theoretical. We develop a heuristic theoretical model…

Descriptors: National Surveys, Cross Cultural Studies, Social Science Research, Periodicals

Rater Connections and the Detection of Bias in Performance Assessment

Peer reviewed

Direct link

Wind, Stefanie A. – Measurement: Interdisciplinary Research and Perspectives, 2022

In many performance assessments, one or two raters from the complete rater pool scores each performance, resulting in a sparse rating design, where there are limited observations of each rater relative to the complete sample of students. Although sparse rating designs can be constructed to facilitate estimation of student achievement, the…

Descriptors: Evaluators, Bias, Identification, Performance Based Assessment

Efficient Localization of the Cortical Language Network and Its Functional Neuroanatomy in Dyslexia

Direct link

Jayden J. Lee – ProQuest LLC, 2022

The functional neuroanatomy of language localization in dyslexia has primarily been studied in the context of reading. However, dyslexia is sometimes referred to as a "language-based learning disability," yet the functional signature of the core language comprehension network in dyslexia is far less understood. This thesis presents a…

Descriptors: Dyslexia, Brain Hemisphere Functions, Comparative Analysis, Speech Communication

Effective Vocabulary Interventions for Young Emergent Bilinguals: A Best-Evidence Synthesis

Peer reviewed

Direct link

Alain Bengochea; Sabrina F. Sembiante – Review of Education, 2024

This best-evidence synthesis appraises the design and outcome characteristics of vocabulary intervention studies conducted with preschool through 6th grade emergent bilingual (EB) children and spotlights rigorously designed studies for which effects could be better attributed to instructional features. Twenty-nine selected studies were analysed…

Descriptors: Bilingualism, Vocabulary Development, Intervention, Comparative Analysis

Do You Mean What I Mean? Comparing Teacher Performance Self-Scores and Evaluator-Generated Scores

Peer reviewed

Direct link

Hunter, Seth B. – Journal of Education Human Resources, 2023

Teacher performance scores inform education leaders' management of teacher human resources. However, prior research has implied that different interpretations of performance criteria between teachers and their evaluators suppress teacher development. Although research has examined teacher perceptions of performance scores and compared teacher…

Descriptors: Teacher Evaluation, Teacher Effectiveness, Self Evaluation (Individuals), Interrater Reliability

A New Scoring Method for Item Response Theory Analysis of C-Tests

Peer reviewed

Direct link

Farshad Effatpanah; Purya Baghaei; Mona Tabatabaee-Yazdi; Esmat Babaii – Language Testing, 2025

This study aimed to propose a new method for scoring C-Tests as measures of general language proficiency. In this approach, the unit of analysis is sentences rather than gaps or passages. That is, the gaps correctly reformulated in each sentence were aggregated as sentence score, and then each sentence was entered into the analysis as a polytomous…

Descriptors: Item Response Theory, Language Tests, Test Items, Test Construction

Evaluating Large Language Models in Analysing Classroom Dialogue

Peer reviewed

Direct link

Yun Long; Haifeng Luo; Yu Zhang – npj Science of Learning, 2024

This study explores the use of Large Language Models (LLMs), specifically GPT-4, in analysing classroom dialogue--a key task for teaching diagnosis and quality improvement. Traditional qualitative methods are both knowledge- and labour-intensive. This research investigates the potential of LLMs to streamline and enhance this process. Using…

Descriptors: Classroom Communication, Computational Linguistics, Chinese, Mathematics Instruction

Estimating the Impact of Local Item Dependency in a Test of Second Language Reading Comprehension

Peer reviewed
PDF on ERIC

Download full text

Tim Stoeckel; Liang Ye Tan; Hung Tan Ha; Nam Thi Phuong Ho; Tomoko Ishii; Young Ae Kim; Chunmei Huang; Stuart McLean – Vocabulary Learning and Instruction, 2024

Local item dependency (LID) occurs when test-takers' responses to one test item are affected by their responses to another. It can be problematic if it causes inflated reliability estimates or distorted person and item measures. The cued-recall reading comprehension test in Hu and Nation's (2000) well-known and influential coverage--comprehension…

Descriptors: Reading Comprehension, English (Second Language), Second Language Instruction, Second Language Learning

The Concurrent Validity of Comparative Judgement Outcomes Compared with Marks

Download full text

Gill, Tim – Research Matters, 2022

In Comparative Judgement (CJ) exercises, examiners are asked to look at a selection of candidate scripts (with marks removed) and order them in terms of which they believe display the best quality. By including scripts from different examination sessions, the results of these exercises can be used to help with maintaining standards. Results from…

Descriptors: Comparative Analysis, Decision Making, Scripts, Standards

Continuous Improvement of Inter-Rater Reliability in Transition Compliance at a State Agency

Direct link

Heather Raithel – ProQuest LLC, 2023

A mixed methods action research study was designed to answer three research questions based on inter-rater reliability (IRR) in compliance calls for transition at a state education agency, perceived confidence levels in making and discussing compliance calls, and perceived confidence in sharing transition resources. An innovation based on…

Descriptors: Public Agencies, Interrater Reliability, Compliance (Legal), Comparative Analysis

Estimating Hazard Ratios from Published Kaplan-Meier Survival Curves: A Methods Validation Study

Peer reviewed

Direct link

Saluja, Ronak; Cheng, Sierra; delos Santos, Keemo Althea; Chan, Kelvin K. W. – Research Synthesis Methods, 2019

Objective: Various statistical methods have been developed to estimate hazard ratios (HRs) from published Kaplan-Meier (KM) curves for the purpose of performing meta-analyses. The objective of this study was to determine the reliability, accuracy, and precision of four commonly used methods by Guyot, Williamson, Parmar, and Hoyle and Henley.…

Descriptors: Meta Analysis, Reliability, Accuracy, Randomized Controlled Trials

Individual Differences in Cognitive Offloading: A Comparison of Intention Offloading, Pattern Copy, and Short-Term Memory Capacity

Peer reviewed

Direct link

Meyerhoff, Hauke S.; Grinschgl, Sandra; Papenmeier, Frank; Gilbert, Sam J. – Cognitive Research: Principles and Implications, 2021

The cognitive load of many everyday life tasks exceeds known limitations of short-term memory. One strategy to compensate for information overload is cognitive offloading which refers to the externalization of cognitive processes such as reminder setting instead of memorizing. There appears to be remarkable variance in offloading behavior between…

Descriptors: Individual Differences, Task Analysis, Reliability, Short Term Memory

Reliability of the Reflective Learning Framework for Assessing Higher-Order Thinking in Geography and Sustainability Courses

Peer reviewed

Direct link

Whalen, Kate; Paez, Antonio – Journal of Geography, 2022

Experiential education partnered with guided reflection is thought to support students with higher-order thinking skills. In this study, 44 reflections from two university-level sustainability courses were compared. In both courses students were asked to write a reflection, but only one course used the Reflective Learning Framework (RLF). Tests of…

Descriptors: Geography Instruction, Thinking Skills, Experiential Learning, Sustainability

« Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | ... | 130

Educational and Psychological…	64
ProQuest LLC	59
Journal of Speech, Language,…	31
Online Submission	27
Journal of Educational…	22
Language Testing	21
Measurement in Physical…	21
ETS Research Report Series	17
Journal of Autism and…	16
Journal of Psychoeducational…	16
Educational Research and…	15
Assessment & Evaluation in…	14
Measurement and Evaluation in…	14
Psychology in the Schools	14
Journal of Consulting and…	12
International Education…	11
Journal of Education and…	11
Psychological Assessment	11
Research in Developmental…	11
Applied Measurement in…	10
Applied Psychological…	10
Educational Sciences: Theory…	10
Advances in Health Sciences…	9
Assessment in Education:…	9
Psychometrika	9
More ▼

Reckase, Mark D.	6
Attali, Yigal	5
Coniam, David	5
Brennan, Robert L.	4
Crehan, Kevin D.	4
Feldt, Leonard S.	4
Hakstian, A. Ralph	4
Jones, Ian	4
Kolen, Michael J.	4
Lunz, Mary E.	4
August, Diane	3
Bashaw, W. L.	3
Bennett, Randy Elliot	3
Benson, Jeri	3
Betz, Nancy E.	3
Ebel, Robert L.	3
Fletcher, Jack M.	3
Francis, David J.	3
Frisbie, David A.	3
Haberman, Shelby	3
Haladyna, Tom	3
Hambleton, Ronald K.	3
Henk, William A.	3
Iwata, Brian A.	3
More ▼

Journal Articles	1365
Reports - Research	1333
Reports - Evaluative	286
Speeches/Meeting Papers	165
Tests/Questionnaires	81
Reports - Descriptive	63
Dissertations/Theses -…	61
Information Analyses	55
Opinion Papers	30
Numerical/Quantitative Data	19
Collected Works - General	8
Books	7
Collected Works - Proceedings	5
Guides - Non-Classroom	5
Book/Product Reviews	4
Dissertations/Theses -…	4
Collected Works - Serials	3
Guides - General	2
Collected Works - Serial	1
Dissertations/Theses	1
Guides - Classroom - Teacher	1
Historical Materials	1
Non-Print Media	1
Reference Materials -…	1
Reference Materials - General	1
More ▼

Wechsler Intelligence Scale…	16
Peabody Picture Vocabulary…	13
Woodcock Johnson Tests of…	11
SAT (College Admission Test)	10
Test of English as a Foreign…	10
Wechsler Adult Intelligence…	10
Program for International…	9
Minnesota Multiphasic…	8
National Assessment of…	8
Torrance Tests of Creative…	7
Trends in International…	7
Wide Range Achievement Test	7
Autism Diagnostic Observation…	6
ACT Assessment	5
Raven Progressive Matrices	5
Self Directed Search	5
Center for Epidemiologic…	4
Dynamic Indicators of Basic…	4
Early Childhood Environment…	4
General Educational…	4
Graduate Record Examinations	4
Iowa Tests of Basic Skills	4
Metropolitan Achievement Tests	4
Rosenberg Self Esteem Scale	4
Social Skills Rating System	4
More ▼