ERIC - Search Results

Publication Date

In 2026	0
Since 2025	2
Since 2022 (last 5 years)	7
Since 2017 (last 10 years)	27
Since 2007 (last 20 years)	79

Descriptor

Comparative Analysis	119
Models	119
Reliability	64
Test Reliability	43
Foreign Countries	26
Scores	20
Validity	20
Correlation	18
Evaluation Methods	18
Statistical Analysis	17
Interrater Reliability	16
Test Validity	16
Data Analysis	13
Item Response Theory	13
Scoring	11
Teaching Methods	11
Accuracy	10
Factor Analysis	10
Predictor Variables	10
Student Attitudes	10
Computer Assisted Testing	9
Measures (Individuals)	9
Test Items	9
Academic Achievement	8
Computation	8
More ▼

Publication Type

Reports - Research	81
Journal Articles	78
Reports - Evaluative	14
Speeches/Meeting Papers	13
Dissertations/Theses -…	8
Reports - Descriptive	7
Information Analyses	4
Tests/Questionnaires	3
Collected Works - Proceedings	2
Opinion Papers	2
Collected Works - General	1
Dissertations/Theses -…	1
Reference Materials -…	1
More ▼

Education Level

Higher Education	16
Postsecondary Education	15
Secondary Education	11
High Schools	8
Elementary Education	5
Elementary Secondary Education	5
Early Childhood Education	4
Grade 12	4
Middle Schools	4
Grade 8	3
Grade 6	2
Grade 7	2
Junior High Schools	2
Preschool Education	2
Adult Basic Education	1
Adult Education	1
Grade 10	1
Grade 11	1
Grade 5	1
Grade 9	1
Kindergarten	1
Primary Education	1
More ▼

Audience

Location

Australia	4
United Kingdom (England)	4
China	3
Connecticut	3
New York	3
Philippines	3
Turkey	3
Canada	2
Egypt	2
Germany	2
Malaysia	2
Netherlands	2
New Hampshire	2
Rhode Island	2
Singapore	2
Texas	2
United States	2
Vermont	2
Asia	1
Brazil	1
California	1
Denmark	1
Estonia	1
Florida	1
France	1
More ▼

Laws, Policies, & Programs

Every Student Succeeds Act…	2
No Child Left Behind Act 2001	1

Assessments and Surveys

National Assessment of…	2
New York State Regents…	2
ACT Assessment	1
Multigroup Ethnic Identity…	1
Parental Authority…	1
Praxis Series	1
Program for International…	1
Self Description Questionnaire	1
Self Perception Profile for…	1
Teacher Efficacy Scale	1
Test of English as a Foreign…	1
Wechsler Intelligence Scale…	1
Wide Range Achievement Test	1
Woodcock Johnson Tests of…	1
More ▼

What Works Clearinghouse Rating

Showing 1 to 15 of 119 results Save | Export

Comparing and Combining IRTree Models and Anchoring Vignettes in Addressing Response Styles

Peer reviewed

Direct link

Mingfeng Xue; Ping Chen – Journal of Educational Measurement, 2025

Response styles pose great threats to psychological measurements. This research compares IRTree models and anchoring vignettes in addressing response styles and estimating the target traits. It also explores the potential of combining them at the item level and total-score level (ratios of extreme and middle responses to vignettes). Four models…

Descriptors: Item Response Theory, Models, Comparative Analysis, Vignettes

Coherence-Based Automatic Short Answer Scoring Using Sentence Embedding

Peer reviewed

Direct link

Dadi Ramesh; Suresh Kumar Sanampudi – European Journal of Education, 2024

Automatic essay scoring (AES) is an essential educational application in natural language processing. This automated process will alleviate the burden by increasing the reliability and consistency of the assessment. With the advances in text embedding libraries and neural network models, AES systems achieved good results in terms of accuracy.…

Descriptors: Scoring, Essays, Writing Evaluation, Memory

The AI Teacher Test: Measuring the Pedagogical Ability of Blender and GPT-3 in Educational Dialogues

Peer reviewed
PDF on ERIC

Download full text

Tack, Anaïs; Piech, Chris – International Educational Data Mining Society, 2022

How can we test whether state-of-the-art generative models, such as Blender and GPT-3, are good AI teachers, capable of replying to a student in an educational dialogue? Designing an AI teacher test is challenging: although evaluation methods are much-needed, there is no off-the-shelf solution to measuring pedagogical ability. This paper reports…

Descriptors: Artificial Intelligence, Dialogs (Language), Bayesian Statistics, Decision Making

Simulating the Relationship between Nonword Repetition Performance and Vocabulary Growth in 2-Year-Olds: Evidence from the Language 0-5 Project

Peer reviewed

Direct link

Caroline F. Rowland; Amy Bidgood; Gary Jones; Andrew Jessop; Paula Stinson; Julian M. Pine; Samantha Durrant; Michelle S. Peter – Language Learning, 2025

A strong predictor of children's language is performance on non-word repetition (NWR) tasks. However, the basis of this relationship remains unknown. Some suggest that NWR tasks measure phonological working memory, which then affects language growth. Others argue that children's knowledge of language/language experience affects NWR performance. A…

Descriptors: Vocabulary Development, Comparative Analysis, Computational Linguistics, Language Skills

Analytic or Holistic: A Study of Agreement between Different Grading Models

Peer reviewed
PDF on ERIC

Download full text

Jönsson, Anders; Balan, Andreia – Practical Assessment, Research & Evaluation, 2018

Research on teachers' grading has shown that there is great variability among teachers regarding both the process and product of grading, resulting in low comparability and issues of inequality when using grades for selection purposes. Despite this situation, not much is known about the merits or disadvantages of different models for grading. In…

Descriptors: Grading, Models, Reliability, Validity

Evaluating Large Language Models in Analysing Classroom Dialogue

Peer reviewed

Direct link

Yun Long; Haifeng Luo; Yu Zhang – npj Science of Learning, 2024

This study explores the use of Large Language Models (LLMs), specifically GPT-4, in analysing classroom dialogue--a key task for teaching diagnosis and quality improvement. Traditional qualitative methods are both knowledge- and labour-intensive. This research investigates the potential of LLMs to streamline and enhance this process. Using…

Descriptors: Classroom Communication, Computational Linguistics, Chinese, Mathematics Instruction

More Efficient Processes for Creating Automated Essay Scoring Frameworks: A Demonstration of Two Algorithms

Peer reviewed

Direct link

Shin, Jinnie; Gierl, Mark J. – Language Testing, 2021

Automated essay scoring (AES) has emerged as a secondary or as a sole marker for many high-stakes educational assessments, in native and non-native testing, owing to remarkable advances in feature engineering using natural language processing, machine learning, and deep-neural algorithms. The purpose of this study is to compare the effectiveness…

Descriptors: Scoring, Essays, Writing Evaluation, Computer Software

A Meta-Analysis on the Reliability of Comparative Judgement

Peer reviewed

Direct link

Verhavert, San; Bouwer, Renske; Donche, Vincent; De Maeyer, Sven – Assessment in Education: Principles, Policy & Practice, 2019

Comparative Judgement (CJ) aims to improve the quality of performance-based assessments by letting multiple assessors judge pairs of performances. CJ is generally associated with high levels of reliability, but there is also a large variation in reliability between assessments. This study investigates which assessment characteristics influence the…

Descriptors: Meta Analysis, Reliability, Comparative Analysis, Value Judgment

Performance of Model-Based Network Meta-Analysis (MBNMA) of Time-Course Relationships: A Simulation Study

Peer reviewed

Direct link

Pedder, Hugo; Boucher, Martin; Dias, Sofia; Bennetts, Margherita; Welton, Nicky J. – Research Synthesis Methods, 2020

Time-course model-based network meta-analysis (MBNMA) has been proposed as a framework to combine treatment comparisons from a network of randomized controlled trials reporting outcomes at multiple time-points. This can explain heterogeneity/inconsistency that arises by pooling studies with different follow-up times and allow inclusion of studies…

Descriptors: Simulation, Randomized Controlled Trials, Meta Analysis, Comparative Analysis

Reliably Assessing Growth with Longitudinal Diagnostic Classification Models

Peer reviewed

Direct link

Madison, Matthew J. – Educational Measurement: Issues and Practice, 2019

Recent advances have enabled diagnostic classification models (DCMs) to accommodate longitudinal data. These longitudinal DCMs were developed to study how examinees change, or transition, between different attribute mastery statuses over time. This study examines using longitudinal DCMs as an approach to assessing growth and serves three purposes:…

Descriptors: Longitudinal Studies, Item Response Theory, Psychometrics, Criterion Referenced Tests

Metrics for Discrete Student Models: Chance Levels, Comparisons, and Use Cases

Peer reviewed
PDF on ERIC

Download full text

Bosch, Nigel; Paquette, Luc – Journal of Learning Analytics, 2018

Metrics including Cohen's kappa, precision, recall, and F[subscript 1] are common measures of performance for models of discrete student states, such as a student's affect or behaviour. This study examined discrete model metrics for previously published student model examples to identify situations where metrics provided differing perspectives on…

Descriptors: Models, Comparative Analysis, Prediction, Probability

Disentangling Objective Characteristics of Learning Situations from Subjective Perceptions Thereof, Using an Experience Sampling Method Design

Peer reviewed
PDF on ERIC

Download full text

Moeller, Julia; Viljaranta, Jaana; Kracke, Bärbel; Dietrich, Julia – Frontline Learning Research, 2020

This article proposes a study design developed to disentangle the objective characteristics of a learning situation from individuals' subjective perceptions of that situation. The term objective characteristics refers to the agreement across students, whereas subjective perceptions refers to inter-individual heterogeneity. We describe a novel…

Descriptors: Student Attitudes, College Students, Lecture Method, Student Interests

Automated Generation of Node-splitting Models for Assessment of Inconsistency in Network Meta-analysis

Peer reviewed

Direct link

van Valkenhoef, Gert; Dias, Sofia; Ades, A. E.; Welton, Nicky J. – Research Synthesis Methods, 2016

Network meta-analysis enables the simultaneous synthesis of a network of clinical trials comparing any number of treatments. Potential inconsistencies between estimates of relative treatment effects are an important concern, and several methods to detect inconsistency have been proposed. This paper is concerned with the node-splitting approach,…

Descriptors: Networks, Meta Analysis, Automation, Models

Age of Exposure 2.0: Estimating Word Complexity Using Iterative Models of Word Embeddings

Peer reviewed
PDF on ERIC

Download full text

Direct link

Botarleanu, Robert-Mihai; Dascalu, Mihai; Watanabe, Micah; Crossley, Scott Andrew; McNamara, Danielle S. – Grantee Submission, 2022

Age of acquisition (AoA) is a measure of word complexity which refers to the age at which a word is typically learned. AoA measures have shown strong correlations with reading comprehension, lexical decision times, and writing quality. AoA scores based on both adult and child data have limitations that allow for error in measurement, and increase…

Descriptors: Age Differences, Vocabulary Development, Correlation, Reading Comprehension

Same Test, Better Scores: Boosting the Reliability of Short Online Intelligence Recruitment Tests with Nested Logit Item Response Theory Models

Peer reviewed
PDF on ERIC

Download full text

Storme, Martin; Myszkowski, Nils; Baron, Simon; Bernard, David – Journal of Intelligence, 2019

Assessing job applicants' general mental ability online poses psychometric challenges due to the necessity of having brief but accurate tests. Recent research (Myszkowski & Storme, 2018) suggests that recovering distractor information through Nested Logit Models (NLM; Suh & Bolt, 2010) increases the reliability of ability estimates in…

Descriptors: Intelligence Tests, Item Response Theory, Comparative Analysis, Test Reliability

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8

ProQuest LLC	8
ETS Research Report Series	7
Journal of Educational…	4
Asia Pacific Education Review	2
Assessment & Evaluation in…	2
Assessment in Education:…	2
Grantee Submission	2
International Educational…	2
Journal of Baltic Science…	2
Language Testing	2
Measurement in Physical…	2
Online Submission	2
Research Quarterly for…	2
Research Synthesis Methods	2
Advances in Health Sciences…	1
American Journal of Distance…	1
Annals of Dyslexia	1
Applied Psychological…	1
Behavioral Disorders	1
Child Welfare	1
Clinical Linguistics &…	1
Council of Chief State School…	1
Counseling Psychologist	1
Developmental Psychology	1
Developmental Review	1
More ▼

Darling-Hammond, Linda	2
Dias, Sofia	2
Haberman, Shelby J.	2
Hansen, Duncan N.	2
Lubiano, Michael Leonard D.	2
Magpantay, Marife S.	2
Mandeville, Garrett K.	2
Stallings, Jane A.	2
Welton, Nicky J.	2
Abdel-Haq, Eman Muhammad	1
Adams, R. J.	1
Ades, A. E.	1
Al-Sayed, Rania Kamal Muhammad	1
Alamprese, Judith A.	1
Ali, Mahsoub Abdel-Sadeq	1
Amy Bidgood	1
Andrew Jessop	1
Armstrong, Helen D.	1
Arslan, Fethi	1
Aspiranti, Kathleen B.	1
Attali, Yigal	1
August, Diane	1
Babette Moeller	1
Bain, Sherry K.	1
More ▼