ERIC - Search Results

Publication Date

In 2026	3
Since 2025	666
Since 2022 (last 5 years)	666
Since 2017 (last 10 years)	666
Since 2007 (last 20 years)	666

Descriptor

Test Reliability	462
Test Validity	382
Foreign Countries	376
Test Construction	204
Reliability	166
Psychometrics	125
Measures (Individuals)	124
Factor Analysis	104
Artificial Intelligence	84
Student Attitudes	72
Factor Structure	69
College Students	67
Teacher Attitudes	62
Undergraduate Students	62
Scores	59
Evaluation Methods	58
Interrater Reliability	58
Test Items	58
Questionnaires	57
Validity	57
Elementary School Students	53
Technology Uses in Education	49
Gender Differences	46
English (Second Language)	43
High School Students	42
More ▼

Publication Type

Journal Articles	645
Reports - Research	618
Tests/Questionnaires	69
Information Analyses	30
Reports - Descriptive	13
Reports - Evaluative	13
Speeches/Meeting Papers	5
Books	2
Non-Print Media	2
Guides - Non-Classroom	1
Numerical/Quantitative Data	1
More ▼

Education Level

Higher Education	222
Postsecondary Education	222
Secondary Education	125
Elementary Education	94
High Schools	58
Middle Schools	42
Early Childhood Education	38
Junior High Schools	29
Intermediate Grades	22
Elementary Secondary Education	21
Primary Education	20
Grade 4	11
Grade 5	11
Preschool Education	11
Grade 2	8
Grade 11	7
Kindergarten	7
Adult Education	6
Grade 3	6
Grade 6	6
Grade 1	5
Grade 7	5
Grade 10	3
Grade 12	3
Grade 8	2
More ▼

Audience

Researchers	5
Policymakers	4
Practitioners	4
Teachers	3
Administrators	1
Counselors	1
Support Staff	1

Location

Turkey	82
China	43
Indonesia	35
Spain	14
Taiwan	14
Germany	11
Iran	10
South Korea	10
United Kingdom	10
Canada	9
India	9
Saudi Arabia	9
Thailand	9
United States	9
Australia	7
Jordan	7
Malaysia	7
Philippines	7
Austria	5
Brazil	5
Japan	5
Netherlands	5
Portugal	5
Switzerland	5
Belgium	4
More ▼

Laws, Policies, & Programs

Elementary and Secondary…	1
Elementary and Secondary…	1
Every Student Succeeds Act…	1
Higher Education Act Title IV	1

What Works Clearinghouse Rating

In 2025 X

Showing 1 to 15 of 666 results Save | Export

Technical Adequacy-Reliability

Peer reviewed

Direct link

Susan K. Johnsen – Gifted Child Today, 2025

The author provides information about reliability and areas that educators should examine in determining if an assessment is consistent and trustworthy for use, and how it should be interpreted in making decisions about students. Reliability areas that are discussed in the column include internal consistency, test-retest or stability, inter-scorer…

Descriptors: Test Reliability, Academically Gifted, Student Evaluation, Error of Measurement

Test-Retest and Inter-Rater Reliability for Selected Outcomes from a Wearable 3D Inertial Sensor over Different Stable and Unstable Postural Conditions: A Validation Study

Peer reviewed

Direct link

Samuel D'Emanuele; Francesca Nardello; Fabrizio Garau; Diego Campaci; Federico Schena; Cantor Tarperi – Measurement in Physical Education and Exercise Science, 2025

The agreement between a wearable inertial sensor (GYKO, G) and the force platform (P) was assessed by evaluating "test-retest" and "inter-rater reliability." Thirty-eight subjects were enrolled; the selected indices of balance were investigated over foot positions and (un)stable conditions. Intraclass correlation coefficient…

Descriptors: Human Posture, Measurement Equipment, Interrater Reliability, Measurement Techniques

Validity and Reliability of the Stuttering Severity Instrument--Fourth Edition for School-Aged Children and Adult Arabic-Speaking People Who Stutter

Peer reviewed

Direct link

Mazin T. Alqhazo; Tha’er Al-Kadi; Firas S. Alfwaress – Language, Speech, and Hearing Services in Schools, 2025

Purpose: The Stuttering Severity Instrument--Fourth Edition (SSI-4) is unavailable in Arabic language. The purpose of the current research is to translate the SSI-4 (Riley, 2009) into Arabic and to discuss its validity, as well as its intrajudge and interjudge reliability. Method: Archived videos of 28 school-aged children who stutter ranged in…

Descriptors: Arabic, Translation, Test Validity, Test Reliability

How Consistent Are Humans When Grading Programming Assignments?

Peer reviewed

Direct link

Marcus Messer; Neil C. C. Brown; Michael Kölling; Miaojing Shi – ACM Transactions on Computing Education, 2025

Providing consistent summative assessment to students is important, as the grades they are awarded affect their progression through university and future career prospects. While small cohorts are typically assessed by a single assessor, such as the module/class leader, larger cohorts are often assessed by multiple assessors, typically teaching…

Descriptors: Foreign Countries, Grading, Interrater Reliability, Teaching Assistants

Grading Exams Using Large Language Models: A Comparison between Human and AI Grading of Exams in Higher Education Using ChatGPT

Peer reviewed

Direct link

Jonas Flodén – British Educational Research Journal, 2025

This study compares how the generative AI (GenAI) large language model (LLM) ChatGPT performs in grading university exams compared to human teachers. Aspects investigated include consistency, large discrepancies and length of answer. Implications for higher education, including the role of teachers and ethics, are also discussed. Three…

Descriptors: College Faculty, Artificial Intelligence, Comparative Testing, Scoring

Reporting and Measuring English School Qualifications: A Case Study of General Certificate of Secondary Education Results in Survey and Linked Administrative Data in the UK Millennium Cohort Study

Peer reviewed

Direct link

Sarah Stopforth; Roxanne Connelly; Vernon Gayle – Cambridge Journal of Education, 2025

Data on educational qualifications is essential in many research domains. The UK Millennium Cohort Study collected self-reported General Certificate of Secondary Education (GCSE) data in sweep 7 (cohort members aged 17). GCSE data from the National Pupil Database (NPD) has been linked to the MCS. This study investigates the consistency of these…

Descriptors: Foreign Countries, Adolescents, Case Studies, Secondary Education

Using Automated Procedures to Score Educational Essays Written in Three Languages

Peer reviewed

Direct link

Tahereh Firoozi; Hamid Mohammadi; Mark J. Gierl – Journal of Educational Measurement, 2025

The purpose of this study is to describe and evaluate a multilingual automated essay scoring (AES) system for grading essays in three languages. Two different sentence embedding models were evaluated within the AES system, multilingual BERT (mBERT) and language-agnostic BERT sentence embedding (LaBSE). German, Italian, and Czech essays were…

Descriptors: College Students, Slavic Languages, German, Italian

Engaging Classroom Observation: A Brief Measure of Active Learning in the College Classroom

Peer reviewed

Direct link

Chase Young; Benjamin Mitchell-Yellin; George Kevin Randall – Active Learning in Higher Education, 2025

The purpose of this study was to develop a valid, reliable, and brief measure of active learning in college classrooms that is cheap and easy to complete and yields results that faculty can easily use to inform their development as instructors. Initial construct and face validity was achieved by modifying existing instruments and creating a draft…

Descriptors: College Faculty, College Students, Active Learning, Classroom Observation Techniques

Validity and Intrarater Reliability of the Fysiometer--Measuring Eccentric Knee Flexor Force during the Nordic Hamstring Exercise

Peer reviewed

Direct link

Morten Pallisgaard Støve; Mathias Kringelholt Kristensen; Jonas Nielsen; Lea Dyhrberg Madsen – Measurement in Physical Education and Exercise Science, 2025

Between limb strength, asymmetry is a leading risk factor for hamstring strain re-injury. However, few accurate testing methodologies are available in clinical settings. This study examined the validity and reliability of eccentric knee flexor torque measured with a novel Nordic Hamstring Device. Twenty-seven healthy participants were assessed in…

Descriptors: Validity, Reliability, Human Body, Foreign Countries

The Vague Language Use Scale: Clinical Utility and Psychometrics from Adults with Traumatic Brain Injury

Peer reviewed

Direct link

Kathryn J. Greenslade; Julia K. Bushell; Emily F. Dillon; Amy E. Ramage – International Journal of Language & Communication Disorders, 2025

Background: Pragmatic communication difficulties encompass many distinct behaviours, including the use of vague and/or insufficient language, a common characteristic following traumatic brain injury (TBI) that negatively impacts psychosocial outcomes. Existing assessments evaluate pragmatic communication broadly, often with only one or two items…

Descriptors: Neurological Impairments, Head Injuries, Language Impairments, Language Tests

Treatment Fidelity in a Feasibility Trial of the Aphasia Intervention, Virtual Elaborated Semantic Feature Analysis

Peer reviewed

Direct link

Niamh Devane; Sofia Mazzoleni; Nicholas Behn; Jane Marshall; Stephanie Wilson; Katerina Hilari – International Journal of Language & Communication Disorders, 2025

Background and Aims: The reliability and validity of an intervention can be improved by checking treatment fidelity (TF). TF methods identify core components of an intervention, check their presence (or absence) and identify threats to fidelity. The Virtual Elaborated Semantic Feature Analysis (VESFA) intervention comprised individual sessions of…

Descriptors: Aphasia, Intervention, Fidelity, Feasibility Studies

Mixed Model Generalizability Theory: A Case Study and Tutorial

Peer reviewed
PDF on ERIC

Download full text

Alan Huebner; Gustaf B. Skar; Mengchen Huang – Practical Assessment, Research & Evaluation, 2025

Generalizability theory is a modern and powerful framework for conducting reliability analyses. It is flexible to accommodate both random and fixed facets. However, there has been a relative scarcity in the practical literature on how to handle the fixed facet case. This article aims to provide practitioners a conceptual understanding and…

Descriptors: Generalizability Theory, Multivariate Analysis, Statistical Analysis, Writing Evaluation

Evidence-Based Evaluation of Student and Marker Performances in Assessment and Examination

Peer reviewed

Direct link

Ole J. Kemi – Advances in Physiology Education, 2025

Students are assessed by coursework and/or exams, all of which are marked by assessors (markers). Student and marker performances are then subject to end-of-session board of examiner handling and analysis. This occurs annually and is the basis for evaluating students but also the wider learning and teaching efficiency of an academic institution.…

Descriptors: Undergraduate Students, Evaluation Methods, Evaluation Criteria, Academic Standards

The Scale of Sincerity Based on Kyai Haji Ahmad Dahlan's Version for Islamic Students: The Rasch Analysis

Peer reviewed
PDF on ERIC

Download full text

Wahyu Nanda Eka Saputra; Trikinasih Handayani; Prima Suci Rohmadheny; Rohmatus Naini; Dody Hartanto; Hardi Santosa; Dewi Afra Khairunnisa; Risma Risansyah; Hanan Riati; Faturrahman – Journal of Education and Learning (EduLearn), 2025

The students are urged to do something without expecting anything in return and only in the name of God. Every islamic student becomes something ideal if they can internalize and implement sincerity. Many people are willing to do something because of an ulterior motive. The importance of sincerity in humans is the background for developing a…

Descriptors: Islam, Interrater Reliability, Prosocial Behavior, Muslims

Reliability and Validity of the Self-Report Version of the Strengths and Difficulties Questionnaire (SDQ) in Primary School Children

Peer reviewed

Direct link

Katharina Liegmann; Lisa Fischer; Kevin Dadaczynski; Reiner Hanewinkel; Frauke Nees; Matthis Morgenstern – International Journal of Behavioral Development, 2025

This study examined the new self-report version of the Strengths and Difficulties Questionnaire (SDQ-S), SDQ-Kids, in primary school children regarding internal consistency, teacher-child agreement, and validity. Data from 2,655 children in Grades 1 to 3 and their teachers were analyzed. Children completed SDQ-Kids, previously piloted (n = 896),…

Descriptors: Questionnaires, Behavior Problems, Screening Tests, Child Behavior

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | ... | 45

Education and Information…	28
Psychology in the Schools	25
International Journal of…	22
Journal of Psychoeducational…	21
International Journal of…	18
SAGE Open	18
European Journal of Education	14
Journal of Autism and…	14
Educational Process:…	13
Measurement in Physical…	13
Journal of Education and…	10
Journal of Educational…	10
Journal of Baltic Science…	9
Journal of Biological…	9
Journal of Computer Assisted…	7
Assessment & Evaluation in…	6
Autism: The International…	6
Discover Education	6
International Journal of…	6
International Journal of…	6
Language Testing	6
Physical Review Physics…	6
Anatomical Sciences Education	5
Journal of Applied Research…	5
Journal of Education and…	5
More ▼

Mohamad Ahmad Saleem Khasawneh	3
Mohammad Nayef Ayasrah	3
Adam B. Wilson	2
Ayoub Hamdan Al-Rousan	2
Benjamin W. Domingue	2
Bomna Ko	2
Cathy Creswell	2
Deanne K. Unruh	2
Eli Rohaeti	2
Elizabeth Pellicano	2
Filiz Arzu Yalin	2
Gaofeng Li	2
Hamdollah Ravand	2
Hillman Wirawan	2
Hongwei Yang	2
Insook Kim	2
Joshua B. Gilbert	2
Juan Cruz	2
Kyle Reardon	2
Li Wang	2
Limin Wang	2
Mariola Moeyaert	2
Mark J. Gierl	2
Mei-ki Chan	2
Muhammad Saefi	2
More ▼

Strengths and Difficulties…	5
Autism Diagnostic Observation…	4
Program for International…	4
Classroom Assessment Scoring…	3
Test of English as a Foreign…	3
Vineland Adaptive Behavior…	3
ACT Assessment	2
Beck Depression Inventory	2
Child Behavior Checklist	2
Childhood Autism Rating Scale	2
Depression Anxiety and Stress…	2
International English…	2
Maslach Burnout Inventory	2
Mullen Scales of Early…	2
Satisfaction With Life Scale	2
Social Skills Improvement…	2
Teachers Sense of Efficacy…	2
UCLA Loneliness Scale	2
Wechsler Intelligence Scale…	2
ACTFL Oral Proficiency…	1
Aberrant Behavior Checklist	1
Academic Motivation Scale	1
Adaptive Behavior Scale	1
Ages and Stages Questionnaires	1
Armed Forces Qualification…	1
More ▼