ERIC - Search Results

Publication Date

In 2025	5
Since 2024	10
Since 2021 (last 5 years)	32
Since 2016 (last 10 years)	60
Since 2006 (last 20 years)	93

Descriptor

Decision Making	132
Scoring	132
Evaluation Methods	28
Foreign Countries	27
Scores	26
Student Evaluation	25
Test Validity	23
Comparative Analysis	20
Evaluators	20
Test Construction	19
Second Language Learning	17
English (Second Language)	15
Test Reliability	14
Interrater Reliability	13
Models	13
Correlation	12
Performance Based Assessment	12
Teaching Methods	12
Computer Assisted Testing	11
Data Collection	11
Elementary Secondary Education	11
Higher Education	11
Item Analysis	11
Knowledge Level	11
Language Tests	11
More ▼

Publication Type

Journal Articles	81
Reports - Research	66
Reports - Evaluative	21
Guides - Non-Classroom	14
Reports - Descriptive	13
Speeches/Meeting Papers	13
Tests/Questionnaires	11
Dissertations/Theses -…	7
Books	4
Collected Works - General	3
Information Analyses	3
Collected Works - Proceedings	2
Book/Product Reviews	1
Dissertations/Theses -…	1
Guides - Classroom - Teacher	1
Numerical/Quantitative Data	1
Opinion Papers	1
More ▼

Education Level

Higher Education	20
Postsecondary Education	17
Elementary Education	12
Secondary Education	10
Elementary Secondary Education	6
Middle Schools	6
High Schools	5
Junior High Schools	5
Early Childhood Education	3
Intermediate Grades	3
Grade 4	2
Grade 6	2
Grade 7	2
Grade 8	2
Grade 9	2
Preschool Education	2
Adult Education	1
Grade 10	1
Grade 11	1
Grade 12	1
Grade 5	1
More ▼

Audience

Practitioners	4
Teachers	4
Administrators	1
Counselors	1
Support Staff	1

Location

Rhode Island	7
Pennsylvania	4
Canada	3
China	3
United Kingdom	3
Australia	2
Europe	2
Japan	2
Turkey	2
Afghanistan	1
Austria	1
Cyprus	1
Finland	1
Germany	1
Hong Kong	1
Illinois	1
Illinois (Chicago)	1
India	1
Indonesia	1
Lithuania	1
Louisiana	1
Malaysia	1
New Zealand	1
North Carolina (Greensboro)	1
Norway	1
More ▼

Laws, Policies, & Programs

Individuals with Disabilities…	2
Family Educational Rights and…	1
Health Insurance Portability…	1
Individuals with Disabilities…	1
Individuals with Disabilities…	1

Assessments and Surveys

International English…	3
National Assessment of…	3
Test of English as a Foreign…	3
Dynamic Indicators of Basic…	1
Early Childhood Environment…	1
Graduate Management Admission…	1
Kaufman Assessment Battery…	1
Peabody Picture Vocabulary…	1
SAT (College Admission Test)	1
Strengths and Difficulties…	1
Systematic Screening for…	1
United States Medical…	1
Wechsler Individual…	1
Woodcock Johnson Tests of…	1
More ▼

What Works Clearinghouse Rating

Showing 1 to 15 of 132 results Save | Export

Evaluating Targeted Double Scoring for the Performance Assessment for School Leaders Using Imputation and Decision Theory. Research Report. ETS RR-23-01

Peer reviewed
PDF on ERIC

Download full text

Jing Miao; Sandip Sinharay; Chris Kelbaugh; Yi Cao; Wei Wang – ETS Research Report Series, 2023

In a targeted double-scoring procedure for performance assessments that are used for licensure and certification purposes, a subset of responses receives an independent second rating if the first rating falls into a preidentified critical score range (CSR) where an additional rating would lead to considerably more reliable pass-fail decisions.…

Descriptors: Scoring, Performance Based Assessment, Licensing Examinations (Professions), Certification

The Sensitivity of Value-Added Estimates to Test Scoring Decisions. EdWorkingPaper No. 25-1226

Download full text

Joshua B. Gilbert; James G. Soland; Benjamin W. Domingue – Annenberg Institute for School Reform at Brown University, 2025

Value-Added Models (VAMs) are both common and controversial in education policy and accountability research. While the sensitivity of VAMs to model specification and covariate selection is well documented, the extent to which test scoring methods (e.g., mean scores vs. IRT-based scores) may affect VA estimates is less studied. We examine the…

Descriptors: Value Added Models, Tests, Testing, Scoring

Evaluating the Consistency and Reliability of Attribution Methods in Automated Short Answer Grading (ASAG) Systems: Toward an Explainable Scoring System

Peer reviewed

Direct link

Wallace N. Pinto Jr.; Jinnie Shin – Journal of Educational Measurement, 2025

In recent years, the application of explainability techniques to automated essay scoring and automated short-answer grading (ASAG) models, particularly those based on transformer architectures, has gained significant attention. However, the reliability and consistency of these techniques remain underexplored. This study systematically investigates…

Descriptors: Automation, Grading, Computer Assisted Testing, Scoring

Reconceptualization of Test Fairness Model: A Grounded Theory Approach

Peer reviewed
PDF on ERIC

Download full text

Beheshti, Shima; Safa, Mohammad Ahmadi – Iranian Journal of Language Teaching Research, 2023

The indefinite nature of test fairness and different interpretations and definitions of the concept have stirred a lot of controversy over the years, necessitating the reconceptualization of the concept. On this basis, this study aimed to explore the empirical validity of Kunnan's (2008) Test Fairness Framework (TFF) and revisit the established…

Descriptors: Test Bias, Equal Education, Grounded Theory, Test Construction

Assessment in Practice: Achieving Joint Decisions in Oral Examination Grading Conversations

Peer reviewed

Direct link

Marit Skarbø Solem; Anne Marie Dalby Landmark; Elizabeth Stokoe; Karianne Skovholt – Scandinavian Journal of Educational Research, 2024

How do examiners reach joint decisions when they grade oral examinations? While government and policymakers provide general frameworks about grading decisions, we know little about how they are actually accomplished in interaction, particularly when examiners initially disagree. We scrutinized 29 video-recorded grading conversations between…

Descriptors: Foreign Countries, Secondary School Teachers, Secondary Education, Speech Tests

Strengthening the Pennsylvania School Climate Survey to Inform School Decisionmaking. REL 2024-006

Peer reviewed
PDF on ERIC

Download full text

Alyson Burnett; Katlyn Lee Milless; Michelle Bennett; Whitney Kozakowski; Sonia Alves; Christine Ross – Regional Educational Laboratory Mid-Atlantic, 2024

This study analyzed Pennsylvania School Climate Survey data from students and staff in the 2021/22 school year to assess the validity and reliability of the elementary school student version of the survey; approaches to scoring the survey in individual schools at all grade levels; and perceptions of school climate across student, staff, and school…

Descriptors: Educational Environment, Decision Making, Surveys, Validity

Scoring Difficulty in Summary Writing Assessment: Toward the Reconstruction of Analytic Rubric

Peer reviewed
PDF on ERIC

Download full text

Makiko Kato – Journal of Education and Learning, 2025

This study aims to examine whether differences exist in the factors influencing the difficulty of scoring English summaries and determining scores based on the raters' attributes, and to collect candid opinions, considerations, and tentative suggestions for future improvements to the analytic rubric of summary writing for English learners. In this…

Descriptors: Writing Evaluation, Scoring, Writing Skills, English (Second Language)

Assessing the Ethical Capabilities of Chat GPT in Healthcare: A Study on Its Proficiency in Situational Judgement Test

Peer reviewed

Direct link

Kunal Sareen – Innovations in Education and Teaching International, 2024

This study examines the proficiency of Chat GPT, an AI language model, in answering questions on the Situational Judgement Test (SJT), a widely used assessment tool for evaluating the fundamental competencies of medical graduates in the UK. A total of 252 SJT questions from the "Oxford Assess and Progress: Situational Judgement" Test…

Descriptors: Ethics, Decision Making, Artificial Intelligence, Computer Software

Comparative Judgement for Evaluating Young Learners' EFL Writing Performances: Reliability and Teacher Perceptions of Holistic and Dimension-Based Judgements

Peer reviewed

Direct link

Rebecca Sickinger; Tineke Brunfaut; John Pill – Language Testing, 2025

Comparative Judgement (CJ) is an evaluation method, typically conducted online, whereby a rank order is constructed, and scores calculated, from judges' pairwise comparisons of performances. CJ has been researched in various educational contexts, though only rarely in English as a Foreign Language (EFL) writing settings, and is generally agreed to…

Descriptors: Writing Evaluation, English (Second Language), Second Language Learning, Second Language Instruction

Strengthening the Pennsylvania School Climate Survey to Inform School Decisionmaking. Appendixes. REL 2024-006

Peer reviewed
PDF on ERIC

Download full text

Regional Educational Laboratory Mid-Atlantic, 2024

These are the appendixes for the report, "Strengthening the Pennsylvania School Climate Survey to Inform School Decisionmaking." This study analyzed Pennsylvania School Climate Survey data from students and staff in the 2021/22 school year to assess the validity and reliability of the elementary school student version of the survey;…

Descriptors: Educational Environment, Surveys, Decision Making, School Personnel

Effects of Self-Scoring Their Math Problem Solutions on Primary School Students' Monitoring and Regulation

Peer reviewed

Direct link

Oudman, Sophie; van de Pol, Janneke; van Gog, Tamara – Metacognition and Learning, 2022

Preparing students to become self-regulated learners has become an important goal of primary education. Therefore, it is important to investigate how we can improve self-monitoring and self-regulation accuracy in primary school students. Focusing on mathematics problems, we investigated whether and how (1) high- and low-performing students…

Descriptors: Metacognition, Elementary School Students, Mathematics Instruction, Problem Solving

Assessing Information Synthesis within and across Multiple Texts with Verification Tasks: A Signal Detection Theory Approach

Peer reviewed

Direct link

Yukhymenko-Lescroart, Mariya A.; Goldman, Susan R.; Lawless, Kimberly A.; Pellegrino, James W.; Shanahan, Cynthia R. – Educational Psychology, 2022

To extend the existing research examining multiple text comprehension and its assessment, we developed a verification task approach to assessing of information that was "explicitly" and "implicitly" presented "within" and across nine texts. A nonparametric form of signal detection theory was used to analyse the…

Descriptors: Task Analysis, Reading Comprehension, Middle School Students, Nonparametric Statistics

Decoding Student Insights: Analyzing Response Change in NAEP Mathematics Constructed Response Items

Peer reviewed
PDF on ERIC

Download full text

Congning Ni; Bhashithe Abeysinghe; Juanita Hicks – International Electronic Journal of Elementary Education, 2025

The National Assessment of Educational Progress (NAEP), often referred to as The Nation's Report Card, offers a window into the state of U.S. K-12 education system. Since 2017, NAEP has transitioned to digital assessments, opening new research opportunities that were previously impossible. Process data tracks students' interactions with the…

Descriptors: Reaction Time, Multiple Choice Tests, Behavior Change, National Competency Tests

A Comparison of Methodologies for Scaling Longitudinal Social-Emotional Survey Responses

Peer reviewed

Direct link

Soland, James; Kuhfeld, Megan; Register, Brennan – Educational Assessment, 2023

Much of what we know about how children develop is based on survey data. In order to estimate growth across time and, thereby, better understand that development, short survey scales are typically administered at repeated timepoints. Before estimating growth, those repeated measures must be put onto the same scale. Yet, little research examines…

Descriptors: Comparative Analysis, Social Emotional Learning, Scaling, Effect Size

Hanyu Shuiping Kaoshi (HSK): A Multi-Level, Multi-Purpose Proficiency Test

Peer reviewed

Direct link

Peng, Yue; Yan, Wei; Cheng, Liying – Language Testing, 2021

This test review focuses on the current version (2009) of [Chinese characters omitted] (Hanyu Shuiping Kaoshi), literally translated as the Chinese Language Proficiency Test and abbreviated as HSK. Tailored to non-native speakers of the Chinese language, this test consists of six proficiency levels (Levels 1 and 2 as beginners, Levels 3 and 4 as…

Descriptors: Language Proficiency, Language Tests, Chinese, Decision Making

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

ProQuest LLC	7
Rhode Island Department of…	7
Applied Measurement in…	5
Language Testing	4
Educational Measurement:…	3
Language Assessment Quarterly	3
Language Education &…	3
ETS Research Report Series	2
Educational Assessment	2
Educational and Psychological…	2
International Journal of…	2
Journal of Educational…	2
Journal of Personnel…	2
Language Testing in Asia	2
Metacognition and Learning	2
Reading and Writing: An…	2
Regional Educational…	2
Scandinavian Journal of…	2
AERA Online Paper Repository	1
Academic Medicine	1
Academy for Educational…	1
Accounting Education	1
Action in Teacher Education	1
Advances in Health Sciences…	1
Anatomical Sciences Education	1
More ▼

Cheng, Liying	2
Herman, Joan L.	2
Lunz, Mary E.	2
Oliveri, María Elena	2
Abbasi, Abbas	1
Allen, Abigail	1
Allwood, Carl Martin	1
Aloisi, Cesare	1
Alyson Burnett	1
Anne Marie Dalby Landmark	1
Aray, Henry	1
Aschbacher, Pamela R.	1
Attali, Yigal	1
Baker, Beverly Anne	1
Bakla, Arif	1
Barbot, Baptiste	1
Barkaoui, Khaled	1
Barnes, Tiffany, Ed.	1
Beard, Jonathan	1
Beaty, Roger E.	1
Beheshti, Shima	1
Benjamin W. Domingue	1
Bergstrom, Betty A.	1
Berson, Nancy	1
More ▼