ERIC - Search Results

Publication Date

In 2025	2
Since 2024	5
Since 2021 (last 5 years)	11

Descriptor

Error of Measurement	11
Testing	11
Test Reliability	6
Scoring	3
Test Items	3
Test Validity	3
College Entrance Examinations	2
Correlation	2
Difficulty Level	2
Equated Scores	2
Evaluation Methods	2
Item Response Theory	2
Scores	2
Simulation	2
Statistical Analysis	2
Test Length	2
Ability Grouping	1
Academic Ability	1
Achievement Tests	1
Adaptive Testing	1
Attitude Measures	1
COVID-19	1
Change	1
College Admission	1
College Readiness	1
More ▼

Source

ACT Education Corp.	1
Annenberg Institute for…	1
ETS Research Report Series	1
International Journal of…	1
Journal of Educational…	1
Journal of Psychoeducational…	1
Journal of University…	1
Language Testing	1
Practical Assessment,…	1
ProQuest LLC	1
Turkish Online Journal of…	1
More ▼

Publication Type

Reports - Research	10
Journal Articles	8
Dissertations/Theses -…	1

Education Level

Higher Education	3
Postsecondary Education	3
Secondary Education	3
High Schools	1
Junior High Schools	1
Middle Schools	1

Audience

Location

Turkey

Laws, Policies, & Programs

Assessments and Surveys

ACT Assessment	1
Program for International…	1
Woodcock Johnson Tests of…	1

What Works Clearinghouse Rating

Showing all 11 results Save | Export

Modeling the Intraindividual Relation of Ability and Speed within a Test

Peer reviewed

Direct link

Augustin Mutak; Robert Krause; Esther Ulitzsch; Sören Much; Jochen Ranger; Steffi Pohl – Journal of Educational Measurement, 2024

Understanding the intraindividual relation between an individual's speed and ability in testing scenarios is essential to assure a fair assessment. Different approaches exist for estimating this relationship, that either rely on specific study designs or on specific assumptions. This paper aims to add to the toolbox of approaches for estimating…

Descriptors: Testing, Academic Ability, Time on Task, Correlation

A Theoretical Suggestion on Testing Measurement Invariance in Adapting Parametric Measurement Tools

Peer reviewed
PDF on ERIC

Download full text

Gökhan Iskifoglu – Turkish Online Journal of Educational Technology - TOJET, 2024

This research paper investigated the importance of conducting measurement invariance analysis in developing measurement tools for assessing differences between and among study variables. Most of the studies, which tended to develop an inventory to assess the existence of an attitude, behavior, belief, IQ, or an intuition in a person's…

Descriptors: Testing, Testing Problems, Error of Measurement, Attitude Measures

The Sensitivity of Value-Added Estimates to Test Scoring Decisions. EdWorkingPaper No. 25-1226

Download full text

Joshua B. Gilbert; James G. Soland; Benjamin W. Domingue – Annenberg Institute for School Reform at Brown University, 2025

Value-Added Models (VAMs) are both common and controversial in education policy and accountability research. While the sensitivity of VAMs to model specification and covariate selection is well documented, the extent to which test scoring methods (e.g., mean scores vs. IRT-based scores) may affect VA estimates is less studied. We examine the…

Descriptors: Value Added Models, Tests, Testing, Scoring

Impacts of Differences in Group Abilities and Anchor Test Features on Three Non-IRT Test Equating Methods

Peer reviewed
PDF on ERIC

Download full text

Inga Laukaityte; Marie Wiberg – Practical Assessment, Research & Evaluation, 2024

The overall aim was to examine effects of differences in group ability and features of the anchor test form on equating bias and the standard error of equating (SEE) using both real and simulated data. Chained kernel equating, Postratification kernel equating, and Circle-arc equating were studied. A college admissions test with four different…

Descriptors: Ability Grouping, Test Items, College Entrance Examinations, High Stakes Tests

Administration and Scoring Errors on the Woodcock-Johnson IV Tests of Achievement: Before and during COVID-19

Peer reviewed

Direct link

Lockwood, Adam B.; Klatka, Kelsey; Parker, Brandon; Benson, Nicholas – Journal of Psychoeducational Assessment, 2023

Eighty Woodcock-Johnson IV Tests of Achievement protocols from 40 test administrators were examined to determine the types and frequencies of administration and scoring errors made. Non-critical errors (e.g., failure to record verbatim) were found on every protocol (M = 37.2). Critical (e.g., standard score, start point) errors were found on 98.8%…

Descriptors: Achievement Tests, Testing, Scoring, Error of Measurement

Initial Evidence Supporting Interpretations of Scores from the Enhanced ACT Test. ACT Research. Research Report. R2425

Download full text

Jeff Allen; Ty Cruce – ACT Education Corp., 2025

This report summarizes some of the evidence supporting interpretations of scores from the enhanced ACT, focusing on reliability, concurrent validity, predictive validity, and score comparability. The authors argue that the evidence presented in this report supports the interpretation of scores from the enhanced ACT as measures of high school…

Descriptors: College Entrance Examinations, Testing, Change, Scores

A Maximum Test of Three Non-Parametric Two-Sample Procedures for Ordinal Data

Direct link

Lotfi Simon Kerzabi – ProQuest LLC, 2021

Monte Carlo methods are an accepted methodology in regards to generation critical values for a Maximum test. The same methods are also applicable to the evaluation of the robustness of the new created test. A table of critical values was created, and the robustness of the new maximum test was evaluated for five different distributions. Robustness…

Descriptors: Data, Monte Carlo Methods, Testing, Evaluation Research

Comparison of Kernel Equating Methods under NEAT and NEC Designs

Peer reviewed
PDF on ERIC

Download full text

Ozsoy, Seyma Nur; Kilmen, Sevilay – International Journal of Assessment Tools in Education, 2023

In this study, Kernel test equating methods were compared under NEAT and NEC designs. In NEAT design, Kernel post-stratification and chain equating methods taking into account optimal and large bandwidths were compared. In the NEC design, gender and/or computer/tablet use was considered as a covariate, and Kernel test equating methods were…

Descriptors: Equated Scores, Testing, Test Items, Statistical Analysis

Assessment of Multiple Choice Question Exams Quality Using Graphical Methods

Peer reviewed
PDF on ERIC

Download full text

Yousuf, Mustafa S.; Miles, Katherine; Harvey, Heather; Al-Tamimi, Mohammad; Badran, Darwish – Journal of University Teaching and Learning Practice, 2022

Exams should be valid, reliable, and discriminative. Multiple informative methods are used for exam analysis. Displaying analysis results numerically, however, may not be easily comprehended. Using graphical analysis tools could be better for the perception of analysis results. Two such methods were employed: standardized x-bar control charts with…

Descriptors: Multiple Choice Tests, Testing, Test Reliability, Test Validity

A Meta-Analysis of Self-Assessment and Language Performance in Language Testing and Assessment

Peer reviewed

Direct link

Li, Minzi; Zhang, Xian – Language Testing, 2021

This meta-analysis explores the correlation between self-assessment (SA) and language performance. Sixty-seven studies with 97 independent samples involving more than 68,500 participants were included in our analysis. It was found that the overall correlation between SA and language performance was 0.466 (p < 0.01). Moderator analysis was…

Descriptors: Meta Analysis, Self Evaluation (Individuals), Likert Scales, Research Reports

Robustness of Weighted Differential Item Functioning (DIF) Analysis: The Case of Mantel-Haenszel DIF Statistics. Research Report. ETS RR-21-12

Peer reviewed
PDF on ERIC

Download full text

Lu, Ru; Guo, Hongwen; Dorans, Neil J. – ETS Research Report Series, 2021

Two families of analysis methods can be used for differential item functioning (DIF) analysis. One family is DIF analysis based on observed scores, such as the Mantel-Haenszel (MH) and the standardized proportion-correct metric for DIF procedures; the other is analysis based on latent ability, in which the statistic is a measure of departure from…

Descriptors: Robustness (Statistics), Weighted Scores, Test Items, Item Analysis

Al-Tamimi, Mohammad	1
Augustin Mutak	1
Badran, Darwish	1
Benjamin W. Domingue	1
Benson, Nicholas	1
Dorans, Neil J.	1
Esther Ulitzsch	1
Guo, Hongwen	1
Gökhan Iskifoglu	1
Harvey, Heather	1
Inga Laukaityte	1
James G. Soland	1
Jeff Allen	1
Jochen Ranger	1
Joshua B. Gilbert	1
Kilmen, Sevilay	1
Klatka, Kelsey	1
Li, Minzi	1
Lockwood, Adam B.	1
Lotfi Simon Kerzabi	1
Lu, Ru	1
Marie Wiberg	1
Miles, Katherine	1
Ozsoy, Seyma Nur	1
Parker, Brandon	1
More ▼