ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	3
Since 2016 (last 10 years)	12

Source

Educational Measurement:…

Publication Type

Journal Articles	12
Reports - Research	11
Information Analyses	1
Reports - Evaluative	1

Education Level

Middle Schools	5
Secondary Education	5
Junior High Schools	4
Elementary Education	3
Elementary Secondary Education	2
Grade 8	2
Early Childhood Education	1
Grade 3	1
Grade 4	1
Grade 5	1
Grade 6	1
Grade 7	1
Grade 9	1
High Schools	1
Higher Education	1
Intermediate Grades	1
Primary Education	1
More ▼

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

Program for International…	1
Trends in International…	1

What Works Clearinghouse Rating

Showing all 12 results Save | Export

Measurement Efficiency for Technology-Enhanced and Multiple-Choice Items in a K-12 Mathematics Accountability Assessment

Peer reviewed

Direct link

Ersan, Ozge; Berry, Yufeng – Educational Measurement: Issues and Practice, 2023

The increasing use of computerization in the testing industry and the need for items potentially measuring higher-order skills have led educational measurement communities to develop technology-enhanced (TE) items and conduct validity studies on the use of TE items. Parallel to this goal, the purpose of this study was to collect validity evidence…

Descriptors: Computer Assisted Testing, Multiple Choice Tests, Elementary Secondary Education, Accountability

Examining Gender Differences in TIMSS 2019 Using a Multiple-Group Hierarchical Speed-Accuracy-Revisits Model

Peer reviewed

Direct link

Dihao Leng; Ummugul Bezirhan; Lale Khorramdel; Bethany Fishbein; Matthias von Davier – Educational Measurement: Issues and Practice, 2024

This study capitalizes on response and process data from the computer-based TIMSS 2019 Problem Solving and Inquiry tasks to investigate gender differences in test-taking behaviors and their association with mathematics achievement at the eighth grade. Specifically, a recently proposed hierarchical speed-accuracy-revisits (SAR) model was adapted to…

Descriptors: Gender Differences, Test Wiseness, Achievement Tests, Mathematics Tests

A Cost-Benefit Analysis of Automatic Item Generation

Peer reviewed

Direct link

Kosh, Audra E.; Simpson, Mary Ann; Bickel, Lisa; Kellogg, Mark; Sanford-Moore, Ellie – Educational Measurement: Issues and Practice, 2019

Automatic item generation (AIG)--a means of leveraging technology to create large quantities of items--requires a minimum number of items to offset the sizable upfront investment (i.e., model development and technology deployment) in order to achieve cost savings. In this cost-benefit analysis, we estimated the cost of each step of AIG and manual…

Descriptors: Cost Effectiveness, Automation, Test Items, Mathematics Tests

The Effect of Drag-and-Drop Item Features on Test-Taker Performance and Response Strategies

Peer reviewed

Direct link

Arslan, Burcu; Jiang, Yang; Keehner, Madeleine; Gong, Tao; Katz, Irvin R.; Yan, Fred – Educational Measurement: Issues and Practice, 2020

Computer-based educational assessments often include items that involve drag-and-drop responses. There are different ways that drag-and-drop items can be laid out and different choices that test developers can make when designing these items. Currently, these decisions are based on experts' professional judgments and design constraints, rather…

Descriptors: Test Items, Computer Assisted Testing, Test Format, Decision Making

The Relationship between Item Developer Alignment of Items to Range Achievement-Level Descriptors and Item Difficulty: Implications for Validating Intended Score Interpretations

Peer reviewed

Direct link

Schneider, M. Christina; Agrimson, Jared; Veazey, Mary – Educational Measurement: Issues and Practice, 2022

This paper presents results of a score interpretation study for a computer adaptive mathematics assessment. The study purpose was to test the efficacy of item developers' alignment of items to Range Achievement-Level Descriptors (RALDs; Egan et al.) against the empirical achievement-level alignment of items to investigate the use of RALDs as the…

Descriptors: Computer Assisted Testing, Mathematics Tests, Scores, Grade 3

Affordances of Item Formats and Their Effects on Test-Taker Cognition under Uncertainty

Peer reviewed

Direct link

Moon, Jung Aa; Keehner, Madeleine; Katz, Irvin R. – Educational Measurement: Issues and Practice, 2019

The current study investigated how item formats and their inherent affordances influence test-takers' cognition under uncertainty. Adult participants solved content-equivalent math items in multiple-selection multiple-choice and four alternative grid formats. The results indicated that participants' affirmative response tendency (i.e., judge the…

Descriptors: Affordances, Test Items, Test Format, Test Wiseness

Predicting Freshman Grade-Point Average from Test Scores: Effects of Variation within and between High Schools

Peer reviewed

Direct link

Koretz, D.; Langi, M. – Educational Measurement: Issues and Practice, 2018

Most studies predicting college performance from high-school grade point average (HSGPA) and college admissions test scores use single-level regression models that conflate relationships within and between high schools. Because grading standards vary among high schools, these relationships are likely to differ within and between schools. We used…

Descriptors: Prediction, High School Students, Grade Point Average, Scores

Measuring Widening Proficiency Differences in International Assessments: Are Current Approaches Enough?

Peer reviewed

Direct link

Rutkowski, David; Rutkowski, Leslie; Liaw, Yuan-Ling – Educational Measurement: Issues and Practice, 2018

Participation in international large-scale assessments has grown over time with the largest, the Programme for International Student Assessment (PISA), including more than 70 education systems that are economically and educationally diverse. To help accommodate for large achievement differences among participants, in 2009 PISA offered…

Descriptors: Educational Assessment, Foreign Countries, Achievement Tests, Secondary School Students

Are Accommodations for English Learners on State Accountability Assessments Evidence-Based? A Multistudy Systematic Review and Meta-Analysis

Peer reviewed

Direct link

Rios, Joseph A.; Ihlenfeldt, Samuel D.; Chavez, Carlos – Educational Measurement: Issues and Practice, 2020

The objectives of this two-part study were to: (a) investigate English learner (EL) accommodation practices on state accountability assessments of reading/English language arts and mathematics in grades 3-8, and (b) conduct a meta-analysis of EL accommodation effectiveness on improving test performance. Across all distinct testing programs, we…

Descriptors: Testing Accommodations, English Language Learners, Program Effectiveness, Evidence Based Practice

Can Item Response Times Provide Insight into Students' Motivation and Self-Efficacy in Math? An Initial Application of Test Metadata to Understand Students' Social-Emotional Needs

Peer reviewed

Direct link

Soland, James – Educational Measurement: Issues and Practice, 2019

As computer-based tests become more common, there is a growing wealth of metadata related to examinees' response processes, which include solution strategies, concentration, and operating speed. One common type of metadata is item response time. While response times have been used extensively to improve estimates of achievement, little work…

Descriptors: Test Items, Item Response Theory, Metadata, Self Efficacy

Examining Effectiveness and Validity of Accommodations for English Language Learners in Mathematics: An Evidence-Based Computer Accommodation Decision System

Peer reviewed

Direct link

Abedi, Jamal; Zhang, Yu; Rowe, Susan E.; Lee, Hansol – Educational Measurement: Issues and Practice, 2020

Research indicates that the performance-gap between English Language Learners (ELLs) and their non-ELL peers is partly due to ELLs' difficulty in understanding assessment language. Accommodations have been shown to narrow this performance-gap, but many accommodations studies have not used a randomized design and are based on relatively small…

Descriptors: English Language Learners, Achievement Gap, Mathematics Tests, Standards

Examining Estimates of Intervention Effectiveness Using Sensitivity Analysis

Peer reviewed

Direct link

An, Chen; Braun, Henry; Walsh, Mary E. – Educational Measurement: Issues and Practice, 2018

Making causal inferences from a quasi-experiment is difficult. Sensitivity analysis approaches to address hidden selection bias thus have gained popularity. This study serves as an introduction to a simple but practical form of sensitivity analysis using Monte Carlo simulation procedures. We examine estimated treatment effects for a school-based…

Descriptors: Statistical Inference, Intervention, Program Effectiveness, Quasiexperimental Design

Mathematics Tests	12
Computer Assisted Testing	5
Test Items	5
Scores	4
Test Construction	3
Test Validity	3
Achievement Tests	2
Correlation	2
Decision Making	2
Difficulty Level	2
Elementary Secondary Education	2
English Language Learners	2
Foreign Countries	2
Grade 8	2
Language Arts	2
Mathematics Achievement	2
Middle School Students	2
Multiple Choice Tests	2
Program Effectiveness	2
Reaction Time	2
Standards	2
Student Needs	2
Test Format	2
Test Wiseness	2
Testing Accommodations	2
More ▼

Katz, Irvin R.	2
Keehner, Madeleine	2
Abedi, Jamal	1
Agrimson, Jared	1
An, Chen	1
Arslan, Burcu	1
Berry, Yufeng	1
Bethany Fishbein	1
Bickel, Lisa	1
Braun, Henry	1
Chavez, Carlos	1
Dihao Leng	1
Ersan, Ozge	1
Gong, Tao	1
Ihlenfeldt, Samuel D.	1
Jiang, Yang	1
Kellogg, Mark	1
Koretz, D.	1
Kosh, Audra E.	1
Lale Khorramdel	1
Langi, M.	1
Lee, Hansol	1
Liaw, Yuan-Ling	1
Matthias von Davier	1
Moon, Jung Aa	1
More ▼