NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 12 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Ersan, Ozge; Berry, Yufeng – Educational Measurement: Issues and Practice, 2023
The increasing use of computerization in the testing industry and the need for items potentially measuring higher-order skills have led educational measurement communities to develop technology-enhanced (TE) items and conduct validity studies on the use of TE items. Parallel to this goal, the purpose of this study was to collect validity evidence…
Descriptors: Computer Assisted Testing, Multiple Choice Tests, Elementary Secondary Education, Accountability
Peer reviewed Peer reviewed
Direct linkDirect link
Dihao Leng; Ummugul Bezirhan; Lale Khorramdel; Bethany Fishbein; Matthias von Davier – Educational Measurement: Issues and Practice, 2024
This study capitalizes on response and process data from the computer-based TIMSS 2019 Problem Solving and Inquiry tasks to investigate gender differences in test-taking behaviors and their association with mathematics achievement at the eighth grade. Specifically, a recently proposed hierarchical speed-accuracy-revisits (SAR) model was adapted to…
Descriptors: Gender Differences, Test Wiseness, Achievement Tests, Mathematics Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Kosh, Audra E.; Simpson, Mary Ann; Bickel, Lisa; Kellogg, Mark; Sanford-Moore, Ellie – Educational Measurement: Issues and Practice, 2019
Automatic item generation (AIG)--a means of leveraging technology to create large quantities of items--requires a minimum number of items to offset the sizable upfront investment (i.e., model development and technology deployment) in order to achieve cost savings. In this cost-benefit analysis, we estimated the cost of each step of AIG and manual…
Descriptors: Cost Effectiveness, Automation, Test Items, Mathematics Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Arslan, Burcu; Jiang, Yang; Keehner, Madeleine; Gong, Tao; Katz, Irvin R.; Yan, Fred – Educational Measurement: Issues and Practice, 2020
Computer-based educational assessments often include items that involve drag-and-drop responses. There are different ways that drag-and-drop items can be laid out and different choices that test developers can make when designing these items. Currently, these decisions are based on experts' professional judgments and design constraints, rather…
Descriptors: Test Items, Computer Assisted Testing, Test Format, Decision Making
Peer reviewed Peer reviewed
Direct linkDirect link
Schneider, M. Christina; Agrimson, Jared; Veazey, Mary – Educational Measurement: Issues and Practice, 2022
This paper presents results of a score interpretation study for a computer adaptive mathematics assessment. The study purpose was to test the efficacy of item developers' alignment of items to Range Achievement-Level Descriptors (RALDs; Egan et al.) against the empirical achievement-level alignment of items to investigate the use of RALDs as the…
Descriptors: Computer Assisted Testing, Mathematics Tests, Scores, Grade 3
Peer reviewed Peer reviewed
Direct linkDirect link
Moon, Jung Aa; Keehner, Madeleine; Katz, Irvin R. – Educational Measurement: Issues and Practice, 2019
The current study investigated how item formats and their inherent affordances influence test-takers' cognition under uncertainty. Adult participants solved content-equivalent math items in multiple-selection multiple-choice and four alternative grid formats. The results indicated that participants' affirmative response tendency (i.e., judge the…
Descriptors: Affordances, Test Items, Test Format, Test Wiseness
Peer reviewed Peer reviewed
Direct linkDirect link
Koretz, D.; Langi, M. – Educational Measurement: Issues and Practice, 2018
Most studies predicting college performance from high-school grade point average (HSGPA) and college admissions test scores use single-level regression models that conflate relationships within and between high schools. Because grading standards vary among high schools, these relationships are likely to differ within and between schools. We used…
Descriptors: Prediction, High School Students, Grade Point Average, Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Rutkowski, David; Rutkowski, Leslie; Liaw, Yuan-Ling – Educational Measurement: Issues and Practice, 2018
Participation in international large-scale assessments has grown over time with the largest, the Programme for International Student Assessment (PISA), including more than 70 education systems that are economically and educationally diverse. To help accommodate for large achievement differences among participants, in 2009 PISA offered…
Descriptors: Educational Assessment, Foreign Countries, Achievement Tests, Secondary School Students
Peer reviewed Peer reviewed
Direct linkDirect link
Rios, Joseph A.; Ihlenfeldt, Samuel D.; Chavez, Carlos – Educational Measurement: Issues and Practice, 2020
The objectives of this two-part study were to: (a) investigate English learner (EL) accommodation practices on state accountability assessments of reading/English language arts and mathematics in grades 3-8, and (b) conduct a meta-analysis of EL accommodation effectiveness on improving test performance. Across all distinct testing programs, we…
Descriptors: Testing Accommodations, English Language Learners, Program Effectiveness, Evidence Based Practice
Peer reviewed Peer reviewed
Direct linkDirect link
Soland, James – Educational Measurement: Issues and Practice, 2019
As computer-based tests become more common, there is a growing wealth of metadata related to examinees' response processes, which include solution strategies, concentration, and operating speed. One common type of metadata is item response time. While response times have been used extensively to improve estimates of achievement, little work…
Descriptors: Test Items, Item Response Theory, Metadata, Self Efficacy
Peer reviewed Peer reviewed
Direct linkDirect link
Abedi, Jamal; Zhang, Yu; Rowe, Susan E.; Lee, Hansol – Educational Measurement: Issues and Practice, 2020
Research indicates that the performance-gap between English Language Learners (ELLs) and their non-ELL peers is partly due to ELLs' difficulty in understanding assessment language. Accommodations have been shown to narrow this performance-gap, but many accommodations studies have not used a randomized design and are based on relatively small…
Descriptors: English Language Learners, Achievement Gap, Mathematics Tests, Standards
Peer reviewed Peer reviewed
Direct linkDirect link
An, Chen; Braun, Henry; Walsh, Mary E. – Educational Measurement: Issues and Practice, 2018
Making causal inferences from a quasi-experiment is difficult. Sensitivity analysis approaches to address hidden selection bias thus have gained popularity. This study serves as an introduction to a simple but practical form of sensitivity analysis using Monte Carlo simulation procedures. We examine estimated treatment effects for a school-based…
Descriptors: Statistical Inference, Intervention, Program Effectiveness, Quasiexperimental Design