Publication Date
In 2025 | 0 |
Since 2024 | 4 |
Since 2021 (last 5 years) | 4 |
Since 2016 (last 10 years) | 18 |
Since 2006 (last 20 years) | 38 |
Descriptor
Computer Assisted Testing | 69 |
Test Items | 24 |
Adaptive Testing | 23 |
Comparative Analysis | 15 |
Item Response Theory | 15 |
Test Construction | 14 |
Psychometrics | 10 |
Scores | 9 |
Ability | 8 |
College Students | 8 |
Difficulty Level | 8 |
More ▼ |
Source
Applied Measurement in… | 69 |
Author
Publication Type
Journal Articles | 69 |
Reports - Research | 41 |
Reports - Evaluative | 25 |
Information Analyses | 7 |
Speeches/Meeting Papers | 3 |
Reports - Descriptive | 2 |
Collected Works - General | 1 |
Opinion Papers | 1 |
Education Level
Secondary Education | 10 |
Higher Education | 9 |
Elementary Education | 5 |
Postsecondary Education | 5 |
Elementary Secondary Education | 4 |
Grade 8 | 4 |
Middle Schools | 4 |
Grade 3 | 3 |
Grade 5 | 3 |
High Schools | 3 |
Junior High Schools | 3 |
More ▼ |
Audience
Laws, Policies, & Programs
Assessments and Surveys
Program for International… | 2 |
Armed Services Vocational… | 1 |
California Achievement Tests | 1 |
Graduate Record Examinations | 1 |
Massachusetts Comprehensive… | 1 |
SAT (College Admission Test) | 1 |
What Works Clearinghouse Rating
Ben Backes; James Cowan – Applied Measurement in Education, 2024
We investigate two research questions using a recent statewide transition from paper to computer-based testing: first, the extent to which test mode effects found in prior studies can be eliminated; and second, the degree to which online and paper assessments offer different information about underlying student ability. We first find very small…
Descriptors: Computer Assisted Testing, Test Format, Differences, Academic Achievement
Stefanie A. Wind; Beyza Aksu-Dunya – Applied Measurement in Education, 2024
Careless responding is a pervasive concern in research using affective surveys. Although researchers have considered various methods for identifying careless responses, studies are limited that consider the utility of these methods in the context of computer adaptive testing (CAT) for affective scales. Using a simulation study informed by recent…
Descriptors: Response Style (Tests), Computer Assisted Testing, Adaptive Testing, Affective Measures
Takahiro Terao – Applied Measurement in Education, 2024
This study aimed to compare item characteristics and response time between stimulus conditions in computer-delivered listening tests. Listening materials had three variants: regular videos, frame-by-frame videos, and only audios without visuals. Participants were 228 Japanese high school students who were requested to complete one of nine…
Descriptors: Computer Assisted Testing, Audiovisual Aids, Reaction Time, High School Students
Wise, Steven L.; Kuhfeld, Megan R.; Soland, James – Applied Measurement in Education, 2019
When we administer educational achievement tests, we want to be confident that the resulting scores validly indicate what the test takers know and can do. However, if the test is perceived as low stakes by the test taker, disengaged test taking sometimes occurs, which poses a serious threat to score validity. When computer-based tests are used,…
Descriptors: Guessing (Tests), Computer Assisted Testing, Achievement Tests, Scores
Cohen, Dale J.; Ballman, Alesha; Rijmen, Frank; Cohen, Jon – Applied Measurement in Education, 2020
Computer-based, pop-up glossaries are perhaps the most promising accommodation aimed at mitigating the influence of linguistic structure and cultural bias on the performance of English Learner (EL) students on statewide assessments. To date, there is no established procedure for identifying the words that require a glossary for EL students that is…
Descriptors: Glossaries, Testing Accommodations, English Language Learners, Computer Assisted Testing
Steven L. Wise; Megan R. Kuhfeld; Marlit Annalena Lindner – Applied Measurement in Education, 2024
When student achievement is assessed, we seek to elicit a student's maximum performance -- a goal requiring the assumption that the student is fully engaged. Otherwise, to the extent that disengagement occurs, test performance is likely to suffer. Effectively managing test-taking disengagement requires an understanding of the testing conditions…
Descriptors: Testing, Attention Span, Learner Engagement, Time Factors (Learning)
Cohen, Yoav; Levi, Effi; Ben-Simon, Anat – Applied Measurement in Education, 2018
In the current study, two pools of 250 essays, all written as a response to the same prompt, were rated by two groups of raters (14 or 15 raters per group), thereby providing an approximation to the essay's true score. An automated essay scoring (AES) system was trained on the datasets and then scored the essays using a cross-validation scheme. By…
Descriptors: Test Validity, Automation, Scoring, Computer Assisted Testing
Wise, Steven L.; Gao, Lingyun – Applied Measurement in Education, 2017
There has been an increased interest in the impact of unmotivated test taking on test performance and score validity. This has led to the development of new ways of measuring test-taking effort based on item response time. In particular, Response Time Effort (RTE) has been shown to provide an assessment of effort down to the level of individual…
Descriptors: Test Bias, Computer Assisted Testing, Item Response Theory, Achievement Tests
Soland, James – Applied Measurement in Education, 2018
This study estimated male-female and Black-White achievement gaps without accounting for low test motivation, then compared those estimates to ones that used several approaches to addressing rapid guessing. Researchers investigated two issues: (1) The differences in rates of rapid guessing across subgroups and (2) How much achievement gap…
Descriptors: Guessing (Tests), Achievement Gap, Student Motivation, Learner Engagement
Lottridge, Susan; Wood, Scott; Shaw, Dan – Applied Measurement in Education, 2018
This study sought to provide a framework for evaluating machine score-ability of items using a new score-ability rating scale, and to determine the extent to which ratings were predictive of observed automated scoring performance. The study listed and described a set of factors that are thought to influence machine score-ability; these factors…
Descriptors: Program Effectiveness, Computer Assisted Testing, Test Scoring Machines, Scoring
Rupp, André A. – Applied Measurement in Education, 2018
This article discusses critical methodological design decisions for collecting, interpreting, and synthesizing empirical evidence during the design, deployment, and operational quality-control phases for automated scoring systems. The discussion is inspired by work on operational large-scale systems for automated essay scoring but many of the…
Descriptors: Design, Automation, Scoring, Test Scoring Machines
Cohen, Dale; Tracy, Ryan; Cohen, Jon – Applied Measurement in Education, 2017
This study examined the effectiveness and influence on validity of a computer-based pop-up English glossary accommodation for English learners (ELs) in grades 3 and 7. In a randomized controlled trial, we administered pop-up English glossaries with audio to students taking a statewide accountability English language arts (ELA) and mathematics…
Descriptors: English (Second Language), Glossaries, Testing Accommodations, Measurement
Dadey, Nathan; Lyons, Susan; DePascale, Charles – Applied Measurement in Education, 2018
Evidence of comparability is generally needed whenever there are variations in the conditions of an assessment administration, including variations introduced by the administration of an assessment on multiple digital devices (e.g., tablet, laptop, desktop). This article is meant to provide a comprehensive examination of issues relevant to the…
Descriptors: Evaluation Methods, Computer Assisted Testing, Educational Technology, Technology Uses in Education
Davis, Laurie Laughlin; Kong, Xiaojing; McBride, Yuanyuan; Morrison, Kristin M. – Applied Measurement in Education, 2017
The definition of what it means to take a test online continues to evolve with the inclusion of a broader range of item types and a wide array of devices used by students to access test content. To assure the validity and reliability of test scores for all students, device comparability research should be conducted to evaluate the impact of…
Descriptors: Educational Technology, Technology Uses in Education, High School Students, Tests
Buzick, Heather; Oliveri, Maria Elena; Attali, Yigal; Flor, Michael – Applied Measurement in Education, 2016
Automated essay scoring is a developing technology that can provide efficient scoring of large numbers of written responses. Its use in higher education admissions testing provides an opportunity to collect validity and fairness evidence to support current uses and inform its emergence in other areas such as K-12 large-scale assessment. In this…
Descriptors: Essays, Learning Disabilities, Attention Deficit Hyperactivity Disorder, Scoring