Publication Date
In 2025 | 0 |
Since 2024 | 2 |
Since 2021 (last 5 years) | 4 |
Since 2016 (last 10 years) | 13 |
Since 2006 (last 20 years) | 19 |
Descriptor
Source
Applied Measurement in… | 34 |
Author
Wise, Steven L. | 3 |
Davis, Laurie Laughlin | 2 |
Engelhard, George, Jr. | 2 |
Haberman, Shelby | 2 |
Kong, Xiaojing | 2 |
Rios, Joseph A. | 2 |
Soland, James | 2 |
Barbara Schneider | 1 |
Bridgeman, Brent | 1 |
Buzick, Heather | 1 |
Canbolat, Yusuf | 1 |
More ▼ |
Publication Type
Journal Articles | 34 |
Reports - Research | 22 |
Reports - Evaluative | 11 |
Information Analyses | 3 |
Reports - Descriptive | 1 |
Speeches/Meeting Papers | 1 |
Tests/Questionnaires | 1 |
Education Level
Secondary Education | 5 |
Grade 3 | 3 |
Grade 8 | 3 |
High Schools | 3 |
Elementary Education | 2 |
Elementary Secondary Education | 2 |
Grade 11 | 2 |
Grade 7 | 2 |
Higher Education | 2 |
Junior High Schools | 2 |
Middle Schools | 2 |
More ▼ |
Audience
Laws, Policies, & Programs
Assessments and Surveys
SAT (College Admission Test) | 3 |
Georgia Criterion Referenced… | 1 |
National Assessment of… | 1 |
What Works Clearinghouse Rating
Sarah Alahmadi; Christine E. DeMars – Applied Measurement in Education, 2024
Large-scale educational assessments are sometimes considered low-stakes, increasing the possibility of confounding true performance level with low motivation. These concerns are amplified in remote testing conditions. To remove the effects of low effort levels in responses observed in remote low-stakes testing, several motivation filtering methods…
Descriptors: Multiple Choice Tests, Item Response Theory, College Students, Scores
Rutkowski, David; Rutkowski, Leslie; Valdivia, Dubravka Svetina; Canbolat, Yusuf; Underhill, Stephanie – Applied Measurement in Education, 2023
Several states in the US have removed time limits on their state assessments. In Indiana, where this study takes place, the state assessment is both untimed during the testing window and allows unlimited breaks during the testing session. Using grade 3 and 8 math and English state assessment data, in this paper we focus on time used for testing…
Descriptors: Testing, Time, Intervals, Academic Achievement
Rios, Joseph A. – Applied Measurement in Education, 2022
Testing programs are confronted with the decision of whether to report individual scores for examinees that have engaged in rapid guessing (RG). As noted by the "Standards for Educational and Psychological Testing," this decision should be based on a documented criterion that determines score exclusion. To this end, a number of heuristic…
Descriptors: Testing, Guessing (Tests), Academic Ability, Scores
Yiling Cheng; I-Chien Chen; Barbara Schneider; Mark Reckase; Joseph Krajcik – Applied Measurement in Education, 2024
The current study expands on previous research on gender differences and similarities in science test scores. Using three different approaches -- differential item functioning, differential distractor functioning, and decision tree analysis -- we examine a high school science assessment administered to 3,849 10th-12th graders, of whom 2,021 are…
Descriptors: Gender Differences, Science Achievement, Responses, Testing
Soland, James – Applied Measurement in Education, 2018
This study estimated male-female and Black-White achievement gaps without accounting for low test motivation, then compared those estimates to ones that used several approaches to addressing rapid guessing. Researchers investigated two issues: (1) The differences in rates of rapid guessing across subgroups and (2) How much achievement gap…
Descriptors: Guessing (Tests), Achievement Gap, Student Motivation, Learner Engagement
Wise, Steven L.; Kuhfeld, Megan R.; Soland, James – Applied Measurement in Education, 2019
When we administer educational achievement tests, we want to be confident that the resulting scores validly indicate what the test takers know and can do. However, if the test is perceived as low stakes by the test taker, disengaged test taking sometimes occurs, which poses a serious threat to score validity. When computer-based tests are used,…
Descriptors: Guessing (Tests), Computer Assisted Testing, Achievement Tests, Scores
Davis, Laurie Laughlin; Kong, Xiaojing; McBride, Yuanyuan; Morrison, Kristin M. – Applied Measurement in Education, 2017
The definition of what it means to take a test online continues to evolve with the inclusion of a broader range of item types and a wide array of devices used by students to access test content. To assure the validity and reliability of test scores for all students, device comparability research should be conducted to evaluate the impact of…
Descriptors: Educational Technology, Technology Uses in Education, High School Students, Tests
Quesen, Sarah; Lane, Suzanne – Applied Measurement in Education, 2019
This study examined the effect of similar vs. dissimilar proficiency distributions on uniform DIF detection on a statewide eighth grade mathematics assessment. Results from the similar- and dissimilar-ability reference groups with an SWD focal group were compared for four models: logistic regression, hierarchical generalized linear model (HGLM),…
Descriptors: Test Items, Mathematics Tests, Grade 8, Item Response Theory
Pan, Tianshu; Yin, Yue – Applied Measurement in Education, 2017
In this article, we propose using the Bayes factors (BF) to evaluate person fit in item response theory models under the framework of Bayesian evaluation of an informative diagnostic hypothesis. We first discuss the theoretical foundation for this application and how to analyze person fit using BF. To demonstrate the feasibility of this approach,…
Descriptors: Bayesian Statistics, Goodness of Fit, Item Response Theory, Monte Carlo Methods
Buzick, Heather; Weeks, Jonathan – Applied Measurement in Education, 2018
Indicators of student academic growth are desired in state accountability systems in order to approximate student learning over time and attribute observed growth to schooling inputs. Through an extant analysis of five states' assessment data, this study offers evidence about whether longitudinal match rates and measures of growth differ at the…
Descriptors: Disabilities, Summative Evaluation, Academic Achievement, Achievement Gains
Lottridge, Susan; Wood, Scott; Shaw, Dan – Applied Measurement in Education, 2018
This study sought to provide a framework for evaluating machine score-ability of items using a new score-ability rating scale, and to determine the extent to which ratings were predictive of observed automated scoring performance. The study listed and described a set of factors that are thought to influence machine score-ability; these factors…
Descriptors: Program Effectiveness, Computer Assisted Testing, Test Scoring Machines, Scoring
Guo, Hongwen; Rios, Joseph A.; Haberman, Shelby; Liu, Ou Lydia; Wang, Jing; Paek, Insu – Applied Measurement in Education, 2016
Unmotivated test takers using rapid guessing in item responses can affect validity studies and teacher and institution performance evaluation negatively, making it critical to identify these test takers. The authors propose a new nonparametric method for finding response-time thresholds for flagging item responses that result from rapid-guessing…
Descriptors: Guessing (Tests), Reaction Time, Nonparametric Statistics, Models
Dadey, Nathan; Lyons, Susan; DePascale, Charles – Applied Measurement in Education, 2018
Evidence of comparability is generally needed whenever there are variations in the conditions of an assessment administration, including variations introduced by the administration of an assessment on multiple digital devices (e.g., tablet, laptop, desktop). This article is meant to provide a comprehensive examination of issues relevant to the…
Descriptors: Evaluation Methods, Computer Assisted Testing, Educational Technology, Technology Uses in Education
Puhan, Gautam; Sinharay, Sandip; Haberman, Shelby; Larkin, Kevin – Applied Measurement in Education, 2010
Will subscores provide additional information than what is provided by the total score? Is there a method that can estimate more trustworthy subscores than observed subscores? To answer the first question, this study evaluated whether the true subscore was more accurately predicted by the observed subscore or total score. To answer the second…
Descriptors: Licensing Examinations (Professions), Scores, Computation, Methods
Wolf, Mikyung Kim; Kim, Jinok; Kao, Jenny – Applied Measurement in Education, 2012
Glossary and reading aloud test items are commonly allowed in many states' accommodation policies for English language learner (ELL) students for large-scale mathematics assessments. However, little research is available regarding the effects of these accommodations on ELL students' performance. Further, no research exists that examines how…
Descriptors: Testing Accommodations, Glossaries, Reading Aloud to Others, Validity