NotesFAQContact Us
Collection
Advanced
Search Tips
Source
Applied Measurement in…34
Audience
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 34 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Sarah Alahmadi; Christine E. DeMars – Applied Measurement in Education, 2024
Large-scale educational assessments are sometimes considered low-stakes, increasing the possibility of confounding true performance level with low motivation. These concerns are amplified in remote testing conditions. To remove the effects of low effort levels in responses observed in remote low-stakes testing, several motivation filtering methods…
Descriptors: Multiple Choice Tests, Item Response Theory, College Students, Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Rutkowski, David; Rutkowski, Leslie; Valdivia, Dubravka Svetina; Canbolat, Yusuf; Underhill, Stephanie – Applied Measurement in Education, 2023
Several states in the US have removed time limits on their state assessments. In Indiana, where this study takes place, the state assessment is both untimed during the testing window and allows unlimited breaks during the testing session. Using grade 3 and 8 math and English state assessment data, in this paper we focus on time used for testing…
Descriptors: Testing, Time, Intervals, Academic Achievement
Peer reviewed Peer reviewed
Direct linkDirect link
Rios, Joseph A. – Applied Measurement in Education, 2022
Testing programs are confronted with the decision of whether to report individual scores for examinees that have engaged in rapid guessing (RG). As noted by the "Standards for Educational and Psychological Testing," this decision should be based on a documented criterion that determines score exclusion. To this end, a number of heuristic…
Descriptors: Testing, Guessing (Tests), Academic Ability, Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Yiling Cheng; I-Chien Chen; Barbara Schneider; Mark Reckase; Joseph Krajcik – Applied Measurement in Education, 2024
The current study expands on previous research on gender differences and similarities in science test scores. Using three different approaches -- differential item functioning, differential distractor functioning, and decision tree analysis -- we examine a high school science assessment administered to 3,849 10th-12th graders, of whom 2,021 are…
Descriptors: Gender Differences, Science Achievement, Responses, Testing
Peer reviewed Peer reviewed
Direct linkDirect link
Soland, James – Applied Measurement in Education, 2018
This study estimated male-female and Black-White achievement gaps without accounting for low test motivation, then compared those estimates to ones that used several approaches to addressing rapid guessing. Researchers investigated two issues: (1) The differences in rates of rapid guessing across subgroups and (2) How much achievement gap…
Descriptors: Guessing (Tests), Achievement Gap, Student Motivation, Learner Engagement
Peer reviewed Peer reviewed
Direct linkDirect link
Wise, Steven L.; Kuhfeld, Megan R.; Soland, James – Applied Measurement in Education, 2019
When we administer educational achievement tests, we want to be confident that the resulting scores validly indicate what the test takers know and can do. However, if the test is perceived as low stakes by the test taker, disengaged test taking sometimes occurs, which poses a serious threat to score validity. When computer-based tests are used,…
Descriptors: Guessing (Tests), Computer Assisted Testing, Achievement Tests, Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Davis, Laurie Laughlin; Kong, Xiaojing; McBride, Yuanyuan; Morrison, Kristin M. – Applied Measurement in Education, 2017
The definition of what it means to take a test online continues to evolve with the inclusion of a broader range of item types and a wide array of devices used by students to access test content. To assure the validity and reliability of test scores for all students, device comparability research should be conducted to evaluate the impact of…
Descriptors: Educational Technology, Technology Uses in Education, High School Students, Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Quesen, Sarah; Lane, Suzanne – Applied Measurement in Education, 2019
This study examined the effect of similar vs. dissimilar proficiency distributions on uniform DIF detection on a statewide eighth grade mathematics assessment. Results from the similar- and dissimilar-ability reference groups with an SWD focal group were compared for four models: logistic regression, hierarchical generalized linear model (HGLM),…
Descriptors: Test Items, Mathematics Tests, Grade 8, Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Pan, Tianshu; Yin, Yue – Applied Measurement in Education, 2017
In this article, we propose using the Bayes factors (BF) to evaluate person fit in item response theory models under the framework of Bayesian evaluation of an informative diagnostic hypothesis. We first discuss the theoretical foundation for this application and how to analyze person fit using BF. To demonstrate the feasibility of this approach,…
Descriptors: Bayesian Statistics, Goodness of Fit, Item Response Theory, Monte Carlo Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Buzick, Heather; Weeks, Jonathan – Applied Measurement in Education, 2018
Indicators of student academic growth are desired in state accountability systems in order to approximate student learning over time and attribute observed growth to schooling inputs. Through an extant analysis of five states' assessment data, this study offers evidence about whether longitudinal match rates and measures of growth differ at the…
Descriptors: Disabilities, Summative Evaluation, Academic Achievement, Achievement Gains
Peer reviewed Peer reviewed
Direct linkDirect link
Lottridge, Susan; Wood, Scott; Shaw, Dan – Applied Measurement in Education, 2018
This study sought to provide a framework for evaluating machine score-ability of items using a new score-ability rating scale, and to determine the extent to which ratings were predictive of observed automated scoring performance. The study listed and described a set of factors that are thought to influence machine score-ability; these factors…
Descriptors: Program Effectiveness, Computer Assisted Testing, Test Scoring Machines, Scoring
Peer reviewed Peer reviewed
Direct linkDirect link
Guo, Hongwen; Rios, Joseph A.; Haberman, Shelby; Liu, Ou Lydia; Wang, Jing; Paek, Insu – Applied Measurement in Education, 2016
Unmotivated test takers using rapid guessing in item responses can affect validity studies and teacher and institution performance evaluation negatively, making it critical to identify these test takers. The authors propose a new nonparametric method for finding response-time thresholds for flagging item responses that result from rapid-guessing…
Descriptors: Guessing (Tests), Reaction Time, Nonparametric Statistics, Models
Peer reviewed Peer reviewed
Direct linkDirect link
Dadey, Nathan; Lyons, Susan; DePascale, Charles – Applied Measurement in Education, 2018
Evidence of comparability is generally needed whenever there are variations in the conditions of an assessment administration, including variations introduced by the administration of an assessment on multiple digital devices (e.g., tablet, laptop, desktop). This article is meant to provide a comprehensive examination of issues relevant to the…
Descriptors: Evaluation Methods, Computer Assisted Testing, Educational Technology, Technology Uses in Education
Peer reviewed Peer reviewed
Direct linkDirect link
Puhan, Gautam; Sinharay, Sandip; Haberman, Shelby; Larkin, Kevin – Applied Measurement in Education, 2010
Will subscores provide additional information than what is provided by the total score? Is there a method that can estimate more trustworthy subscores than observed subscores? To answer the first question, this study evaluated whether the true subscore was more accurately predicted by the observed subscore or total score. To answer the second…
Descriptors: Licensing Examinations (Professions), Scores, Computation, Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Wolf, Mikyung Kim; Kim, Jinok; Kao, Jenny – Applied Measurement in Education, 2012
Glossary and reading aloud test items are commonly allowed in many states' accommodation policies for English language learner (ELL) students for large-scale mathematics assessments. However, little research is available regarding the effects of these accommodations on ELL students' performance. Further, no research exists that examines how…
Descriptors: Testing Accommodations, Glossaries, Reading Aloud to Others, Validity
Previous Page | Next Page ยป
Pages: 1  |  2  |  3