NotesFAQContact Us
Collection
Advanced
Search Tips
Source
Applied Measurement in…63
Audience
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 63 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Lederman, Josh – Applied Measurement in Education, 2023
Given its centrality to assessment, until the concept of validity includes concern for racial justice, such matters will be seen as residing outside the "real" work of validation, rendering them powerless to count against the apparent scientific merit of the test. As the definition of validity has evolved, however, it holds great…
Descriptors: Educational Assessment, Validity, Social Justice, Race
Peer reviewed Peer reviewed
Direct linkDirect link
Marcelo Andrade da Silva; A. Corinne Huggins-Manley; Jorge Luis Bazán; Amber Benedict – Applied Measurement in Education, 2024
A Q-matrix is a binary matrix that defines the relationship between items and latent variables and is widely used in diagnostic classification models (DCMs), and can also be adopted in multidimensional item response theory (MIRT) models. The construction process of the Q-matrix is typically carried out by experts in the subject area of the items…
Descriptors: Q Methodology, Matrices, Item Response Theory, Educational Assessment
Peer reviewed Peer reviewed
Direct linkDirect link
Wise, Steven; Kuhfeld, Megan – Applied Measurement in Education, 2021
Effort-moderated (E-M) scoring is intended to estimate how well a disengaged test taker would have performed had they been fully engaged. It accomplishes this adjustment by excluding disengaged responses from scoring and estimating performance from the remaining responses. The scoring method, however, assumes that the remaining responses are not…
Descriptors: Scoring, Achievement Tests, Identification, Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Finch, Holmes – Applied Measurement in Education, 2022
Much research has been devoted to identification of differential item functioning (DIF), which occurs when the item responses for individuals from two groups differ after they are conditioned on the latent trait being measured by the scale. There has been less work examining differential step functioning (DSF), which is present for polytomous…
Descriptors: Comparative Analysis, Item Response Theory, Item Analysis, Simulation
Peer reviewed Peer reviewed
Direct linkDirect link
Carney, Michele; Paulding, Katie; Champion, Joe – Applied Measurement in Education, 2022
Teachers need ways to efficiently assess students' cognitive understanding. One promising approach involves easily adapted and administered item types that yield quantitative scores that can be interpreted in terms of whether or not students likely possess key understandings. This study illustrates an approach to analyzing response process…
Descriptors: Middle School Students, Logical Thinking, Mathematical Logic, Problem Solving
Peer reviewed Peer reviewed
Direct linkDirect link
Bostic, Jonathan David – Applied Measurement in Education, 2021
Think alouds are valuable tools for academicians, test developers, and practitioners as they provide a unique window into a respondent's thinking during an assessment. The purpose of this special issue is to highlight novel ways to use think alouds as a means to gather evidence about respondents' thinking. An intended outcome from this special…
Descriptors: Protocol Analysis, Cognitive Processes, Data Collection, STEM Education
Peer reviewed Peer reviewed
Direct linkDirect link
Wise, Steven L. – Applied Measurement in Education, 2019
The identification of rapid guessing is important to promote the validity of achievement test scores, particularly with low-stakes tests. Effective methods for identifying rapid guesses require reliable threshold methods that are also aligned with test taker behavior. Although several common threshold methods are based on rapid guessing response…
Descriptors: Guessing (Tests), Identification, Reaction Time, Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Mo, Ya; Carney, Michele; Cavey, Laurie; Totorica, Tatia – Applied Measurement in Education, 2021
There is a need for assessment items that assess complex constructs but can also be efficiently scored for evaluation of teacher education programs. In an effort to measure the construct of teacher attentiveness in an efficient and scalable manner, we are using exemplar responses elicited by constructed-response item prompts to develop…
Descriptors: Protocol Analysis, Test Items, Responses, Mathematics Teachers
Peer reviewed Peer reviewed
Direct linkDirect link
Carney, Michele; Crawford, Angela; Siebert, Carl; Osguthorpe, Rich; Thiede, Keith – Applied Measurement in Education, 2019
The "Standards for Educational and Psychological Testing" recommend an argument-based approach to validation that involves a clear statement of the intended interpretation and use of test scores, the identification of the underlying assumptions and inferences in that statement--termed the interpretation/use argument, and gathering of…
Descriptors: Inquiry, Test Interpretation, Validity, Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Myers, Aaron J.; Ames, Allison J.; Leventhal, Brian C.; Holzman, Madison A. – Applied Measurement in Education, 2020
When rating performance assessments, raters may ascribe different scores for the same performance when rubric application does not align with the intended application of the scoring criteria. Given performance assessment score interpretation assumes raters apply rubrics as rubric developers intended, misalignment between raters' scoring processes…
Descriptors: Scoring Rubrics, Validity, Item Response Theory, Interrater Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Wise, Steven L.; Kuhfeld, Megan R.; Soland, James – Applied Measurement in Education, 2019
When we administer educational achievement tests, we want to be confident that the resulting scores validly indicate what the test takers know and can do. However, if the test is perceived as low stakes by the test taker, disengaged test taking sometimes occurs, which poses a serious threat to score validity. When computer-based tests are used,…
Descriptors: Guessing (Tests), Computer Assisted Testing, Achievement Tests, Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Ketterlin-Geller, Leanne R.; Perry, Lindsey; Adams, Elizabeth – Applied Measurement in Education, 2019
Despite the call for an argument-based approach to validity over 25 years ago, few examples exist in the published literature. One possible explanation for this outcome is that the complexity of the argument-based approach makes implementation difficult. To counter this claim, we propose that the Assessment Triangle can serve as the overarching…
Descriptors: Validity, Educational Assessment, Models, Screening Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Wise, Steven L. – Applied Measurement in Education, 2015
Whenever the purpose of measurement is to inform an inference about a student's achievement level, it is important that we be able to trust that the student's test score accurately reflects what that student knows and can do. Such trust requires the assumption that a student's test event is not unduly influenced by construct-irrelevant factors…
Descriptors: Achievement Tests, Scores, Validity, Test Items
Peer reviewed Peer reviewed
Direct linkDirect link
Oliveri, Maria; McCaffrey, Daniel; Ezzo, Chelsea; Holtzman, Steven – Applied Measurement in Education, 2017
The assessment of noncognitive traits is challenging due to possible response biases, "subjectivity" and "faking." Standardized third-party evaluations where an external evaluator rates an applicant on their strengths and weaknesses on various noncognitive traits are a promising alternative. However, accurate score-based…
Descriptors: Factor Analysis, Decision Making, College Admission, Likert Scales
Peer reviewed Peer reviewed
Direct linkDirect link
De Lisle, Jerome – Applied Measurement in Education, 2015
This article explores the challenge of setting performance standards in a non-Western context. The study is centered on standard-setting practice in the national learning assessments of Trinidad and Tobago. Quantitative and qualitative data from annual evaluations between 2005 and 2009 were compiled, analyzed, and deconstructed. In the mixed…
Descriptors: Foreign Countries, National Standards, Educational Assessment, Standard Setting
Previous Page | Next Page »
Pages: 1  |  2  |  3  |  4  |  5