NotesFAQContact Us
Collection
Advanced
Search Tips
Source
Applied Measurement in…120
Audience
Laws, Policies, & Programs
Race to the Top1
What Works Clearinghouse Rating
Showing 1 to 15 of 120 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Séverin Lions; María Paz Blanco; Pablo Dartnell; Carlos Monsalve; Gabriel Ortega; Julie Lemarié – Applied Measurement in Education, 2024
Multiple-choice items are universally used in formal education. Since they should assess learning, not test-wiseness or guesswork, they must be constructed following the highest possible standards. Hundreds of item-writing guides have provided guidelines to help test developers adopt appropriate strategies to define the distribution and sequence…
Descriptors: Test Construction, Multiple Choice Tests, Guidelines, Test Items
Peer reviewed Peer reviewed
Direct linkDirect link
Anne Traynor; Sara C. Christopherson – Applied Measurement in Education, 2024
Combining methods from earlier content validity and more contemporary content alignment studies may allow a more complete evaluation of the meaning of test scores than if either set of methods is used on its own. This article distinguishes item relevance indices in the content validity literature from test representativeness indices in the…
Descriptors: Test Validity, Test Items, Achievement Tests, Test Construction
Peer reviewed Peer reviewed
Direct linkDirect link
O'Dwyer, Eowyn P.; Sparks, Jesse R.; Nabors Oláh, Leslie – Applied Measurement in Education, 2023
A critical aspect of the development of culturally relevant classroom assessments is the design of tasks that affirm students' racial and ethnic identities and community cultural practices. This paper describes the process we followed to build a shared understanding of what culturally relevant assessments are, to pursue ways of bringing more…
Descriptors: Evaluation Methods, Culturally Relevant Education, Test Construction, Educational Research
Peer reviewed Peer reviewed
Direct linkDirect link
Leighton, Jacqueline P. – Applied Measurement in Education, 2021
The objective of this paper is to comment on the think-aloud methods presented in the three papers included in this special issue. The commentary offered stems from the author's own psychological investigations of unobservable information processes and the conditions under which the most defensible claims can be advanced. The structure of this…
Descriptors: Protocol Analysis, Data Collection, Test Construction, Test Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Mo, Ya; Carney, Michele; Cavey, Laurie; Totorica, Tatia – Applied Measurement in Education, 2021
There is a need for assessment items that assess complex constructs but can also be efficiently scored for evaluation of teacher education programs. In an effort to measure the construct of teacher attentiveness in an efficient and scalable manner, we are using exemplar responses elicited by constructed-response item prompts to develop…
Descriptors: Protocol Analysis, Test Items, Responses, Mathematics Teachers
Peer reviewed Peer reviewed
Direct linkDirect link
Lions, Séverin; Monsalve, Carlos; Dartnell, Pablo; Blanco, María Paz; Ortega, Gabriel; Lemarié, Julie – Applied Measurement in Education, 2022
Multiple-choice tests are widely used in education, often for high-stakes assessment purposes. Consequently, these tests should be constructed following the highest standards. Many efforts have been undertaken to advance item-writing guidelines intended to improve tests. One important issue is the unwanted effects of the options' position on test…
Descriptors: Multiple Choice Tests, High Stakes Tests, Test Construction, Guidelines
Peer reviewed Peer reviewed
Direct linkDirect link
Bostic, Jonathan David; Sondergeld, Toni A.; Matney, Gabriel; Stone, Gregory; Hicks, Tiara – Applied Measurement in Education, 2021
Response process validity evidence provides a window into a respondent's cognitive processing. The purpose of this study is to describe a new data collection tool called a whole-class think aloud (WCTA). This work is performed as part of test development for a series of problem-solving measures to be used in elementary and middle grades. Data from…
Descriptors: Data Collection, Protocol Analysis, Problem Solving, Cognitive Processes
Peer reviewed Peer reviewed
Direct linkDirect link
Abu-Ghazalah, Rashid M.; Dubins, David N.; Poon, Gregory M. K. – Applied Measurement in Education, 2023
Multiple choice results are inherently probabilistic outcomes, as correct responses reflect a combination of knowledge and guessing, while incorrect responses additionally reflect blunder, a confidently committed mistake. To objectively resolve knowledge from responses in an MC test structure, we evaluated probabilistic models that explicitly…
Descriptors: Guessing (Tests), Multiple Choice Tests, Probability, Models
Peer reviewed Peer reviewed
Direct linkDirect link
Canivez, Gary L.; Youngstrom, Eric A. – Applied Measurement in Education, 2019
The Cattell-Horn-Carroll (CHC) taxonomy of cognitive abilities married John Horn and Raymond Cattell's Extended Gf-Gc theory with John Carroll's Three-Stratum Theory. While there are some similarities in arrangements or classifications of tasks (observed variables) within similar broad or narrow dimensions, other salient theoretical features and…
Descriptors: Taxonomy, Cognitive Ability, Intelligence, Cognitive Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Wyse, Adam E. – Applied Measurement in Education, 2018
An important consideration in standard setting is recruiting a group of panelists with different experiences and backgrounds to serve on the standard-setting panel. This study uses data from 14 different Angoff standard settings from a variety of medical imaging credentialing programs to examine whether people with different professional roles and…
Descriptors: Standard Setting (Scoring), Test Construction, Cutting Scores, Accuracy
Peer reviewed Peer reviewed
Direct linkDirect link
Jacobson, Erik; Svetina, Dubravka – Applied Measurement in Education, 2019
Contingent argument-based approaches to validity require a unique argument for each use, in contrast to more prescriptive approaches that identify the common kinds of validity evidence researchers should consider for every use. In this article, we evaluate our use of an approach that is both prescriptive "and" argument-based to develop a…
Descriptors: Test Validity, Test Items, Test Construction, Test Interpretation
Peer reviewed Peer reviewed
Direct linkDirect link
Bonner, Sarah; Chen, Peggy; Jones, Kristi; Milonovich, Brandon – Applied Measurement in Education, 2021
We describe the use of think alouds to examine substantive processes involved in performance on a formative assessment of computational thinking (CT) designed to support self-regulated learning (SRL). Our task design model included three phases of work on a computational thinking problem: forethought, performance, and reflection. The cognitive…
Descriptors: Formative Evaluation, Thinking Skills, Metacognition, Computer Science Education
Peer reviewed Peer reviewed
Direct linkDirect link
Yannakoudakis, Helen; Andersen, Øistein E.; Geranpayeh, Ardeshir; Briscoe, Ted; Nicholls, Diane – Applied Measurement in Education, 2018
There are quite a few challenges in the development of an automated writing placement model for non-native English learners, among them the fact that exams that encompass the full range of language proficiency exhibited at different stages of learning are hard to design. However, acquisition of appropriate training data that are relevant to the…
Descriptors: Automation, Data Processing, Student Placement, English Language Learners
Peer reviewed Peer reviewed
Direct linkDirect link
Rupp, André A. – Applied Measurement in Education, 2018
This article discusses critical methodological design decisions for collecting, interpreting, and synthesizing empirical evidence during the design, deployment, and operational quality-control phases for automated scoring systems. The discussion is inspired by work on operational large-scale systems for automated essay scoring but many of the…
Descriptors: Design, Automation, Scoring, Test Scoring Machines
Peer reviewed Peer reviewed
Direct linkDirect link
Antal, Judit; Proctor, Thomas P.; Melican, Gerald J. – Applied Measurement in Education, 2014
In common-item equating the anchor block is generally built to represent a miniature form of the total test in terms of content and statistical specifications. The statistical properties frequently reflect equal mean and spread of item difficulty. Sinharay and Holland (2007) suggested that the requirement for equal spread of difficulty may be too…
Descriptors: Test Items, Equated Scores, Difficulty Level, Item Response Theory
Previous Page | Next Page »
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7  |  8