ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	5
Since 2016 (last 10 years)	8
Since 2006 (last 20 years)	17

Descriptor

Test Construction	35
Test Validity	20
Validity	13
Test Items	11
Scores	9
Evidence	8
Test Use	6
Achievement Tests	5
Item Response Theory	5
Reliability	5
Test Interpretation	5
Test Reliability	5
Advanced Placement Programs	4
Equivalency Tests	4
Inferences	4
Models	4
Protocol Analysis	4
Test Content	4
Adaptive Testing	3
Alignment (Education)	3
Classification	3
Computer Assisted Testing	3
Curriculum	3
Data Collection	3
Decision Making	3
More ▼

Source

Applied Measurement in…

Publication Type

Journal Articles	35
Reports - Evaluative	13
Reports - Research	13
Reports - Descriptive	9
Speeches/Meeting Papers	2
Information Analyses	1
Opinion Papers	1

Education Level

High Schools	5
Secondary Education	5
Elementary Secondary Education	3
Elementary Education	1
Grade 11	1
Grade 3	1
Grade 6	1
Grade 8	1
Grade 9	1
Higher Education	1
Junior High Schools	1
Middle Schools	1
Postsecondary Education	1
More ▼

Audience

Location

Canada	2
New York (New York)	1

Laws, Policies, & Programs

Race to the Top

Assessments and Surveys

Texas Assessment of Academic…

What Works Clearinghouse Rating

Showing 1 to 15 of 35 results Save | Export

Using Content Relevance and Representativeness Indices in Instrument Revision

Peer reviewed

Direct link

Anne Traynor; Sara C. Christopherson – Applied Measurement in Education, 2024

Combining methods from earlier content validity and more contemporary content alignment studies may allow a more complete evaluation of the meaning of test scores than if either set of methods is used on its own. This article distinguishes item relevance indices in the content validity literature from test representativeness indices in the…

Descriptors: Test Validity, Test Items, Achievement Tests, Test Construction

Rethinking Think-Alouds: The Often-Problematic Collection of Response Process Data

Peer reviewed

Direct link

Leighton, Jacqueline P. – Applied Measurement in Education, 2021

The objective of this paper is to comment on the think-aloud methods presented in the three papers included in this special issue. The commentary offered stems from the author's own psychological investigations of unobservable information processes and the conditions under which the most defensible claims can be advanced. The structure of this…

Descriptors: Protocol Analysis, Data Collection, Test Construction, Test Validity

Using Think-Alouds for Response Process Evidence of Teacher Attentiveness

Peer reviewed

Direct link

Mo, Ya; Carney, Michele; Cavey, Laurie; Totorica, Tatia – Applied Measurement in Education, 2021

There is a need for assessment items that assess complex constructs but can also be efficiently scored for evaluation of teacher education programs. In an effort to measure the construct of teacher attentiveness in an efficient and scalable manner, we are using exemplar responses elicited by constructed-response item prompts to develop…

Descriptors: Protocol Analysis, Test Items, Responses, Mathematics Teachers

Gathering Response Process Data for a Problem-Solving Measure through Whole-Class Think Alouds

Peer reviewed

Direct link

Bostic, Jonathan David; Sondergeld, Toni A.; Matney, Gabriel; Stone, Gregory; Hicks, Tiara – Applied Measurement in Education, 2021

Response process validity evidence provides a window into a respondent's cognitive processing. The purpose of this study is to describe a new data collection tool called a whole-class think aloud (WCTA). This work is performed as part of test development for a series of problem-solving measures to be used in elementary and middle grades. Data from…

Descriptors: Data Collection, Protocol Analysis, Problem Solving, Cognitive Processes

Prescribing Structure for Validation Arguments: Elemental, Structural, and Ecological Validity

Peer reviewed

Direct link

Jacobson, Erik; Svetina, Dubravka – Applied Measurement in Education, 2019

Contingent argument-based approaches to validity require a unique argument for each use, in contrast to more prescriptive approaches that identify the common kinds of validity evidence researchers should consider for every use. In this article, we evaluate our use of an approach that is both prescriptive "and" argument-based to develop a…

Descriptors: Test Validity, Test Items, Test Construction, Test Interpretation

Formative Assessment of Computational Thinking: Cognitive and Metacognitive Processes

Peer reviewed

Direct link

Bonner, Sarah; Chen, Peggy; Jones, Kristi; Milonovich, Brandon – Applied Measurement in Education, 2021

We describe the use of think alouds to examine substantive processes involved in performance on a formative assessment of computational thinking (CT) designed to support self-regulated learning (SRL). Our task design model included three phases of work on a computational thinking problem: forethought, performance, and reflection. The cognitive…

Descriptors: Formative Evaluation, Thinking Skills, Metacognition, Computer Science Education

In Search of Validity Evidence in Support of the Interpretation and Use of Assessments of Complex Constructs: Discussion of Research on Assessing 21st Century Skills

Peer reviewed

Direct link

Ercikan, Kadriye; Oliveri, María Elena – Applied Measurement in Education, 2016

Assessing complex constructs such as those discussed under the umbrella of 21st century constructs highlights the need for a principled assessment design and validation approach. In our discussion, we made a case for three considerations: (a) taking construct complexity into account across various stages of assessment development such as the…

Descriptors: Evaluation Methods, Test Construction, Design, Scaling

Designing, Evaluating, and Deploying Automated Scoring Systems with Validity in Mind: Methodological Design Decisions

Peer reviewed

Direct link

Rupp, André A. – Applied Measurement in Education, 2018

This article discusses critical methodological design decisions for collecting, interpreting, and synthesizing empirical evidence during the design, deployment, and operational quality-control phases for automated scoring systems. The discussion is inspired by work on operational large-scale systems for automated essay scoring but many of the…

Descriptors: Design, Automation, Scoring, Test Scoring Machines

Content Assessment Aligned to the Common Core State Standards: Improving Validity and Fairness for English Language Learners

Peer reviewed

Direct link

Chia, Magda Y. – Applied Measurement in Education, 2014

The Smarter Balanced Assessment Consortium (Smarter Balanced) serves over 19 million primary, middle, and high school students from across 26 states and affiliates (Smarter Balanced, n.d). As one of the two Race to the Top (RTT)-funded assessment consortia, Smarter Balanced is responsible for developing formative, interim, and summative…

Descriptors: State Standards, Academic Standards, Educational Assessment, English Language Learners

An Experimental Test of Student Verbal Reports and Teacher Evaluations as a Source of Validity Evidence for Test Development

Peer reviewed

Direct link

Leighton, Jacqueline P.; Heffernan, Colleen; Cor, M. Kenneth; Gokiert, Rebecca J.; Cui, Ying – Applied Measurement in Education, 2011

The "Standards for Educational and Psychological Testing" indicate that test instructions, and by extension item objectives, presented to examinees should be sufficiently clear and detailed to help ensure that they respond as developers intend them to respond (Standard 3.20; AERA, APA, & NCME, 1999). The present study investigates…

Descriptors: Test Construction, Validity, Evidence, Science Tests

Evidence-Centered Assessment Design and the Advanced Placement Program[R]: A Psychometrician's Perspective

Peer reviewed

Direct link

Brennan, Robert L. – Applied Measurement in Education, 2010

This paper provides an overview of evidence-centered assessment design (ECD) and some general information about of the Advanced Placement (AP[R]) Program. Then the papers in this special issue are discussed, as they relate to the use of ECD in the revision of various AP tests. This paper concludes with some observations about the need to validate…

Descriptors: Advanced Placement Programs, Equivalency Tests, Evidence, Test Construction

The Promises and Challenges of Implementing Evidence-Centered Design in Large-Scale Assessment

Peer reviewed

Direct link

Huff, Kristen; Steinberg, Linda; Matts, Thomas – Applied Measurement in Education, 2010

The cornerstone of evidence-centered assessment design (ECD) is an evidentiary argument that requires that each target of measurement (e.g., learning goal) for an assessment be expressed as a "claim" to be made about an examinee that is relevant to the specific purpose and audience(s) for the assessment. The "observable evidence" required to…

Descriptors: Advanced Placement Programs, Equivalency Tests, Evidence, Test Construction

Providing Subscale Scores for Diagnostic Information: A Case Study when the Test Is Essentially Unidimensional

Peer reviewed

Direct link

Stone, Clement A.; Ye, Feifei; Zhu, Xiaowen; Lane, Suzanne – Applied Measurement in Education, 2010

Although reliability of subscale scores may be suspect, subscale scores are the most common type of diagnostic information included in student score reports. This research compared methods for augmenting the reliability of subscale scores for an 8th-grade mathematics assessment. Yen's Objective Performance Index, Wainer et al.'s augmented scores,…

Descriptors: Item Response Theory, Case Studies, Reliability, Scores

Claims, Evidence, and Achievement-Level Descriptors as a Foundation for Item Design and Test Specifications

Peer reviewed

Direct link

Hendrickson, Amy; Huff, Kristen; Luecht, Richard – Applied Measurement in Education, 2010

Evidence-centered assessment design (ECD) explicates a transparent evidentiary argument to warrant the inferences we make from student test performance. This article describes how the vehicles for gathering student evidence--task models and test specifications--are developed. Task models, which are the basis for item development, flow directly…

Descriptors: Evidence, Test Construction, Measurement, Classification

Representing Targets of Measurement within Evidence-Centered Design

Peer reviewed

Direct link

Ewing, Maureen; Packman, Sheryl; Hamen, Cynthia; Thurber, Allison Clark – Applied Measurement in Education, 2010

In the last few years, the Advanced Placement (AP) Program[R] has used evidence-centered assessment design (ECD) to articulate the knowledge, skills, and abilities to be taught in the course and measured on the summative exam for four science courses, three history courses, and six world language courses; its application to calculus and English…

Descriptors: Advanced Placement Programs, Equivalency Tests, Evidence, Test Construction

Previous Page | Next Page »

Pages: 1 | 2 | 3

Downing, Steven M.	3
Haladyna, Thomas M.	2
Huff, Kristen	2
Leighton, Jacqueline P.	2
Linn, Robert L.	2
Mehrens, William A.	2
Anne Traynor	1
Bonner, Sarah	1
Bostic, Jonathan David	1
Brennan, Robert L.	1
Bush, M. Joan	1
Carney, Michele	1
Cavey, Laurie	1
Chen, Peggy	1
Chia, Magda Y.	1
Clauser, Brian E.	1
Coffman, Don D.	1
Cor, M. Kenneth	1
Cui, Ying	1
Denny, Patricia	1
Dunbar, Stephen B.	1
Ercikan, Kadriye	1
Ewing, Maureen	1
Feldt, Leonard S.	1
More ▼