ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	3
Since 2016 (last 10 years)	6
Since 2006 (last 20 years)	13

Descriptor

Test Construction	20
Test Validity	20
Evidence	6
Scores	6
Test Items	5
Test Reliability	5
Advanced Placement Programs	4
Equivalency Tests	4
Inferences	4
Test Interpretation	4
Achievement Tests	3
Alignment (Education)	3
Computer Assisted Testing	3
Data Collection	3
Decision Making	3
Educational Assessment	3
Item Response Theory	3
Models	3
Scaling	3
Test Content	3
Test Use	3
Design	2
Equated Scores	2
Evaluation Methods	2
Item Analysis	2
More ▼

Source

Applied Measurement in…

Publication Type

Journal Articles	20
Reports - Evaluative	9
Reports - Research	6
Reports - Descriptive	5
Speeches/Meeting Papers	2
Opinion Papers	1

Education Level

High Schools	4
Secondary Education	4
Elementary Education	1
Elementary Secondary Education	1
Grade 3	1
Grade 6	1
Grade 9	1

Audience

Location

Canada

Laws, Policies, & Programs

Race to the Top

Assessments and Surveys

What Works Clearinghouse Rating

Showing 1 to 15 of 20 results Save | Export

Using Content Relevance and Representativeness Indices in Instrument Revision

Peer reviewed

Direct link

Anne Traynor; Sara C. Christopherson – Applied Measurement in Education, 2024

Combining methods from earlier content validity and more contemporary content alignment studies may allow a more complete evaluation of the meaning of test scores than if either set of methods is used on its own. This article distinguishes item relevance indices in the content validity literature from test representativeness indices in the…

Descriptors: Test Validity, Test Items, Achievement Tests, Test Construction

Rethinking Think-Alouds: The Often-Problematic Collection of Response Process Data

Peer reviewed

Direct link

Leighton, Jacqueline P. – Applied Measurement in Education, 2021

The objective of this paper is to comment on the think-aloud methods presented in the three papers included in this special issue. The commentary offered stems from the author's own psychological investigations of unobservable information processes and the conditions under which the most defensible claims can be advanced. The structure of this…

Descriptors: Protocol Analysis, Data Collection, Test Construction, Test Validity

Gathering Response Process Data for a Problem-Solving Measure through Whole-Class Think Alouds

Peer reviewed

Direct link

Bostic, Jonathan David; Sondergeld, Toni A.; Matney, Gabriel; Stone, Gregory; Hicks, Tiara – Applied Measurement in Education, 2021

Response process validity evidence provides a window into a respondent's cognitive processing. The purpose of this study is to describe a new data collection tool called a whole-class think aloud (WCTA). This work is performed as part of test development for a series of problem-solving measures to be used in elementary and middle grades. Data from…

Descriptors: Data Collection, Protocol Analysis, Problem Solving, Cognitive Processes

Prescribing Structure for Validation Arguments: Elemental, Structural, and Ecological Validity

Peer reviewed

Direct link

Jacobson, Erik; Svetina, Dubravka – Applied Measurement in Education, 2019

Contingent argument-based approaches to validity require a unique argument for each use, in contrast to more prescriptive approaches that identify the common kinds of validity evidence researchers should consider for every use. In this article, we evaluate our use of an approach that is both prescriptive "and" argument-based to develop a…

Descriptors: Test Validity, Test Items, Test Construction, Test Interpretation

Designing, Evaluating, and Deploying Automated Scoring Systems with Validity in Mind: Methodological Design Decisions

Peer reviewed

Direct link

Rupp, André A. – Applied Measurement in Education, 2018

This article discusses critical methodological design decisions for collecting, interpreting, and synthesizing empirical evidence during the design, deployment, and operational quality-control phases for automated scoring systems. The discussion is inspired by work on operational large-scale systems for automated essay scoring but many of the…

Descriptors: Design, Automation, Scoring, Test Scoring Machines

Content Assessment Aligned to the Common Core State Standards: Improving Validity and Fairness for English Language Learners

Peer reviewed

Direct link

Chia, Magda Y. – Applied Measurement in Education, 2014

The Smarter Balanced Assessment Consortium (Smarter Balanced) serves over 19 million primary, middle, and high school students from across 26 states and affiliates (Smarter Balanced, n.d). As one of the two Race to the Top (RTT)-funded assessment consortia, Smarter Balanced is responsible for developing formative, interim, and summative…

Descriptors: State Standards, Academic Standards, Educational Assessment, English Language Learners

In Search of Validity Evidence in Support of the Interpretation and Use of Assessments of Complex Constructs: Discussion of Research on Assessing 21st Century Skills

Peer reviewed

Direct link

Ercikan, Kadriye; Oliveri, María Elena – Applied Measurement in Education, 2016

Assessing complex constructs such as those discussed under the umbrella of 21st century constructs highlights the need for a principled assessment design and validation approach. In our discussion, we made a case for three considerations: (a) taking construct complexity into account across various stages of assessment development such as the…

Descriptors: Evaluation Methods, Test Construction, Design, Scaling

Evidence-Centered Assessment Design and the Advanced Placement Program[R]: A Psychometrician's Perspective

Peer reviewed

Direct link

Brennan, Robert L. – Applied Measurement in Education, 2010

This paper provides an overview of evidence-centered assessment design (ECD) and some general information about of the Advanced Placement (AP[R]) Program. Then the papers in this special issue are discussed, as they relate to the use of ECD in the revision of various AP tests. This paper concludes with some observations about the need to validate…

Descriptors: Advanced Placement Programs, Equivalency Tests, Evidence, Test Construction

The Promises and Challenges of Implementing Evidence-Centered Design in Large-Scale Assessment

Peer reviewed

Direct link

Huff, Kristen; Steinberg, Linda; Matts, Thomas – Applied Measurement in Education, 2010

The cornerstone of evidence-centered assessment design (ECD) is an evidentiary argument that requires that each target of measurement (e.g., learning goal) for an assessment be expressed as a "claim" to be made about an examinee that is relevant to the specific purpose and audience(s) for the assessment. The "observable evidence" required to…

Descriptors: Advanced Placement Programs, Equivalency Tests, Evidence, Test Construction

Claims, Evidence, and Achievement-Level Descriptors as a Foundation for Item Design and Test Specifications

Peer reviewed

Direct link

Hendrickson, Amy; Huff, Kristen; Luecht, Richard – Applied Measurement in Education, 2010

Evidence-centered assessment design (ECD) explicates a transparent evidentiary argument to warrant the inferences we make from student test performance. This article describes how the vehicles for gathering student evidence--task models and test specifications--are developed. Task models, which are the basis for item development, flow directly…

Descriptors: Evidence, Test Construction, Measurement, Classification

Representing Targets of Measurement within Evidence-Centered Design

Peer reviewed

Direct link

Ewing, Maureen; Packman, Sheryl; Hamen, Cynthia; Thurber, Allison Clark – Applied Measurement in Education, 2010

In the last few years, the Advanced Placement (AP) Program[R] has used evidence-centered assessment design (ECD) to articulate the knowledge, skills, and abilities to be taught in the course and measured on the summative exam for four science courses, three history courses, and six world language courses; its application to calculus and English…

Descriptors: Advanced Placement Programs, Equivalency Tests, Evidence, Test Construction

Validity of the Simultaneous Approach to the Development of Equivalent Achievement Tests in English and French

Peer reviewed

Direct link

Rogers, W. Todd; Lin, Jie; Rinaldi, Christia M. – Applied Measurement in Education, 2011

The evidence gathered in the present study supports the use of the simultaneous development of test items for different languages. The simultaneous approach used in the present study involved writing an item in one language (e.g., French) and, before moving to the development of a second item, translating the item into the second language (e.g.,…

Descriptors: Test Items, Item Analysis, Achievement Tests, French

Has Item Response Theory Increased the Validity of Achievement Test Scores?

Peer reviewed

Linn, Robert L. – Applied Measurement in Education, 1990

The contribution of item response theory to the validity of interpretations of achievement test results is reviewed in the context of four applications. The applications include construction of scales for achievement tests, test construction, development of customized tests, and investigation of the influence of instruction on achievement tests.…

Descriptors: Achievement Tests, Elementary Secondary Education, Instructional Effectiveness, Item Response Theory

Item Type and Cognitive Ability Measured: The Validity Evidence for Multiple True-False Items in Medical Specialty Certification.

Peer reviewed

Downing, Steven M.; And Others – Applied Measurement in Education, 1995

The criterion-related validity evidence and other psychometric characteristics of multiple-choice and multiple true-false (MTF) items in medical specialty certification examinations were compared using results from 21,346 candidates. Advantages of MTF items and implications for test construction are discussed. (SLD)

Descriptors: Cognitive Ability, Licensing Examinations (Professions), Medical Education, Objective Tests

An Investigation of the Differential Effort Received by Items on a Low-Stakes Computer-Based Test

Peer reviewed

Direct link

Wise, Steven L. – Applied Measurement in Education, 2006

In low-stakes testing, the motivation levels of examinees are often a matter of concern to test givers because a lack of examinee effort represents a direct threat to the validity of the test data. This study investigated the use of response time to assess the amount of examinee effort received by individual test items. In 2 studies, it was found…

Descriptors: Computer Assisted Testing, Motivation, Test Validity, Item Response Theory

Previous Page | Next Page »

Pages: 1 | 2

Huff, Kristen	2
Linn, Robert L.	2
Anne Traynor	1
Bostic, Jonathan David	1
Brennan, Robert L.	1
Chia, Magda Y.	1
Coffman, Don D.	1
Downing, Steven M.	1
Dunbar, Stephen B.	1
Ercikan, Kadriye	1
Ewing, Maureen	1
Hambleton, Ronald K.	1
Hamen, Cynthia	1
Hendrickson, Amy	1
Hicks, Tiara	1
Jacobson, Erik	1
Leighton, Jacqueline P.	1
Lin, Jie	1
Luecht, Richard	1
Matney, Gabriel	1
Matts, Thomas	1
Mehrens, William A.	1
Millman, Jason	1
Oliveri, María Elena	1
More ▼