ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	3
Since 2006 (last 20 years)	4

Descriptor

Test Construction	7
Test Interpretation	7
Test Validity	4
Quality Control	3
Scores	3
Evidence	2
Inferences	2
Test Items	2
Test Use	2
Academic Achievement	1
Accountability	1
Achievement	1
Advanced Placement Programs	1
Automation	1
Best Practices	1
Classification	1
Cognitive Ability	1
Cognitive Tests	1
Computer Assisted Testing	1
Context Effect	1
Criterion Referenced Tests	1
Curriculum	1
Cutting Scores	1
Data Collection	1
Data Interpretation	1
More ▼

Source

Applied Measurement in…

Author

Canivez, Gary L.	1
Downing, Steven M.	1
Dunbar, Stephen B.	1
Haladyna, Thomas M.	1
Hambleton, Ronald K.	1
Huff, Kristen	1
Jacobson, Erik	1
Linn, Robert L.	1
Plake, Barbara S.	1
Reshetar, Rosemary	1
Rupp, André A.	1
Svetina, Dubravka	1
Youngstrom, Eric A.	1
More ▼

Publication Type

Journal Articles	7
Reports - Evaluative	6
Reports - Descriptive	1
Speeches/Meeting Papers	1

Education Level

High Schools	1
Secondary Education	1

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

Wechsler Intelligence Scale…	1
Woodcock Johnson Tests of…	1

What Works Clearinghouse Rating

Showing all 7 results Save | Export

Challenges to the Cattell-Horn-Carroll Theory: Empirical, Clinical, and Policy Implications

Peer reviewed

Direct link

Canivez, Gary L.; Youngstrom, Eric A. – Applied Measurement in Education, 2019

The Cattell-Horn-Carroll (CHC) taxonomy of cognitive abilities married John Horn and Raymond Cattell's Extended Gf-Gc theory with John Carroll's Three-Stratum Theory. While there are some similarities in arrangements or classifications of tasks (observed variables) within similar broad or narrow dimensions, other salient theoretical features and…

Descriptors: Taxonomy, Cognitive Ability, Intelligence, Cognitive Tests

Prescribing Structure for Validation Arguments: Elemental, Structural, and Ecological Validity

Peer reviewed

Direct link

Jacobson, Erik; Svetina, Dubravka – Applied Measurement in Education, 2019

Contingent argument-based approaches to validity require a unique argument for each use, in contrast to more prescriptive approaches that identify the common kinds of validity evidence researchers should consider for every use. In this article, we evaluate our use of an approach that is both prescriptive "and" argument-based to develop a…

Descriptors: Test Validity, Test Items, Test Construction, Test Interpretation

Designing, Evaluating, and Deploying Automated Scoring Systems with Validity in Mind: Methodological Design Decisions

Peer reviewed

Direct link

Rupp, André A. – Applied Measurement in Education, 2018

This article discusses critical methodological design decisions for collecting, interpreting, and synthesizing empirical evidence during the design, deployment, and operational quality-control phases for automated scoring systems. The discussion is inspired by work on operational large-scale systems for automated essay scoring but many of the…

Descriptors: Design, Automation, Scoring, Test Scoring Machines

Evidence-Centered Assessment Design as a Foundation for Achievement-Level Descriptor Development and for Standard Setting

Peer reviewed

Direct link

Plake, Barbara S.; Huff, Kristen; Reshetar, Rosemary – Applied Measurement in Education, 2010

In many large-scale assessment programs, achievement level descriptors (ALDs) provide a critical role in communicating what scores on the assessment mean and in interpreting what examinees know and are able to do based on their test performance. Based on their test performance, examinees are often classified into performance categories. The…

Descriptors: Evidence, Test Construction, Measurement, Standard Setting

Test Item Development: Validity Evidence from Quality Assurance Procedures.

Peer reviewed

Downing, Steven M.; Haladyna, Thomas M. – Applied Measurement in Education, 1997

An ideal process is outlined for test item development and the study of item responses to ensure that tests are sound. Qualitative and quantitative methods are used to assess the item-level validity evidence for high-stakes examinations. A checklist for assessment is provided. (SLD)

Descriptors: High Stakes Tests, Item Response Theory, Qualitative Research, Quality Control

Customized Tests and Customized Norms.

Peer reviewed

Linn, Robert L.; Hambleton, Ronald K. – Applied Measurement in Education, 1991

Four main approaches to customized testing are described, and their resulting scores' valid uses and interpretations are discussed. Customized testing can yield valid normative and curriculum-specific information, although cautious application is needed to avoid misleading inferences about student achievement. (SLD)

Descriptors: Academic Achievement, Accountability, Criterion Referenced Tests, Curriculum

Quality Control in the Development and Use of Performance Assessments.

Peer reviewed

Dunbar, Stephen B.; And Others – Applied Measurement in Education, 1991

Issues pertaining to the quality of performance assessments, including reliability and validity, are discussed. The relatively limited generalizability of performance across tasks is indicative of the care needed to evaluate performance assessments. Quality control is an empirical matter when measurement is intended to inform public policy. (SLD)

Descriptors: Educational Assessment, Generalization, Interrater Reliability, Measurement Techniques