ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	3
Since 2006 (last 20 years)	5

Source

Applied Measurement in…

Publication Type

Journal Articles	14
Reports - Evaluative	14
Speeches/Meeting Papers	2
Book/Product Reviews	1
Information Analyses	1

Education Level

Elementary Education	1
Grade 4	1
High Schools	1
Secondary Education	1

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

National Assessment of…	1
Wechsler Intelligence Scale…	1
Woodcock Johnson Tests of…	1

What Works Clearinghouse Rating

Showing all 14 results Save | Export

Comparison of Two Approaches to Interpretive Use Arguments

Peer reviewed

Direct link

Carney, Michele; Crawford, Angela; Siebert, Carl; Osguthorpe, Rich; Thiede, Keith – Applied Measurement in Education, 2019

The "Standards for Educational and Psychological Testing" recommend an argument-based approach to validation that involves a clear statement of the intended interpretation and use of test scores, the identification of the underlying assumptions and inferences in that statement--termed the interpretation/use argument, and gathering of…

Descriptors: Inquiry, Test Interpretation, Validity, Scores

Challenges to the Cattell-Horn-Carroll Theory: Empirical, Clinical, and Policy Implications

Peer reviewed

Direct link

Canivez, Gary L.; Youngstrom, Eric A. – Applied Measurement in Education, 2019

The Cattell-Horn-Carroll (CHC) taxonomy of cognitive abilities married John Horn and Raymond Cattell's Extended Gf-Gc theory with John Carroll's Three-Stratum Theory. While there are some similarities in arrangements or classifications of tasks (observed variables) within similar broad or narrow dimensions, other salient theoretical features and…

Descriptors: Taxonomy, Cognitive Ability, Intelligence, Cognitive Tests

Prescribing Structure for Validation Arguments: Elemental, Structural, and Ecological Validity

Peer reviewed

Direct link

Jacobson, Erik; Svetina, Dubravka – Applied Measurement in Education, 2019

Contingent argument-based approaches to validity require a unique argument for each use, in contrast to more prescriptive approaches that identify the common kinds of validity evidence researchers should consider for every use. In this article, we evaluate our use of an approach that is both prescriptive "and" argument-based to develop a…

Descriptors: Test Validity, Test Items, Test Construction, Test Interpretation

Evidence-Centered Assessment Design as a Foundation for Achievement-Level Descriptor Development and for Standard Setting

Peer reviewed

Direct link

Plake, Barbara S.; Huff, Kristen; Reshetar, Rosemary – Applied Measurement in Education, 2010

In many large-scale assessment programs, achievement level descriptors (ALDs) provide a critical role in communicating what scores on the assessment mean and in interpreting what examinees know and are able to do based on their test performance. Based on their test performance, examinees are often classified into performance categories. The…

Descriptors: Evidence, Test Construction, Measurement, Standard Setting

Using a Taxonomy of Differential Step Functioning to Improve the Interpretation of DIF in Polytomous Items: An Illustration

Peer reviewed

Direct link

Penfield, Randall D.; Alvarez, Karina; Lee, Okhee – Applied Measurement in Education, 2009

The assessment of differential item functioning (DIF) in polytomous items addresses between-group differences in measurement properties at the item level, but typically does not inform which score levels may be involved in the DIF effect. The framework of differential step functioning (DSF) addresses this issue by examining between-group…

Descriptors: Test Bias, Classification, Test Items, Criteria

Validating Licensing and Certification Test Score Interpretations and Decisions: A Response.

Peer reviewed

Mehrens, William A. – Applied Measurement in Education, 1997

This commentary on articles in this special issue generally agrees with the viewpoints expressed, although it argues that in some cases the authors of these articles should have expanded on certain issues. Many comments relate to the legal defensibility of the positions taken. (SLD)

Descriptors: Certification, Decision Making, Licensing Examinations (Professions), Performance Based Assessment

The Precision of Measurements.

Peer reviewed

Kane, Michael – Applied Measurement in Education, 1996

This overview of the role of error and tolerance for error in measurement asserts that the generic precision associated with a measurement procedure is defined as the root mean square error, or standard error, in some relevant population. This view of precision is explored in several applications of measurement. (SLD)

Descriptors: Error of Measurement, Error Patterns, Generalizability Theory, Measurement Techniques

Measuring the Impact of Judge Severity on Examination Scores.

Peer reviewed

Lunz, Mary E.; And Others – Applied Measurement in Education, 1990

An extension of the Rasch model is used to obtain objective measurements for examinations graded by judges. The model calibrates elements of each facet of the examination on a common log-linear scale. Real examination data illustrate the way correcting for judge severity improves fairness of examinee measures. (SLD)

Descriptors: Certification, Difficulty Level, Interrater Reliability, Judges

Validating Inferences from National Assessment of Educational Progress Achievement-Level Reporting.

Peer reviewed

Linn, Robert L. – Applied Measurement in Education, 1998

The validity of interpretations of National Assessment of Educational Progress (NAEP) achievement levels is evaluated by focusing on evidence regarding three types of discrepancies: (1) between standards; (2) among descriptions of achievement levels; and (3) between assessments and content standards. All of these discrepancies raise serious…

Descriptors: Academic Achievement, Achievement Tests, Elementary Secondary Education, National Surveys

Using Multidimensional Item Response Theory to Understand What Items and Tests Are Measuring.

Peer reviewed

Ackerman, Terry A. – Applied Measurement in Education, 1994

When item response data do not satisfy the unidimensionality assumption, multidimensional item response theory (MIRT) should be used to model the item-examinee interaction. This article presents and discusses MIRT analyses designed to give better insight into what individual items are measuring. (SLD)

Descriptors: Evaluation Methods, Item Response Theory, Measurement Techniques, Models

Test Item Development: Validity Evidence from Quality Assurance Procedures.

Peer reviewed

Downing, Steven M.; Haladyna, Thomas M. – Applied Measurement in Education, 1997

An ideal process is outlined for test item development and the study of item responses to ensure that tests are sound. Qualitative and quantitative methods are used to assess the item-level validity evidence for high-stakes examinations. A checklist for assessment is provided. (SLD)

Descriptors: High Stakes Tests, Item Response Theory, Qualitative Research, Quality Control

Psychometric Issues in Testing Students with Disabilities.

Peer reviewed

Geisinger, Kurt F. – Applied Measurement in Education, 1994

Federal law requires that individuals with handicapping conditions be administered assessments in ways that accommodate their disabilities without penalizing them. Validation studies are needed to evaluate the meaning of scores resulting from nonstandard test administrations. The limited number of these studies to date is reviewed. (SLD)

Descriptors: Disabilities, Educational Assessment, Elementary School Students, Elementary Secondary Education

Customized Tests and Customized Norms.

Peer reviewed

Linn, Robert L.; Hambleton, Ronald K. – Applied Measurement in Education, 1991

Four main approaches to customized testing are described, and their resulting scores' valid uses and interpretations are discussed. Customized testing can yield valid normative and curriculum-specific information, although cautious application is needed to avoid misleading inferences about student achievement. (SLD)

Descriptors: Academic Achievement, Accountability, Criterion Referenced Tests, Curriculum

Quality Control in the Development and Use of Performance Assessments.

Peer reviewed

Dunbar, Stephen B.; And Others – Applied Measurement in Education, 1991

Issues pertaining to the quality of performance assessments, including reliability and validity, are discussed. The relatively limited generalizability of performance across tasks is indicative of the care needed to evaluate performance assessments. Quality control is an empirical matter when measurement is intended to inform public policy. (SLD)

Descriptors: Educational Assessment, Generalization, Interrater Reliability, Measurement Techniques

Test Interpretation	14
Scores	7
Test Construction	6
Validity	6
Measurement Techniques	4
Test Items	4
Test Use	4
Test Validity	4
Measurement	3
Academic Achievement	2
Certification	2
Classification	2
Educational Assessment	2
Elementary Secondary Education	2
Evidence	2
Inferences	2
Interrater Reliability	2
Item Response Theory	2
Licensing Examinations…	2
Mathematics	2
Performance Based Assessment	2
Quality Control	2
Scoring	2
Test Content	2
Test Results	2
More ▼

Linn, Robert L.	2
Ackerman, Terry A.	1
Alvarez, Karina	1
Canivez, Gary L.	1
Carney, Michele	1
Crawford, Angela	1
Downing, Steven M.	1
Dunbar, Stephen B.	1
Geisinger, Kurt F.	1
Haladyna, Thomas M.	1
Hambleton, Ronald K.	1
Huff, Kristen	1
Jacobson, Erik	1
Kane, Michael	1
Lee, Okhee	1
Lunz, Mary E.	1
Mehrens, William A.	1
Osguthorpe, Rich	1
Penfield, Randall D.	1
Plake, Barbara S.	1
Reshetar, Rosemary	1
Siebert, Carl	1
Svetina, Dubravka	1
Thiede, Keith	1
More ▼