Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 0 |
Since 2006 (last 20 years) | 1 |
Descriptor
Criteria | 7 |
Evaluation Methods | 3 |
Comparative Analysis | 2 |
Decision Making | 2 |
Performance Based Assessment | 2 |
Standard Setting (Scoring) | 2 |
Standards | 2 |
Statistical Analysis | 2 |
Test Items | 2 |
Validity | 2 |
Classification | 1 |
More ▼ |
Source
Applied Measurement in… | 7 |
Author
Alvarez, Karina | 1 |
Berk, Ronald A. | 1 |
Feldt, Leonard S. | 1 |
Fisher, Steve | 1 |
Holland, Paul W. | 1 |
Johnson, Robert L. | 1 |
Kuhs, Therese | 1 |
Lee, Okhee | 1 |
Penfield, Randall D. | 1 |
Penny, Jim | 1 |
Plake, Barbara S. | 1 |
More ▼ |
Publication Type
Journal Articles | 7 |
Reports - Evaluative | 5 |
Information Analyses | 3 |
Reports - Research | 1 |
Education Level
Elementary Education | 1 |
Grade 4 | 1 |
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
SAT (College Admission Test) | 1 |
What Works Clearinghouse Rating
Penfield, Randall D.; Alvarez, Karina; Lee, Okhee – Applied Measurement in Education, 2009
The assessment of differential item functioning (DIF) in polytomous items addresses between-group differences in measurement properties at the item level, but typically does not inform which score levels may be involved in the DIF effect. The framework of differential step functioning (DSF) addresses this issue by examining between-group…
Descriptors: Test Bias, Classification, Test Items, Criteria

Feldt, Leonard S. – Applied Measurement in Education, 1997
It has often been asserted that the reliability of a measure places an upper limit on its validity. This article demonstrates in theory that validity can rise when reliability declines, even when validity evidence is a correlation with an acceptable criterion. Whether empirical examples can actually be found is an open question. (SLD)
Descriptors: Correlation, Criteria, Reliability, Test Construction

Holland, Paul W.; Wainer, Howard – Applied Measurement in Education, 1990
The attempt by D.Edwards and C. B. Cummings to adjust state mean Scholastic Aptitude Test Scores for differential participation rates with a "fuzzy truncation model" satisfies three criteria the authors previously defined but falls short for two. Omission of sensitivity studies mars the otherwise exemplary study. (SLD)
Descriptors: College Entrance Examinations, Criteria, Higher Education, Participation
Johnson, Robert L.; Penny, Jim; Fisher, Steve; Kuhs, Therese – Applied Measurement in Education, 2003
When raters assign different scores to a performance task, a method for resolving rating differences is required to report a single score to the examinee. Recent studies indicate that decisions about examinees, such as pass/fail decisions, differ across resolution methods. Previous studies also investigated the interrater reliability of…
Descriptors: Test Reliability, Test Validity, Scores, Interrater Reliability

Berk, Ronald A. – Applied Measurement in Education, 1995
A brief summary of standard setting knowledge is presented, derived from about 20 methods that utilize a judgmental review process, the approach most relevant to the standard-setting strategies proposed in this special issue. Criteria for judging effectiveness and critiques of the methods discussed in the issue are offered. (SLD)
Descriptors: Criteria, Decision Making, Educational History, Evaluation Methods

Williams, Valerie S. L. – Applied Measurement in Education, 1997
Using item response theory to investigate differential item functioning (DIF), students' expected course grades were examined and found to function similarly across sex and race. These grades were incorporated into the matching criterion, enhancing the validity of subgroup comparisons for the third-grade mathematics test taken by 1,050 students.…
Descriptors: Comparative Analysis, Criteria, Elementary School Students, Grade 3

Plake, Barbara S. – Applied Measurement in Education, 1995
This article provides a framework for the rest of the articles in this special issue comparing the utility of three standard-setting methods with complex performance assessments. The context of the standard setting study is described, and the methods are outlined. (SLD)
Descriptors: Comparative Analysis, Criteria, Decision Making, Educational Assessment