NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 7 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Penfield, Randall D.; Alvarez, Karina; Lee, Okhee – Applied Measurement in Education, 2009
The assessment of differential item functioning (DIF) in polytomous items addresses between-group differences in measurement properties at the item level, but typically does not inform which score levels may be involved in the DIF effect. The framework of differential step functioning (DSF) addresses this issue by examining between-group…
Descriptors: Test Bias, Classification, Test Items, Criteria
Peer reviewed Peer reviewed
Feldt, Leonard S. – Applied Measurement in Education, 1997
It has often been asserted that the reliability of a measure places an upper limit on its validity. This article demonstrates in theory that validity can rise when reliability declines, even when validity evidence is a correlation with an acceptable criterion. Whether empirical examples can actually be found is an open question. (SLD)
Descriptors: Correlation, Criteria, Reliability, Test Construction
Peer reviewed Peer reviewed
Holland, Paul W.; Wainer, Howard – Applied Measurement in Education, 1990
The attempt by D.Edwards and C. B. Cummings to adjust state mean Scholastic Aptitude Test Scores for differential participation rates with a "fuzzy truncation model" satisfies three criteria the authors previously defined but falls short for two. Omission of sensitivity studies mars the otherwise exemplary study. (SLD)
Descriptors: College Entrance Examinations, Criteria, Higher Education, Participation
Peer reviewed Peer reviewed
Direct linkDirect link
Johnson, Robert L.; Penny, Jim; Fisher, Steve; Kuhs, Therese – Applied Measurement in Education, 2003
When raters assign different scores to a performance task, a method for resolving rating differences is required to report a single score to the examinee. Recent studies indicate that decisions about examinees, such as pass/fail decisions, differ across resolution methods. Previous studies also investigated the interrater reliability of…
Descriptors: Test Reliability, Test Validity, Scores, Interrater Reliability
Peer reviewed Peer reviewed
Berk, Ronald A. – Applied Measurement in Education, 1995
A brief summary of standard setting knowledge is presented, derived from about 20 methods that utilize a judgmental review process, the approach most relevant to the standard-setting strategies proposed in this special issue. Criteria for judging effectiveness and critiques of the methods discussed in the issue are offered. (SLD)
Descriptors: Criteria, Decision Making, Educational History, Evaluation Methods
Peer reviewed Peer reviewed
Williams, Valerie S. L. – Applied Measurement in Education, 1997
Using item response theory to investigate differential item functioning (DIF), students' expected course grades were examined and found to function similarly across sex and race. These grades were incorporated into the matching criterion, enhancing the validity of subgroup comparisons for the third-grade mathematics test taken by 1,050 students.…
Descriptors: Comparative Analysis, Criteria, Elementary School Students, Grade 3
Peer reviewed Peer reviewed
Plake, Barbara S. – Applied Measurement in Education, 1995
This article provides a framework for the rest of the articles in this special issue comparing the utility of three standard-setting methods with complex performance assessments. The context of the standard setting study is described, and the methods are outlined. (SLD)
Descriptors: Comparative Analysis, Criteria, Decision Making, Educational Assessment