Descriptor
Measurement Techniques | 3 |
Test Reliability | 3 |
Interrater Reliability | 2 |
Scores | 2 |
Test Validity | 2 |
Achievement Tests | 1 |
Criteria | 1 |
Educational Assessment | 1 |
Estimation (Mathematics) | 1 |
Evaluation Methods | 1 |
Generalization | 1 |
More ▼ |
Source
Applied Measurement in… | 3 |
Author
Dunbar, Stephen B. | 1 |
Fisher, Steve | 1 |
Johnson, Robert L. | 1 |
Kuhs, Therese | 1 |
Penny, Jim | 1 |
Qualls, Audrey L. | 1 |
Publication Type
Journal Articles | 3 |
Reports - Evaluative | 2 |
Reports - Research | 1 |
Speeches/Meeting Papers | 1 |
Education Level
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating

Qualls, Audrey L. – Applied Measurement in Education, 1995
Classically parallel, tau-equivalently parallel, and congenerically parallel models representing various degrees of part-test parallelism and their appropriateness for tests composed of multiple item formats are discussed. An appropriate reliability estimate for a test with multiple item formats is presented and illustrated. (SLD)
Descriptors: Achievement Tests, Estimation (Mathematics), Measurement Techniques, Test Format
Johnson, Robert L.; Penny, Jim; Fisher, Steve; Kuhs, Therese – Applied Measurement in Education, 2003
When raters assign different scores to a performance task, a method for resolving rating differences is required to report a single score to the examinee. Recent studies indicate that decisions about examinees, such as pass/fail decisions, differ across resolution methods. Previous studies also investigated the interrater reliability of…
Descriptors: Test Reliability, Test Validity, Scores, Interrater Reliability

Dunbar, Stephen B.; And Others – Applied Measurement in Education, 1991
Issues pertaining to the quality of performance assessments, including reliability and validity, are discussed. The relatively limited generalizability of performance across tasks is indicative of the care needed to evaluate performance assessments. Quality control is an empirical matter when measurement is intended to inform public policy. (SLD)
Descriptors: Educational Assessment, Generalization, Interrater Reliability, Measurement Techniques