ERIC - Search Results

Source

Applied Measurement in…

Author

Bejar, Isaac I.	1
Downing, Steven M.	1
Dunbar, Stephen B.	1
Haladyna, Thomas M.	1
Sax, Anne	1
Williamson, David M.	1

Publication Type

Journal Articles	3
Reports - Evaluative	3
Speeches/Meeting Papers	1

Education Level

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 3 results Save | Export

Automated Tools for Subject Matter Expert Evaluation of Automated Scoring

Peer reviewed

Direct link

Williamson, David M.; Bejar, Isaac I.; Sax, Anne – Applied Measurement in Education, 2004

As automated scoring of complex constructed-response examinations reaches operational status, the process of evaluating the quality of resultant scores, particularly in contrast to scores of expert human graders, becomes as complex as the data itself. Using a vignette from the Architectural Registration Examination (ARE), this article explores the…

Descriptors: Validity, Scoring, Scores, Evaluation Methods

Test Item Development: Validity Evidence from Quality Assurance Procedures.

Peer reviewed

Downing, Steven M.; Haladyna, Thomas M. – Applied Measurement in Education, 1997

An ideal process is outlined for test item development and the study of item responses to ensure that tests are sound. Qualitative and quantitative methods are used to assess the item-level validity evidence for high-stakes examinations. A checklist for assessment is provided. (SLD)

Descriptors: High Stakes Tests, Item Response Theory, Qualitative Research, Quality Control

Quality Control in the Development and Use of Performance Assessments.

Peer reviewed

Dunbar, Stephen B.; And Others – Applied Measurement in Education, 1991

Issues pertaining to the quality of performance assessments, including reliability and validity, are discussed. The relatively limited generalizability of performance across tasks is indicative of the care needed to evaluate performance assessments. Quality control is an empirical matter when measurement is intended to inform public policy. (SLD)

Descriptors: Educational Assessment, Generalization, Interrater Reliability, Measurement Techniques

Quality Control	3
Scores	2
Test Construction	2
Test Interpretation	2
Validity	2
Automation	1
Educational Assessment	1
Evaluation Methods	1
Generalization	1
High Stakes Tests	1
Interrater Reliability	1
Item Response Theory	1
Measurement Techniques	1
Performance Based Assessment	1
Performance Tests	1
Public Policy	1
Qualitative Research	1
Responses	1
Scoring	1
Statistical Analysis	1
Test Items	1
Test Reliability	1
Test Scoring Machines	1
Test Use	1
Test Validity	1
More ▼