NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 6 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Stefanie A. Wind; Yangmeng Xu – Educational Assessment, 2024
We explored three approaches to resolving or re-scoring constructed-response items in mixed-format assessments: rater agreement, person fit, and targeted double scoring (TDS). We used a simulation study to consider how the three approaches impact the psychometric properties of student achievement estimates, with an emphasis on person fit. We found…
Descriptors: Interrater Reliability, Error of Measurement, Evaluation Methods, Examiners
Peer reviewed Peer reviewed
Direct linkDirect link
International Journal of Testing, 2019
These guidelines describe considerations relevant to the assessment of test takers in or across countries or regions that are linguistically or culturally diverse. The guidelines were developed by a committee of experts to help inform test developers, psychometricians, test users, and test administrators about fairness issues in support of the…
Descriptors: Test Bias, Student Diversity, Cultural Differences, Language Usage
Peer reviewed Peer reviewed
Direct linkDirect link
Suto, Irenka; Nadas, Rita; Bell, John – Research Papers in Education, 2011
Accurate marking is crucial to the reliability and validity of public examinations, in England and internationally. Factors contributing to accuracy have been conceptualised as affecting either marking task demands or markers' personal expertise. The aim of this empirical study was to develop this conceptualisation through investigating the…
Descriptors: Academic Achievement, Examiners, Biology, Foreign Countries
Peer reviewed Peer reviewed
Jaeger, Richard M. – Educational Measurement: Issues and Practice, 1991
Issues concerning the selection of judges for standard setting are discussed. Determining the consistency of judges' recommendations, or their congruity with other expert recommendations, would help in selection. Enough judges must be chosen to allow estimation of recommendations by an entire population of judges. (SLD)
Descriptors: Cutting Scores, Evaluation Methods, Evaluators, Examiners
Oldefendt, Susan J. – 1976
The first National Assessment of Music, conducted in 1971-72, measured the knowledge, skills, and attitudes of 9 year olds, 13 year olds, 17 year olds, and young adults, resulting in estimates of proportions of people in the population who have certain attitudes toward music, knowledge about music terminology, notaion and history, and musical…
Descriptors: Criterion Referenced Tests, Educational Assessment, Elementary Secondary Education, Evaluation Criteria
PDF pending restoration PDF pending restoration
North Carolina State Dept. of Public Instruction, Raleigh. Div. of Research. – 1986
This report describes the North Carolina Annual Testing Programs writing task which was administered in 1985-86. Grade six students were tested on their ability to write a clarification composition; while grade 8 students were evaluated on their skills in writing a persuasive composition. The timed composition (50 minutes) was scored by two…
Descriptors: Basic Skills, Coherence, Cohesion (Written Composition), Elementary Education