NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 9 results Save | Export
Linacre, John M. – 1989
An accepted criterion for gauging the fairness of examinees' scores, derived from judge-awarded ratings, has been the size of the correlation between the judges and the inter-rater reliability. Various means of achieving inter-rater reliability were reviewed, and a model to measure inter-rater reliability is forwarded. Both theoretical and…
Descriptors: Evaluators, Interrater Reliability, Latent Trait Theory, Licensing Examinations (Professions)
Peer reviewed Peer reviewed
Geisinger, Kurt F. – Educational Measurement: Issues and Practice, 1991
Ways to use standard-setting data to adjust cutoff scores on examinations are reviewed. Ten sources of information to be used in determining standards are listed. The decision to modify passing scores should be based on these types of information and consideration of adverse impact or rating process irregularities. (SLD)
Descriptors: Cutting Scores, Evaluation Utilization, Evaluators, Interrater Reliability
Peer reviewed Peer reviewed
Plake, Barbara S.; And Others – Educational Measurement: Issues and Practice, 1991
Possible sources of intrajudge inconsistency in standard setting are reviewed, and approaches are presented to improve the accuracy of rating. Procedures for providing judges with feedback through discussion or computerized communication are discussed. Monitoring and maintaining judges' consistency throughout the rating process are essential. (SLD)
Descriptors: Computer Assisted Instruction, Evaluators, Examiners, Feedback
Raymond, Mark R.; Viswesvaran, Chockalingam – 1991
This study illustrates the use of three least-squares models to control for rater effects in performance evaluation: (1) ordinary least squares (OLS); (2) weighted least squares (WLS); and (3) OLS subsequent to applying a logistic transformation to observed ratings (LOG-OLS). The three models were applied to ratings obtained from four…
Descriptors: Evaluators, Higher Education, Interrater Reliability, Least Squares Statistics
Peer reviewed Peer reviewed
Jaeger, Richard M. – Educational Measurement: Issues and Practice, 1991
Issues concerning the selection of judges for standard setting are discussed. Determining the consistency of judges' recommendations, or their congruity with other expert recommendations, would help in selection. Enough judges must be chosen to allow estimation of recommendations by an entire population of judges. (SLD)
Descriptors: Cutting Scores, Evaluation Methods, Evaluators, Examiners
Goldberg, Gail Lynn; Kapinus, Barbara – 1992
The Maryland School Performance Assessment Program (MSPAP) is a relatively new, statewide performance assessment of students in grades 3, 5, and 8. When first administered in May of 1991, the MSPAP included a battery of performance assessment tasks designed to generate written or drawn responses to reading texts. This study evaluated selected…
Descriptors: Comparative Testing, Elementary Education, Elementary School Teachers, Evaluators
Cramer, Stephen E. – 1990
A standard-setting procedure was developed for the Georgia Teacher Certification Testing Program as tests in 30 teaching fields were revised. A list of important characteristics of a standard-setting procedure was derived, drawing on the work of R. A. Berk (1986). The best method was found to be a highly formalized judgmental, empirical Angoff…
Descriptors: Computer Assisted Testing, Cutting Scores, Data Collection, Elementary Secondary Education
Auchter, Joan Chikos; Patience, Wayne – 1989
The methods used by the General Educational Development Testing Service (GEDTS) to establish and maintain score stability and reading reliability on its direct assessment of writing are described. Using the 1988 site certification and monitoring results of several scoring sites, the focus is on describing how the score scale was established and…
Descriptors: Decentralization, Equivalency Tests, Essay Tests, Evaluators
Kaplan, Bruce A.; Johnson, Eugene G. – 1992
Across the field of educational assessment the case has been made for alternatives to the multiple-choice item type. Most of the alternative types of items require a subjective evaluation by a rater. The reliability of this subjective rating is a key component of these types of alternative items. In this paper, measures of reliability are…
Descriptors: Educational Assessment, Elementary Secondary Education, Estimation (Mathematics), Evaluators