NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 7 results Save | Export
Peer reviewed Peer reviewed
Quereshi, M. Y.; Fisher, Thomas L. – Educational and Psychological Measurement, 1977
Logical estimates of item difficulty made by judges were compared to empirical estimates derived from a test administration. Results indicated substantial correspondence between logical and empirical estimates, and substantial variation among judges. Further, the more elaborate the system used by judges to make estimates, the more accurate the…
Descriptors: Court Judges, Difficulty Level, Evaluation Methods, Item Analysis
Peer reviewed Peer reviewed
Douglass, Jacqueline A. – Educational and Psychological Measurement, 1979
The validity of two subjective approaches to judging in synchronized swimming were examined through a multitrait-multimethod matrix. Results indicated that judging panels tended not to differentiate between execution and content scores. (Author/JKS)
Descriptors: Behavior Rating Scales, Court Judges, Evaluation Criteria, Evaluation Methods
Peer reviewed Peer reviewed
Janson, Harald; Olsson, Ulf – Educational and Psychological Measurement, 2001
Proposes a generalization of Cohen's kappa coefficient (J. Cohen, 1960) to address the problem of accounting for overall chance-corrected interobserver agreement among the multivariate ratings of several judges. The statistic's metric is conventional and in the univariate case it is equivalent to existing extensions of the kappa coefficient to…
Descriptors: Interrater Reliability, Judges, Multivariate Analysis
Peer reviewed Peer reviewed
Hurtz, Gregory M.; Auerbach, Meredith A. – Educational and Psychological Measurement, 2003
Conducted a meta analysis of studies of procedural modifications of the Angoff method of setting cutoff scores. Findings for 38 studies (113 judges) show that common modifications have produced systematic effects on cutoff scores and the degree of consensus among judges. (SLD)
Descriptors: Cutting Scores, Judges, Meta Analysis, Standard Setting
Peer reviewed Peer reviewed
Fabbris, Luigi; Gallo, Francesca – Educational and Psychological Measurement, 1993
New coefficients of agreement are suggested for the measure of intraclass consistency between observations on two variables. The coefficients are derived from a general coefficient for measuring intraclass dependence in a bivariate analysis context. Various coefficients for the univariate agreement analysis are shown to be cases of the suggested…
Descriptors: Correlation, Equations (Mathematics), Interrater Reliability, Judges
Peer reviewed Peer reviewed
Lunz, Mary E.; And Others – Educational and Psychological Measurement, 1994
In a study involving eight judges, analysis with the FACETS model provides evidence that judges grade differently, whether or not scores correlate well. This outcome suggests that adjustments for differences among judges should be made before student measures are estimated to produce reproducible decisions. (SLD)
Descriptors: Correlation, Decision Making, Evaluation Methods, Evaluators
Peer reviewed Peer reviewed
Engelhard, George, Jr.; Stone, Gregory E. – Educational and Psychological Measurement, 1998
A new approach based on Rasch measurement theory is described for examining the quality of ratings from standard-setting judges. Ratings of nine judges for 213 items on a nursing examination show that judges vary in their views of the essential items for nursing certification, with statistically significant variability in the judged essentiality…
Descriptors: Certification, Evaluation Methods, Item Response Theory, Judges