ERIC - Search Results

Source

Educational and Psychological…

Author

Auerbach, Meredith A.	1
Douglass, Jacqueline A.	1
Engelhard, George, Jr.	1
Fabbris, Luigi	1
Fisher, Thomas L.	1
Gallo, Francesca	1
Hurtz, Gregory M.	1
Janson, Harald	1
Lunz, Mary E.	1
Olsson, Ulf	1
Quereshi, M. Y.	1
Stone, Gregory E.	1
More ▼

Publication Type

Journal Articles	6
Reports - Evaluative	3
Reports - Research	2
Reports - Descriptive	1
Speeches/Meeting Papers	1

Education Level

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 7 results Save | Export

Logical Versus Empirical Estimates of Item Difficulty

Peer reviewed

Quereshi, M. Y.; Fisher, Thomas L. – Educational and Psychological Measurement, 1977

Logical estimates of item difficulty made by judges were compared to empirical estimates derived from a test administration. Results indicated substantial correspondence between logical and empirical estimates, and substantial variation among judges. Further, the more elaborate the system used by judges to make estimates, the more accurate the…

Descriptors: Court Judges, Difficulty Level, Evaluation Methods, Item Analysis

Validation of Two Subjective Rating Systems for Synchronized Swimming.

Peer reviewed

Douglass, Jacqueline A. – Educational and Psychological Measurement, 1979

The validity of two subjective approaches to judging in synchronized swimming were examined through a multitrait-multimethod matrix. Results indicated that judging panels tended not to differentiate between execution and content scores. (Author/JKS)

Descriptors: Behavior Rating Scales, Court Judges, Evaluation Criteria, Evaluation Methods

A Measure of Agreement for Interval or Nominal Multivariate Observations.

Peer reviewed

Janson, Harald; Olsson, Ulf – Educational and Psychological Measurement, 2001

Proposes a generalization of Cohen's kappa coefficient (J. Cohen, 1960) to address the problem of accounting for overall chance-corrected interobserver agreement among the multivariate ratings of several judges. The statistic's metric is conventional and in the univariate case it is equivalent to existing extensions of the kappa coefficient to…

Descriptors: Interrater Reliability, Judges, Multivariate Analysis

A Meta-Analysis of the Effects of Modifications to the Angoff Method on Cutoff Scores and Judgment Consensus.

Peer reviewed

Hurtz, Gregory M.; Auerbach, Meredith A. – Educational and Psychological Measurement, 2003

Conducted a meta analysis of studies of procedural modifications of the Angoff method of setting cutoff scores. Findings for 38 studies (113 judges) show that common modifications have produced systematic effects on cutoff scores and the degree of consensus among judges. (SLD)

Descriptors: Cutting Scores, Judges, Meta Analysis, Standard Setting

Bivariate Coefficients of Agreement among Any Number of Observers.

Peer reviewed

Fabbris, Luigi; Gallo, Francesca – Educational and Psychological Measurement, 1993

New coefficients of agreement are suggested for the measure of intraclass consistency between observations on two variables. The coefficients are derived from a general coefficient for measuring intraclass dependence in a bivariate analysis context. Various coefficients for the univariate agreement analysis are shown to be cases of the suggested…

Descriptors: Correlation, Equations (Mathematics), Interrater Reliability, Judges

Interjudge Reliability and Decision Reproducibility.

Peer reviewed

Lunz, Mary E.; And Others – Educational and Psychological Measurement, 1994

In a study involving eight judges, analysis with the FACETS model provides evidence that judges grade differently, whether or not scores correlate well. This outcome suggests that adjustments for differences among judges should be made before student measures are estimated to produce reproducible decisions. (SLD)

Descriptors: Correlation, Decision Making, Evaluation Methods, Evaluators

Evaluating the Quality of Ratings Obtained from Standard- Setting Judges.

Peer reviewed

Engelhard, George, Jr.; Stone, Gregory E. – Educational and Psychological Measurement, 1998

A new approach based on Rasch measurement theory is described for examining the quality of ratings from standard-setting judges. Ratings of nine judges for 213 items on a nursing examination show that judges vary in their views of the essential items for nursing certification, with statistically significant variability in the judged essentiality…

Descriptors: Certification, Evaluation Methods, Item Response Theory, Judges

Judges	7
Evaluation Methods	4
Interrater Reliability	3
Correlation	2
Court Judges	2
Measurement Techniques	2
Rating Scales	2
Behavior Rating Scales	1
Certification	1
Cutting Scores	1
Decision Making	1
Difficulty Level	1
Equations (Mathematics)	1
Evaluation Criteria	1
Evaluators	1
Informal Assessment	1
Item Analysis	1
Item Response Theory	1
Meta Analysis	1
Multivariate Analysis	1
Nurses	1
Observation	1
Performance Based Assessment	1
Psychometrics	1
Scores	1
More ▼