ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	0
Since 2006 (last 20 years)	6

Descriptor

Rating Scales	14
Item Response Theory	11
Athletic Coaches	4
Change	3
Classification	3
Measurement Techniques	3
Models	3
Self Efficacy	3
Correlation	2
Effect Size	2
Evaluation Methods	2
Performance Based Assessment	2
Physical Education	2
Portfolio Assessment	2
Probability	2
Psychometrics	2
Researchers	2
Team Sports	2
Test Bias	2
Validity	2
Accuracy	1
Achievement Gains	1
Advanced Placement Programs	1
African Americans	1
Algorithms	1
More ▼

Source

Educational and Psychological…	2
Journal of Outcome Measurement	2
Measurement in Physical…	2
Research Quarterly for…	2
Applied Measurement in…	1
College Board	1
Educational Measurement:…	1

Author

Wolfe, Edward W.	14
Myers, Nicholas D.	5
Chiu, Chris W. T.	4
Feltz, Deborah L.	4
Myford, Carol M.	2
Penfield, Randall D.	2
Engelhard, George, Jr.	1
Harris, Debbi C.	1
Maier, Kimberly S.	1
Manalo, Jonathan R.	1
McVay, Aaron	1
Miller, Timothy R.	1
Moulder, Bradley C.	1
Ray, Lisa M.	1
Reckase, Mark D.	1
More ▼

Publication Type

Journal Articles	10
Reports - Evaluative	7
Reports - Research	5
Speeches/Meeting Papers	5
Reports - Descriptive	2

Education Level

High Schools	2
Higher Education	2
Postsecondary Education	1
Secondary Education	1

Audience

Location

United States

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 14 results Save | Export

Application of Latent Trait Models to Identifying Substantively Interesting Raters

Peer reviewed

Direct link

Wolfe, Edward W.; McVay, Aaron – Educational Measurement: Issues and Practice, 2012

Historically, research focusing on rater characteristics and rating contexts that enable the assignment of accurate ratings and research focusing on statistical indicators of accurate ratings has been conducted by separate communities of researchers. This study demonstrates how existing latent trait modeling procedures can identify groups of…

Descriptors: Researchers, Research, Correlation, Test Bias

A Confirmatory Study of Rating Scale Category Effectiveness for the Coaching Efficacy Scale

Peer reviewed

Direct link

Myers, Nicholas D.; Feltz, Deborah L.; Wolfe, Edward W. – Research Quarterly for Exercise and Sport, 2008

This study extended validity evidence for measures of coaching efficacy derived from the Coaching Efficacy Scale (CES) by testing the rating scale categorizations suggested in previous research. Previous research provided evidence for the effectiveness of a four-category (4-CAT) structure for high school and collegiate sports coaches; it also…

Descriptors: Rating Scales, Validity, Self Efficacy, Athletic Coaches

Methods for Assessing Item, Step, and Threshold Invariance in Polytomous Items Following the Partial Credit Model

Peer reviewed

Direct link

Penfield, Randall D.; Myers, Nicholas D.; Wolfe, Edward W. – Educational and Psychological Measurement, 2008

Measurement invariance in the partial credit model (PCM) can be conceptualized in several different but compatible ways. In this article the authors distinguish between three forms of measurement invariance in the PCM: step invariance, item invariance, and threshold invariance. Approaches for modeling these three forms of invariance are proposed,…

Descriptors: Measurement Techniques, Mathematics Instruction, Probability, Rating Scales

Detecting Differential Rater Functioning over Time (DRIFT) Using a Rasch Multi-Faceted Rating Scale Model.

Download full text

Wolfe, Edward W.; Moulder, Bradley C.; Myford, Carol M. – 1999

This paper describes a class of rater effects that depict rater-by-time interactions. This class of rater effects is referred to as differential rater functioning over time (DRIFT). This article describes several types of DRIFT (primacy/recency, differential centrality/extremism, and practice/fatigue) and Rasch measurement procedures designed to…

Descriptors: Classification, Effect Size, Evaluators, Item Response Theory

Extending Validity Evidence for Multidimensional Measures of Coaching Competency

Peer reviewed

Direct link

Myers, Nicholas D.; Wolfe, Edward W.; Maier, Kimberly S.; Feltz, Deborah L.; Reckase, Mark D. – Research Quarterly for Exercise and Sport, 2006

This study extended validity evidence for multidimensional measures of coaching competency derived from the Coaching Competency Scale (CCS; Myers, Feltz, Maier, Wolfe, & Reckase, 2006) by examining use of the original rating scale structure and testing how measures related to satisfaction with the head coach within teams and between teams.…

Descriptors: Rating Scales, Competence, Athletic Coaches, Validity

Measuring Pretest-Posttest Change with a Rasch Rating Scale Model.

Peer reviewed

Wolfe, Edward W.; Chiu, Chris W. T. – Journal of Outcome Measurement, 1999

Describes a method for disentangling changes in persons from changes in the interpretation of Likert-type questionnaire items and the use of rating scales. The procedure relies on anchoring strategies to create a common frame of reference for interpreting measures taken at different times. Illustrates the use of these procedures using the FACETS…

Descriptors: Change, Item Response Theory, Likert Scales, Models

Measuring Change across Multiple Occasions Using the Rasch Rating Scale Model.

Peer reviewed

Wolfe, Edward W.; Chiu, Chris W. T. – Journal of Outcome Measurement, 1999

Describes a Rasch rating scale analysis of a multi-occasion evaluation that produces confusing results when subjected to separate calibrations. Applies a correction algorithm developed by B. Wright (1996) to show how the Wright algorithm can reduce misfit to the Rasch rating scale model as well as changing the interpretation of change. (SLD)

Descriptors: Algorithms, Change, Evaluation Methods, Item Response Theory

Detecting Rater Effects with a Multi-Faceted Rating Scale Model.

Download full text

Wolfe, Edward W.; Chiu, Chris W. T. – 1997

How common patterns of rater errors may be detected in a large-scale performance assessment setting is discussed. Common rater effects are identified, and a scaling method that can be used to detect them in operational data sets is presented. Simulated data sets are generated to exhibit each of these rater effects. The three continua that depict…

Descriptors: Item Response Theory, Mathematical Models, Norms, Performance Based Assessment

Measuring Change over Time with a Rasch Rating Scale Model.

Download full text

Wolfe, Edward W.; Chiu, Chris W. T. – 1997

When measures are taken on the same individual over time, it is difficult to determine whether observed differences are the result of changes in the person or changes in other facets of the measurement situation (e.g. interpretation of items or use of rating scale). This paper describes a method for disentangling changes in persons from changes in…

Descriptors: Change, Item Response Theory, Measurement Techniques, Portfolio Assessment

Barriers to the Implementation of Portfolio Assessment in Secondary Education.

Peer reviewed

Wolfe, Edward W.; Miller, Timothy R. – Applied Measurement in Education, 1997

Barriers to large-scale portfolio assessment were studied by surveying 206 secondary teachers interested in adopting these forms of assessment. A rating scale model based on the Rasch model was used to analyze results. Suggestions are presented for facilitating the efforts of secondary school teachers. (SLD)

Descriptors: Instructional Improvement, Item Response Theory, Performance Based Assessment, Portfolio Assessment

Identifying Differential Item Functioning of Rating Scale Items with the Rasch Model: An Introduction and an Application

Peer reviewed

Direct link

Myers, Nicholas D.; Wolfe, Edward W.; Feltz, Deborah L.; Penfield, Randall D. – Measurement in Physical Education and Exercise Science, 2006

This study (a) provided a conceptual introduction to differential item functioning (DIF), (b) introduced the multifaceted Rasch rating scale model (MRSM) and an associated statistical procedure for identifying DIF in rating scale items, and (c) applied this procedure to previously collected data from American coaches who responded to the coaching…

Descriptors: Test Bias, Rating Scales, Personality, Item Response Theory

A Rasch Analysis of Three Measures of Teacher Perception Generated from the School and Staffing Survey

Peer reviewed

Direct link

Wolfe, Edward W.; Ray, Lisa M.; Harris, Debbi C. – Educational and Psychological Measurement, 2004

The National Center for Educational Statistics' 1999-2000 Schools and Staffing Survey data are used extensively by researchers conducting secondary analysis on a variety of issues including teacher quality, teacher preparation, and the use of technology in the classroom. Researchers frequently combine the data from several related survey questions…

Descriptors: Researchers, Psychometrics, Educational Technology, Data Analysis

Monitoring Reader Performance and DRIFT in the AP® English Literature and Composition Examination Using Benchmark Essays. Research Report No. 2007-2

Download full text

Wolfe, Edward W.; Myford, Carol M.; Engelhard, George, Jr.; Manalo, Jonathan R. – College Board, 2007

In this study, we investigated a variety of Reader effects that may influence the validity of ratings assigned to AP® English Literature and Composition essays. Specifically, we investigated whether Readers exhibit changes in their levels of severity and accuracy, and their use of individual scale categories over time. We refer to changes in these…

Descriptors: Advanced Placement Programs, Essays, English Literature, Writing (Composition)

An Evaluation of the Psychometric Properties of the Coaching Efficacy Scale for Coaches from the United States of America

Peer reviewed

Direct link

Myers, Nicholas D.; Wolfe, Edward W.; Feltz, Deborah L. – Measurement in Physical Education and Exercise Science, 2005

This study extends validity evidence for the Coaching Efficacy Scale (CES; Feltz, Chase, Moritz, & Sullivan, 1999) by providing an evaluation of the psychometric properties of the instrument from previously collected data on high school and college coaches from United States. Data were fitted to a multidimensional item response theory model.…

Descriptors: Self Efficacy, Test Validity, Rating Scales, Psychometrics