Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 0 |
Since 2006 (last 20 years) | 6 |
Descriptor
Rating Scales | 14 |
Item Response Theory | 11 |
Athletic Coaches | 4 |
Change | 3 |
Classification | 3 |
Measurement Techniques | 3 |
Models | 3 |
Self Efficacy | 3 |
Correlation | 2 |
Effect Size | 2 |
Evaluation Methods | 2 |
More ▼ |
Source
Educational and Psychological… | 2 |
Journal of Outcome Measurement | 2 |
Measurement in Physical… | 2 |
Research Quarterly for… | 2 |
Applied Measurement in… | 1 |
College Board | 1 |
Educational Measurement:… | 1 |
Author
Publication Type
Journal Articles | 10 |
Reports - Evaluative | 7 |
Reports - Research | 5 |
Speeches/Meeting Papers | 5 |
Reports - Descriptive | 2 |
Education Level
High Schools | 2 |
Higher Education | 2 |
Postsecondary Education | 1 |
Secondary Education | 1 |
Audience
Location
United States | 1 |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Wolfe, Edward W.; McVay, Aaron – Educational Measurement: Issues and Practice, 2012
Historically, research focusing on rater characteristics and rating contexts that enable the assignment of accurate ratings and research focusing on statistical indicators of accurate ratings has been conducted by separate communities of researchers. This study demonstrates how existing latent trait modeling procedures can identify groups of…
Descriptors: Researchers, Research, Correlation, Test Bias
Myers, Nicholas D.; Feltz, Deborah L.; Wolfe, Edward W. – Research Quarterly for Exercise and Sport, 2008
This study extended validity evidence for measures of coaching efficacy derived from the Coaching Efficacy Scale (CES) by testing the rating scale categorizations suggested in previous research. Previous research provided evidence for the effectiveness of a four-category (4-CAT) structure for high school and collegiate sports coaches; it also…
Descriptors: Rating Scales, Validity, Self Efficacy, Athletic Coaches
Penfield, Randall D.; Myers, Nicholas D.; Wolfe, Edward W. – Educational and Psychological Measurement, 2008
Measurement invariance in the partial credit model (PCM) can be conceptualized in several different but compatible ways. In this article the authors distinguish between three forms of measurement invariance in the PCM: step invariance, item invariance, and threshold invariance. Approaches for modeling these three forms of invariance are proposed,…
Descriptors: Measurement Techniques, Mathematics Instruction, Probability, Rating Scales
Wolfe, Edward W.; Moulder, Bradley C.; Myford, Carol M. – 1999
This paper describes a class of rater effects that depict rater-by-time interactions. This class of rater effects is referred to as differential rater functioning over time (DRIFT). This article describes several types of DRIFT (primacy/recency, differential centrality/extremism, and practice/fatigue) and Rasch measurement procedures designed to…
Descriptors: Classification, Effect Size, Evaluators, Item Response Theory
Myers, Nicholas D.; Wolfe, Edward W.; Maier, Kimberly S.; Feltz, Deborah L.; Reckase, Mark D. – Research Quarterly for Exercise and Sport, 2006
This study extended validity evidence for multidimensional measures of coaching competency derived from the Coaching Competency Scale (CCS; Myers, Feltz, Maier, Wolfe, & Reckase, 2006) by examining use of the original rating scale structure and testing how measures related to satisfaction with the head coach within teams and between teams.…
Descriptors: Rating Scales, Competence, Athletic Coaches, Validity

Wolfe, Edward W.; Chiu, Chris W. T. – Journal of Outcome Measurement, 1999
Describes a method for disentangling changes in persons from changes in the interpretation of Likert-type questionnaire items and the use of rating scales. The procedure relies on anchoring strategies to create a common frame of reference for interpreting measures taken at different times. Illustrates the use of these procedures using the FACETS…
Descriptors: Change, Item Response Theory, Likert Scales, Models

Wolfe, Edward W.; Chiu, Chris W. T. – Journal of Outcome Measurement, 1999
Describes a Rasch rating scale analysis of a multi-occasion evaluation that produces confusing results when subjected to separate calibrations. Applies a correction algorithm developed by B. Wright (1996) to show how the Wright algorithm can reduce misfit to the Rasch rating scale model as well as changing the interpretation of change. (SLD)
Descriptors: Algorithms, Change, Evaluation Methods, Item Response Theory
Wolfe, Edward W.; Chiu, Chris W. T. – 1997
How common patterns of rater errors may be detected in a large-scale performance assessment setting is discussed. Common rater effects are identified, and a scaling method that can be used to detect them in operational data sets is presented. Simulated data sets are generated to exhibit each of these rater effects. The three continua that depict…
Descriptors: Item Response Theory, Mathematical Models, Norms, Performance Based Assessment
Wolfe, Edward W.; Chiu, Chris W. T. – 1997
When measures are taken on the same individual over time, it is difficult to determine whether observed differences are the result of changes in the person or changes in other facets of the measurement situation (e.g. interpretation of items or use of rating scale). This paper describes a method for disentangling changes in persons from changes in…
Descriptors: Change, Item Response Theory, Measurement Techniques, Portfolio Assessment

Wolfe, Edward W.; Miller, Timothy R. – Applied Measurement in Education, 1997
Barriers to large-scale portfolio assessment were studied by surveying 206 secondary teachers interested in adopting these forms of assessment. A rating scale model based on the Rasch model was used to analyze results. Suggestions are presented for facilitating the efforts of secondary school teachers. (SLD)
Descriptors: Instructional Improvement, Item Response Theory, Performance Based Assessment, Portfolio Assessment
Myers, Nicholas D.; Wolfe, Edward W.; Feltz, Deborah L.; Penfield, Randall D. – Measurement in Physical Education and Exercise Science, 2006
This study (a) provided a conceptual introduction to differential item functioning (DIF), (b) introduced the multifaceted Rasch rating scale model (MRSM) and an associated statistical procedure for identifying DIF in rating scale items, and (c) applied this procedure to previously collected data from American coaches who responded to the coaching…
Descriptors: Test Bias, Rating Scales, Personality, Item Response Theory
Wolfe, Edward W.; Ray, Lisa M.; Harris, Debbi C. – Educational and Psychological Measurement, 2004
The National Center for Educational Statistics' 1999-2000 Schools and Staffing Survey data are used extensively by researchers conducting secondary analysis on a variety of issues including teacher quality, teacher preparation, and the use of technology in the classroom. Researchers frequently combine the data from several related survey questions…
Descriptors: Researchers, Psychometrics, Educational Technology, Data Analysis
Wolfe, Edward W.; Myford, Carol M.; Engelhard, George, Jr.; Manalo, Jonathan R. – College Board, 2007
In this study, we investigated a variety of Reader effects that may influence the validity of ratings assigned to AP® English Literature and Composition essays. Specifically, we investigated whether Readers exhibit changes in their levels of severity and accuracy, and their use of individual scale categories over time. We refer to changes in these…
Descriptors: Advanced Placement Programs, Essays, English Literature, Writing (Composition)
Myers, Nicholas D.; Wolfe, Edward W.; Feltz, Deborah L. – Measurement in Physical Education and Exercise Science, 2005
This study extends validity evidence for the Coaching Efficacy Scale (CES; Feltz, Chase, Moritz, & Sullivan, 1999) by providing an evaluation of the psychometric properties of the instrument from previously collected data on high school and college coaches from United States. Data were fitted to a multidimensional item response theory model.…
Descriptors: Self Efficacy, Test Validity, Rating Scales, Psychometrics