Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 4 |
Since 2006 (last 20 years) | 7 |
Descriptor
Rating Scales | 37 |
Measurement Techniques | 10 |
Test Validity | 10 |
Higher Education | 9 |
Test Construction | 8 |
Item Analysis | 7 |
Test Items | 6 |
Correlation | 5 |
Models | 5 |
Comparative Analysis | 4 |
Factor Analysis | 4 |
More ▼ |
Source
Journal of Educational… | 37 |
Author
Jones, Eli | 2 |
Klockars, Alan J. | 2 |
Wang, Wen-Chung | 2 |
Wind, Stefanie A. | 2 |
Andrich, David | 1 |
Bausell, R. Barker | 1 |
Bejar, Isaac I. | 1 |
Benson, Jeri | 1 |
Bentler, Peter M. | 1 |
Bergan, John R. | 1 |
Bolea, Angelo S. | 1 |
More ▼ |
Publication Type
Journal Articles | 22 |
Reports - Research | 19 |
Book/Product Reviews | 1 |
Opinion Papers | 1 |
Reports - Descriptive | 1 |
Education Level
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
College and University… | 1 |
Differential Aptitude Test | 1 |
What Works Clearinghouse Rating
Wind, Stefanie A.; Jones, Eli – Journal of Educational Measurement, 2019
Researchers have explored a variety of topics related to identifying and distinguishing among specific types of rater effects, as well as the implications of different types of incomplete data collection designs for rater-mediated assessments. In this study, we used simulated data to examine the sensitivity of latent trait model indicators of…
Descriptors: Rating Scales, Models, Evaluators, Data Collection
Wind, Stefanie A.; Jones, Eli – Journal of Educational Measurement, 2018
Range restrictions, or raters' tendency to limit their ratings to a subset of available rating scale categories, are well documented in large-scale teacher evaluation systems based on principal observations. When these restrictions occur, the ratings observed during operational teacher evaluations are limited to a subset of the available…
Descriptors: Measurement, Classroom Environment, Observation, Rating Scales
Andrich, David; Marais, Ida – Journal of Educational Measurement, 2018
Even though guessing biases difficulty estimates as a function of item difficulty in the dichotomous Rasch model, assessment programs with tests which include multiple-choice items often construct scales using this model. Research has shown that when all items are multiple-choice, this bias can largely be eliminated. However, many assessments have…
Descriptors: Multiple Choice Tests, Test Items, Guessing (Tests), Test Bias
Wiberg, Marie; González, Jorge – Journal of Educational Measurement, 2016
Equating methods make use of an appropriate transformation function to map the scores of one test form into the scale of another so that scores are comparable and can be used interchangeably. The equating literature shows that the ways of judging the success of an equating (i.e., the score transformation) might differ depending on the adopted…
Descriptors: Statistical Analysis, Equated Scores, Scores, Models
Wang, Wen-Chung; Wu, Shiu-Lien – Journal of Educational Measurement, 2011
Rating scale items have been widely used in educational and psychological tests. These items require people to make subjective judgments, and these subjective judgments usually involve randomness. To account for this randomness, Wang, Wilson, and Shih proposed the random-effect rating scale model in which the threshold parameters are treated as…
Descriptors: Rating Scales, Models, Statistical Analysis, Computation
Kang, Taehoon; Chen, Troy T. – Journal of Educational Measurement, 2008
Orlando and Thissen's S-X[superscript 2] item fit index has performed better than traditional item fit statistics such as Yen' s Q[subscript 1] and McKinley and Mill' s G[superscript 2] for dichotomous item response theory (IRT) models. This study extends the utility of S-X[superscript 2] to polytomous IRT models, including the generalized partial…
Descriptors: Item Response Theory, Models, Rating Scales, Generalization

Klockars, Alan J.; Yamagishi, Midori – Journal of Educational Measurement, 1988
The influence of the verbal label and its scalar position in defining the meaning of the labeled position on a rating scale was studied in three forms of the scale with the labels FAIR and GOOD systematically moved. When labels and position differed in meaning, college students rated the labeled position as a compromise between the two. (SLD)
Descriptors: College Students, Rating Scales, Scaling

Bergan, John R. – Journal of Educational Measurement, 1980
A coefficient of inter-rater agreement is presented which describes the magnitude of observer agreement as the probability estimated under a quasi-independence model that responses from different observers will be in agreement. (Author/JKS)
Descriptors: Measurement Techniques, Observation, Rating Scales, Reliability

Shaw, Dale G; And Others – Journal of Educational Measurement, 1987
Information loss occurs when continuous data are grouped in discrete intervals. After calculating the squared correlation coefficients between continuous data and corresponding grouped data for four population distributions, the effects of population distribution, number of intervals, and interval width on information loss and recovery were…
Descriptors: Intervals, Rating Scales, Sampling, Scaling

Lavoie, Allan L.; Bentler, Peter M. – Journal of Educational Measurement, 1974
Descriptors: Measurement Techniques, Rating Scales, Semantic Differential, Test Construction

Olson, Margot A. – Journal of Educational Measurement, 1978
The use of matrix sampling to overcome the impracticality of the pair comparison method for obtaining scale values is empirically tested. Results indicate that matrix sampling is useful in such applications. (Author/JKS)
Descriptors: Item Sampling, Matrices, Measurement Techniques, Rating Scales
The Relationship Between Number of Response Categories and Reliability of Likert-Type Questionnaires

Masters, James R. – Journal of Educational Measurement, 1974
Descriptors: Attitudes, Questionnaires, Rating Scales, Response Style (Tests)

Secolsky, Charles – Journal of Educational Measurement, 1987
For measuring the face validity of a test, Nevo suggested that test takers and nonprofessional users rate items on a five point scale. This article questions the ability of those raters and the credibility of the aggregated judgment as evidence of the validity of the test. (JAZ)
Descriptors: Content Validity, Measurement Techniques, Rating Scales, Test Items

Bejar, Isaac I.; Doyle, Kenneth O. – Journal of Educational Measurement, 1976
The relationship between naturally-occurring student expectations about the instructor and later student evaluations of that instructor was studied. It was found that students were capable of rating their instructors independently of expectations held prior to the course. (Author/BW)
Descriptors: Expectation, Higher Education, Rating Scales, Student Evaluation of Teacher Performance

Lam, Tony C. M.; Klockars, Alan J. – Journal of Educational Measurement, 1982
Ratings given to questionnaire items on four types of rating scales were compared. This study shows that the differences between scales are contingent upon the particular anchors used for the intermediate options. The results suggest that the mean score is predictably influenced by changes in the intermediate anchors. (Author/PN)
Descriptors: Higher Education, Measurement Techniques, Measures (Individuals), Psychometrics