Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 0 |
| Since 2017 (last 10 years) | 2 |
| Since 2007 (last 20 years) | 5 |
Descriptor
| Rating Scales | 29 |
| Testing Problems | 29 |
| Test Reliability | 22 |
| Test Validity | 12 |
| Higher Education | 9 |
| Student Evaluation | 8 |
| Evaluation Methods | 7 |
| Measurement Techniques | 7 |
| Comparative Analysis | 6 |
| Foreign Countries | 6 |
| Interrater Reliability | 6 |
| More ▼ | |
Source
| Educational and Psychological… | 2 |
| Applied Psychological… | 1 |
| Assessing Writing | 1 |
| Didakometry | 1 |
| International Journal of… | 1 |
| Language Learning in Higher… | 1 |
| Language Teaching | 1 |
| Language Testing | 1 |
| Small Group Behavior | 1 |
Author
Publication Type
| Reports - Research | 16 |
| Journal Articles | 8 |
| Speeches/Meeting Papers | 7 |
| Reports - Evaluative | 4 |
| Reports - Descriptive | 2 |
| Information Analyses | 1 |
Education Level
| Higher Education | 2 |
| Postsecondary Education | 2 |
Audience
| Researchers | 2 |
Laws, Policies, & Programs
| Elementary and Secondary… | 1 |
Assessments and Surveys
| ACTFL Oral Proficiency… | 1 |
| Wechsler Intelligence Scale… | 1 |
What Works Clearinghouse Rating
Isbell, Dan; Winke, Paula – Language Testing, 2019
The American Council on the Teaching of Foreign Languages (ACTFL) oral proficiency interview -- computer (OPIc) testing system represents an ambitious effort in language assessment: Assessing oral proficiency in over a dozen languages, on the same scale, from virtually anywhere at any time. Especially for users in contexts where multiple foreign…
Descriptors: Oral Language, Language Tests, Language Proficiency, Second Language Learning
Min, Shangchao; He, Lianzhen; Zhang, Jie – Language Teaching, 2020
This article reviews a selected sample of 70 empirical studies in journal articles and doctoral dissertations on language assessment in China between 2011 and 2018. Following a brief introduction to the history and current state of language assessment in China, the article presents a critical review of language assessment research on six themes…
Descriptors: Language Tests, Test Reliability, Test Validity, Journal Articles
Alberola Colomar, María Pilar – Language Learning in Higher Education, 2014
This article presents and analyses a classroom-based assessment method to test students' speaking skills in a variety of professional settings in tourism. The assessment system has been implemented in the Communication in English for Tourism course, as part of the Tourism Management degree programme, at Florida Universitaria (affiliated to the…
Descriptors: English for Special Purposes, Tourism, Oral Language, Language Tests
Baker, Beverly A. – Assessing Writing, 2010
In high-stakes writing assessments, rater training in the use of a rating scale does not eliminate variability in grade attribution. This realisation has been accompanied by research that explores possible sources of rater variability, such as rater background or rating scale type. However, there has been little consideration thus far of…
Descriptors: Foreign Countries, Writing Evaluation, Writing Tests, Testing
Peer reviewedGordon, Michael E.; Gross, Ronald H. – Educational and Psychological Measurement, 1978
Past practice of operationalizing the concept of fakeability of psychological tests is reviewed. The strengths and weaknesses of these indices are discussed in the light of a proposed new definition of fakeability based upon Naylor's model of measurement accuracy. (Author/JKS)
Descriptors: Psychological Testing, Rating Scales, Response Style (Tests), Test Reliability
Follman, John; And Others – 1974
Three substudies of effects of different formats on student ratings of faculty teaching effectiveness were conducted. One substudy investigated Kinds of Keys, Agreement, Evaluation, and Needs Improvement. The second, NO TUP, (New Observation of Teaching of University Professor Rating Scale), investigated numbers of positive rating categories. The…
Descriptors: College Faculty, College Students, Measurement Techniques, Rating Scales
Peer reviewedLustig, Myron W. – Small Group Behavior, 1987
Investigated reliability and dimensionality of Bales's Interpersonal Rating Forms (IRF) using volunteer subjects (N=266) enrolled in undergraduate communications course. Results documented shortcomings of IRF as a measuring instrument finding the subscales neither reliable nor dimensionally structured; only 2 of 18 items in each subscale are…
Descriptors: College Students, Group Behavior, Groups, Higher Education
Larsson, Bernt – Didakometry, 1974
Subjects are asked to answer six questions, partly with a frequency and partly by marking a verbally anchored scale with five categories. Some univariate and multivariate analyses are performed to elucidate the relations between variables with the two different modes of response. Although there are similarities in results for the two types of…
Descriptors: Measurement Techniques, Measures (Individuals), Rating Scales, Responses
Horner, Walter R.; And Others – 1970
These rating scales are intended for evaluation of student pilot performance. Each student is evaluated individually on the basis of video recordings of the student in flight. Ten point rating lines are used for the ten criterion performance elements of each of three maneuvers, (1) Final Turn to Landing, (2) Lazy Eight, and (3) Vertical S "A".…
Descriptors: Aircraft Pilots, Audiovisual Instruction, Behavioral Objectives, Criterion Referenced Tests
Stewart, Krista J. – 1985
The Wechsler Intelligence Scale for Children-Revised (WISC-R), one of the most commonly used tests of cognitive ability, is difficult to administer accurately. The purpose of this study was primarily to assess interrater agreement on the WISC-R Administration Observational Checklist (WAOC), a new observational instrument that can be used by an…
Descriptors: Educational Psychology, Elementary Secondary Education, Examiners, Higher Education
Suhor, Charles – 1977
In attempting to meet school-board mandates for competency-based testing in composition, educators must devise the most acceptable testing programs they can. This paper describes a design (the Paul Diederich system) for testing students' writing skills, which yields statistically reliable data on individual students, and reports on a New Orleans…
Descriptors: Competency Based Education, English Instruction, Evaluation Methods, Middle Schools
Wright, E.N.; Wyman, W.C. – 1974
This paper is part 3 of Research Department Report Number 123, "Exploring the T.R.Q.: An Assessment of the Effectiveness of the Teachers' Rating Questionnaire," which gave questionnaire users information necessary to tailor the instrument to their own needs. Retention of TRQ questions which generated good distributions of teacher ratings…
Descriptors: Affective Behavior, Communication Skills, Creativity, Elementary Education
Wyman, W.C.; Wright, E.N. – 1974
This report assesses the validity, reliability, and efficiency of the Teachers' Rating Questionnaire (TRQ), a pupils' school success measure developed in connection with a 1961 longitudinal Study of Achievement. TRQ ratings on nearly 14,000 pupils, gathered in the Study of Achievement and a New Canadian Report, constituted the data source.…
Descriptors: Academic Achievement, Correlation, Elementary Education, Predictive Measurement
Peer reviewedMaurer, Todd J.; And Others – Educational and Psychological Measurement, 1990
One implication of attitude self-ratings, that scales may become recalibrated transiently, was investigated conservatively using hypothesized shifts of data from 368 college students rating the desirability of taking senior comprehensive examinations. Results support the validity and reliability of simple graphic rating scales as measures of…
Descriptors: Attitude Measures, College Students, Graphs, Higher Education
Raymond, Mark R.; Viswesvaran, Chockalingam – 1991
This study illustrates the use of three least-squares models to control for rater effects in performance evaluation: (1) ordinary least squares (OLS); (2) weighted least squares (WLS); and (3) OLS subsequent to applying a logistic transformation to observed ratings (LOG-OLS). The three models were applied to ratings obtained from four…
Descriptors: Evaluators, Higher Education, Interrater Reliability, Least Squares Statistics
Previous Page | Next Page »
Pages: 1 | 2
Direct link
