Publication Date
| In 2026 | 3 |
| Since 2025 | 190 |
| Since 2022 (last 5 years) | 1069 |
| Since 2017 (last 10 years) | 2891 |
| Since 2007 (last 20 years) | 6176 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Teachers | 481 |
| Practitioners | 358 |
| Researchers | 153 |
| Administrators | 122 |
| Policymakers | 51 |
| Students | 44 |
| Parents | 32 |
| Counselors | 25 |
| Community | 15 |
| Media Staff | 5 |
| Support Staff | 3 |
| More ▼ | |
Location
| Australia | 183 |
| Turkey | 157 |
| California | 134 |
| Canada | 124 |
| New York | 118 |
| United States | 112 |
| Florida | 107 |
| China | 103 |
| Texas | 72 |
| United Kingdom | 72 |
| Japan | 70 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 5 |
| Meets WWC Standards with or without Reservations | 11 |
| Does not meet standards | 8 |
Golub-Smith, Marna; And Others – 1993
The Test of Written English (TWE), administered with certain designated examinations of the Test of English as a Foreign Language (TOEFL), consists of a single essay prompt to which examinees have 30 minutes to respond. Questions have been raised about the comparability of different TWE prompts. This study was designed to elicit essays for prompts…
Descriptors: Charts, Comparative Analysis, English (Second Language), Essay Tests
Kump, Ann – 1992
Directions are given for scoring typing tests taken on a typewriter or on a computer using special software. The speed score (gross words per minute) is obtained by determining the total number of strokes typed, and dividing by 25. The accuracy score is obtained by comparing the examinee's test paper to the appropriate scoring key and counting the…
Descriptors: Computer Assisted Testing, Employment Qualifications, Guidelines, Job Applicants
Ferrara, F. Felicia – 1995
Cut scores, quartile ranking, sample size, and overall classification scheme were studied as personnel selection procedures in two samples. The first was 120 simulated observations of employee scores based on actual selection procedures for applicants for administrative assistant positions. The other sample was composed of test results for 73…
Descriptors: Classification, Cutting Scores, Job Applicants, Personnel Selection
Lunz, Mary E.; O'Neill, Thomas R. – 1997
This retrospective longitudinal study was designed to show grading leniency patterns of judges within and across clinical examination administrations. Data from 17 different administrations of the histology examination of the American Society of Clinical Pathologists over 10 years were studied. Over the 10 years there were 4,683 candidates and 57…
Descriptors: Higher Education, Interrater Reliability, Item Response Theory, Judges
Krippendorff, Klaus – 1992
When one wants to set data reliability standards for a class of scientific inquiries or when one needs to compare and select among many different kinds of data with reliabilities that are crucial to a particular research undertaking, then one needs a single reliability coefficient that is adaptable to all or most situations. Work toward this goal…
Descriptors: Definitions, Equations (Mathematics), Mathematical Models, Reliability
Bontempo, Brian D.; Marks, Casimer M.; Karabatsos, George – 1998
Using meta-analysis, this research takes a look at studies included in a meta-analysis by R. Jaeger (1989) that compared the cut score set by one standard setting method with that set by another. This meta-analysis looks beyond Jaeger's studies to select 10 from the research literature. Each compared at least two types of standard setting method.…
Descriptors: Comparative Analysis, Cutting Scores, Effect Size, Meta Analysis
Horgan, Dianne D.; Barnett, Loretta – 1991
Seventy-four college students participated in a peer review assignment. Subjects were asked to write a draft of a three-page paper, distribute copies to three peer reviewers, revise their papers using the resulting feedback from each of the three peer reviewers, and then prepare and submit a final paper. Reviews were scored for the quality and…
Descriptors: College Students, Feedback, Higher Education, Instructional Effectiveness
Thayer, Jerome D. – 1991
Combining student scores to form subtotals and finally a total score to determine a grade is discussed. The composite score reached by combining measures or subtotals is only valid when the scores are combined so that the actual weight of each measure or subtotal in the total score is the same as the intended weight. Three types of variability…
Descriptors: Academic Achievement, Elementary Secondary Education, Grading, Mathematical Models
Peer reviewedReilly, Richard R. – Educational and Psychological Measurement, 1975
Because previous reports have suggested that the lowered validity of tests scored with empirical option weights might be explained by a capitalization of the keying procedures on omitting tendencies, a procedure was devised to key options empirically with a "correction-for-guessing" constraint. (Author)
Descriptors: Achievement Tests, Graduate Study, Guessing (Tests), Scoring Formulas
Peer reviewedWatkins, Julia M.; Watkins, Dennis A. – Journal of Clinical Psychology, 1975
This study researched the Plenk scoring system more thoroughly to see whether it could be used with older children and whether it could differentiate normal from emotionally disturbed Ss. (Author)
Descriptors: Data Collection, Emotional Disturbances, Handicapped Children, Research Methodology
Follman, John; Panther, Edward – Child Study Journal Monographs, 1974
Examines empirically the efficacy of utilizing Olympic diving and gymnastic scoring systems for grading graduate students' English compositions. Results indicated that such scoring rules do not produce ratings different in reliability or in level from conventional letter grades. (ED)
Descriptors: English Curriculum, Evaluation Methods, Grading, Graduate Students
Peer reviewedChapman, Loren J.; Chapman, Jean P. – Journal of Abnormal Psychology, 1975
Schizophrenic and normal subjects' responses to the Stanford-Binet Vocabulary items, the Wechsler Adult Intelligence Scale (WAIS) Vocabulary items, and the WAIS Similarities items were scored by two methods, one relatively strict and the other relatively lenient. (Editor)
Descriptors: Cognitive Measurement, Psychological Studies, Psychopathology, Research Methodology
Increasing Score Reliability with Item-Pattern Scoring: An Empirical Study in Several Score Metrics.
Yen, Wendy M.; Candell, Gregory L. – 1990
Reliabilities are compared for two types of test score data: number correct, and item response patterns. Item-pattern scoring using three-parameter item response theory takes into account how many and which items a student answers correctly. This procedure theoretically results in greater reliability than does number-correct scoring. Empirical…
Descriptors: Elementary Education, Elementary School Students, Item Response Theory, Scores
Hampton Univ., VA. – 1985
The booklet describes the Vulpe' Performance Analysis System (VPAS), a measure of a child's progress in developmental activities which provides a link to instructional programming. In the assessment stage the child's performance is scored according to how much and what type of assistance is required to perform the task. The scale ranges from no…
Descriptors: Behavior Rating Scales, Diagnostic Teaching, Disabilities, Elementary Secondary Education
Fox, Kathleen V. – 1988
A comparison was made between scores and grades of college students taking a development and learning course, using either a modified mastery grading system (MMGS) or a modified norm-referenced grading system (MNRGS). Under the MMGS, students could take each unit exam up to three times to meet minimum or higher criteria levels. Under the MNRGS,…
Descriptors: Comparative Analysis, Grading, Higher Education, Mastery Tests


