Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 2 |
Since 2006 (last 20 years) | 3 |
Descriptor
Evaluation Methods | 9 |
Scoring | 9 |
Test Theory | 9 |
Item Analysis | 4 |
Testing | 4 |
Latent Trait Theory | 3 |
Test Reliability | 3 |
Equated Scores | 2 |
Higher Education | 2 |
Program Validation | 2 |
Psychometrics | 2 |
More ▼ |
Source
Anatomical Sciences Education | 1 |
European Journal of… | 1 |
Instructional Science | 1 |
International Journal of… | 1 |
Journal of Educational… | 1 |
Physical Review Physics… | 1 |
Research Quarterly for… | 1 |
Author
Aksu, Gökhan | 1 |
Bhaskar, R. | 1 |
Cohen, Allan S., Comp. | 1 |
Dillard, Jesse F. | 1 |
Eser, Mehmet Taha | 1 |
Feldt, Leonard S. | 1 |
Fenna, Doug S. | 1 |
Gilliland, Kurt O. | 1 |
Hills, John R. | 1 |
Kernick, Edward T. | 1 |
Mislevy, Robert J. | 1 |
More ▼ |
Publication Type
Journal Articles | 7 |
Reports - Research | 7 |
Reference Materials -… | 1 |
Reports - Evaluative | 1 |
Speeches/Meeting Papers | 1 |
Education Level
Higher Education | 3 |
Postsecondary Education | 2 |
Audience
Location
Laws, Policies, & Programs
Elementary and Secondary… | 1 |
Assessments and Surveys
National Assessment of… | 1 |
What Works Clearinghouse Rating
Comparison of the Results of the Generalizability Theory with the Inter-Rater Agreement Coefficients
Eser, Mehmet Taha; Aksu, Gökhan – International Journal of Curriculum and Instruction, 2022
The agreement between raters is examined within the scope of the concept of "inter-rater reliability". Although there are clear definitions of the concepts of agreement between raters and reliability between raters, there is no clear information about the conditions under which agreement and reliability level methods are appropriate to…
Descriptors: Generalizability Theory, Interrater Reliability, Evaluation Methods, Test Theory
Rainey, Katherine D.; Vignal, Michael; Wilcox, Bethany R. – Physical Review Physics Education Research, 2022
Currently there are no assessment instruments available for upper-division thermal physics, though several introductory assessments are currently available. Notably missing from these introductory assessment are items targeting statistical mechanics. This leaves a gap in the content that can be assessed by upper-division thermal physics faculty.…
Descriptors: Physics, Science Instruction, Thermodynamics, College Science
Royal, Kenneth D.; Gilliland, Kurt O.; Kernick, Edward T. – Anatomical Sciences Education, 2014
Any examination that involves moderate to high stakes implications for examinees should be psychometrically sound and legally defensible. Currently, there are two broad and competing families of test theories that are used to score examination data. The majority of instructors outside the high-stakes testing arena rely on classical test theory…
Descriptors: Item Response Theory, Scoring, Evaluation Methods, Anatomy
Mislevy, Robert J. – 1988
Large-scale educational assessments differ from familiar educational measurements by attempting to provide information about the levels and natures of skills in populations rather than in individuals. That the distinct purposes of assessment require different methodologies than individual measurement was recognized by the development of…
Descriptors: Educational Assessment, Evaluation Methods, Item Analysis, Latent Trait Theory
Fenna, Doug S. – European Journal of Engineering Education, 2004
Multiple-choice testing (MCT) has several advantages which are becoming more relevant in the current financial climate. In particular, they can be machine marked. As an objective testing method it is particularly relevant to engineering and other factual courses, but MCTs are not widely used in engineering because students can benefit from…
Descriptors: Guessing (Tests), Testing, Multiple Choice Tests, Engineering Education

Feldt, Leonard S.; Spray, Judith A. – Research Quarterly for Exercise and Sport, 1983
The reliabilities of two types of measurement plans were compared across six hypothetical distributions of true scores or abilities. The measurement plans were: (1) fixed-length, where the number of trials for all examinees is set in advance; and (2) trials-to-criterion, where examinees must keep trying until they complete a given number of trials…
Descriptors: Criterion Referenced Tests, Evaluation Methods, Higher Education, Measurement Techniques

Bhaskar, R.; Dillard, Jesse F. – Instructional Science, 1983
Description of an objective method for assigning weights to questions on examinations includes discussions of classical test theory, knowledge organization, and how task analysis can be used to identify knowledge elements required to solve specific problems, rank them, and assign objective weights to exam questions using a Pareto distribution (7…
Descriptors: Accounting, Epistemology, Evaluation Methods, Item Analysis

Hills, John R.; And Others – Journal of Educational Measurement, 1988
Five methods of equating minimum-competency tests were compared using the Florida Statewide Student Assessment Test, Part II, for 1984 and 1986. Four of five methods yielded essentially comparable results for the highest scoring 84% of the students. Different lengths of anchor items were compared, using the concurrent item response theory equating…
Descriptors: Comparative Analysis, Equated Scores, Evaluation Methods, Graduation Requirements
Cohen, Allan S., Comp. – 1979
This partially annotated bibliography of journal articles, dissertations, convention papers, research reports, and a few books and unpublished manuscripts provides a comprehensive coverage of work on latent trait theory and practice. Documents are arranged alphabetically by author. The period covered ranges from the early 1950's to the present.…
Descriptors: Attitude Measures, Career Development, Computer Assisted Testing, Computer Programs