NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 11 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Lahner, Felicitas-Maria; Lörwald, Andrea Carolin; Bauer, Daniel; Nouns, Zineb Miriam; Krebs, René; Guttormsen, Sissel; Fischer, Martin R.; Huwendiek, Sören – Advances in Health Sciences Education, 2018
Multiple true-false (MTF) items are a widely used supplement to the commonly used single-best answer (Type A) multiple choice format. However, an optimal scoring algorithm for MTF items has not yet been established, as existing studies yielded conflicting results. Therefore, this study analyzes two questions: What is the optimal scoring algorithm…
Descriptors: Scoring Formulas, Scoring Rubrics, Objective Tests, Multiple Choice Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Gierl, Mark J.; Bulut, Okan; Guo, Qi; Zhang, Xinxin – Review of Educational Research, 2017
Multiple-choice testing is considered one of the most effective and enduring forms of educational assessment that remains in practice today. This study presents a comprehensive review of the literature on multiple-choice testing in education focused, specifically, on the development, analysis, and use of the incorrect options, which are also…
Descriptors: Multiple Choice Tests, Difficulty Level, Accuracy, Error Patterns
Brinzer, Raymond J. – 1979
The problem engendered by the Matching Familiar Figures (MFF) Test is one of instrument integrity (II). II is delimited by validity, reliability, and utility of MFF as a measure of the reflective-impulsive construct. Validity, reliability and utility of construct assessment may be improved by utilizing: (1) a prototypic scoring model that will…
Descriptors: Conceptual Tempo, Difficulty Level, Item Analysis, Research Methodology
Peer reviewed Peer reviewed
Dorans, Neil J. – Journal of Educational Measurement, 1986
The analytical decomposition demonstrates how the effects of item characteristics, test properties, individual examinee responses, and rounding rules combine to produce the item deletion effect on the equating/scaling function and candidate scores. The empirical portion of the report illustrates the effects of item deletion on reported score…
Descriptors: Difficulty Level, Equated Scores, Item Analysis, Latent Trait Theory
Jaeger, Richard M. – 1980
Five statistical indices are developed and described which may be used for determining (1) when linear equating of two approximately parallel tests is adequate, and (2) whan a more complex method such as equipercentile equating must be used. The indices were based on: (1) similarity of cumulative score distributions; (2) shape of the raw-score to…
Descriptors: College Entrance Examinations, Difficulty Level, Equated Scores, Higher Education
Peer reviewed Peer reviewed
Lord, Frederic M. – Educational and Psychological Measurement, 1971
Descriptors: Ability, Adaptive Testing, Computer Oriented Programs, Difficulty Level
Donlon, Thomas F.; Fitzpatrick, Anne R. – 1978
On the basis of past research efforts to improve multiple-choice test information through differential weighting of responses to wrong answers (distractors), two statistical indices are developed. Each describes the properties of response distributions across the options of an item. Jaspen's polyserial generalization of the biserial correlation…
Descriptors: Confidence Testing, Difficulty Level, Guessing (Tests), High Schools
Legg, Sue M. – 1982
A case study of the Florida Teacher Certification Examination (FTCE) program was described to assist others launching the development of large scale item banks. FTCE has four subtests: Mathematics, Reading, Writing, and Professional Education. Rasch calibrated item banks have been developed for all subtests except Writing. The methods used to…
Descriptors: Cutting Scores, Difficulty Level, Field Tests, Item Analysis
Rippey, Robert M. – 1971
Technical improvements, which may be made in the reliability and validity of tests through confidence scores, are discussed. However, studies indicate that subjects do not handle their confidence uniformly. (MS)
Descriptors: Computer Programs, Confidence Testing, Correlation, Difficulty Level
Kingston, Neal M. – 1985
Birnbaum's three-parameter logistic item response model was used to study guessing behavior of low ability examinees on the Graduate Record Examinations (GRE) General Test, Verbal Measure. GRE scoring procedures had recently changed, from a scoring formula which corrected for guessing, to number-right scoring. The three-parameter theory was used…
Descriptors: Academic Aptitude, Analysis of Variance, College Entrance Examinations, Difficulty Level
Smith, Richard M.; Mitchell, Virginia P. – 1979
To improve the accuracy of college placement, Rasch scoring and person-fit statistics on the Comparative Guidance and Placement test (CGP) was compared to the traditional right-only scoring. Correlations were calculated between English and mathematics course grades and scores of 1,448 entering freshmen on the reading, writing, and mathematics…
Descriptors: Academic Ability, Computer Programs, Difficulty Level, Goodness of Fit