Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 1 |
| Since 2017 (last 10 years) | 11 |
| Since 2007 (last 20 years) | 35 |
Descriptor
| Comparative Analysis | 72 |
| Multiple Choice Tests | 72 |
| Test Reliability | 52 |
| Test Validity | 25 |
| Test Items | 21 |
| Foreign Countries | 18 |
| Test Construction | 18 |
| Test Format | 16 |
| Higher Education | 15 |
| Statistical Analysis | 15 |
| Reliability | 13 |
| More ▼ | |
Source
Author
| Ebel, Robert L. | 3 |
| Frisbie, David A. | 3 |
| Bauer, Daniel | 2 |
| Fischer, Martin R. | 2 |
| Hakstian, A. Ralph | 2 |
| Hambleton, Ronald K. | 2 |
| Kansup, Wanlop | 2 |
| Oosterhof, Albert C. | 2 |
| Alemi, Minoo | 1 |
| Alsma, Jelmer | 1 |
| Alvaro, Rosaria | 1 |
| More ▼ | |
Publication Type
| Reports - Research | 49 |
| Journal Articles | 42 |
| Speeches/Meeting Papers | 10 |
| Reports - Evaluative | 7 |
| Reports - Descriptive | 3 |
| Tests/Questionnaires | 3 |
| Dissertations/Theses -… | 1 |
| Non-Print Media | 1 |
| Reference Materials - General | 1 |
Education Level
| Higher Education | 15 |
| Postsecondary Education | 12 |
| Secondary Education | 8 |
| High Schools | 5 |
| Elementary Education | 3 |
| Elementary Secondary Education | 2 |
| Grade 8 | 2 |
| Junior High Schools | 2 |
| Middle Schools | 2 |
| Grade 6 | 1 |
| Grade 9 | 1 |
| More ▼ | |
Audience
| Researchers | 2 |
Laws, Policies, & Programs
Assessments and Surveys
| Embedded Figures Test | 1 |
| National Assessment of… | 1 |
| Personality Research Form | 1 |
| SAT (College Admission Test) | 1 |
| Work Keys (ACT) | 1 |
What Works Clearinghouse Rating
Wind, Stefanie A.; Walker, A. Adrienne – Educational Measurement: Issues and Practice, 2021
Many large-scale performance assessments include score resolution procedures for resolving discrepancies in rater judgments. The goal of score resolution is conceptually similar to person fit analyses: To identify students for whom observed scores may not accurately reflect their achievement. Previously, researchers have observed that…
Descriptors: Goodness of Fit, Performance Based Assessment, Evaluators, Decision Making
Deribo, Tobias; Goldhammer, Frank; Kroehne, Ulf – Educational and Psychological Measurement, 2023
As researchers in the social sciences, we are often interested in studying not directly observable constructs through assessments and questionnaires. But even in a well-designed and well-implemented study, rapid-guessing behavior may occur. Under rapid-guessing behavior, a task is skimmed shortly but not read and engaged with in-depth. Hence, a…
Descriptors: Reaction Time, Guessing (Tests), Behavior Patterns, Bias
Lee, Won-Chan; Kim, Stella Y.; Choi, Jiwon; Kang, Yujin – Journal of Educational Measurement, 2020
This article considers psychometric properties of composite raw scores and transformed scale scores on mixed-format tests that consist of a mixture of multiple-choice and free-response items. Test scores on several mixed-format tests are evaluated with respect to conditional and overall standard errors of measurement, score reliability, and…
Descriptors: Raw Scores, Item Response Theory, Test Format, Multiple Choice Tests
Galeoto, Giovanni; D'Elpidio, Giuliana; Alvaro, Rosaria; Zicari, Anna Maria; Valente, Donatella; Riccio, Marianna – International Association for Development of the Information Society, 2021
The Italian Disciplinary section of Test of Competences (TECO-D) project is an important longitudinal study used to analyze learning outcomes of ungraded students and to measure quality of the educational process. The aim of the present study was to evaluate the psychometric properties of the TECO-D in students enrolled in the Bachelor's Degree in…
Descriptors: Case Studies, Nursing Education, Psychometrics, Longitudinal Studies
Türkoguz, Suat – Anatolian Journal of Education, 2020
This study aimed to investigate the item "Response Time Fidelity scores" ("RTFs"), "KuderRichardson Reliability" ("KR[subscript 20]") and "Cronbach's Alpha Reliability" ("alpha") coefficients, calculate "KR[subscript 20]" coefficients with "RTFs" for 30 threshold…
Descriptors: Comparative Analysis, Reaction Time, Multiple Choice Tests, Scores
Haberman, Shelby J.; Liu, Yang; Lee, Yi-Hsuan – ETS Research Report Series, 2019
Distractor analyses are routinely conducted in educational assessments with multiple-choice items. In this research report, we focus on three item response models for distractors: (a) the traditional nominal response (NR) model, (b) a combination of a two-parameter logistic model for item scores and a NR model for selections of incorrect…
Descriptors: Multiple Choice Tests, Scores, Test Reliability, High Stakes Tests
Lahner, Felicitas-Maria; Lörwald, Andrea Carolin; Bauer, Daniel; Nouns, Zineb Miriam; Krebs, René; Guttormsen, Sissel; Fischer, Martin R.; Huwendiek, Sören – Advances in Health Sciences Education, 2018
Multiple true-false (MTF) items are a widely used supplement to the commonly used single-best answer (Type A) multiple choice format. However, an optimal scoring algorithm for MTF items has not yet been established, as existing studies yielded conflicting results. Therefore, this study analyzes two questions: What is the optimal scoring algorithm…
Descriptors: Scoring Formulas, Scoring Rubrics, Objective Tests, Multiple Choice Tests
Fauville, Géraldine; Strang, Craig; Cannady, Matthew A.; Chen, Ying-Fang – Environmental Education Research, 2019
The Ocean Literacy movement began in the U.S. in the early 2000s, and has recently become an international effort. The focus on marine environmental issues and marine education is increasing, and yet it has been difficult to show progress of the ocean literacy movement, in part, because no widely adopted measurement tool exists. The International…
Descriptors: Marine Education, Environmental Education, Comparative Analysis, Factor Structure
Bush, Martin – Assessment & Evaluation in Higher Education, 2015
The humble multiple-choice test is very widely used within education at all levels, but its susceptibility to guesswork makes it a suboptimal assessment tool. The reliability of a multiple-choice test is partly governed by the number of items it contains; however, longer tests are more time consuming to take, and for some subject areas, it can be…
Descriptors: Guessing (Tests), Multiple Choice Tests, Test Format, Test Reliability
Severo, Milton; Gaio, A. Rita; Povo, Ana; Silva-Pereira, Fernanda; Ferreira, Maria Amélia – Anatomical Sciences Education, 2015
In theory the formula scoring methods increase the reliability of multiple-choice tests in comparison with number-right scoring. This study aimed to evaluate the impact of the formula scoring method in clinical anatomy multiple-choice examinations, and to compare it with that from the number-right scoring method, hoping to achieve an…
Descriptors: Anatomy, Multiple Choice Tests, Scoring, Decision Making
Steedle, Jeffrey T.; Ferrara, Steve – Applied Measurement in Education, 2016
As an alternative to rubric scoring, comparative judgment generates essay scores by aggregating decisions about the relative quality of the essays. Comparative judgment eliminates certain scorer biases and potentially reduces training requirements, thereby allowing a large number of judges, including teachers, to participate in essay evaluation.…
Descriptors: Essays, Scoring, Comparative Analysis, Evaluators
Kural, Faruk – Journal of Language and Linguistic Studies, 2018
The present paper, which is a study based on midterm exam results of 53 University English prep-school students, examines correlation between a direct writing test, measured holistically by multiple-trait scoring, and two indirect writing tests used in a competence exam, one of which is a multiple-choice cloze test and the other a rewrite test…
Descriptors: Writing Evaluation, Cloze Procedure, Comparative Analysis, Essays
Davis, Doris Bitler – Teaching of Psychology, 2017
Providing two or more versions of multiple-choice exams has long been a popular strategy for reducing the opportunity for students to engage in academic dishonesty. While the results of studies comparing exam scores under different question-order conditions have been inconclusive, the potential importance of contextual cues to aid student recall…
Descriptors: Test Construction, Multiple Choice Tests, Sequential Approach, Cues
Hauser, Peter C.; Paludneviciene, Raylene; Riddle, Wanda; Kurz, Kim B.; Emmorey, Karen; Contreras, Jessica – Journal of Deaf Studies and Deaf Education, 2016
The American Sign Language Comprehension Test (ASL-CT) is a 30-item multiple-choice test that measures ASL receptive skills and is administered through a website. This article describes the development and psychometric properties of the test based on a sample of 80 college students including deaf native signers, hearing native signers, deaf…
Descriptors: American Sign Language, Comprehension, Multiple Choice Tests, Receptive Language
Wilcox, Bethany R.; Pollock, Steven J. – Physical Review Special Topics - Physics Education Research, 2014
Free-response research-based assessments, like the Colorado Upper-division Electrostatics Diagnostic (CUE), provide rich, fine-grained information about students' reasoning. However, because of the difficulties inherent in scoring these assessments, the majority of the large-scale conceptual assessments in physics are multiple choice. To increase…
Descriptors: Science Tests, Science Process Skills, Student Evaluation, Physics

Peer reviewed
Direct link
