NotesFAQContact Us
Collection
Advanced
Search Tips
Publication Date
In 20250
Since 20240
Since 2021 (last 5 years)0
Since 2016 (last 10 years)1
Since 2006 (last 20 years)5
Audience
Researchers1
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing all 10 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Lahner, Felicitas-Maria; Lörwald, Andrea Carolin; Bauer, Daniel; Nouns, Zineb Miriam; Krebs, René; Guttormsen, Sissel; Fischer, Martin R.; Huwendiek, Sören – Advances in Health Sciences Education, 2018
Multiple true-false (MTF) items are a widely used supplement to the commonly used single-best answer (Type A) multiple choice format. However, an optimal scoring algorithm for MTF items has not yet been established, as existing studies yielded conflicting results. Therefore, this study analyzes two questions: What is the optimal scoring algorithm…
Descriptors: Scoring Formulas, Scoring Rubrics, Objective Tests, Multiple Choice Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Kim, Sooyeon; Walker, Michael E.; McHale, Frederick – Journal of Educational Measurement, 2010
In this study we examined variations of the nonequivalent groups equating design for tests containing both multiple-choice (MC) and constructed-response (CR) items to determine which design was most effective in producing equivalent scores across the two tests to be equated. Using data from a large-scale exam, this study investigated the use of…
Descriptors: Measures (Individuals), Scoring, Equated Scores, Test Bias
Peer reviewed Peer reviewed
Crino, Michael D.; And Others – Educational and Psychological Measurement, 1985
The random response technique was compared to a direct questionnaire, administered to college students, to investigate whether or not the responses predicted the social desirability of the item. Results suggest support for the hypothesis. A 33-item version of the Marlowe-Crowne Social Desirability Scale which was used is included. (GDC)
Descriptors: Comparative Testing, Confidentiality, Higher Education, Item Analysis
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Gu, Lixiong; Drake, Samuel; Wolfe, Edward W. – Journal of Technology, Learning, and Assessment, 2006
This study seeks to determine whether item features are related to observed differences in item difficulty (DIF) between computer- and paper-based test delivery media. Examinees responded to 60 quantitative items similar to those found on the GRE general test in either a computer-based or paper-based medium. Thirty-eight percent of the items were…
Descriptors: Test Bias, Test Items, Educational Testing, Student Evaluation
Peer reviewed Peer reviewed
Kinicki, Angelo J.; And Others – Educational and Psychological Measurement, 1985
Using both the Behaviorally Anchored Rating Scales (BARS) and the Purdue University Scales, 727 undergraduates rated 32 instructors. The BARS had less halo effect, more leniency error, and lower interrater reliability. Both formats were valid. The two tests did not differ in rate discrimination or susceptibility to rating bias. (Author/GDC)
Descriptors: Behavior Rating Scales, College Faculty, Comparative Testing, Higher Education
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Johnson, Martin; Green, Sylvia – Journal of Technology, Learning, and Assessment, 2006
The transition from paper-based to computer-based assessment raises a number of important issues about how mode might affect children's performance and question answering strategies. In this project 104 eleven-year-olds were given two sets of matched mathematics questions, one set on-line and the other on paper. Facility values were analyzed to…
Descriptors: Student Attitudes, Computer Assisted Testing, Program Effectiveness, Elementary School Students
Peer reviewed Peer reviewed
Bolger, Niall; Kellaghan, Thomas – Journal of Educational Measurement, 1990
Gender differences in scholastic achievement as a function of measurement method were examined by comparing performance of 739 15-year-old boys and 758 15-year-old girls in Irish high schools on multiple-choice and free-response tests of mathematics, Irish, and English achievement. Method-based gender differences are discussed. (SLD)
Descriptors: Academic Achievement, Adolescents, Comparative Testing, English
Owen, K. – 1989
Sources of item bias located in characteristics of the test item were studied in a reasoning test developed in South Africa. Subjects were 1,056 White, 1,063 Indian, and 1,093 Black students from standard 7 in Afrikaans and English schools. Format and content of the 85-item Reasoning Test were manipulated to obtain information about bias or…
Descriptors: Afrikaans, Black Students, Cognitive Tests, Comparative Testing
Macpherson, Colin R.; Rowley, Glenn L. – 1986
Teacher-made mastery tests were administered in a classroom-sized sample to study their decision consistency. Decision-consistency of criterion-referenced tests is usually defined in terms of the proportion of examinees who are classified in the same way after two test administrations. Single-administration estimates of decision consistency were…
Descriptors: Classroom Research, Comparative Testing, Criterion Referenced Tests, Cutting Scores
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Horkay, Nancy; Bennett, Randy Elliott; Allen, Nancy; Kaplan, Bruce; Yan, Fred – Journal of Technology, Learning, and Assessment, 2006
This study investigated the comparability of scores for paper and computer versions of a writing test administered to eighth grade students. Two essay prompts were given on paper to a nationally representative sample as part of the 2002 main NAEP writing assessment. The same two essay prompts were subsequently administered on computer to a second…
Descriptors: Writing Evaluation, Writing Tests, Computer Assisted Testing, Program Effectiveness