Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 3 |
Since 2006 (last 20 years) | 5 |
Descriptor
Objective Tests | 11 |
Scores | 11 |
Test Reliability | 11 |
Multiple Choice Tests | 5 |
Test Items | 4 |
Test Validity | 4 |
Correlation | 3 |
Graduate Students | 3 |
Guessing (Tests) | 3 |
Statistical Analysis | 3 |
Comparative Analysis | 2 |
More ▼ |
Source
Author
Bordage, Georges | 2 |
Yudkowsky, Rachel | 2 |
Allison, Donald E. | 1 |
Anderson, Paul S. | 1 |
Burton, Richard F. | 1 |
Daniels, Vijay J. | 1 |
Daughtry, Don | 1 |
Gierl, Mark J. | 1 |
Hakelind, Camilla | 1 |
Hamstra, Stanley J. | 1 |
Harasym, P. H. | 1 |
More ▼ |
Publication Type
Journal Articles | 9 |
Reports - Research | 9 |
Reports - Evaluative | 1 |
Education Level
Higher Education | 5 |
Postsecondary Education | 5 |
Audience
Researchers | 1 |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Hakelind, Camilla; Sundström, Anna E. – Psychology Learning and Teaching, 2022
Finding valid and reliable ways to assess complex clinical skills within psychology is a challenge. Recently, there have been some examples of applying Objective Structured Clinical Examinations (OSCEs) in psychology for making such assessments. The aim of this study was to examine students' and examiners' perceptions of a digital OSCE in…
Descriptors: Graduate Students, Masters Programs, Clinical Psychology, Student Evaluation
Kelly, William E.; Daughtry, Don – College Student Journal, 2018
This study developed an abbreviated form of Barron's (1953) Ego Strength Scale for use in research among college student samples. A version of Barron's scale was administered to 100 undergraduate college students. Using item-total score correlations and internal consistency, the scale was reduced to 18 items (Es18). The Es18 possessed adequate…
Descriptors: Undergraduate Students, Self Concept Measures, Test Length, Scores
McKenna, Peter – Interactive Technology and Smart Education, 2019
Purpose: This paper aims to examine whether multiple choice questions (MCQs) can be answered correctly without knowing the answer and whether constructed response questions (CRQs) offer more reliable assessment. Design/methodology/approach: The paper presents a critical review of existing research on MCQs, then reports on an experimental study…
Descriptors: Multiple Choice Tests, Accuracy, Test Wiseness, Objective Tests
Pugh, Debra; Hamstra, Stanley J.; Wood, Timothy J.; Humphrey-Murto, Susan; Touchie, Claire; Yudkowsky, Rachel; Bordage, Georges – Advances in Health Sciences Education, 2015
Internists are required to perform a number of procedures that require mastery of technical and non-technical skills, however, formal assessment of these skills is often lacking. The purpose of this study was to develop, implement, and gather validity evidence for a procedural skills objective structured clinical examination (PS-OSCE) for internal…
Descriptors: Graduate Students, Medical Students, Internal Medicine, Skills
Daniels, Vijay J.; Bordage, Georges; Gierl, Mark J.; Yudkowsky, Rachel – Advances in Health Sciences Education, 2014
Objective structured clinical examinations (OSCEs) are used worldwide for summative examinations but often lack acceptable reliability. Research has shown that reliability of scores increases if OSCE checklists for medical students include only clinically relevant items. Also, checklists are often missing evidence-based items that high-achieving…
Descriptors: Graduate Medical Education, Check Lists, Scores, Internal Medicine
Multiple Choice and True/False Tests: Reliability Measures and Some Implications of Negative Marking
Burton, Richard F. – Assessment & Evaluation in Higher Education, 2004
The standard error of measurement usefully provides confidence limits for scores in a given test, but is it possible to quantify the reliability of a test with just a single number that allows comparison of tests of different format? Reliability coefficients do not do this, being dependent on the spread of examinee attainment. Better in this…
Descriptors: Multiple Choice Tests, Error of Measurement, Test Reliability, Test Items

Allison, Donald E. – Alberta Journal of Educational Research, 1984
Reports that no significant difference in reliability appeared between a heterogeneous and a homogeneous form of the same general science matching-item test administered to 316 sixth-grade students but that scores on the heterogeneous form of the test were higher, independent of the examinee's sex or intelligence. (SB)
Descriptors: Comparative Analysis, Comparative Testing, Elementary Education, Grade 6

Anderson, Paul S.; And Others – Illinois School Research and Development, 1985
Concludes that the Multi-Digit Test stimulates better retention than multiple choice tests while offering the advantage of computerized scoring and analysis. (FL)
Descriptors: Comparative Analysis, Computer Assisted Testing, Educational Research, Higher Education

Harasym, P. H.; And Others – Evaluation and the Health Professions, 1980
Coded, as opposed to free response items, in a multiple choice physiology test had a cueing effect which raised students' scores, especially for lower achievers. Reliability of coded items was also lower. Item format and scoring method had an effect on test results. (GDC)
Descriptors: Achievement Tests, Comparative Testing, Cues, Higher Education
Roudabush, Glenn E. – 1974
In this paper, several models for the psychometric nature of criterion-referenced tests are presented and results derived with implications for test construction, reliability and validity measures, and educational decision making. Both dichotomous and continuous underlying abilities to perform are considered. Illustrative data fitting both cases…
Descriptors: Criterion Referenced Tests, Decision Making, Evaluation Methods, Measurement Techniques
Hendrickson, Gerry F. – 1971
The purpose of this study was to determine whether option weighting improved the internal consistency and intercorrelation of the subtests. The differential option-weighting scheme employed in this study is based on one devised by Guttman. The tests were first scored with Guttman-type weights and then with conventional correction-for-guessing…
Descriptors: Answer Keys, Correlation, Factor Structure, Guessing (Tests)