Publication Date
| In 2026 | 0 |
| Since 2025 | 5 |
| Since 2022 (last 5 years) | 45 |
| Since 2017 (last 10 years) | 91 |
| Since 2007 (last 20 years) | 144 |
Descriptor
| Test Format | 418 |
| Test Reliability | 418 |
| Test Validity | 243 |
| Test Construction | 135 |
| Test Items | 119 |
| Higher Education | 88 |
| Multiple Choice Tests | 68 |
| Foreign Countries | 67 |
| Testing | 65 |
| Test Interpretation | 61 |
| Comparative Analysis | 57 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 33 |
| Teachers | 23 |
| Administrators | 18 |
| Researchers | 12 |
| Community | 1 |
| Counselors | 1 |
| Policymakers | 1 |
| Students | 1 |
| Support Staff | 1 |
Location
| New York | 9 |
| Turkey | 8 |
| California | 7 |
| Canada | 6 |
| Japan | 6 |
| Germany | 4 |
| United Kingdom | 4 |
| Georgia | 3 |
| Israel | 3 |
| France | 2 |
| Indonesia | 2 |
| More ▼ | |
Laws, Policies, & Programs
| Individuals with Disabilities… | 1 |
| Job Training Partnership Act… | 1 |
| No Child Left Behind Act 2001 | 1 |
| Pell Grant Program | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Peer reviewedPutnam, Lillian R. – Journal of Reading, 1986
Criticizes the Detroit Tests of Learning Aptitude 2 (DLTA-2): (1) scoring criteria for the Story Construction Test are questionable; (2) the Word Fragment Test may not be practically significant; (3) the Picture Book is inconvenient to use without an index or table of contents. One major strength is the provision for combining subtest scores. (SRT)
Descriptors: Aptitude Tests, Intelligence Tests, Learning Processes, Scores
Peer reviewedHodson, D. – Research in Science and Technological Education, 1984
Investigated the effect on student performance of changes in question structure and sequence on a GCE 0-level multiple-choice chemistry test. One finding noted is that there was virtually no change in test reliability on reducing the number of options (from five to per test item). (JN)
Descriptors: Academic Achievement, Chemistry, Multiple Choice Tests, Science Education
Peer reviewedKumar, V. K.; And Others – Measurement and Evaluation in Counseling and Development, 1986
Disguising scale purpose by using an innocuous skill title and filler items had no effect on the reliability and validity of Rotter's Interpersonal Trust Scale. (Author)
Descriptors: College Students, Higher Education, Response Style (Tests), Student Attitudes
Peer reviewedWeiten, Wayne – Journal of Experimental Education, 1982
A comparison of double as opposed to single multiple-choice questions yielded significant differences in regard to item difficulty, item discrimination, and internal reliability, but not concurrent validity. (Author/PN)
Descriptors: Difficulty Level, Educational Testing, Higher Education, Multiple Choice Tests
Peer reviewedKolstad, Rosemarie; And Others – Journal of Dental Education, 1982
Nonrestricted-answer, multiple-choice test items are recommended as a way of including more facts and fewer incorrect answers in test items, and they do not cue successful guessing as restricted multiple choice items can. Examination construction, scoring, and reliability are discussed. (MSE)
Descriptors: Guessing (Tests), Higher Education, Item Analysis, Multiple Choice Tests
Peer reviewedGreen, Kathy; And Others – Educational and Psychological Measurement, 1982
Achievement test reliability and validity as a function of ability were determined for multiple sections of a large undergraduate French class. Results did not support previous arguments that decreasing the number of options results in a more efficient test for high-level examinees, but less efficient for low-level examinees. (Author/GK)
Descriptors: Academic Ability, Comparative Analysis, Higher Education, Multiple Choice Tests
Peer reviewedHancock, Gregory R.; And Others – Educational and Psychological Measurement, 1993
Two-option multiple-choice vocabulary test items are compared with comparably written true-false test items. Results from a study with 111 high school students suggest that multiple-choice items provide a significantly more reliable measure than the true-false format. (SLD)
Descriptors: Ability, High School Students, High Schools, Objective Tests
Peer reviewedGreenberg, Karen L. – WPA: Writing Program Administration, 1992
Elaborates on and responds to challenges of direct writing assessment. Speculates on future directions in writing assessment. Suggests that, if writing instructors accept that writing is a multidimensional, situational construct that fluctuates across a wide variety of contexts, then they must also respect the complexity of teaching and testing…
Descriptors: Essay Tests, Higher Education, Multiple Choice Tests, Test Format
Peer reviewedDozois, David J. A.; Ahnberg, Jamie L.; Dobson, Keith S. – Psychological Assessment, 1998
Provides psychometric information on the second edition of the Beck Depression Inventory (BDI-II) (A. Beck, R. Steer, and G. Brown, 1996) for internal consistency, factorial validity, and gender differences. Results indicate that the BDI-II is a stronger instrument than its predecessor in terms of factor structure. (SLD)
Descriptors: Depression (Psychology), Factor Analysis, Factor Structure, Psychometrics
Hinton-Bayre, Anton; Geffen, Gina – Psychological Assessment, 2005
The present study examined the comparability of 4 alternate forms of the Digit Symbol Substitution test and the Symbol Digit Modalities (written) test, including the original versions. Male contact-sport athletes (N=112) were assessed on 1 of the 4 forms of each test. Reasonable alternate form comparability was demonstrated through establishing…
Descriptors: Intervals, Test Format, Orthographic Symbols, Drills (Practice)
Crehan, Kevin D.; And Others – 1989
Two issues in the writing of multiple-choice test items were investigated: a comparison of three versus four options; and the use of the inclusive "none of these" option versus a content option. Subjects were 220 introductory psychology students, who were enrolled at a large southwestern university, responding to a final examination in psychology…
Descriptors: College Students, Higher Education, Item Analysis, Multiple Choice Tests
Haladyna, Thomas M.; Downing, Steven M. – 1988
The proposition that the optimal number of options in a multiple choice test item is three was examined. The concept of functional distractor, a plausible wrong answer that is negatively discriminating when total test performance is the criterion, is discussed. Three distinct groups of achievers (high, middle, and low) on a national standardized…
Descriptors: Achievement Tests, Item Analysis, Multiple Choice Tests, Physicians
Karchmer, Michael A.; Allen, Thomas E. – 1984
The final report describes the accomplishments of an 18-month study designed to adapt and standardize the 7th Edition of the Stanford Achievement Test with a national, randomly drawn sample of hearing-impaired students. The following objectives were accomplished: (1) test material and special procedures were developed and disseminated; (2) the…
Descriptors: Achievement Tests, Elementary Secondary Education, Hearing Impairments, Test Construction
Rosso, Martin A.; Reckase, Mark D. – 1981
The overall purpose of this research was to compare a maximum likelihood based tailored testing procedure to a Bayesian tailored testing procedure. The results indicated that both tailored testing procedures produced equally reliable ability estimates. Also an analysis of test length indicated that reasonable ability estimates could be obtained…
Descriptors: Adaptive Testing, Bayesian Statistics, Comparative Analysis, Computer Assisted Testing
Peer reviewedWergin, Jon F. – New Directions for Teaching and Learning, 1988
Each of the many methods or approaches to assessing student learning is based on a clear understanding of validity, reliability, course objectives, the advantages and disadvantages of different evaluation formats, and the ways assessment data will be used to improve instruction. (Author/MSE)
Descriptors: College Instruction, Educational Objectives, Evaluation Methods, Higher Education

Direct link
