Publication Date
| In 2026 | 0 |
| Since 2025 | 5 |
| Since 2022 (last 5 years) | 45 |
| Since 2017 (last 10 years) | 91 |
| Since 2007 (last 20 years) | 144 |
Descriptor
| Test Format | 418 |
| Test Reliability | 418 |
| Test Validity | 243 |
| Test Construction | 135 |
| Test Items | 119 |
| Higher Education | 88 |
| Multiple Choice Tests | 68 |
| Foreign Countries | 67 |
| Testing | 65 |
| Test Interpretation | 61 |
| Comparative Analysis | 57 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 33 |
| Teachers | 23 |
| Administrators | 18 |
| Researchers | 12 |
| Community | 1 |
| Counselors | 1 |
| Policymakers | 1 |
| Students | 1 |
| Support Staff | 1 |
Location
| New York | 9 |
| Turkey | 8 |
| California | 7 |
| Canada | 6 |
| Japan | 6 |
| Germany | 4 |
| United Kingdom | 4 |
| Georgia | 3 |
| Israel | 3 |
| France | 2 |
| Indonesia | 2 |
| More ▼ | |
Laws, Policies, & Programs
| Individuals with Disabilities… | 1 |
| Job Training Partnership Act… | 1 |
| No Child Left Behind Act 2001 | 1 |
| Pell Grant Program | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Peer reviewedSimon, Alan J.; Joiner, Lee M. – Journal of Educational Measurement, 1976
The purpose of this study was to determine whether a Mexican version of the Peabody Picture Vocabulary Test could be improved by directly translating both forms of the American test, then using decision procedures to select the better item of each pair. The reliability of the simple translations suffered. (Author/BW)
Descriptors: Early Childhood Education, Spanish, Test Construction, Test Format
Peer reviewedWeber, Ronald L. – Journal of Learning Disabilities, 1982
Three measures often used with handicapped children (the Berry-Talbott Comprehension of Grammar, the Grammatic Closure subtest of the Illinois Test of Psycholinguistic Abilities, and the Grammatic Completion subtest of the Test of Language Development) are discussed in terms of test reliability, scoring procedures, format, and types of scores.…
Descriptors: Disabilities, Language Tests, Morphology (Languages), Nonstandard Dialects
Peer reviewedSandoval, Jonathan – Journal of Abnormal Child Psychology, 1981
The object of the study was to investigate the effect of differences in format on the precision of teacher ratings and thus on the reliability and validity of two teacher rating scales of children's hyperactive behavior. Attributes assessed were motor restlssness, inattentiveness, impulsivity, and aggressiveness/emotional stability. (Author/DB)
Descriptors: Behavior Rating Scales, Elementary Secondary Education, Hyperactivity, Test Format
Peer reviewedSchretlen, David; And Others – Psychological Assessment, 1994
Composite reliability and standard errors of measurement were computed for prorated Verbal, Performance, and Full-Scale intelligence quotient (IQ) scores from a seven-subtest short form of the Wechsler Adult Intelligence Scale-Revised. Results with 1,880 adults (standardization sample) indicate that this form is as reliable as the complete test.…
Descriptors: Adults, Error of Measurement, Intelligence, Intelligence Quotient
Peer reviewedAustin, Joe Dan – Psychometrika, 1981
On distractor-identification tests students mark as many distractors as possible on each test item. A grading scale is developed for this type testing. The score is optimal in that it yields an unbiased estimate of the student's score as if no guessing had occurred. (Author/JKS)
Descriptors: Guessing (Tests), Item Analysis, Measurement Techniques, Scoring Formulas
Murphy, Meg – School Shop, 1981
Suggests three techniques for assuring the content validity of classroom/shop tests: build a bank of content-valid test items; develop valid tests based on a carefully prepared table of specifications; and check the validity of tests already developed. A self-test is included for the reader. (CT)
Descriptors: Item Banks, Test Construction, Test Format, Test Reliability
Peer reviewedGilley, William F.; And Others – Psychology: A Journal of Human Behavior, 1988
Administered Peabody Mathematics Readiness Test to 325 students in kindergarten through second grade to investigate selected psychometric characteristics of the test. Found low item-to-item correlations; results did not support factor structure suggested by test's authors or proposed hierarchical structure. (Author/NB)
Descriptors: Factor Structure, Learning Readiness, Mathematics, Primary Education
Peer reviewedQualls, Audrey L. – Applied Measurement in Education, 1995
Classically parallel, tau-equivalently parallel, and congenerically parallel models representing various degrees of part-test parallelism and their appropriateness for tests composed of multiple item formats are discussed. An appropriate reliability estimate for a test with multiple item formats is presented and illustrated. (SLD)
Descriptors: Achievement Tests, Estimation (Mathematics), Measurement Techniques, Test Format
Peer reviewedJamison, Christine; Scogin, Forrest – International Journal of Aging and Human Development, 1992
Developed interview-based Geriatric Depression Rating Scale (GDRS) and administered 35-item GDRS to 68 older adults with range of affective disturbance. Found scale to have internal consistency and split-half reliability comparable to those of Hamilton Rating Scale for Depression and Geriatric Depression Scale. Concurrent validity, construct…
Descriptors: Depression (Psychology), Geriatrics, Interviews, Older Adults
Peer reviewedKapes, Jerome T.; Vansickle, Timothy R. – Measurement and Evaluation in Counseling and Development, 1992
Examined equivalence of mode of administration of the Career Decision-Making System, comparing paper-and-pencil version and computer-based version. Findings from 61 undergraduate students indicated that the computer-based version was significantly more reliable than paper-and-pencil version and was generally equivalent in other respects.…
Descriptors: Comparative Testing, Computer Assisted Testing, Higher Education, Test Format
Singelis, Theodore M.; Yamada, Ann Marie; Barrio, Concepcion; Laney, Joshua Harrison; Her, Pa; Ruiz-Anaya, Alejandrina; Lennertz, Sara Terwilliger – Hispanic Journal of Behavioral Sciences, 2006
The metric equivalence of translated scales is often in question but seldom examined. This study presents test-retest data that support the metric equivalence of the Spanish and English language versions of three measures: the Bidimensional Acculturation Scale, the Satisfaction with Life Scale, and the Self-Construal Scale. Participants were…
Descriptors: Acculturation, Life Satisfaction, English, Test Format
Rotou, Ourania; Patsula, Liane; Steffen, Manfred; Rizavi, Saba – ETS Research Report Series, 2007
Traditionally, the fixed-length linear paper-and-pencil (P&P) mode of administration has been the standard method of test delivery. With the advancement of technology, however, the popularity of administering tests using adaptive methods like computerized adaptive testing (CAT) and multistage testing (MST) has grown in the field of measurement…
Descriptors: Comparative Analysis, Test Format, Computer Assisted Testing, Models
Peer reviewedTollefson, Nona – Educational and Psychological Measurement, 1987
This study compared the item difficulty, item discrimination, and test reliability of three forms of multiple-choice items: (1) one correct answer; (2) "none of the above" as a foil; and (3) "none of the above" as the correct answer. Twelve items in the three formats were administered in a college statistics examination. (BS)
Descriptors: Difficulty Level, Higher Education, Item Analysis, Multiple Choice Tests
Peer reviewedHenk, William A. – Journal of Reading Behavior, 1981
Analyzes alternative cloze forms derived from selected deletion strategies, scoring procedures, and blank conditions for respective effects on the cloze test performance of college-level readers. (HOD)
Descriptors: Cloze Procedure, College Students, Higher Education, Reading Research
Peer reviewedBeitel, Patricia A.; Mead, Barbara J. – Perceptual and Motor Skills, 1980
Examined the short form and eight subtests of the Bruininks-Oseretsky Test of Motor Proficiency with a sample of preschoolers to assess its potential for discriminating among ages and between sexes and to see whether the short form accounted for a major portion of the variability of the complete battery. (Author/SJL)
Descriptors: Age Differences, Perceptual Motor Coordination, Performance Tests, Sex Differences

Direct link
