Publication Date
| In 2026 | 0 |
| Since 2025 | 4 |
| Since 2022 (last 5 years) | 14 |
| Since 2017 (last 10 years) | 17 |
| Since 2007 (last 20 years) | 36 |
Descriptor
| Comparative Testing | 203 |
| Test Reliability | 203 |
| Test Validity | 95 |
| Higher Education | 47 |
| Test Construction | 47 |
| Foreign Countries | 31 |
| College Students | 28 |
| Test Format | 28 |
| Intelligence Tests | 22 |
| Test Items | 22 |
| Psychometrics | 20 |
| More ▼ | |
Source
Author
| Bracken, Bruce A. | 3 |
| Gallas, Edwin J. | 3 |
| Smith, Douglas K. | 3 |
| Trevisan, Michael S. | 3 |
| Anderson, Paul S. | 2 |
| Breland, Hunter M. | 2 |
| Costantino, Giuseppe | 2 |
| Green, Kathy | 2 |
| Hyers, Albert D. | 2 |
| Karma, Kai | 2 |
| Marsh, Herbert W. | 2 |
| More ▼ | |
Publication Type
Education Level
| Higher Education | 16 |
| Postsecondary Education | 11 |
| Elementary Education | 5 |
| Elementary Secondary Education | 4 |
| Secondary Education | 4 |
| Early Childhood Education | 2 |
| Grade 2 | 2 |
| Grade 4 | 2 |
| High Schools | 2 |
| Grade 10 | 1 |
| Grade 7 | 1 |
| More ▼ | |
Audience
| Researchers | 9 |
| Practitioners | 3 |
| Teachers | 2 |
| Counselors | 1 |
Location
| United States | 5 |
| Australia | 4 |
| Canada | 4 |
| China | 4 |
| Ireland | 2 |
| Israel | 2 |
| Singapore | 2 |
| United Kingdom | 2 |
| United Kingdom (England) | 2 |
| Alabama | 1 |
| Argentina | 1 |
| More ▼ | |
Laws, Policies, & Programs
| Elementary and Secondary… | 2 |
| No Child Left Behind Act 2001 | 1 |
| Race to the Top | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Peer reviewedAllison, Donald E. – Alberta Journal of Educational Research, 1984
Reports that no significant difference in reliability appeared between a heterogeneous and a homogeneous form of the same general science matching-item test administered to 316 sixth-grade students but that scores on the heterogeneous form of the test were higher, independent of the examinee's sex or intelligence. (SB)
Descriptors: Comparative Analysis, Comparative Testing, Elementary Education, Grade 6
Peer reviewedMcCallum, R. Steve; Bracken, Bruce A. – Psychology in the Schools, 1981
Compared alternate forms of the Peabody Picture Vocabulary Test-Revised for (N=72) preschool children. Results indicated differences between Form L and Form M mean scores were nonsignificant for Whites, males, females, and the total group. For Black preschoolers, Form L was apparently more difficult to complete successfully than Form M. (Author)
Descriptors: Black Youth, Comparative Testing, Intelligence Tests, Preschool Children
Peer reviewedTirre, William C.; Pena, Carmen M. – Journal of Educational Psychology, 1992
Two experiments with approximately 377 newly enlisted Air Force personnel and 182 college students investigated the validity of a reading span test combining a knowledge verification task with a word memorization task. Results support the hypothesis that word recall reflects the amount of working memory functional in reading. (SLD)
Descriptors: College Students, Comparative Testing, Higher Education, Knowledge Level
Rodriguez-Aragon, Graciela; And Others – 1993
The predictive power of the Split-Half version of the Wechsler Intelligence Scale for Children--Revised (WISC-R) Object Assembly (OA) subtest was compared to that of the full administration of the OA subtest. A cohort of 218 male and 49 female adolescent offenders detained in a Texas juvenile detention facility between 1990 and 1992 was used. The…
Descriptors: Adolescents, Cohort Analysis, Comparative Testing, Correlation
PDF pending restorationAnderson, Paul S.; Hyers, Albert D. – 1991
Three descriptive statistics (difficulty, discrimination, and reliability) of multiple-choice (MC) test items were compared to those of a new (1980s) format of machine-scored questions. The new method, answer-bank multi-digit testing (MDT), uses alphabetized lists of up to 1,000 alternatives and approximates the completion style of assessment…
Descriptors: College Students, Comparative Testing, Computer Assisted Testing, Correlation
Green, Kathy – 1978
Forty three-option multiple choice (MC) statements on a midterm examination were converted to 120 true-false (TF) statements, identical in content. Test forms (MC and TF) were randomly administered to 50 undergraduates, to investigate the validity and internal consistency reliability of the two forms. A Kuder-Richardson formula 20 reliability was…
Descriptors: Achievement Tests, Comparative Testing, Higher Education, Multiple Choice Tests
PDF pending restorationFeitler, Fred C.; Graf, Stephen A. – 1978
Two forms of a teacher rating questionnaire, Student Reaction to Instruction, were administered to college students. The regular format used category scaling; the 631 responding students selected a number between one and five. Experimental "ratio production (multiply-divide)" evaluations were also completed by 26 subjects along with the…
Descriptors: College Faculty, Comparative Testing, Higher Education, Rating Scales
Peer reviewedHarrington, Robert G.; Jennings, Valerie – Contemporary Educational Psychology, 1986
Three short forms of the McCarthy Scales of Children's Abilities (MSCA) have been developed to screen the cognitive skills of young children suspected of learning disorders and developmental delays. Correlations were obtained between scores on the full form of the MSCA and the Kaufman, Taylor, and McCarthy Screening Test short forms. (Author/LMO)
Descriptors: Cognitive Tests, Comparative Testing, Correlation, Early Childhood Education
DeMars, Christine E. – Online Submission, 2005
Several methods for estimating item response theory scores for multiple subtests were compared. These methods included two multidimensional item response theory models: a bi-factor model where each subtest was a composite score based on the primary trait measured by the set of tests and a secondary trait measured by the individual subtest, and a…
Descriptors: Item Response Theory, Multidimensional Scaling, Correlation, Scoring Rubrics
Peer reviewedBarnes, Janet L.; Landy, Frank J. – Applied Psychological Measurement, 1979
Although behaviorally anchored rating scales have both intuitive and empirical appeal, they have not always yielded superior results in contrast with graphic rating scales. Results indicate that the choice of an anchoring procedure will depend on the nature of the actual rating process. (Author/JKS)
Descriptors: Behavior Rating Scales, Comparative Testing, Higher Education, Rating Scales
Peer reviewedSchriesheim, Chester A.; And Others – Educational and Psychological Measurement, 1991
Effects of item wording on questionnaire reliability and validity were studied, using 280 undergraduate business students who completed a questionnaire comprising 4 item types: (1) regular; (2) polar opposite; (3) negated polar opposite; and (4) negated regular. Implications of results favoring regular and negated regular items are discussed. (SLD)
Descriptors: Business Education, Comparative Testing, Higher Education, Negative Forms (Language)
Peer reviewedFriedrich, William N.; And Others – Psychological Assessment, 1992
Comparison of scores from a normative sample of 880 children and 276 sexually abused children on the Child Sexual Behavior Inventory (CSBI), a checklist for 2- through 12-year-old children, supports the reliability and validity of the instrument. The CSBI is directly related to specific features of sexual abuse. (SLD)
Descriptors: Behavior Patterns, Check Lists, Child Abuse, Children
Bezruczko, Nikolaus; Schroeder, David H. – 1989
An experimental test battery consisting of several tests that measure aspects of artistic judgment was administered to over 1,600 clients of the Johnson O'Connor Research Foundation. The battery consisted of the Visual Aesthetic Sensitivity Test (VAST) of K. O. Gotz (1981); the Design Judgment Test (DJT) of M. Graves (1948); and two tests…
Descriptors: Adults, Aesthetic Values, Aptitude Tests, Art Appreciation
Costantino, Giuseppe; And Others – 1989
Attention deficits and attention deficit-hyperactivity disorder (AD-HD) are regarded as relatively common disorders among school-age children, but the literature reveals several confounding factors with the standard assessment techniques for the disorder. Using a structured thematic apperception technique (the TEMAS Apperception Test of G.…
Descriptors: Adolescents, Attention Deficit Disorders, Children, Comparative Testing
Peer reviewedCarver, Ronald P. – Educational and Psychological Measurement, 1992
Reliability and validity of a new measure of cognitive speed, the Speed of Thinking Test (SST), were investigated with 129 college students, who also completed a vocabulary test, a test of reading speed, and a test of reading comprehension. The SST appears to be a reliable and valid measure. (SLD)
Descriptors: Cognitive Ability, Cognitive Tests, College Students, Comparative Testing


