Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 2 |
| Since 2017 (last 10 years) | 3 |
| Since 2007 (last 20 years) | 29 |
Descriptor
Source
Author
| Wise, Steven L. | 3 |
| Badger, Elizabeth | 2 |
| Bridgeman, Brent | 2 |
| Clarke, S. C. T. | 2 |
| Clauser, Brian E. | 2 |
| De Ayala, R. J. | 2 |
| Ellis, Barbara B. | 2 |
| Hughes, Carolyn | 2 |
| Lissitz, Robert W. | 2 |
| Little, Todd D. | 2 |
| Nandakumar, Ratna | 2 |
| More ▼ | |
Publication Type
Education Level
| Higher Education | 11 |
| Elementary Secondary Education | 10 |
| Postsecondary Education | 7 |
| Elementary Education | 6 |
| Grade 8 | 5 |
| Grade 4 | 4 |
| Secondary Education | 4 |
| Grade 3 | 3 |
| Early Childhood Education | 2 |
| Grade 5 | 1 |
| Grade 7 | 1 |
| More ▼ | |
Audience
| Researchers | 6 |
| Practitioners | 1 |
| Teachers | 1 |
Location
| United States | 7 |
| Canada | 6 |
| Germany | 3 |
| Israel | 3 |
| Australia | 2 |
| China | 2 |
| South Africa | 2 |
| United Kingdom (England) | 2 |
| Alabama | 1 |
| Canada (Edmonton) | 1 |
| France | 1 |
| More ▼ | |
Laws, Policies, & Programs
| No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Mazor, Kathleen M.; And Others – 1993
The Mantel-Haenszel (MH) procedure has become one of the most popular procedures for detecting differential item functioning (DIF). One of the most troublesome criticisms of this procedure is that while detection rates for uniform DIF are very good, the procedure is not sensitive to non-uniform DIF. In this study, examinee responses were generated…
Descriptors: Comparative Testing, Computer Simulation, Item Bias, Item Response Theory
Peer reviewedStocking, Martha L.; And Others – Applied Psychological Measurement, 1993
A method of automatically selecting items for inclusion in a test with constraints on item content and statistical properties was applied to real data. Tests constructed manually from the same data and constraints were compared to tests constructed automatically. Results show areas in which automated assembly can improve test construction. (SLD)
Descriptors: Algorithms, Automation, Comparative Testing, Computer Assisted Testing
Threlfall, John; Pool, Peter; Homer, Matthew; Swinnerton, Bronwen – Educational Studies in Mathematics, 2007
This article explores the effect on assessment of "translating" paper and pencil test items into their computer equivalents. Computer versions of a set of mathematics questions derived from the paper-based end of key stage 2 and 3 assessments in England were administered to age appropriate pupil samples, and the outcomes compared.…
Descriptors: Test Items, Student Evaluation, Foreign Countries, Test Validity
Peer reviewedGressard, Risa P.; Loyd, Brenda H. – Journal of Educational Measurement, 1991
A Monte Carlo study, which simulated 10,000 examinees' responses to four tests, investigated the effect of item stratification on parameter estimation in multiple matrix sampling of achievement data. Practical multiple matrix sampling is based on item stratification by item discrimination and a sampling plan with moderate number of subtests. (SLD)
Descriptors: Achievement Tests, Comparative Testing, Computer Simulation, Estimation (Mathematics)
Ebel, Robert L. – 1981
An alternate-choice test item is a simple declarative sentence, one portion of which is given with two different wordings. For example, "Foundations like Ford and Carnegie tend to be (1) eager (2) hesitant to support innovative solutions to educational problems." The examinee's task is to choose the alternative that makes the sentence…
Descriptors: Comparative Testing, Difficulty Level, Guessing (Tests), Multiple Choice Tests
Brigham, Donald; Sullivan, Edward A. – 1980
The goals of the visual arts program of the Attleboro (MA) public schools, its relationship with the rest of the curriculum, and a study of the effectiveness of the program in seventh grade are described. It is suggested that the visual conceptual skills that are developed through the visual arts program are essential to cognitive processes and…
Descriptors: Cognitive Development, Comparative Testing, Concept Formation, Elementary Secondary Education
Peer reviewedRocklin, Thomas; O'Donnell, Angela M. – Journal of Educational Psychology, 1987
An experiment was conducted that contrasted a variant of computerized adaptive testing, self-adapted testing, with two traditional tests. Participants completed a self-report of text anxiety and were randomly assigned to take one of the three tests of verbal ability. Subjects generally chose more difficult items as the test progressed. (Author/LMO)
Descriptors: Adaptive Testing, Comparative Testing, Computer Assisted Testing, Difficulty Level
Peer reviewedDash, Udaya; Maguire, Thomas – Alberta Journal of Educational Research, 1984
Compares scores of 3,443 third graders in 1956 and 4,378 third graders in 1977 on the California Short Form Test of Mental Maturity. Examines differences in factoral structure and differences in ability level between groups for factors (64 items related to 7 components) apparently measuring consistent abilities. (SB)
Descriptors: Academic Ability, Comparative Analysis, Comparative Testing, Elementary Education
Peer reviewedIlai, Doron; Willerman, Lee – Intelligence, 1989
Items showing sex differences on the revised Wechsler Adult Intelligence Scale (WAIS-R) were studied. In a sample of 206 young adults (110 males and 96 females), 15 items demonstrated significant sex differences, but there was no relationship of item-specific gender content to sex differences in item performance. (SLD)
Descriptors: Comparative Testing, Females, Intelligence Tests, Item Analysis
Wallach, P. M.; Crespo, L. M.; Holtzman, K. Z.; Galbraith, R. M.; Swanson, D. B. – Advances in Health Sciences Education, 2006
Purpose: In conjunction with curricular changes, a process to develop integrated examinations was implemented. Pre-established guidelines were provided favoring vignettes, clinically relevant material, and application of knowledge rather than simple recall. Questions were read aloud in a committee including all course directors, and a reviewer…
Descriptors: Test Items, Rating Scales, Examiners, Guidelines
Kong, Xiaojing J.; Wise, Steven L.; Bhola, Dennison S. – Educational and Psychological Measurement, 2007
This study compared four methods for setting item response time thresholds to differentiate rapid-guessing behavior from solution behavior. Thresholds were either (a) common for all test items, (b) based on item surface features such as the amount of reading required, (c) based on visually inspecting response time frequency distributions, or (d)…
Descriptors: Test Items, Reaction Time, Timed Tests, Item Response Theory
Sykes, Robert C. – 1989
An analysis-of-covariance methodology was used to investigate whether there were population differences between tryout and operational Rasch item b-values relative to differences between pairs of item response theory (IRT) b-values from consecutive operational item administrations. This methodology allowed the evaluation of whether any such…
Descriptors: Analysis of Covariance, Certification, Comparative Testing, Item Response Theory
Peer reviewedPrasse, David P.; Bracken, Bruce A. – Psychology in the Schools, 1981
Significant differences were found between the Peabody Picture Vocabulary Test-Revised mean standard scores and Verbal, Performance, and Full Scale IQs. The PPVT-R did not correlate significantly with the WISC-R scales or subtests, suggesting the tests are measuring different abilities. (Author)
Descriptors: Ability Identification, Children, Comparative Testing, Intelligence Tests
Peer reviewedGreen, Kathy – Journal of Experimental Education, 1979
Reliabilities and concurrent validities of teacher-made multiple-choice and true-false tests were compared. No significant differences were found even when multiple-choice reliability was adjusted to equate testing time. (Author/MH)
Descriptors: Comparative Testing, Higher Education, Multiple Choice Tests, Test Format
Borich, Gary D.; Paver, Sydney W. – 1974
Eighty undergraduates were administered four self-report locus of control inventories, in order to evaluate the convergent and discriminant validity of four categories common to these inventories: chance, fate, personal control, and powerful others. The four inventories were: (1) Internal, Powerful Others and Chance scales; (2) James Internal…
Descriptors: Comparative Testing, Higher Education, Individual Differences, Locus of Control

Direct link
