Publication Date
In 2025 | 0 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 1 |
Since 2006 (last 20 years) | 5 |
Descriptor
Comparative Testing | 35 |
Test Construction | 35 |
Test Format | 35 |
Test Items | 17 |
Higher Education | 16 |
Multiple Choice Tests | 12 |
Test Reliability | 11 |
Computer Assisted Testing | 8 |
Difficulty Level | 7 |
Test Validity | 7 |
Scores | 6 |
More ▼ |
Source
Author
Publication Type
Reports - Research | 26 |
Journal Articles | 15 |
Speeches/Meeting Papers | 14 |
Reports - Evaluative | 7 |
Tests/Questionnaires | 4 |
Dissertations/Theses -… | 1 |
Numerical/Quantitative Data | 1 |
Opinion Papers | 1 |
Reports - Descriptive | 1 |
Education Level
Elementary Secondary Education | 1 |
Grade 4 | 1 |
Grade 5 | 1 |
High Schools | 1 |
Audience
Researchers | 6 |
Practitioners | 1 |
Teachers | 1 |
Location
United Kingdom | 3 |
California | 1 |
Pennsylvania | 1 |
Laws, Policies, & Programs
Assessments and Surveys
ACT Assessment | 1 |
College Level Examination… | 1 |
Graduate Record Examinations | 1 |
What Works Clearinghouse Rating
Wim J. van der Linden; Luping Niu; Seung W. Choi – Journal of Educational and Behavioral Statistics, 2024
A test battery with two different levels of adaptation is presented: a within-subtest level for the selection of the items in the subtests and a between-subtest level to move from one subtest to the next. The battery runs on a two-level model consisting of a regular response model for each of the subtests extended with a second level for the joint…
Descriptors: Adaptive Testing, Test Construction, Test Format, Test Reliability
Kim, Sooyeon; Walker, Michael E.; McHale, Frederick – Journal of Educational Measurement, 2010
In this study we examined variations of the nonequivalent groups equating design for tests containing both multiple-choice (MC) and constructed-response (CR) items to determine which design was most effective in producing equivalent scores across the two tests to be equated. Using data from a large-scale exam, this study investigated the use of…
Descriptors: Measures (Individuals), Scoring, Equated Scores, Test Bias
Crisp, Victoria – Research Papers in Education, 2008
This research set out to compare the quality, length and nature of (1) exam responses in combined question and answer booklets, with (2) responses in separate answer booklets in order to inform choices about response format. Combined booklets are thought to support candidates by giving more information on what is expected of them. Anecdotal…
Descriptors: Geography Instruction, High School Students, Test Format, Test Construction
Wallach, P. M.; Crespo, L. M.; Holtzman, K. Z.; Galbraith, R. M.; Swanson, D. B. – Advances in Health Sciences Education, 2006
Purpose: In conjunction with curricular changes, a process to develop integrated examinations was implemented. Pre-established guidelines were provided favoring vignettes, clinically relevant material, and application of knowledge rather than simple recall. Questions were read aloud in a committee including all course directors, and a reviewer…
Descriptors: Test Items, Rating Scales, Examiners, Guidelines

Kobak, Kenneth A.; And Others – Psychological Assessment, 1993
A developed computer-administered form of the Hamilton Anxiety Scale and the clinician form of the instrument were administered to 214 psychiatric outpatients and 78 community adults. Results support the reliability and validity of the computer-administered version as an alternative to the clinician-administered version. (SLD)
Descriptors: Adults, Anxiety, Clinical Diagnosis, Comparative Testing
Collins, Allan; And Others – 1991
The use of paper and pencil, videotape recordings, and microcomputers in student testing provide three very different views of student achievement. Paper and pencil tests can record how students compose tests and documents, and how they critique documents or performances. Video recordings can record how students explain ideas, answer questions,…
Descriptors: Comparative Testing, Computer Assisted Testing, Computer Simulation, Elementary Secondary Education

Crehan, Kevin D.; And Others – Educational and Psychological Measurement, 1993
Studies with 220 college students found that multiple-choice test items with 3 items are more difficult than those with 4 items, and items with the none-of-these option are more difficult than those without this option. Neither format manipulation affected item discrimination. Implications for test construction are discussed. (SLD)
Descriptors: College Students, Comparative Testing, Difficulty Level, Distractors (Tests)
Cizek, Gregory J. – 1991
A commonly accepted rule for developing equated examinations using the common-items non-equivalent groups (CINEG) design is that items common to the two examinations being equated should be identical. The CINEG design calls for two groups of examinees to respond to a set of common items that is included in two examinations. In practice, this rule…
Descriptors: Certification, Comparative Testing, Difficulty Level, Higher Education

Barnes, Janet L.; Landy, Frank J. – Applied Psychological Measurement, 1979
Although behaviorally anchored rating scales have both intuitive and empirical appeal, they have not always yielded superior results in contrast with graphic rating scales. Results indicate that the choice of an anchoring procedure will depend on the nature of the actual rating process. (Author/JKS)
Descriptors: Behavior Rating Scales, Comparative Testing, Higher Education, Rating Scales

Schriesheim, Chester A.; And Others – Educational and Psychological Measurement, 1991
Effects of item wording on questionnaire reliability and validity were studied, using 280 undergraduate business students who completed a questionnaire comprising 4 item types: (1) regular; (2) polar opposite; (3) negated polar opposite; and (4) negated regular. Implications of results favoring regular and negated regular items are discussed. (SLD)
Descriptors: Business Education, Comparative Testing, Higher Education, Negative Forms (Language)
Assessing the Effects of Computer Administration on Scores and Parameter Estimates Using IRT Models.
Sykes, Robert C.; And Others – 1991
To investigate the psychometric feasibility of replacing a paper-and-pencil licensing examination with a computer-administered test, a validity study was conducted. The computer-administered test (Cadm) was a common set of items for all test takers, distinct from computerized adaptive testing, in which test takers receive items appropriate to…
Descriptors: Adults, Certification, Comparative Testing, Computer Assisted Testing
Heller, Eric S.; Rife, Frank N. – 1987
The goal of this study was to assess the relative merit of various ranges and types of response scales in terms of respondent satisfaction and comfort and the nature of the elicited information in a population of seventh grade students. Three versions of an attitudinal questionnaire, each containing the same items but employing a different…
Descriptors: Attitude Measures, Comparative Testing, Grade 7, Junior High Schools

Friedman, Stephen J.; Ansley, Timothy N. – Journal of Experimental Education, 1990
To investigate the relationship between reading and listening test scores, 3 different sets of listening items accompanied by answer sheets requiring varying amounts of reading were administered to 1,200 students in grades 3 through 8. Listening scores increased as more printed information was added to the answer sheet. (SLD)
Descriptors: Answer Sheets, Comparative Testing, Elementary Education, Elementary School Students
Mazzeo, John; And Others – 1991
Two studies investigated the comparability of scores from paper-and-pencil and computer-administered versions of the College-Level Examination Program (CLEP) General Examinations in mathematics and English composition. The first study used a prototype computer-administered version on each examination for 94 students for mathematics and 116 for…
Descriptors: College Entrance Examinations, College Students, Comparative Testing, Computer Assisted Testing
Shavelson, Richard J.; And Others – 1988
This study investigated the relationships among the symbolic representation of problems given to students to solve, the mental representations they use to solve the problems, and the accuracy of their solutions. Twenty eleventh-grade science students were asked to think aloud as they solved problems on the ideal gas laws. The problems were…
Descriptors: Chemistry, Comparative Testing, Problem Solving, Response Style (Tests)