ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	3

Descriptor

Test Format	10
Scoring	8
Test Items	5
Multiple Choice Tests	4
Test Construction	4
College Students	2
Comparative Analysis	2
Comparative Testing	2
Computer Assisted Testing	2
High Schools	2
Higher Education	2
Responses	2
Scores	2
Test Scoring Machines	2
Academic Achievement	1
Adults	1
Advanced Placement	1
Answer Keys	1
Answer Sheets	1
Cognitive Processes	1
Cognitive Tests	1
College Entrance Examinations	1
Computation	1
Computer Science	1
Constructed Response	1
More ▼

Source

Journal of Educational…

Author

Baldwin, Peter	1
Braun, Henry I.	1
Bridgeman, Brent	1
Clauser, Brian E.	1
Guo, Wenjing	1
Hughes, David C.	1
Kim, Sooyeon	1
McHale, Frederick	1
Mills, Craig N.	1
Norcini, John J.	1
Walker, Michael E.	1
Ward, William C.	1
Wilcox, Karen Thompson	1
Wilcox, Rand R.	1
Wind, Stefanie A.	1
More ▼

Publication Type

Journal Articles	10
Reports - Research	7
Reports - Evaluative	2
Reports - Descriptive	1

Education Level

Elementary Secondary Education

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

Graduate Record Examinations	2
Advanced Placement…	1
National Assessment of…	1

What Works Clearinghouse Rating

Showing all 10 results Save | Export

Historical Perspectives on Score Comparability Issues Raised by Innovations in Testing

Peer reviewed

Direct link

Baldwin, Peter; Clauser, Brian E. – Journal of Educational Measurement, 2022

While score comparability across test forms typically relies on common (or randomly equivalent) examinees or items, innovations in item formats, test delivery, and efforts to extend the range of score interpretation may require a special data collection before examinees or items can be used in this way--or may be incompatible with common examinee…

Descriptors: Scoring, Testing, Test Items, Test Format

Examining the Impacts of Ignoring Rater Effects in Mixed-Format Tests

Peer reviewed

Direct link

Guo, Wenjing; Wind, Stefanie A. – Journal of Educational Measurement, 2021

The use of mixed-format tests made up of multiple-choice (MC) items and constructed response (CR) items is popular in large-scale testing programs, including the National Assessment of Educational Progress (NAEP) and many district- and state-level assessments in the United States. Rater effects, or raters' scoring tendencies that result in…

Descriptors: Test Format, Multiple Choice Tests, Scoring, Test Items

Comparisons among Designs for Equating Mixed-Format Tests in Large-Scale Assessments

Peer reviewed

Direct link

Kim, Sooyeon; Walker, Michael E.; McHale, Frederick – Journal of Educational Measurement, 2010

In this study we examined variations of the nonequivalent groups equating design for tests containing both multiple-choice (MC) and constructed-response (CR) items to determine which design was most effective in producing equivalent scores across the two tests to be equated. Using data from a large-scale exam, this study investigated the use of…

Descriptors: Measures (Individuals), Scoring, Equated Scores, Test Bias

Construct Validity of Free-Response and Machine-Scorable Forms of a Test.

Peer reviewed

Ward, William C.; And Others – Journal of Educational Measurement, 1980

Free response and machine-scorable versions of a test called Formulating Hypotheses were compared with respect to construct validity. Results indicate that the different forms involve different cognitive processes and measure different qualities. (Author/JKS)

Descriptors: Cognitive Processes, Cognitive Tests, Higher Education, Personality Traits

A Comparison of Three Methods of Establishing Cut-Off Scores on Criterion-Referenced Tests.

Peer reviewed

Mills, Craig N. – Journal of Educational Measurement, 1983

This study compares the results obtained using the Angoff, borderline group, and contrasting groups methods of determining performance standards. Congruent results were obtained from the Angoff and contrasting groups methods for several test forms. Borderline group standards were not similar to standards obtained with other methods. (Author/PN)

Descriptors: Comparative Analysis, Criterion Referenced Tests, Cutting Scores, Standard Setting (Scoring)

The Influence of Context Position and Scoring Method on Essay Scoring.

Peer reviewed

And Others; Hughes, David C. – Journal of Educational Measurement, 1980

The effect of context on the scoring of essays was examined by arranging that the scoring of the criterion essay would be preceded either by five superior essays or by five inferior essays. The contrast in essay quality had the hypothesized effect. Other effects were not significant. (CTM)

Descriptors: Essay Tests, High Schools, Holistic Evaluation, Scoring

Models of Decisionmaking Processes for Multiple-Choice Test Items: An Analysis of Spatial Ability.

Peer reviewed

Wilcox, Rand R.; Wilcox, Karen Thompson – Journal of Educational Measurement, 1988

Use of latent class models to examine strategies that examinees (92 college students) use for a specific task is illustrated, via a multiple-choice test of spatial ability. Under an answer-until-correct scoring procedure, models representing an improvement over simplistic random guessing are proposed. (SLD)

Descriptors: College Students, Decision Making, Guessing (Tests), Multiple Choice Tests

The Answer Key as a Source of Error in Examinations for Professionals.

Peer reviewed

Norcini, John J. – Journal of Educational Measurement, 1987

Answer keys for physician and teacher licensing examinations were studied. The impact of variability on total errors of measurement was examined for answer keys constructed using the aggregate method. Results indicated that, in some cases, scorers contributed to a sizable reduction in measurement error. (Author/GDC)

Descriptors: Adults, Answer Keys, Error of Measurement, Evaluators

Scoring Constructed Responses Using Expert Systems.

Peer reviewed

Braun, Henry I.; And Others – Journal of Educational Measurement, 1990

The accuracy with which expert systems (ESs) score a new nonmultiple-choice free-response test item was investigated, using 734 high school students who were administered an advanced-placement computer science examination. ESs produced scores for 82 percent to 95 percent of the responses and displayed high agreement with a human reader on the…

Descriptors: Advanced Placement, Computer Assisted Testing, Computer Science, Constructed Response

A Comparison of Quantitative Questions in Open-Ended and Multiple-Choice Formats.

Peer reviewed

Bridgeman, Brent – Journal of Educational Measurement, 1992

Examinees in a regular administration of the quantitative portion of the Graduate Record Examination responded to particular items in a machine-scannable multiple-choice format. Volunteers (n=364) used a computer to answer open-ended counterparts of these items. Scores for both formats demonstrated similar correlational patterns. (SLD)

Descriptors: Answer Sheets, College Entrance Examinations, College Students, Comparative Testing