ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	1
Since 2006 (last 20 years)	2

Descriptor

Scoring Formulas	10
Statistical Analysis	10
Test Validity	10
Testing	5
Multiple Choice Tests	4
Response Style (Tests)	4
Test Reliability	4
Comparative Analysis	3
Measurement Techniques	3
Test Construction	3
Weighted Scores	3
Computer Programs	2
Confidence Testing	2
Guessing (Tests)	2
Mathematical Applications	2
Probability	2
Accuracy	1
Achievement Tests	1
Admission (School)	1
Attitude Measures	1
Bias	1
Career Guidance	1
Cognitive Tests	1
College Entrance Examinations	1
College Students	1
More ▼

Source

Educational and Psychological…	3
Applied Measurement in…	1
College Board	1

Author

Brown, Thomas A.	1
Cohen, Allan	1
Donlon, Thomas F.	1
Gleser, Leon Jay	1
Gordon, Leonard V.	1
Kimmel, Ernest W.	1
Kobrin, Jennifer L.	1
Raczynski, Kevin	1
Rippey, Robert M.	1
Sands, William A.	1
Scott, William A.	1
Shuford, Emir H., Jr.	1
Sibley, William L.	1
More ▼

Publication Type

Reports - Research	6
Journal Articles	1

Education Level

Grade 7	1
High Schools	1
Higher Education	1
Postsecondary Education	1

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

SAT (College Admission Test)	1
Strong Vocational Interest…	1

What Works Clearinghouse Rating

Showing all 10 results Save | Export

Appraising the Scoring Performance of Automated Essay Scoring Systems--Some Additional Considerations: Which Essays? Which Human Raters? Which Scores?

Peer reviewed

Direct link

Raczynski, Kevin; Cohen, Allan – Applied Measurement in Education, 2018

The literature on Automated Essay Scoring (AES) systems has provided useful validation frameworks for any assessment that includes AES scoring. Furthermore, evidence for the scoring fidelity of AES systems is accumulating. Yet questions remain when appraising the scoring performance of AES systems. These questions include: (a) which essays are…

Descriptors: Essay Tests, Test Scoring Machines, Test Validity, Evaluators

Are There Two Extremeness Response Sets?

Peer reviewed

Gordon, Leonard V. – Educational and Psychological Measurement, 1971

Results indicate that extremeness response sets at the two ends of the continuum differentially contribute to scale validity. (MS)

Descriptors: Attitude Measures, Rating Scales, Response Style (Tests), Scoring Formulas

The Distribution of Test Scores

Peer reviewed

Scott, William A. – Educational and Psychological Measurement, 1972

Descriptors: Item Sampling, Mathematical Applications, Scoring Formulas, Statistical Analysis

On Bounds for the Average Correlation Between Subtest Scores in Ipsatively Scored Tests

Peer reviewed

Gleser, Leon Jay – Educational and Psychological Measurement, 1972

Paper is concerned with the effect that ipsative scoring has upon a commonly used index of between-subtest correlation. (Author)

Descriptors: Comparative Analysis, Forced Choice Technique, Mathematical Applications, Measurement Techniques

Test Development and Technical Information on the Writing Section of the SAT Reasoning Test™. Research Notes RN-25

Download full text

Kobrin, Jennifer L.; Kimmel, Ernest W. – College Board, 2006

Based on statistics from the first few administrations of the SAT writing section, the test is performing as expected. The reliability of the writing section is very similar to that of other writing assessments. Based on preliminary validity research, the writing section is expected to add modestly to the prediction of college performance when…

Descriptors: Test Construction, Writing Tests, Cognitive Tests, College Entrance Examinations

An Optimizing Weight For Wrong Scores.

Download full text

Donlon, Thomas F. – 1975

This study empirically determined the optimizing weight to be applied to the Wrongs Total Score in scoring rubrics of the general form = R - kW, where S is the Score, R the Rights Total, k the weight and W the Wrongs Total, if reliability is to be maximized. As is well known, the traditional formula score rests on a theoretical framework which is…

Descriptors: Achievement Tests, Comparative Analysis, Guessing (Tests), Multiple Choice Tests

An Experimental Implementation of Computer Assisted Admissible Probability Testing.

Download full text

Sibley, William L. – 1974

The use of computers in areas of testing, selection, and placement processes for those in military services' training programs are viewed in this paper. Also discussed is a review of the motivational and theoretical foundation of admissible probability testing, the role of the computer in admissible probability testing, and the authors' experience…

Descriptors: Computer Oriented Programs, Computers, Interaction, Military Training

Alternative Item Response Weighting Procedures: Development and Evaluation.

Download full text

Sands, William A. – 1975

In order to develop tools for use in the selection and vocational-educational guidance of U.S. Naval Academy midshipmen, three empirically-based scales, designed using the Strong Vocational Interest Blank (SVIB), were developed to predict three criteria: (1) disenrollment for academic reasons, (2) disenrollment for motivational reasons, and (3)…

Descriptors: Admission (School), Career Guidance, College Students, Comparative Analysis

Rationale of Computer-Administered Admissible Probability Measurement.

Download full text

Shuford, Emir H., Jr.; Brown, Thomas A. – 1974

A student's choice of an answer to a test question is a coarse measure of his knowledge about the subject matter of the question. Much finer measurement might be achieved if the student were asked to estimate, for each possible answer, the probability that it is the correct one. Such a procedure could yield two classes of benefits: (a) students…

Descriptors: Bias, Computer Programs, Confidence Testing, Decision Making

Scoreing and Analyzing Confidence Tests. Final Report.

Download full text

Rippey, Robert M. – 1971

Technical improvements, which may be made in the reliability and validity of tests through confidence scores, are discussed. However, studies indicate that subjects do not handle their confidence uniformly. (MS)

Descriptors: Computer Programs, Confidence Testing, Correlation, Difficulty Level