ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	1
Since 2006 (last 20 years)	2

Descriptor

Measurement Techniques	15
Scoring Formulas	15
Test Validity	15
Test Reliability	12
Multiple Choice Tests	6
Test Interpretation	5
Item Analysis	4
Scoring	4
Testing	4
Weighted Scores	4
Guessing (Tests)	3
Standardized Tests	3
Statistical Analysis	3
Test Construction	3
Computer Programs	2
Confidence Testing	2
Correlation	2
Criterion Referenced Tests	2
Essay Tests	2
Performance Criteria	2
Probability	2
Research Reports	2
Scores	2
Test Results	2
Academic Achievement	1
More ▼

Source

Educational and Psychological…	2
Applied Measurement in…	1
Creativity Research Journal	1
Journal of Educational…	1
Journal of Veterinary Medical…	1
Perceptual and Motor Skills	1
Psychometrika	1

Publication Type

Journal Articles	7
Reports - Research	7
Guides - Classroom - Teacher	1
Opinion Papers	1
Reports - Evaluative	1

Education Level

Grade 7

Audience

Practitioners

Location

Laws, Policies, & Programs

Assessments and Surveys

California Achievement Tests	1
Group Embedded Figures Test	1
Rod and Frame Test	1

What Works Clearinghouse Rating

Showing all 15 results Save | Export

Appraising the Scoring Performance of Automated Essay Scoring Systems--Some Additional Considerations: Which Essays? Which Human Raters? Which Scores?

Peer reviewed

Direct link

Raczynski, Kevin; Cohen, Allan – Applied Measurement in Education, 2018

The literature on Automated Essay Scoring (AES) systems has provided useful validation frameworks for any assessment that includes AES scoring. Furthermore, evidence for the scoring fidelity of AES systems is accumulating. Yet questions remain when appraising the scoring performance of AES systems. These questions include: (a) which essays are…

Descriptors: Essay Tests, Test Scoring Machines, Test Validity, Evaluators

Divergent Thinking as an Indicator of Creative Potential

Peer reviewed

Direct link

Runco, Mark A.; Acar, Selcuk – Creativity Research Journal, 2012

Divergent thinking (DT) tests are very often used in creativity studies. Certainly DT does not guarantee actual creative achievement, but tests of DT are reliable and reasonably valid predictors of certain performance criteria. The validity of DT is described as reasonable because validity is not an all-or-nothing attribute, but is, instead, a…

Descriptors: Creativity, Creative Activities, Creative Thinking, Test Validity

Measurement of Rod-and-Frame Test Performance.

Peer reviewed

Allen, Mary J.; And Others – Perceptual and Motor Skills, 1982

Adults took the Rod and Frame, Portable Rod and Frame, and Embedded Figures Tests. Absolute and algebraic frame-effect scores were more reliable and valid than rod-effect algebraic scores. Correlations with the Embedded Figures Test were so low that the interchangeability of these field articulation measures is questionable. (Author/RD)

Descriptors: Adults, Cognitive Style, Correlation, Measurement Techniques

Grading Distractor-Identification Tests.

Peer reviewed

Austin, Joe Dan – Psychometrika, 1981

On distractor-identification tests students mark as many distractors as possible on each test item. A grading scale is developed for this type testing. The score is optimal in that it yields an unbiased estimate of the student's score as if no guessing had occurred. (Author/JKS)

Descriptors: Guessing (Tests), Item Analysis, Measurement Techniques, Scoring Formulas

A Simulation Study of Reliability and Validity of Multiple-Choice Test Scores Under Six Response-Scoring Modes.

Peer reviewed

Frary, Robert B. – Journal of Educational Statistics, 1982

Six different approaches to scoring test data, including number right, correction for guessing, and answer-until-correct, were investigated using Monte Carlo techniques. Modes permitting multiple response showed higher internal consistency, but there was little difference among modes for a validity measure. (JKS)

Descriptors: Guessing (Tests), Measurement Techniques, Multiple Choice Tests, Scoring Formulas

On Bounds for the Average Correlation Between Subtest Scores in Ipsatively Scored Tests

Peer reviewed

Gleser, Leon Jay – Educational and Psychological Measurement, 1972

Paper is concerned with the effect that ipsative scoring has upon a commonly used index of between-subtest correlation. (Author)

Descriptors: Comparative Analysis, Forced Choice Technique, Mathematical Applications, Measurement Techniques

Content Validity and Reliability of Single Items or Questionnaires.

Peer reviewed

Aiken, Lewis R. – Educational and Psychological Measurement, 1980

Procedures for computing content validity and consistency reliability coefficients and determining the statistical significance of these coefficients are described. Procedures employing the multinomial probability distribution for small samples and normal curve probability estimates for large samples, can be used where judgments are made on…

Descriptors: Computer Programs, Measurement Techniques, Probability, Questionnaires

Toward an Integration of Theory and Method for Criterion-Referenced Tests.

Download full text

Hambleton, Ronald K.; Novick, Melvin R. – 1972

In this paper, an attempt has been made to synthesize some of the current thinking in the area of criterion-referenced testing as well as to provide the beginning of an integration of theory and method for such testing. Since criterion-referenced testing is viewed from a decision-theoretic point of view, approaches to reliability and validity…

Descriptors: Criterion Referenced Tests, Measurement Instruments, Measurement Techniques, Scaling

Over Confidence on Probabilistic Tests.

Download full text

Koehler, Roger A. – 1974

A potentially valuable measure of overconfidence on probabilistic multiple-choice tests was evaluated. The measure of overconfidence was based on probabilistic responses to nonsense items embedded in a vocabulary test. The test was administered under both confidence response and conventional choice response directions to 208 undergraduate…

Descriptors: Confidence Testing, Guessing (Tests), Measurement Techniques, Multiple Choice Tests

The Effect of Differential Weighting of Individual Item Responses on the Predictive Validity and Reliability of an Aptitude Test.

Download full text

Sabers, Darrell L.; White, Gordon W. – 1971

A procedure for scoring multiple-choice tests by assigning different weights to every option of a test item is investigated. The weighting method used was based on that proposed by Davis, which involves taking the upper and lower 27% of a sample, according to some criterion measure, and using the percentages of these groups marking an item option…

Descriptors: Computer Oriented Programs, Item Analysis, Measurement Techniques, Multiple Choice Tests

Improving Achievement Via Essay Exams.

Peer reviewed

Milton, Ohmer – Journal of Veterinary Medical Education, 1979

The benefits of using essay tests rather than objective tests in professional education programs are discussed. Essay tests offer practice in writing, creativity and formal communications. Guidelines for using and scoring a sample essay test in biology are presented. (BH)

Descriptors: Academic Achievement, Biology, Educational Objectives, Essay Tests

A Note on the Variances of Empirically Derived Option Scoring Weights.

Download full text

Echternacht, Gary – 1973

Estimates for the variance of empirically determined scoring weights are given. It is shown that test item writers should write distractors that discriminate on the criterion variable when this type of scoring is used. (Author)

Descriptors: Item Analysis, Measurement Techniques, Multiple Choice Tests, Performance Criteria

A Comparison of Various Item Option Weighting Schemes.

Download full text

Echternacht, Gary – 1973

This study compares various item option scoring methods with respect to coefficient alpha and a concurrent validity coefficient. The scoring methods under consideration were: (1) formula scoring, (2) a priori scoring, (3) empirical scoring with an internal criterion, and (4) two modifications of formula scoring. The study indicates a clear…

Descriptors: Item Analysis, Measurement Techniques, Multiple Choice Tests, Performance Criteria

Rationale of Computer-Administered Admissible Probability Measurement.

Download full text

Shuford, Emir H., Jr.; Brown, Thomas A. – 1974

A student's choice of an answer to a test question is a coarse measure of his knowledge about the subject matter of the question. Much finer measurement might be achieved if the student were asked to estimate, for each possible answer, the probability that it is the correct one. Such a procedure could yield two classes of benefits: (a) students…

Descriptors: Bias, Computer Programs, Confidence Testing, Decision Making

Estimating Normative Scores from a Criterion-Referenced Test.

Download full text

Roudabush, Glenn E. – 1975

The objective of this study was to show that standardized reading scores could be adequately estimated from scores on a criterion-referenced test in reading. This would reduce classroom test time, while, at the same time, provide the kinds of information teachers need to guide instruction, and the kinds of information administrators require for…

Descriptors: Achievement Tests, Correlation, Criterion Referenced Tests, Equated Scores

Echternacht, Gary	2
Acar, Selcuk	1
Aiken, Lewis R.	1
Allen, Mary J.	1
Austin, Joe Dan	1
Brown, Thomas A.	1
Cohen, Allan	1
Frary, Robert B.	1
Gleser, Leon Jay	1
Hambleton, Ronald K.	1
Koehler, Roger A.	1
Milton, Ohmer	1
Novick, Melvin R.	1
Raczynski, Kevin	1
Roudabush, Glenn E.	1
Runco, Mark A.	1
Sabers, Darrell L.	1
Shuford, Emir H., Jr.	1
White, Gordon W.	1
More ▼