ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	0
Since 2006 (last 20 years)	1

Descriptor

Item Analysis	10
Probability	10
Test Reliability	10
Error of Measurement	4
Test Validity	4
Comparative Analysis	3
Equated Scores	3
Mathematical Models	3
Multiple Choice Tests	3
Raw Scores	3
Statistical Analysis	3
Achievement Tests	2
Elementary Education	2
Factor Analysis	2
Goodness of Fit	2
Guessing (Tests)	2
Mathematical Formulas	2
Reading Comprehension	2
Reading Tests	2
Research Reports	2
Response Style (Tests)	2
Sample Size	2
Standardized Tests	2
Test Construction	2
Test Interpretation	2
More ▼

Source

Educational and Psychological…	2
Applied Measurement in…	1
Journal of Educational…	1

Author

Bashaw, W. L.	2
Rentz, R. Robert	2
Aiken, Lewis R.	1
Brennan, Robert L,	1
Dawis, Rene V.	1
Kane, Michael T.	1
Levine, Michael V.	1
Lockwood, Robert E.	1
Moloney, James M.	1
Munoz-Colberg, Magda	1
Phillips, Gary W.	1
Rubin, Donald B.	1
Weber, Margaret B.	1
Whitely, Susan E.	1
More ▼

Publication Type

Reports - Research	7
Journal Articles	2
Numerical/Quantitative Data	2
Speeches/Meeting Papers	1

Education Level

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

SAT (College Admission Test)

What Works Clearinghouse Rating

Showing all 10 results Save | Export

Impact of Design Effects in Large-Scale District and State Assessments

Peer reviewed

Direct link

Phillips, Gary W. – Applied Measurement in Education, 2015

This article proposes that sampling design effects have potentially huge unrecognized impacts on the results reported by large-scale district and state assessments in the United States. When design effects are unrecognized and unaccounted for they lead to underestimating the sampling error in item and test statistics. Underestimating the sampling…

Descriptors: State Programs, Sampling, Research Design, Error of Measurement

The Nature of Objectivity with the Rasch Model

Peer reviewed

Whitely, Susan E.; Dawis, Rene V. – Journal of Educational Measurement, 1974

Descriptors: Error of Measurement, Item Analysis, Matrices, Measurement Techniques

A Logical Approach to the Testing of Deductive and Inductive Abilities. Technical Study 77-7.

Download full text

Munoz-Colberg, Magda – 1977

The logical foundations of deduction and induction are outlined to form the rules for the construction of a set of tests of reasoning ability. Both deduction and induction involve the derivation of a conclusion from a set of premises. Deductive logic uses syllogisms and is abstract. Inductive logic is both empirical and abstract. Although…

Descriptors: Abstract Reasoning, Cognitive Tests, Deduction, Induction

An Examination of the Bilevel Dimensionality of Lower and Higher Mental Processes in Probability Achievement

Peer reviewed

Weber, Margaret B. – Educational and Psychological Measurement, 1977

Bilevel dimensionality of probability was examined via factor analysis, Rasch latent trait analysis, and classical item analysis. Results suggest that when nonstandardized measures are the criteria for achievement, relying solely on estimates of content validity may lead to erroneous interpretation of test score data. (JKS)

Descriptors: Achievement, Achievement Tests, Factor Analysis, Item Analysis

Three Coefficients for Analyzing the Reliability and Validity of Ratings.

Peer reviewed

Aiken, Lewis R. – Educational and Psychological Measurement, 1985

Three numerical coefficients for analyzing the validity and reliability of ratings are described. Each coefficient is computed as the ratio of an obtained to a maximum sum of differences in ratings. The coefficients are also applicable to the item analysis, agreement analysis, and cluster or factor analysis of rating-scale data. (Author/BW)

Descriptors: Computer Software, Data Analysis, Factor Analysis, Item Analysis

Item Reliabilities for a Family of Answer-Until-Correct (AUC) Scoring Rules.

PDF pending restoration

Kane, Michael T.; Moloney, James M. – 1976

The Answer-Until-Correct (AUC) procedure has been proposed in order to increase the reliability of multiple-choice items. A model for examinees' behavior when they must respond to each item until they answer it correctly is presented. An expression for the reliability of AUC items, as a function of the characteristics of the item and the scoring…

Descriptors: Guessing (Tests), Item Analysis, Mathematical Models, Multiple Choice Tests

A Comparison of Two Cutting Score Procedures Using Generalizability Theory. ACT Technical Bulletin No. 33.

Download full text

Brennan, Robert L,; Lockwood, Robert E. – 1979

Procedures for determining cutting scores have been proposed by Angoff and by Nedelsky. Nedelsky's approach requires that a rater examine each distractor within a test item to determine the probability of a minimally competent examinee answering correctly; whereas Angoff uses a judgment based on the whole item, rather than each of its components.…

Descriptors: Achievement Tests, Comparative Analysis, Cutting Scores, Guessing (Tests)

Equating Reading Tests With the Rasch Model. Volume I, Final Report.

Download full text

Rentz, R. Robert; Bashaw, W. L. – 1975

In order to determine if Rasch Model procedures have any utility for equating pre-existing tests, this study reanalyzed the data from the equating phase of the Anchor Test Study which used a variety of equipercentile and linear model methods. The tests involved included seven reading test batteries, each having from one to three levels and two…

Descriptors: Comparative Analysis, Elementary Education, Equated Scores, Error of Measurement

Equating Reading Tests With the Rasch Model. Volume II, Technical Reference Tables.

Download full text

Rentz, R. Robert; Bashaw, W. L. – 1975

This volume contains tables of item analysis results obtained by following procedures associated with the Rasch Model for those reading tests used in the Anchor Test Study. Appendix I gives the test names and their corresponding analysis code numbers. Section I (Basic Item Analyses) presents data for the item analysis of each test in a two part…

Descriptors: Comparative Analysis, Elementary Education, Equated Scores, Error of Measurement

Measuring the Appropriateness of Multiple-Choice Test Scores.

Download full text

Levine, Michael V.; Rubin, Donald B. – 1976

Appropriateness indexes (statistical formulas) for detecting suspiciously high or low scores on aptitude tests were presented, based on a simulation of the Scholastic Aptitude Test (SAT) with 3,000 simulated scores--2,800 normal and 200 suspicious. The traditional index--marginal probability--uses a model for the normal examinee's test-taking…

Descriptors: Academic Ability, Aptitude Tests, College Entrance Examinations, High Schools