Descriptor
Difficulty Level | 12 |
Error of Measurement | 12 |
Mathematical Models | 12 |
Test Items | 11 |
Test Reliability | 5 |
Item Analysis | 4 |
Latent Trait Theory | 4 |
Monte Carlo Methods | 4 |
Cutting Scores | 3 |
Test Construction | 3 |
Test Theory | 3 |
More ▼ |
Author
Benson, Jeri | 1 |
Carlson, James E. | 1 |
Curry, Allen R. | 1 |
De Ayala, R. J. | 1 |
Divgi, D. R. | 1 |
Feldt, Leonard S. | 1 |
Huck, Schuyler W. | 1 |
Jones, Patricia B. | 1 |
Livingston, Samuel A. | 1 |
Patience, Wayne M. | 1 |
Reckase, Mark D. | 1 |
More ▼ |
Publication Type
Reports - Research | 11 |
Speeches/Meeting Papers | 6 |
Journal Articles | 3 |
Reports - Evaluative | 1 |
Education Level
Audience
Researchers | 4 |
Location
Laws, Policies, & Programs
Assessments and Surveys
ACT Assessment | 1 |
Medical College Admission Test | 1 |
What Works Clearinghouse Rating

Feldt, Leonard S. – Educational and Psychological Measurement, 1984
The binomial error model includes form-to-form difficulty differences as error variance and leads to Ruder-Richardson formula 21 as an estimate of reliability. If the form-to-form component is removed from the estimate of error variance, the binomial model leads to KR 20 as the reliability estimate. (Author/BW)
Descriptors: Achievement Tests, Difficulty Level, Error of Measurement, Mathematical Formulas
Divgi, D. R. – 1978
One aim of criterion-referenced testing is to classify an examinee without reference to a norm group; therefore, any statements about the dependability of such classification ought to be group-independent also. A population-independent index is proposed in terms of the probability of incorrect classification near the cutoff true score. The…
Descriptors: Criterion Referenced Tests, Cutting Scores, Difficulty Level, Error of Measurement
deGruijter, Dato N. M. – 1980
The setting of standards involves subjective value judgments. The inherent arbitrariness of specific standards has been severely criticized by Glass. His antagonists agree that standard setting is a judgmental task but they have pointed out that arbitrariness in the positive sense of serious judgmental decisions is unavoidable. Further, small…
Descriptors: Cutting Scores, Difficulty Level, Error of Measurement, Mastery Tests

Huck, Schuyler W.; And Others – Educational and Psychological Measurement, 1981
Believing that examinee-by-item interaction should be conceptualized as true score variability rather than as a result of errors of measurement, Lu proposed a modification of Hoyt's analysis of variance reliability procedure. Via a computer simulation study, it is shown that Lu's approach does not separate interaction from error. (Author/RL)
Descriptors: Analysis of Variance, Comparative Analysis, Computer Programs, Difficulty Level
Jones, Patricia B.; And Others – 1987
In order to determine the effectiveness of multidimensional scaling (MDS) in recovering the dimensionality of a set of dichotomously-scored items, data were simulated in one, two, and three dimensions for a variety of correlations with the underlying latent trait. Similarity matrices were constructed from these data using three margin-sensitive…
Descriptors: Cluster Analysis, Correlation, Difficulty Level, Error of Measurement
Livingston, Samuel A. – 1986
This paper deals with test fairness regarding a test consisting of two parts: (1) a "common" section, taken by all students; and (2) a "variable" section, in which some students may answer a different set of questions from other students. For example, a test taken by several thousand students each year contains a common multiple-choice portion and…
Descriptors: Difficulty Level, Error of Measurement, Essay Tests, Mathematical Models

De Ayala, R. J. – Educational and Psychological Measurement, 1992
Effects of dimensionality on ability estimation of an adaptive test were examined using generated data in Bayesian computerized adaptive testing (CAT) simulations. Generally, increasing interdimensional difficulty association produced a slight decrease in test length and an increase in accuracy of ability estimation as assessed by root mean square…
Descriptors: Adaptive Testing, Bayesian Statistics, Computer Assisted Testing, Computer Simulation
Wise, Lauress L. – 1986
A primary goal of this study was to determine the extent to which item difficulty was related to item position and, if a significant relationship was found, to suggest adjustments to predicted item difficulty that reflect differences in item position. Item response data from the Medical College Admission Test (MCAT) were analyzed. A data set was…
Descriptors: College Entrance Examinations, Difficulty Level, Educational Research, Error of Measurement
Curry, Allen R.; And Others – 1978
The efficacy of employing subsets of items from a calibrated item pool to estimate the Rasch model person parameters was investigated. Specifically, the degree of invariance of Rasch model ability-parameter estimates was examined across differing collections of simulated items. The ability-parameter estimates were obtained from a simulation of…
Descriptors: Career Development, Difficulty Level, Equated Scores, Error of Measurement
A Comparison of Three Types of Test Development Procedures Using Classical and Latent Trait Methods.
Benson, Jeri; Wilson, Michael – 1979
Three methods of item selection were used to select sets of 38 items from a 50-item verbal analogies test and the resulting item sets were compared for internal consistency, standard errors of measurement, item difficulty, biserial item-test correlations, and relative efficiency. Three groups of 1,500 cases each were used for item selection. First…
Descriptors: Comparative Analysis, Difficulty Level, Efficiency, Error of Measurement
Patience, Wayne M.; Reckase, Mark D. – 1979
An experiment was performed with computer-generated data to investigate some of the operational characteristics of tailored testing as they are related to various provisions of the computer program and item pool. With respect to the computer program, two characteristics were varied: the size of the step of increase or decrease in item difficulty…
Descriptors: Adaptive Testing, Computer Assisted Testing, Difficulty Level, Error of Measurement
Carlson, James E.; Spray, Judith A. – 1986
This paper discussed methods currently under study for use with multiple-response data. Besides using Bonferroni inequality methods to control type one error rate over a set of inferences involving multiple response data, a recently proposed methodology of plotting the p-values resulting from multiple significance tests was explored. Proficiency…
Descriptors: Cutting Scores, Data Analysis, Difficulty Level, Error of Measurement