ERIC - Search Results

Descriptor

Difficulty Level	12
Error of Measurement	12
Mathematical Models	12
Test Items	11
Test Reliability	5
Item Analysis	4
Latent Trait Theory	4
Monte Carlo Methods	4
Cutting Scores	3
Test Construction	3
Test Theory	3
True Scores	3
Adaptive Testing	2
Comparative Analysis	2
Computer Assisted Testing	2
Factor Analysis	2
Graphs	2
High Schools	2
Item Banks	2
Scoring	2
Simulation	2
Statistical Analysis	2
Statistical Studies	2
Achievement Tests	1
Analysis of Variance	1
More ▼

Source

Educational and Psychological…

Author

Benson, Jeri	1
Carlson, James E.	1
Curry, Allen R.	1
De Ayala, R. J.	1
Divgi, D. R.	1
Feldt, Leonard S.	1
Huck, Schuyler W.	1
Jones, Patricia B.	1
Livingston, Samuel A.	1
Patience, Wayne M.	1
Reckase, Mark D.	1
Spray, Judith A.	1
Wilson, Michael	1
Wise, Lauress L.	1
deGruijter, Dato N. M.	1
More ▼

Publication Type

Reports - Research	11
Speeches/Meeting Papers	6
Journal Articles	3
Reports - Evaluative	1

Education Level

Audience

Researchers

Location

Laws, Policies, & Programs

Assessments and Surveys

ACT Assessment	1
Medical College Admission Test	1

What Works Clearinghouse Rating

Showing all 12 results Save | Export

Some Relationships between the Binomial Error Model and Classical Test Theory.

Peer reviewed

Feldt, Leonard S. – Educational and Psychological Measurement, 1984

The binomial error model includes form-to-form difficulty differences as error variance and leads to Ruder-Richardson formula 21 as an estimate of reliability. If the form-to-form component is removed from the estimate of error variance, the binomial model leads to KR 20 as the reliability estimate. (Author/BW)

Descriptors: Achievement Tests, Difficulty Level, Error of Measurement, Mathematical Formulas

A New Index for the Accuracy of a Criterion-Referenced Test.

Divgi, D. R. – 1978

One aim of criterion-referenced testing is to classify an examinee without reference to a norm group; therefore, any statements about the dependability of such classification ought to be group-independent also. A population-independent index is proposed in terms of the probability of incorrect classification near the cutoff true score. The…

Descriptors: Criterion Referenced Tests, Cutting Scores, Difficulty Level, Error of Measurement

Accounting for the Uncertainty in Performance Standards.

Download full text

deGruijter, Dato N. M. – 1980

The setting of standards involves subjective value judgments. The inherent arbitrariness of specific standards has been severely criticized by Glass. His antagonists agree that standard setting is a judgmental task but they have pointed out that arbitrariness in the positive sense of serious judgmental decisions is unavoidable. Further, small…

Descriptors: Cutting Scores, Difficulty Level, Error of Measurement, Mastery Tests

An Empirical Investigation of Lu's Method of Reliability Estimation.

Peer reviewed

Huck, Schuyler W.; And Others – Educational and Psychological Measurement, 1981

Believing that examinee-by-item interaction should be conceptualized as true score variability rather than as a result of errors of measurement, Lu proposed a modification of Hoyt's analysis of variance reliability procedure. Via a computer simulation study, it is shown that Lu's approach does not separate interaction from error. (Author/RL)

Descriptors: Analysis of Variance, Comparative Analysis, Computer Programs, Difficulty Level

Dimensionality Assessment for Dichotomously Scored Items Using Multidimensional Scaling.

Download full text

Jones, Patricia B.; And Others – 1987

In order to determine the effectiveness of multidimensional scaling (MDS) in recovering the dimensionality of a set of dichotomously-scored items, data were simulated in one, two, and three dimensions for a variety of correlations with the underlying latent trait. Similarity matrices were constructed from these data using three margin-sensitive…

Descriptors: Cluster Analysis, Correlation, Difficulty Level, Error of Measurement

Adjusting Scores on Examinations Offering a Choice of Questions.

Download full text

Livingston, Samuel A. – 1986

This paper deals with test fairness regarding a test consisting of two parts: (1) a "common" section, taken by all students; and (2) a "variable" section, in which some students may answer a different set of questions from other students. For example, a test taken by several thousand students each year contains a common multiple-choice portion and…

Descriptors: Difficulty Level, Error of Measurement, Essay Tests, Mathematical Models

The Influence of Dimensionality on CAT Ability Estimation.

Peer reviewed

De Ayala, R. J. – Educational and Psychological Measurement, 1992

Effects of dimensionality on ability estimation of an adaptive test were examined using generated data in Bayesian computerized adaptive testing (CAT) simulations. Generally, increasing interdimensional difficulty association produced a slight decrease in test length and an increase in accuracy of ability estimation as assessed by root mean square…

Descriptors: Adaptive Testing, Bayesian Statistics, Computer Assisted Testing, Computer Simulation

Latent Trait Models for Partially Speeded Tests.

Wise, Lauress L. – 1986

A primary goal of this study was to determine the extent to which item difficulty was related to item position and, if a significant relationship was found, to suggest adjustments to predicted item difficulty that reflect differences in item position. Item response data from the Medical College Admission Test (MCAT) were analyzed. A data set was…

Descriptors: College Entrance Examinations, Difficulty Level, Educational Research, Error of Measurement

Invariance of Rasch Model Ability Parameter Estimates Over Different Collections of Items.

Curry, Allen R.; And Others – 1978

The efficacy of employing subsets of items from a calibrated item pool to estimate the Rasch model person parameters was investigated. Specifically, the degree of invariance of Rasch model ability-parameter estimates was examined across differing collections of simulated items. The ability-parameter estimates were obtained from a simulation of…

Descriptors: Career Development, Difficulty Level, Equated Scores, Error of Measurement

A Comparison of Three Types of Test Development Procedures Using Classical and Latent Trait Methods.

Benson, Jeri; Wilson, Michael – 1979

Three methods of item selection were used to select sets of 38 items from a 50-item verbal analogies test and the resulting item sets were compared for internal consistency, standard errors of measurement, item difficulty, biserial item-test correlations, and relative efficiency. Three groups of 1,500 cases each were used for item selection. First…

Descriptors: Comparative Analysis, Difficulty Level, Efficiency, Error of Measurement

Operational Characteristics of a One-Parameter Tailored Testing Procedure. Research Report 79-2.

Download full text

Patience, Wayne M.; Reckase, Mark D. – 1979

An experiment was performed with computer-generated data to investigate some of the operational characteristics of tailored testing as they are related to various provisions of the computer program and item pool. With respect to the computer program, two characteristics were varied: the size of the step of increase or decrease in item difficulty…

Descriptors: Adaptive Testing, Computer Assisted Testing, Difficulty Level, Error of Measurement

Analysis of Contingency Tables Involving Multiple-Response Data.

Carlson, James E.; Spray, Judith A. – 1986

This paper discussed methods currently under study for use with multiple-response data. Besides using Bonferroni inequality methods to control type one error rate over a set of inferences involving multiple response data, a recently proposed methodology of plotting the p-values resulting from multiple significance tests was explored. Proficiency…

Descriptors: Cutting Scores, Data Analysis, Difficulty Level, Error of Measurement