Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 0 |
Since 2006 (last 20 years) | 6 |
Descriptor
Cutting Scores | 10 |
Classification | 5 |
Item Response Theory | 5 |
Computation | 3 |
Test Format | 3 |
Accuracy | 2 |
Achievement Tests | 2 |
Correlation | 2 |
Equated Scores | 2 |
Mastery Tests | 2 |
Mathematics Tests | 2 |
More ▼ |
Source
Applied Psychological… | 10 |
Author
Brennan, Robert L. | 2 |
Wyse, Adam E. | 2 |
Cheng, Ying | 1 |
Divgi, D. R. | 1 |
Gao, Rui | 1 |
Hao, Shiqi | 1 |
Jones, Andrew T. | 1 |
Kim, Seonghoon | 1 |
Lathrop, Quinn N. | 1 |
Lockwood, Robert E. | 1 |
Mellenbergh, Gideon J. | 1 |
More ▼ |
Publication Type
Journal Articles | 9 |
Reports - Research | 6 |
Reports - Evaluative | 2 |
Reports - Descriptive | 1 |
Education Level
High Schools | 1 |
Higher Education | 1 |
Secondary Education | 1 |
Audience
Location
Michigan | 1 |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Lathrop, Quinn N.; Cheng, Ying – Applied Psychological Measurement, 2013
Within the framework of item response theory (IRT), there are two recent lines of work on the estimation of classification accuracy (CA) rate. One approach estimates CA when decisions are made based on total sum scores, the other based on latent trait estimates. The former is referred to as the Lee approach, and the latter, the Rudner approach,…
Descriptors: Item Response Theory, Accuracy, Classification, Computation
Jones, Andrew T. – Applied Psychological Measurement, 2011
Practitioners often depend on item analysis to select items for exam forms and have a variety of options available to them. These include the point-biserial correlation, the agreement statistic, the B index, and the phi coefficient. Although research has demonstrated that these statistics can be useful for item selection, no research as of yet has…
Descriptors: Test Items, Item Analysis, Cutting Scores, Statistics
Wyse, Adam E.; Hao, Shiqi – Applied Psychological Measurement, 2012
This article introduces two new classification consistency indices that can be used when item response theory (IRT) models have been applied. The new indices are shown to be related to Rudner's classification accuracy index and Guo's classification accuracy index. The Rudner- and Guo-based classification accuracy and consistency indices are…
Descriptors: Item Response Theory, Classification, Accuracy, Reliability
Wyse, Adam E. – Applied Psychological Measurement, 2011
In many practical testing situations, alternate test forms from the same testing program are not strictly parallel to each other and instead the test forms exhibit small psychometric differences. This article investigates the potential practical impact that these small psychometric differences can have on expected classification accuracy. Ten…
Descriptors: Test Format, Test Construction, Testing Programs, Psychometrics
Yang, Wen-Ling; Gao, Rui – Applied Psychological Measurement, 2008
This study investigates whether the functions linking number-correct scores to the College-Level Examination Program (CLEP) scaled scores remain invariant over gender groups, using test data on the 16 testlet-based forms of the CLEP College Algebra exam. To be consistent with the operational practice, linking of various test forms to a common…
Descriptors: Mathematics Tests, Algebra, Item Response Theory, Testing Programs
Yi, Hyun Sook; Kim, Seonghoon; Brennan, Robert L. – Applied Psychological Measurement, 2007
Large-scale testing programs involving classification decisions typically have multiple forms available and conduct equating to ensure cut-score comparability across forms. A test developer might be interested in the extent to which an examinee who happens to take a particular form would have a consistent classification decision if he or she had…
Descriptors: Classification, Reliability, Indexes, Computation

Divgi, D. R. – Applied Psychological Measurement, 1980
The dependence of reliability indices for mastery tests on mean and cutoff scores was examined in the case of three decision-theoretic indices. Dependence of kappa on mean and cutoff scores was opposite to that of the proportion of correct decisions, which was linearly related to average threshold loss. (Author/BW)
Descriptors: Classification, Cutting Scores, Mastery Tests, Test Reliability

Brennan, Robert L.; Lockwood, Robert E. – Applied Psychological Measurement, 1980
Generalizability theory is used to characterize and quantify expected variance in cutting scores and to compare the Nedelsky and Angoff procedures for establishing a cutting score. Results suggest that the restricted nature of the Nedelsky (inferred) probability scale may limit its applicability in certain contexts. (Author/BW)
Descriptors: Cutting Scores, Generalization, Statistical Analysis, Test Reliability

van der Linden, Wim J.; Mellenbergh, Gideon J. – Applied Psychological Measurement, 1977
Using a linear loss function, a procedure is described for computing a cutting score that minimizes the risk for a given decision rule. The procedure is demonstrated with a criterion-referenced achievement test of elementary statistics administered to 167 students. (Author/CTM)
Descriptors: Cutting Scores, Higher Education, Latent Trait Theory, Mastery Tests

Norcini, John; And Others – Applied Psychological Measurement, 1991
Effects of numbers of experts (NOEs) and common items (CIs) on the scaling of cutting scores from expert judgments were studied for 11,917 physicians taking 2 forms of a medical specialty examination. Increasing NOEs and CIs reduced error; beyond 5 experts and 25 CIs, error differences were small. (SLD)
Descriptors: Comparative Testing, Cutting Scores, Equated Scores, Estimation (Mathematics)