ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	0
Since 2006 (last 20 years)	6

Source

Applied Psychological…

Author

Brennan, Robert L.	2
Wyse, Adam E.	2
Cheng, Ying	1
Divgi, D. R.	1
Gao, Rui	1
Hao, Shiqi	1
Jones, Andrew T.	1
Kim, Seonghoon	1
Lathrop, Quinn N.	1
Lockwood, Robert E.	1
Mellenbergh, Gideon J.	1
Norcini, John	1
Yang, Wen-Ling	1
Yi, Hyun Sook	1
van der Linden, Wim J.	1
More ▼

Publication Type

Journal Articles	9
Reports - Research	6
Reports - Evaluative	2
Reports - Descriptive	1

Education Level

High Schools	1
Higher Education	1
Secondary Education	1

Audience

Location

Michigan

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Applied Psychological Measurement X

Showing all 10 results Save | Export

Two Approaches to Estimation of Classification Accuracy Rate under Item Response Theory

Peer reviewed

Direct link

Lathrop, Quinn N.; Cheng, Ying – Applied Psychological Measurement, 2013

Within the framework of item response theory (IRT), there are two recent lines of work on the estimation of classification accuracy (CA) rate. One approach estimates CA when decisions are made based on total sum scores, the other based on latent trait estimates. The former is referred to as the Lee approach, and the latter, the Rudner approach,…

Descriptors: Item Response Theory, Accuracy, Classification, Computation

Comparing Methods for Item Analysis: The Impact of Different Item-Selection Statistics on Test Difficulty

Peer reviewed

Direct link

Jones, Andrew T. – Applied Psychological Measurement, 2011

Practitioners often depend on item analysis to select items for exam forms and have a variety of options available to them. These include the point-biserial correlation, the agreement statistic, the B index, and the phi coefficient. Although research has demonstrated that these statistics can be useful for item selection, no research as of yet has…

Descriptors: Test Items, Item Analysis, Cutting Scores, Statistics

An Evaluation of Item Response Theory Classification Accuracy and Consistency Indices

Peer reviewed

Direct link

Wyse, Adam E.; Hao, Shiqi – Applied Psychological Measurement, 2012

This article introduces two new classification consistency indices that can be used when item response theory (IRT) models have been applied. The new indices are shown to be related to Rudner's classification accuracy index and Guo's classification accuracy index. The Rudner- and Guo-based classification accuracy and consistency indices are…

Descriptors: Item Response Theory, Classification, Accuracy, Reliability

The Potential Impact of Not Being Able to Create Parallel Tests on Expected Classification Accuracy

Peer reviewed

Direct link

Wyse, Adam E. – Applied Psychological Measurement, 2011

In many practical testing situations, alternate test forms from the same testing program are not strictly parallel to each other and instead the test forms exhibit small psychometric differences. This article investigates the potential practical impact that these small psychometric differences can have on expected classification accuracy. Ten…

Descriptors: Test Format, Test Construction, Testing Programs, Psychometrics

Invariance of Score Linkings across Gender Groups for Forms of a Testlet-Based College-Level Examination Program Examination

Peer reviewed

Direct link

Yang, Wen-Ling; Gao, Rui – Applied Psychological Measurement, 2008

This study investigates whether the functions linking number-correct scores to the College-Level Examination Program (CLEP) scaled scores remain invariant over gender groups, using test data on the 16 testlet-based forms of the CLEP College Algebra exam. To be consistent with the operational practice, linking of various test forms to a common…

Descriptors: Mathematics Tests, Algebra, Item Response Theory, Testing Programs

A Method for Estimating Classification Consistency Indices for Two Equated Forms

Peer reviewed

Direct link

Yi, Hyun Sook; Kim, Seonghoon; Brennan, Robert L. – Applied Psychological Measurement, 2007

Large-scale testing programs involving classification decisions typically have multiple forms available and conduct equating to ensure cut-score comparability across forms. A test developer might be interested in the extent to which an examinee who happens to take a particular form would have a consistent classification decision if he or she had…

Descriptors: Classification, Reliability, Indexes, Computation

Group Dependence of Some Reliability Indices for Mastery Tests.

Peer reviewed

Divgi, D. R. – Applied Psychological Measurement, 1980

The dependence of reliability indices for mastery tests on mean and cutoff scores was examined in the case of three decision-theoretic indices. Dependence of kappa on mean and cutoff scores was opposite to that of the proportion of correct decisions, which was linearly related to average threshold loss. (Author/BW)

Descriptors: Classification, Cutting Scores, Mastery Tests, Test Reliability

A Comparison of the Nedelsky and Angoff Cutting Score Procedures Using Generalizability Theory.

Peer reviewed

Brennan, Robert L.; Lockwood, Robert E. – Applied Psychological Measurement, 1980

Generalizability theory is used to characterize and quantify expected variance in cutting scores and to compare the Nedelsky and Angoff procedures for establishing a cutting score. Results suggest that the restricted nature of the Nedelsky (inferred) probability scale may limit its applicability in certain contexts. (Author/BW)

Descriptors: Cutting Scores, Generalization, Statistical Analysis, Test Reliability

Optimal Cutting Scores Using A Linear Loss Function

Peer reviewed

van der Linden, Wim J.; Mellenbergh, Gideon J. – Applied Psychological Measurement, 1977

Using a linear loss function, a procedure is described for computing a cutting score that minimizes the risk for a given decision rule. The procedure is demonstrated with a criterion-referenced achievement test of elementary statistics administered to 167 students. (Author/CTM)

Descriptors: Cutting Scores, Higher Education, Latent Trait Theory, Mastery Tests

The Effect of Numbers of Experts and Common Items on Cutting Score Equivalents Based on Expert Judgment.

Peer reviewed

Norcini, John; And Others – Applied Psychological Measurement, 1991

Effects of numbers of experts (NOEs) and common items (CIs) on the scaling of cutting scores from expert judgments were studied for 11,917 physicians taking 2 forms of a medical specialty examination. Increasing NOEs and CIs reduced error; beyond 5 experts and 25 CIs, error differences were small. (SLD)

Descriptors: Comparative Testing, Cutting Scores, Equated Scores, Estimation (Mathematics)

Cutting Scores	10
Classification	5
Item Response Theory	5
Computation	3
Test Format	3
Accuracy	2
Achievement Tests	2
Correlation	2
Equated Scores	2
Mastery Tests	2
Mathematics Tests	2
Psychometrics	2
Reliability	2
Standardized Tests	2
Statistical Analysis	2
Test Construction	2
Test Length	2
Test Reliability	2
Testing Programs	2
Algebra	1
College Mathematics	1
Comparative Analysis	1
Comparative Testing	1
Decision Making	1
Differences	1
More ▼