ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	1
Since 2006 (last 20 years)	4

Descriptor

Sampling	4
Test Items	4
Error of Measurement	3
Cutting Scores	2
Equated Scores	2
Item Response Theory	2
Licensing Examinations…	2
Sample Size	2
Statistical Inference	2
Accuracy	1
Computation	1
Difficulty Level	1
Females	1
Generalizability Theory	1
Grade 8	1
Males	1
Mathematics Tests	1
Measurement	1
Multiple Choice Tests	1
Probability	1
Psychometrics	1
Regression (Statistics)	1
Reliability	1
Scores	1
Simulation	1
More ▼

Source

Applied Measurement in…

Author

Haertel, Edward H.	1
Jones, Andrew T.	1
Kannan, Priya	1
Katz, Irvin R.	1
Kim, Sooyeon	1
Kopp, Jason P.	1
Michaelides, Michalis P.	1
Sgammato, Adrienne	1
Tannenbaum, Richard J.	1
Walker, Michael	1

Publication Type

Journal Articles	4
Reports - Research	4

Education Level

Grade 8	1
Junior High Schools	1
Middle Schools	1
Secondary Education	1

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 4 results Save | Export

Impact of Item Parameter Drift on Rasch Scale Stability in Small Samples over Multiple Administrations

Peer reviewed

Direct link

Kopp, Jason P.; Jones, Andrew T. – Applied Measurement in Education, 2020

Traditional psychometric guidelines suggest that at least several hundred respondents are needed to obtain accurate parameter estimates under the Rasch model. However, recent research indicates that Rasch equating results in accurate parameter estimates with sample sizes as small as 25. Item parameter drift under the Rasch model has been…

Descriptors: Item Response Theory, Psychometrics, Sample Size, Sampling

Evaluating the Consistency of Angoff-Based Cut Scores Using Subsets of Items within a Generalizability Theory Framework

Peer reviewed

Direct link

Kannan, Priya; Sgammato, Adrienne; Tannenbaum, Richard J.; Katz, Irvin R. – Applied Measurement in Education, 2015

The Angoff method requires experts to view every item on the test and make a probability judgment. This can be time consuming when there are large numbers of items on the test. In this study, a G-theory framework was used to determine if a subset of items can be used to make generalizable cut-score recommendations. Angoff ratings (i.e.,…

Descriptors: Reliability, Standard Setting (Scoring), Cutting Scores, Test Items

Selection of Common Items as an Unrecognized Source of Variability in Test Equating: A Bootstrap Approximation Assuming Random Sampling of Common Items

Peer reviewed

Direct link

Michaelides, Michalis P.; Haertel, Edward H. – Applied Measurement in Education, 2014

The standard error of equating quantifies the variability in the estimation of an equating function. Because common items for deriving equated scores are treated as fixed, the only source of variability typically considered arises from the estimation of common-item parameters from responses of samples of examinees. Use of alternative, equally…

Descriptors: Equated Scores, Test Items, Sampling, Statistical Inference

Determining the Anchor Composition for a Mixed-Format Test: Evaluation of Subpopulation Invariance of Linking Functions

Peer reviewed

Direct link

Kim, Sooyeon; Walker, Michael – Applied Measurement in Education, 2012

This study examined the appropriateness of the anchor composition in a mixed-format test, which includes both multiple-choice (MC) and constructed-response (CR) items, using subpopulation invariance indices. Linking functions were derived in the nonequivalent groups with anchor test (NEAT) design using two types of anchor sets: (a) MC only and (b)…

Descriptors: Multiple Choice Tests, Test Format, Test Items, Equated Scores