Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 1 |
Since 2006 (last 20 years) | 3 |
Descriptor
Error of Measurement | 4 |
Probability | 4 |
Item Response Theory | 3 |
Test Items | 3 |
Sampling | 2 |
Test Length | 2 |
Ability | 1 |
Adaptive Testing | 1 |
Bayesian Statistics | 1 |
Change | 1 |
College Students | 1 |
More ▼ |
Source
Applied Measurement in… | 4 |
Author
Bergstrom, Betty A. | 1 |
Kannan, Priya | 1 |
Katz, Irvin R. | 1 |
Kim, Stella Yun | 1 |
Lee, Won-Chan | 1 |
Phillips, Gary W. | 1 |
Sgammato, Adrienne | 1 |
Tannenbaum, Richard J. | 1 |
Publication Type
Journal Articles | 4 |
Reports - Research | 4 |
Speeches/Meeting Papers | 1 |
Education Level
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Kim, Stella Yun; Lee, Won-Chan – Applied Measurement in Education, 2023
This study evaluates various scoring methods including number-correct scoring, IRT theta scoring, and hybrid scoring in terms of scale-score stability over time. A simulation study was conducted to examine the relative performance of five scoring methods in terms of preserving the first two moments of scale scores for a population in a chain of…
Descriptors: Scoring, Comparative Analysis, Item Response Theory, Simulation
Kannan, Priya; Sgammato, Adrienne; Tannenbaum, Richard J.; Katz, Irvin R. – Applied Measurement in Education, 2015
The Angoff method requires experts to view every item on the test and make a probability judgment. This can be time consuming when there are large numbers of items on the test. In this study, a G-theory framework was used to determine if a subset of items can be used to make generalizable cut-score recommendations. Angoff ratings (i.e.,…
Descriptors: Reliability, Standard Setting (Scoring), Cutting Scores, Test Items
Phillips, Gary W. – Applied Measurement in Education, 2015
This article proposes that sampling design effects have potentially huge unrecognized impacts on the results reported by large-scale district and state assessments in the United States. When design effects are unrecognized and unaccounted for they lead to underestimating the sampling error in item and test statistics. Underestimating the sampling…
Descriptors: State Programs, Sampling, Research Design, Error of Measurement

Bergstrom, Betty A.; And Others – Applied Measurement in Education, 1992
Effects of altering test difficulty on examinee ability measures and test length in a computer adaptive test were studied for 225 medical technology students in 3 test difficulty conditions. Results suggest that, with an item pool of sufficient depth and breadth, acceptable targeting to test difficulty is possible. (SLD)
Descriptors: Ability, Adaptive Testing, Change, College Students