ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	1
Since 2006 (last 20 years)	3

Source

Applied Measurement in…

Author

Bergstrom, Betty A.	1
Kannan, Priya	1
Katz, Irvin R.	1
Kim, Stella Yun	1
Lee, Won-Chan	1
Phillips, Gary W.	1
Sgammato, Adrienne	1
Tannenbaum, Richard J.	1

Publication Type

Journal Articles	4
Reports - Research	4
Speeches/Meeting Papers	1

Education Level

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 4 results Save | Export

Maintaining Score Scales over Time: A Comparison of Five Scoring Methods

Peer reviewed

Direct link

Kim, Stella Yun; Lee, Won-Chan – Applied Measurement in Education, 2023

This study evaluates various scoring methods including number-correct scoring, IRT theta scoring, and hybrid scoring in terms of scale-score stability over time. A simulation study was conducted to examine the relative performance of five scoring methods in terms of preserving the first two moments of scale scores for a population in a chain of…

Descriptors: Scoring, Comparative Analysis, Item Response Theory, Simulation

Evaluating the Consistency of Angoff-Based Cut Scores Using Subsets of Items within a Generalizability Theory Framework

Peer reviewed

Direct link

Kannan, Priya; Sgammato, Adrienne; Tannenbaum, Richard J.; Katz, Irvin R. – Applied Measurement in Education, 2015

The Angoff method requires experts to view every item on the test and make a probability judgment. This can be time consuming when there are large numbers of items on the test. In this study, a G-theory framework was used to determine if a subset of items can be used to make generalizable cut-score recommendations. Angoff ratings (i.e.,…

Descriptors: Reliability, Standard Setting (Scoring), Cutting Scores, Test Items

Impact of Design Effects in Large-Scale District and State Assessments

Peer reviewed

Direct link

Phillips, Gary W. – Applied Measurement in Education, 2015

This article proposes that sampling design effects have potentially huge unrecognized impacts on the results reported by large-scale district and state assessments in the United States. When design effects are unrecognized and unaccounted for they lead to underestimating the sampling error in item and test statistics. Underestimating the sampling…

Descriptors: State Programs, Sampling, Research Design, Error of Measurement

Altering the Level of Difficulty in Computer Adaptive Testing.

Peer reviewed

Bergstrom, Betty A.; And Others – Applied Measurement in Education, 1992

Effects of altering test difficulty on examinee ability measures and test length in a computer adaptive test were studied for 225 medical technology students in 3 test difficulty conditions. Results suggest that, with an item pool of sufficient depth and breadth, acceptable targeting to test difficulty is possible. (SLD)

Descriptors: Ability, Adaptive Testing, Change, College Students

Error of Measurement	4
Probability	4
Item Response Theory	3
Test Items	3
Sampling	2
Test Length	2
Ability	1
Adaptive Testing	1
Bayesian Statistics	1
Change	1
College Students	1
Comparative Analysis	1
Computer Assisted Testing	1
Cutting Scores	1
Difficulty Level	1
Equated Scores	1
Evaluation Criteria	1
Evaluation Methods	1
Evaluation Problems	1
Experimenter Characteristics	1
Generalizability Theory	1
Group Testing	1
Higher Education	1
Item Analysis	1
Item Banks	1
More ▼