ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	0
Since 2006 (last 20 years)	1

Source

Applied Measurement in…

Author

Yen, Wendy M.	2
Candell, Gregory L.	1
Feldt, Leonard S.	1
Fitzpatrick, Anne R.	1
Kannan, Priya	1
Katz, Irvin R.	1
Sgammato, Adrienne	1
Tannenbaum, Richard J.	1

Publication Type

Journal Articles	4
Reports - Research	3
Reports - Evaluative	1
Speeches/Meeting Papers	1

Education Level

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

Iowa Tests of Basic Skills

What Works Clearinghouse Rating

Showing all 4 results Save | Export

Evaluating the Consistency of Angoff-Based Cut Scores Using Subsets of Items within a Generalizability Theory Framework

Peer reviewed

Direct link

Kannan, Priya; Sgammato, Adrienne; Tannenbaum, Richard J.; Katz, Irvin R. – Applied Measurement in Education, 2015

The Angoff method requires experts to view every item on the test and make a probability judgment. This can be time consuming when there are large numbers of items on the test. In this study, a G-theory framework was used to determine if a subset of items can be used to make generalizable cut-score recommendations. Angoff ratings (i.e.,…

Descriptors: Reliability, Standard Setting (Scoring), Cutting Scores, Test Items

Estimating the Internal Consistency Reliability of Tests Composed of Testlets Varying in Length.

Peer reviewed

Feldt, Leonard S. – Applied Measurement in Education, 2002

Considers the degree of bias in testlet-based alpha (internal consistency reliability) through hypothetical examples and real test data from four tests of the Iowa Tests of Basic Skills. Presents a simple formula for computing a testlet-based congeneric coefficient. (SLD)

Descriptors: Estimation (Mathematics), Reliability, Statistical Bias, Test Format

The Effects of Test Length and Sample Size on the Reliability and Equating of Tests Composed of Constructed-Response Items.

Peer reviewed

Fitzpatrick, Anne R.; Yen, Wendy M. – Applied Measurement in Education, 2001

Examined the effects of test length and sample size on the alternate forms reliability and equating of simulated mathematics tests composed of constructed response items scaled using the two-parameter partial credit model. Results suggest that, to obtain acceptable reliabilities and accurate equated scores, tests should have at least 8 6-point…

Descriptors: Constructed Response, Equated Scores, Mathematics Tests, Reliability

Increasing Score Reliability with Item-Pattern Scoring: An Empirical Study in Five Score Metrics.

Peer reviewed

Yen, Wendy M.; Candell, Gregory L. – Applied Measurement in Education, 1991

Empirical reliabilities of scores based on item-pattern scoring, using 3-parameter item-response theory and number-correct scoring, were compared within each of 5 score metrics for at least 900 elementary school students for 5 content areas. Average increases in reliability were produced by item-pattern scoring. (SLD)

Descriptors: Elementary Education, Elementary School Students, Grade Equivalent Scores, Item Response Theory

Reliability	4
Test Length	4
Constructed Response	1
Cutting Scores	1
Elementary Education	1
Elementary School Students	1
Equated Scores	1
Error of Measurement	1
Estimation (Mathematics)	1
Generalizability Theory	1
Grade Equivalent Scores	1
Item Response Theory	1
Licensing Examinations…	1
Mathematics Tests	1
Probability	1
Sample Size	1
Sampling	1
Scores	1
Simulation	1
Standard Setting (Scoring)	1
Statistical Bias	1
Test Construction	1
Test Format	1
Test Items	1
More ▼