ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	4

Source

Applied Measurement in…

Author

Bolt, Daniel M.	1
Chuah, Siang Chee	1
De Champlain, Andre	1
Drasgow, Fritz	1
Fitzpatrick, Anne R.	1
Gessaroli, Marc E.	1
Hau, Kit-Tai	1
Luecht, Richard	1
Sinharay, Sandip	1
Wells, Craig S.	1
Xiao, Leifeng	1
Yen, Wendy M.	1
More ▼

Publication Type

Journal Articles	6
Reports - Research	4
Reports - Evaluative	2
Speeches/Meeting Papers	1

Education Level

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 6 results Save | Export

Accuracy and Sensitivity of Coefficient Alpha and Its Alternatives with Unidimensional and Contaminated Scales

Peer reviewed

Direct link

Xiao, Leifeng; Hau, Kit-Tai – Applied Measurement in Education, 2023

We compared coefficient alpha with five alternatives (omega total, omega RT, omega h, GLB, and coefficient H) in two simulation studies. Results showed for unidimensional scales, (a) all indices except omega h performed similarly well for most conditions; (b) alpha is still good; (c) GLB and coefficient H overestimated reliability with small…

Descriptors: Test Theory, Test Reliability, Factor Analysis, Test Length

Are the Nonparametric Person-Fit Statistics More Powerful than Their Parametric Counterparts? Revisiting the Simulations in Karabatsos (2003)

Peer reviewed

Direct link

Sinharay, Sandip – Applied Measurement in Education, 2017

Karabatsos compared the power of 36 person-fit statistics using receiver operating characteristics curves and found the "H[superscript T]" statistic to be the most powerful in identifying aberrant examinees. He found three statistics, "C", "MCI", and "U3", to be the next most powerful. These four statistics,…

Descriptors: Nonparametric Statistics, Goodness of Fit, Simulation, Comparative Analysis

Investigation of a Nonparametric Procedure for Assessing Goodness-of-Fit in Item Response Theory

Peer reviewed

Direct link

Wells, Craig S.; Bolt, Daniel M. – Applied Measurement in Education, 2008

Tests of model misfit are often performed to validate the use of a particular model in item response theory. Douglas and Cohen (2001) introduced a general nonparametric approach for detecting misfit under the two-parameter logistic model. However, the statistical properties of their approach, and empirical comparisons to other methods, have not…

Descriptors: Test Length, Test Items, Monte Carlo Methods, Nonparametric Statistics

Assessing the Dimensionality of Item Response Matrices with Small Sample Sizes and Short Test Lengths.

Peer reviewed

De Champlain, Andre; Gessaroli, Marc E. – Applied Measurement in Education, 1998

Type I error rates and rejection rates for three-dimensionality assessment procedures were studied with data sets simulated to reflect short tests and small samples. Results show that the G-squared difference test (D. Bock, R. Gibbons, and E. Muraki, 1988) suffered from a severely inflated Type I error rate at all conditions simulated. (SLD)

Descriptors: Item Response Theory, Matrices, Sample Size, Simulation

How Big Is Big Enough? Sample Size Requirements for CAST Item Parameter Estimation

Peer reviewed

Direct link

Chuah, Siang Chee; Drasgow, Fritz; Luecht, Richard – Applied Measurement in Education, 2006

Adaptive tests offer the advantages of reduced test length and increased accuracy in ability estimation. However, adaptive tests require large pools of precalibrated items. This study looks at the development of an item pool for 1 type of adaptive administration: the computer-adaptive sequential test. An important issue is the sample size required…

Descriptors: Test Length, Sample Size, Adaptive Testing, Item Response Theory

The Effects of Test Length and Sample Size on the Reliability and Equating of Tests Composed of Constructed-Response Items.

Peer reviewed

Fitzpatrick, Anne R.; Yen, Wendy M. – Applied Measurement in Education, 2001

Examined the effects of test length and sample size on the alternate forms reliability and equating of simulated mathematics tests composed of constructed response items scaled using the two-parameter partial credit model. Results suggest that, to obtain acceptable reliabilities and accurate equated scores, tests should have at least 8 6-point…

Descriptors: Constructed Response, Equated Scores, Mathematics Tests, Reliability

Simulation	6
Test Length	6
Sample Size	5
Item Response Theory	4
Goodness of Fit	2
Nonparametric Statistics	2
Accuracy	1
Adaptive Testing	1
Comparative Analysis	1
Computation	1
Computer Assisted Testing	1
Constructed Response	1
Design Requirements	1
Equated Scores	1
Error Patterns	1
Evaluation Methods	1
Evaluation Problems	1
Factor Analysis	1
Formative Evaluation	1
Mathematics Tests	1
Matrices	1
Measures (Individuals)	1
Monte Carlo Methods	1
Reliability	1
Replication (Evaluation)	1
More ▼