ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	5
Since 2006 (last 20 years)	11

Descriptor

Item Response Theory	11
Sampling	11
Statistical Inference	11
Computation	6
Comparative Analysis	4
Error of Measurement	4
Goodness of Fit	3
Sample Size	3
Test Items	3
Ability	2
Data Analysis	2
Equated Scores	2
Evaluation Methods	2
Grade 8	2
Mathematics Tests	2
Measurement	2
Probability	2
Statistical Analysis	2
Statistical Bias	2
Statistics	2
Achievement Tests	1
Adaptive Testing	1
Aptitude Tests	1
Bayesian Statistics	1
Bias	1
More ▼

Source

Educational and Psychological…	2
International Journal of…	2
Journal of Educational and…	2
ACT, Inc.	1
Applied Measurement in…	1
Applied Psychological…	1
College Board	1
ProQuest LLC	1

Publication Type

Reports - Research	10
Journal Articles	8
Numerical/Quantitative Data	2
Dissertations/Theses -…	1

Education Level

Secondary Education	3
Grade 8	2
Higher Education	2
Junior High Schools	2
Middle Schools	2
Elementary Education	1
Elementary Secondary Education	1
High Schools	1
Postsecondary Education	1

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

National Merit Scholarship…	1
Preliminary Scholastic…	1
Trends in International…	1

What Works Clearinghouse Rating

Showing all 11 results Save | Export

A Bias-Corrected RMSD Item Fit Statistic: An Evaluation and Comparison to Alternatives

Peer reviewed

Direct link

Köhler, Carmen; Robitzsch, Alexander; Hartig, Johannes – Journal of Educational and Behavioral Statistics, 2020

Testing whether items fit the assumptions of an item response theory model is an important step in evaluating a test. In the literature, numerous item fit statistics exist, many of which show severe limitations. The current study investigates the root mean squared deviation (RMSD) item fit statistic, which is used for evaluating item fit in…

Descriptors: Test Items, Goodness of Fit, Statistics, Bias

Online Calibration of a Joint Model of Item Responses and Response Times in Computerized Adaptive Testing

Peer reviewed

Direct link

Kang, Hyeon-Ah; Zheng, Yi; Chang, Hua-Hua – Journal of Educational and Behavioral Statistics, 2020

With the widespread use of computers in modern assessment, online calibration has become increasingly popular as a way of replenishing an item pool. The present study discusses online calibration strategies for a joint model of responses and response times. The study proposes likelihood inference methods for item paramter estimation and evaluates…

Descriptors: Adaptive Testing, Computer Assisted Testing, Item Response Theory, Reaction Time

An Algorithm to Improve Test Answer Copying Detection Using the Omega Statistic

Peer reviewed

Direct link

Maeda, Hotaka; Zhang, Bo – International Journal of Testing, 2017

The omega (?) statistic is reputed to be one of the best indices for detecting answer copying on multiple choice tests, but its performance relies on the accurate estimation of copier ability, which is challenging because responses from the copiers may have been contaminated. We propose an algorithm that aims to identify and delete the suspected…

Descriptors: Cheating, Test Items, Mathematics, Statistics

Evaluating Equity at the Local Level Using Bootstrap Tests. Research Report 2016-4

Download full text

Kim, YoungKoung; DeCarlo, Lawrence T. – College Board, 2016

Because of concerns about test security, different test forms are typically used across different testing occasions. As a result, equating is necessary in order to get scores from the different test forms that can be used interchangeably. In order to assure the quality of equating, multiple equating methods are often examined. Various equity…

Descriptors: Equated Scores, Evaluation Methods, Sampling, Statistical Inference

Bootstrap Standard Errors for Maximum Likelihood Ability Estimates When Item Parameters Are Unknown

Peer reviewed

Direct link

Patton, Jeffrey M.; Cheng, Ying; Yuan, Ke-Hai; Diao, Qi – Educational and Psychological Measurement, 2014

When item parameter estimates are used to estimate the ability parameter in item response models, the standard error (SE) of the ability estimate must be corrected to reflect the error carried over from item calibration. For maximum likelihood (ML) ability estimates, a corrected asymptotic SE is available, but it requires a long test and the…

Descriptors: Sampling, Statistical Inference, Maximum Likelihood Statistics, Computation

Selection of Common Items as an Unrecognized Source of Variability in Test Equating: A Bootstrap Approximation Assuming Random Sampling of Common Items

Peer reviewed

Direct link

Michaelides, Michalis P.; Haertel, Edward H. – Applied Measurement in Education, 2014

The standard error of equating quantifies the variability in the estimation of an equating function. Because common items for deriving equated scores are treated as fixed, the only source of variability typically considered arises from the estimation of common-item parameters from responses of samples of examinees. Use of alternative, equally…

Descriptors: Equated Scores, Test Items, Sampling, Statistical Inference

Statistical Considerations in Choosing a Test Reliability Coefficient. ACT Research Report Series, 2012 (10)

Download full text

Woodruff, David; Wu, Yi-Fang – ACT, Inc., 2012

The purpose of this paper is to illustrate alpha's robustness and usefulness, using actual and simulated educational test data. The sampling properties of alpha are compared with the sampling properties of several other reliability coefficients: Guttman's lambda[subscript 2], lambda[subscript 4], and lambda[subscript 6]; test-retest reliability;…

Descriptors: Sampling, Test Reliability, Item Response Theory, Statistical Inference

Grain Size and Parameter Recovery with TIMSS and the General Diagnostic Model

Peer reviewed

Direct link

Skaggs, Gary; Wilkins, Jesse L. M.; Hein, Serge F. – International Journal of Testing, 2016

The purpose of this study was to explore the degree of grain size of the attributes and the sample sizes that can support accurate parameter recovery with the General Diagnostic Model (GDM) for a large-scale international assessment. In this resampling study, bootstrap samples were obtained from the 2003 Grade 8 TIMSS in Mathematics at varying…

Descriptors: Achievement Tests, Foreign Countries, Elementary Secondary Education, Science Achievement

Standard Errors and Confidence Intervals from Bootstrapping for Ramsay-Curve Item Response Theory Model Item Parameters

Peer reviewed

Direct link

Gu, Fei; Skorupski, William P.; Hoyle, Larry; Kingston, Neal M. – Applied Psychological Measurement, 2011

Ramsay-curve item response theory (RC-IRT) is a nonparametric procedure that estimates the latent trait using splines, and no distributional assumption about the latent trait is required. For item parameters of the two-parameter logistic (2-PL), three-parameter logistic (3-PL), and polytomous IRT models, RC-IRT can provide more accurate estimates…

Descriptors: Intervals, Item Response Theory, Models, Evaluation Methods

Assessing Goodness of Fit in Item Response Theory with Nonparametric Models: A Comparison of Posterior Probabilities and Kernel-Smoothing Approaches

Peer reviewed

Direct link

Sueiro, Manuel J.; Abad, Francisco J. – Educational and Psychological Measurement, 2011

The distance between nonparametric and parametric item characteristic curves has been proposed as an index of goodness of fit in item response theory in the form of a root integrated squared error index. This article proposes to use the posterior distribution of the latent trait as the nonparametric model and compares the performance of an index…

Descriptors: Goodness of Fit, Item Response Theory, Nonparametric Statistics, Probability

A Comparison of Kernel Equating and Traditional Equipercentile Equating Methods and the Parametric Bootstrap Methods for Estimating Standard Errors in Equipercentile Equating

Direct link

Choi, Sae Il – ProQuest LLC, 2009

This study used simulation (a) to compare the kernel equating method to traditional equipercentile equating methods under the equivalent-groups (EG) design and the nonequivalent-groups with anchor test (NEAT) design and (b) to apply the parametric bootstrap method for estimating standard errors of equating. A two-parameter logistic item response…

Descriptors: Item Response Theory, Comparative Analysis, Sampling, Statistical Inference

Abad, Francisco J.	1
Chang, Hua-Hua	1
Cheng, Ying	1
Choi, Sae Il	1
DeCarlo, Lawrence T.	1
Diao, Qi	1
Gu, Fei	1
Haertel, Edward H.	1
Hartig, Johannes	1
Hein, Serge F.	1
Hoyle, Larry	1
Kang, Hyeon-Ah	1
Kim, YoungKoung	1
Kingston, Neal M.	1
Köhler, Carmen	1
Maeda, Hotaka	1
Michaelides, Michalis P.	1
Patton, Jeffrey M.	1
Robitzsch, Alexander	1
Skaggs, Gary	1
Skorupski, William P.	1
Sueiro, Manuel J.	1
Wilkins, Jesse L. M.	1
Woodruff, David	1
Wu, Yi-Fang	1
More ▼