NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 12 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Wallin, Gabriel; Wiberg, Marie – Journal of Educational and Behavioral Statistics, 2023
This study explores the usefulness of covariates on equating test scores from nonequivalent test groups. The covariates are captured by an estimated propensity score, which is used as a proxy for latent ability to balance the test groups. The objective is to assess the sensitivity of the equated scores to various misspecifications in the…
Descriptors: Models, Error of Measurement, Robustness (Statistics), Equated Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Clemens Draxler; Andreas Kurz; Can Gürer; Jan Philipp Nolte – Journal of Educational and Behavioral Statistics, 2024
A modified and improved inductive inferential approach to evaluate item discriminations in a conditional maximum likelihood and Rasch modeling framework is suggested. The new approach involves the derivation of four hypothesis tests. It implies a linear restriction of the assumed set of probability distributions in the classical approach that…
Descriptors: Inferences, Test Items, Item Analysis, Maximum Likelihood Statistics
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Donoghue, John R.; McClellan, Catherine A.; Hess, Melinda R. – ETS Research Report Series, 2022
When constructed-response items are administered for a second time, it is necessary to evaluate whether the current Time B administration's raters have drifted from the scoring of the original administration at Time A. To study this, Time A papers are sampled and rescored by Time B scorers. Commonly the scores are compared using the proportion of…
Descriptors: Item Response Theory, Test Construction, Scoring, Testing
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Zopluoglu, Cengiz – International Journal of Assessment Tools in Education, 2019
Unusual response similarity among test takers may occur in testing data and be an indicator of potential test fraud (e.g., examinees copy responses from other examinees, send text messages or pre-arranged signals among themselves for the correct response, item pre-knowledge). One index to measure the degree of similarity between two response…
Descriptors: Item Response Theory, Computation, Cheating, Measurement Techniques
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Li, Zhen; Cai, Li – Grantee Submission, 2017
In standard item response theory (IRT) applications, the latent variable is typically assumed to be normally distributed. If the normality assumption is violated, the item parameter estimates can become biased. Summed score likelihood based statistics may be useful for testing latent variable distribution fit. We develop Satorra-Bentler type…
Descriptors: Scores, Goodness of Fit, Statistical Distributions, Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Sinharay, Sandip – Journal of Educational and Behavioral Statistics, 2015
Person-fit assessment may help the researcher to obtain additional information regarding the answering behavior of persons. Although several researchers examined person fit, there is a lack of research on person-fit assessment for mixed-format tests. In this article, the lz statistic and the ?2 statistic, both of which have been used for tests…
Descriptors: Test Format, Goodness of Fit, Item Response Theory, Bayesian Statistics
Cai, Li; Monroe, Scott – National Center for Research on Evaluation, Standards, and Student Testing (CRESST), 2014
We propose a new limited-information goodness of fit test statistic C[subscript 2] for ordinal IRT models. The construction of the new statistic lies formally between the M[subscript 2] statistic of Maydeu-Olivares and Joe (2006), which utilizes first and second order marginal probabilities, and the M*[subscript 2] statistic of Cai and Hansen…
Descriptors: Item Response Theory, Models, Goodness of Fit, Probability
Peer reviewed Peer reviewed
Direct linkDirect link
Hayes, Kevin – Teaching Statistics: An International Journal for Teachers, 2004
This article demonstrates that the lower bound for the most deviant Z score and the upper bound for the sample standard deviation are attained simultaneously.
Descriptors: Statistical Analysis, Scores, Item Response Theory, Probability
Peer reviewed Peer reviewed
Harwell, Michael R.; Baker, Frank B. – Applied Psychological Measurement, 1991
Previous work on the mathematical and implementation details of the marginalized maximum likelihood estimation procedure is extended to encompass the marginalized Bayesian procedure for estimating item parameters of R. J. Mislevy (1986) and to communicate this procedure to users of the BILOG computer program. (SLD)
Descriptors: Bayesian Statistics, Equations (Mathematics), Estimation (Mathematics), Item Response Theory
Wang, Tianyou; And Others – 1996
M. J. Kolen, B. A. Hanson, and R. L. Brennan (1992) presented a procedure for assessing the conditional standard error of measurement (CSEM) of scale scores using a strong true-score model. They also investigated the ways of using nonlinear transformation from number-correct raw score to scale score to equalize the conditional standard error along…
Descriptors: Ability, Classification, Error of Measurement, Goodness of Fit
Kim, Seock-Ho; And Others – 1992
Hierarchical Bayes procedures were compared for estimating item and ability parameters in item response theory. Simulated data sets from the two-parameter logistic model were analyzed using three different hierarchical Bayes procedures: (1) the joint Bayesian with known hyperparameters (JB1); (2) the joint Bayesian with information hyperpriors…
Descriptors: Ability, Bayesian Statistics, Comparative Analysis, Equations (Mathematics)
Peer reviewed Peer reviewed
Camilli, Gregory – Applied Psychological Measurement, 1992
A mathematical model is proposed to describe how group differences in distributions of abilities, which are distinct from the target ability, influence the probability of a correct item response. In the multidimensional approach, differential item functioning is considered a function of the educational histories of the examinees. (SLD)
Descriptors: Ability, Comparative Analysis, Equations (Mathematics), Factor Analysis