NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 8 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Socha, Alan; DeMars, Christine E.; Zilberberg, Anna; Phan, Ha – International Journal of Testing, 2015
The Mantel-Haenszel (MH) procedure is commonly used to detect items that function differentially for groups of examinees from various demographic and linguistic backgrounds--for example, in international assessments. As in some other DIF methods, the total score is used to match examinees on ability. In thin matching, each of the total score…
Descriptors: Test Items, Educational Testing, Evaluation Methods, Ability Grouping
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Zwick, Rebecca – ETS Research Report Series, 2012
Differential item functioning (DIF) analysis is a key component in the evaluation of the fairness and validity of educational tests. The goal of this project was to review the status of ETS DIF analysis procedures, focusing on three aspects: (a) the nature and stringency of the statistical rules used to flag items, (b) the minimum sample size…
Descriptors: Test Bias, Sample Size, Bayesian Statistics, Evaluation Methods
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Han, Kyung T. – Practical Assessment, Research & Evaluation, 2012
For several decades, the "three-parameter logistic model" (3PLM) has been the dominant choice for practitioners in the field of educational measurement for modeling examinees' response data from multiple-choice (MC) items. Past studies, however, have pointed out that the c-parameter of 3PLM should not be interpreted as a guessing…
Descriptors: Statistical Analysis, Models, Multiple Choice Tests, Guessing (Tests)
Peer reviewed Peer reviewed
Direct linkDirect link
Boyd, Donald; Lankford, Hamilton; Loeb, Susanna; Wyckoff, James – Journal of Educational and Behavioral Statistics, 2013
Test-based accountability as well as value-added asessments and much experimental and quasi-experimental research in education rely on achievement tests to measure student skills and knowledge. Yet, we know little regarding fundamental properties of these tests, an important example being the extent of measurement error and its implications for…
Descriptors: Accountability, Educational Research, Educational Testing, Error of Measurement
Peer reviewed Peer reviewed
Dwyer, Carol Anne – Psychological Assessment, 1996
The uses and abuses of cut scores are examined. The article demonstrates (1) that cut scores always entail judgment; (2) that cut scores inherently result in misclassification; (3) that cut scores impose an artificial dichotomy on an essentially continuous distribution of knowledge, skill, or ability; and (4) that no true cut scores exist. (SLD)
Descriptors: Classification, Cutting Scores, Educational Testing, Error of Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Sotaridona, Leonardo S.; van der Linden, Wim J.; Meijer, Rob R. – Applied Psychological Measurement, 2006
A statistical test for detecting answer copying on multiple-choice tests based on Cohen's kappa is proposed. The test is free of any assumptions on the response processes of the examinees suspected of copying and having served as the source, except for the usual assumption that these processes are probabilistic. Because the asymptotic null and…
Descriptors: Cheating, Test Items, Simulation, Statistical Analysis
Peer reviewed Peer reviewed
Brink, Nicholas E. – Educational and Psychological Measurement, 1972
Study compares the Rasch and the Guttman models of measurement and thus adds to the description of the characteristics of Rasch's logistic model. Such knowledge is of importance in making decisions as to which model and which statistics should be used in evaluations of tests. (Author/CB)
Descriptors: Comparative Analysis, Educational Testing, Error of Measurement, Goodness of Fit
PDF pending restoration PDF pending restoration
Gilmer, Jerry S.; Feldt, Leonard S. – 1982
The Feldt-Gilmer congeneric reliability coefficients make it possible to estimate the reliability of a test composed of parts of unequal, unknown length. The approximate standard errors of the Feldt-Gilmer coefficients are derived via a method using the multivariate Taylor's expansion. Monte Carlo simulation is employed to corroborate the…
Descriptors: Educational Testing, Error of Measurement, Mathematical Formulas, Mathematical Models