Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 0 |
Since 2006 (last 20 years) | 5 |
Descriptor
Educational Testing | 8 |
Error of Measurement | 8 |
Statistical Analysis | 8 |
Measurement Techniques | 3 |
Test Items | 3 |
Accuracy | 2 |
Bayesian Statistics | 2 |
Comparative Analysis | 2 |
Evaluation Methods | 2 |
Goodness of Fit | 2 |
Item Analysis | 2 |
More ▼ |
Source
Applied Psychological… | 1 |
ETS Research Report Series | 1 |
Educational and Psychological… | 1 |
International Journal of… | 1 |
Journal of Educational and… | 1 |
Practical Assessment,… | 1 |
Psychological Assessment | 1 |
Author
Boyd, Donald | 1 |
Brink, Nicholas E. | 1 |
DeMars, Christine E. | 1 |
Dwyer, Carol Anne | 1 |
Feldt, Leonard S. | 1 |
Gilmer, Jerry S. | 1 |
Han, Kyung T. | 1 |
Lankford, Hamilton | 1 |
Loeb, Susanna | 1 |
Meijer, Rob R. | 1 |
Phan, Ha | 1 |
More ▼ |
Publication Type
Journal Articles | 6 |
Reports - Research | 6 |
Reports - Evaluative | 1 |
Speeches/Meeting Papers | 1 |
Audience
Location
New York | 1 |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Socha, Alan; DeMars, Christine E.; Zilberberg, Anna; Phan, Ha – International Journal of Testing, 2015
The Mantel-Haenszel (MH) procedure is commonly used to detect items that function differentially for groups of examinees from various demographic and linguistic backgrounds--for example, in international assessments. As in some other DIF methods, the total score is used to match examinees on ability. In thin matching, each of the total score…
Descriptors: Test Items, Educational Testing, Evaluation Methods, Ability Grouping
Zwick, Rebecca – ETS Research Report Series, 2012
Differential item functioning (DIF) analysis is a key component in the evaluation of the fairness and validity of educational tests. The goal of this project was to review the status of ETS DIF analysis procedures, focusing on three aspects: (a) the nature and stringency of the statistical rules used to flag items, (b) the minimum sample size…
Descriptors: Test Bias, Sample Size, Bayesian Statistics, Evaluation Methods
Han, Kyung T. – Practical Assessment, Research & Evaluation, 2012
For several decades, the "three-parameter logistic model" (3PLM) has been the dominant choice for practitioners in the field of educational measurement for modeling examinees' response data from multiple-choice (MC) items. Past studies, however, have pointed out that the c-parameter of 3PLM should not be interpreted as a guessing…
Descriptors: Statistical Analysis, Models, Multiple Choice Tests, Guessing (Tests)
Boyd, Donald; Lankford, Hamilton; Loeb, Susanna; Wyckoff, James – Journal of Educational and Behavioral Statistics, 2013
Test-based accountability as well as value-added asessments and much experimental and quasi-experimental research in education rely on achievement tests to measure student skills and knowledge. Yet, we know little regarding fundamental properties of these tests, an important example being the extent of measurement error and its implications for…
Descriptors: Accountability, Educational Research, Educational Testing, Error of Measurement

Dwyer, Carol Anne – Psychological Assessment, 1996
The uses and abuses of cut scores are examined. The article demonstrates (1) that cut scores always entail judgment; (2) that cut scores inherently result in misclassification; (3) that cut scores impose an artificial dichotomy on an essentially continuous distribution of knowledge, skill, or ability; and (4) that no true cut scores exist. (SLD)
Descriptors: Classification, Cutting Scores, Educational Testing, Error of Measurement
Sotaridona, Leonardo S.; van der Linden, Wim J.; Meijer, Rob R. – Applied Psychological Measurement, 2006
A statistical test for detecting answer copying on multiple-choice tests based on Cohen's kappa is proposed. The test is free of any assumptions on the response processes of the examinees suspected of copying and having served as the source, except for the usual assumption that these processes are probabilistic. Because the asymptotic null and…
Descriptors: Cheating, Test Items, Simulation, Statistical Analysis

Brink, Nicholas E. – Educational and Psychological Measurement, 1972
Study compares the Rasch and the Guttman models of measurement and thus adds to the description of the characteristics of Rasch's logistic model. Such knowledge is of importance in making decisions as to which model and which statistics should be used in evaluations of tests. (Author/CB)
Descriptors: Comparative Analysis, Educational Testing, Error of Measurement, Goodness of Fit

Gilmer, Jerry S.; Feldt, Leonard S. – 1982
The Feldt-Gilmer congeneric reliability coefficients make it possible to estimate the reliability of a test composed of parts of unequal, unknown length. The approximate standard errors of the Feldt-Gilmer coefficients are derived via a method using the multivariate Taylor's expansion. Monte Carlo simulation is employed to corroborate the…
Descriptors: Educational Testing, Error of Measurement, Mathematical Formulas, Mathematical Models