ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	0
Since 2006 (last 20 years)	5

Descriptor

Educational Testing	8
Error of Measurement	8
Statistical Analysis	8
Measurement Techniques	3
Test Items	3
Accuracy	2
Bayesian Statistics	2
Comparative Analysis	2
Evaluation Methods	2
Goodness of Fit	2
Item Analysis	2
Item Response Theory	2
Mathematical Models	2
Multiple Choice Tests	2
Sample Size	2
Scores	2
Simulation	2
Test Bias	2
Ability Grouping	1
Accountability	1
Achievement Tests	1
Cheating	1
Classification	1
Computation	1
Correlation	1
More ▼

Source

Applied Psychological…	1
ETS Research Report Series	1
Educational and Psychological…	1
International Journal of…	1
Journal of Educational and…	1
Practical Assessment,…	1
Psychological Assessment	1

Author

Boyd, Donald	1
Brink, Nicholas E.	1
DeMars, Christine E.	1
Dwyer, Carol Anne	1
Feldt, Leonard S.	1
Gilmer, Jerry S.	1
Han, Kyung T.	1
Lankford, Hamilton	1
Loeb, Susanna	1
Meijer, Rob R.	1
Phan, Ha	1
Socha, Alan	1
Sotaridona, Leonardo S.	1
Wyckoff, James	1
Zilberberg, Anna	1
Zwick, Rebecca	1
van der Linden, Wim J.	1
More ▼

Publication Type

Journal Articles	6
Reports - Research	6
Reports - Evaluative	1
Speeches/Meeting Papers	1

Education Level

Grade 3	1
Grade 4	1
Grade 5	1
Grade 6	1
Grade 7	1
Grade 8	1

Audience

Location

New York

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 8 results Save | Export

Differential Item Functioning Detection with the Mantel-Haenszel Procedure: The Effects of Matching Types and Other Factors

Peer reviewed

Direct link

Socha, Alan; DeMars, Christine E.; Zilberberg, Anna; Phan, Ha – International Journal of Testing, 2015

The Mantel-Haenszel (MH) procedure is commonly used to detect items that function differentially for groups of examinees from various demographic and linguistic backgrounds--for example, in international assessments. As in some other DIF methods, the total score is used to match examinees on ability. In thin matching, each of the total score…

Descriptors: Test Items, Educational Testing, Evaluation Methods, Ability Grouping

A Review of ETS Differential Item Functioning Assessment Procedures: Flagging Rules, Minimum Sample Size Requirements, and Criterion Refinement. Research Report. ETS RR-12-08

Peer reviewed
PDF on ERIC

Download full text

Zwick, Rebecca – ETS Research Report Series, 2012

Differential item functioning (DIF) analysis is a key component in the evaluation of the fairness and validity of educational tests. The goal of this project was to review the status of ETS DIF analysis procedures, focusing on three aspects: (a) the nature and stringency of the statistical rules used to flag items, (b) the minimum sample size…

Descriptors: Test Bias, Sample Size, Bayesian Statistics, Evaluation Methods

Fixing the c Parameter in the Three-Parameter Logistic Model

Peer reviewed
PDF on ERIC

Download full text

Han, Kyung T. – Practical Assessment, Research & Evaluation, 2012

For several decades, the "three-parameter logistic model" (3PLM) has been the dominant choice for practitioners in the field of educational measurement for modeling examinees' response data from multiple-choice (MC) items. Past studies, however, have pointed out that the c-parameter of 3PLM should not be interpreted as a guessing…

Descriptors: Statistical Analysis, Models, Multiple Choice Tests, Guessing (Tests)

Measuring Test Measurement Error: A General Approach

Peer reviewed

Direct link

Boyd, Donald; Lankford, Hamilton; Loeb, Susanna; Wyckoff, James – Journal of Educational and Behavioral Statistics, 2013

Test-based accountability as well as value-added asessments and much experimental and quasi-experimental research in education rely on achievement tests to measure student skills and knowledge. Yet, we know little regarding fundamental properties of these tests, an important example being the extent of measurement error and its implications for…

Descriptors: Accountability, Educational Research, Educational Testing, Error of Measurement

Cut Scores and Testing: Statistics, Judgment, Truth, and Error.

Peer reviewed

Dwyer, Carol Anne – Psychological Assessment, 1996

The uses and abuses of cut scores are examined. The article demonstrates (1) that cut scores always entail judgment; (2) that cut scores inherently result in misclassification; (3) that cut scores impose an artificial dichotomy on an essentially continuous distribution of knowledge, skill, or ability; and (4) that no true cut scores exist. (SLD)

Descriptors: Classification, Cutting Scores, Educational Testing, Error of Measurement

Detecting Answer Copying Using the Kappa Statistic

Peer reviewed

Direct link

Sotaridona, Leonardo S.; van der Linden, Wim J.; Meijer, Rob R. – Applied Psychological Measurement, 2006

A statistical test for detecting answer copying on multiple-choice tests based on Cohen's kappa is proposed. The test is free of any assumptions on the response processes of the examinees suspected of copying and having served as the source, except for the usual assumption that these processes are probabilistic. Because the asymptotic null and…

Descriptors: Cheating, Test Items, Simulation, Statistical Analysis

Rasch's Logistic Model Vs. the Guttman Model

Peer reviewed

Brink, Nicholas E. – Educational and Psychological Measurement, 1972

Study compares the Rasch and the Guttman models of measurement and thus adds to the description of the characteristics of Rasch's logistic model. Such knowledge is of importance in making decisions as to which model and which statistics should be used in evaluations of tests. (Author/CB)

Descriptors: Comparative Analysis, Educational Testing, Error of Measurement, Goodness of Fit

The Standard Errors of the Feldt-Gilmer Congeneric Reliability Coefficients: Iowa Testing Programs Occasional Papers. Number 31.

PDF pending restoration

Gilmer, Jerry S.; Feldt, Leonard S. – 1982

The Feldt-Gilmer congeneric reliability coefficients make it possible to estimate the reliability of a test composed of parts of unequal, unknown length. The approximate standard errors of the Feldt-Gilmer coefficients are derived via a method using the multivariate Taylor's expansion. Monte Carlo simulation is employed to corroborate the…

Descriptors: Educational Testing, Error of Measurement, Mathematical Formulas, Mathematical Models