Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 2 |
Since 2006 (last 20 years) | 3 |
Descriptor
Error of Measurement | 4 |
Test Theory | 4 |
Item Response Theory | 3 |
Statistical Analysis | 3 |
Test Items | 2 |
Accuracy | 1 |
Adaptive Testing | 1 |
Causal Models | 1 |
Classification | 1 |
College Entrance Examinations | 1 |
Computation | 1 |
More ▼ |
Source
ETS Research Report Series | 4 |
Author
Dorans, Neil J. | 1 |
Haberman, Shelby J. | 1 |
Kim, Sooyeon | 1 |
Li, Feifei | 1 |
Livingston, Samuel A. | 1 |
Mapuranga, Raymond | 1 |
Middleton, Kyndra | 1 |
Publication Type
Journal Articles | 4 |
Reports - Research | 4 |
Education Level
Grade 3 | 1 |
Higher Education | 1 |
Postsecondary Education | 1 |
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
SAT (College Admission Test) | 1 |
What Works Clearinghouse Rating
Kim, Sooyeon; Livingston, Samuel A. – ETS Research Report Series, 2017
The purpose of this simulation study was to assess the accuracy of a classical test theory (CTT)-based procedure for estimating the alternate-forms reliability of scores on a multistage test (MST) having 3 stages. We generated item difficulty and discrimination parameters for 10 parallel, nonoverlapping forms of the complete 3-stage test and…
Descriptors: Accuracy, Test Theory, Test Reliability, Adaptive Testing
Li, Feifei – ETS Research Report Series, 2017
An information-correction method for testlet-based tests is introduced. This method takes advantage of both generalizability theory (GT) and item response theory (IRT). The measurement error for the examinee proficiency parameter is often underestimated when a unidimensional conditional-independence IRT model is specified for a testlet dataset. By…
Descriptors: Item Response Theory, Generalizability Theory, Tests, Error of Measurement
Mapuranga, Raymond; Dorans, Neil J.; Middleton, Kyndra – ETS Research Report Series, 2008
In many practical settings, essentially the same differential item functioning (DIF) procedures have been in use since the late 1980s. Since then, examinee populations have become more heterogeneous, and tests have included more polytomously scored items. This paper summarizes and classifies new DIF methods and procedures that have appeared since…
Descriptors: Test Bias, Educational Development, Evaluation Methods, Statistical Analysis
Haberman, Shelby J. – ETS Research Report Series, 2005
In educational tests, subscores are often generated from a portion of the items in a larger test. Guidelines based on mean-squared error are proposed to indicate whether subscores are worth reporting. Alternatives considered are direct reports of subscores, estimates of subscores based on total score, combined estimates based on subscores and…
Descriptors: Scores, Test Items, Error of Measurement, Computation