Publication Date
In 2025 | 0 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 4 |
Since 2016 (last 10 years) | 5 |
Since 2006 (last 20 years) | 8 |
Descriptor
Error of Measurement | 10 |
Item Analysis | 10 |
Test Length | 10 |
Test Items | 8 |
Item Response Theory | 6 |
Comparative Analysis | 5 |
Adaptive Testing | 4 |
Computer Assisted Testing | 4 |
Sample Size | 4 |
Monte Carlo Methods | 3 |
Simulation | 3 |
More ▼ |
Source
ETS Research Report Series | 2 |
Applied Measurement in… | 1 |
Applied Psychological… | 1 |
Educational and Psychological… | 1 |
Journal of Educational… | 1 |
ProQuest LLC | 1 |
Psychological Methods | 1 |
Psychometrika | 1 |
Author
Bejar, Isaac I. | 1 |
Dorans, Neil J. | 1 |
Emons, Wilco H. M. | 1 |
Finch, Holmes | 1 |
Goodrich, J. Marc | 1 |
Gu, Lixiong | 1 |
Guo, Hongwen | 1 |
Huang, Feifei | 1 |
Huo, Yan | 1 |
Koziol, Natalie A. | 1 |
Lee, Won-Chan | 1 |
More ▼ |
Publication Type
Journal Articles | 8 |
Reports - Research | 7 |
Reports - Evaluative | 2 |
Dissertations/Theses -… | 1 |
Education Level
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Shaojie Wang; Won-Chan Lee; Minqiang Zhang; Lixin Yuan – Applied Measurement in Education, 2024
To reduce the impact of parameter estimation errors on IRT linking results, recent work introduced two information-weighted characteristic curve methods for dichotomous items. These two methods showed outstanding performance in both simulation and pseudo-form pseudo-group analysis. The current study expands upon the concept of information…
Descriptors: Item Response Theory, Test Format, Test Length, Error of Measurement
Wang, Shaojie; Zhang, Minqiang; Lee, Won-Chan; Huang, Feifei; Li, Zonglong; Li, Yixing; Yu, Sufang – Journal of Educational Measurement, 2022
Traditional IRT characteristic curve linking methods ignore parameter estimation errors, which may undermine the accuracy of estimated linking constants. Two new linking methods are proposed that take into account parameter estimation errors. The item- (IWCC) and test-information-weighted characteristic curve (TWCC) methods employ weighting…
Descriptors: Item Response Theory, Error of Measurement, Accuracy, Monte Carlo Methods
Koziol, Natalie A.; Goodrich, J. Marc; Yoon, HyeonJin – Educational and Psychological Measurement, 2022
Differential item functioning (DIF) is often used to examine validity evidence of alternate form test accommodations. Unfortunately, traditional approaches for evaluating DIF are prone to selection bias. This article proposes a novel DIF framework that capitalizes on regression discontinuity design analysis to control for selection bias. A…
Descriptors: Regression (Statistics), Item Analysis, Validity, Testing Accommodations
Lu, Ru; Guo, Hongwen; Dorans, Neil J. – ETS Research Report Series, 2021
Two families of analysis methods can be used for differential item functioning (DIF) analysis. One family is DIF analysis based on observed scores, such as the Mantel-Haenszel (MH) and the standardized proportion-correct metric for DIF procedures; the other is analysis based on latent ability, in which the statistic is a measure of departure from…
Descriptors: Robustness (Statistics), Weighted Scores, Test Items, Item Analysis
Gu, Lixiong; Ling, Guangming; Qu, Yanxuan – ETS Research Report Series, 2019
Research has found that the "a"-stratified item selection strategy (STR) for computerized adaptive tests (CATs) may lead to insufficient use of high a items at later stages of the tests and thus to reduced measurement precision. A refined approach, unequal item selection across strata (USTR), effectively improves test precision over the…
Descriptors: Computer Assisted Testing, Adaptive Testing, Test Use, Test Items
Yao, Lihua – Psychometrika, 2012
Multidimensional computer adaptive testing (MCAT) can provide higher precision and reliability or reduce test length when compared with unidimensional CAT or with the paper-and-pencil test. This study compared five item selection procedures in the MCAT framework for both domain scores and overall scores through simulation by varying the structure…
Descriptors: Item Banks, Test Length, Simulation, Adaptive Testing
Huo, Yan – ProQuest LLC, 2009
Variable-length computerized adaptive testing (CAT) can provide examinees with tailored test lengths. With the fixed standard error of measurement ("SEM") termination rule, variable-length CAT can achieve predetermined measurement precision by using relatively shorter tests compared to fixed-length CAT. To explore the application of…
Descriptors: Test Length, Test Items, Adaptive Testing, Item Analysis
Emons, Wilco H. M.; Sijtsma, Klaas; Meijer, Rob R. – Psychological Methods, 2007
Short tests containing at most 15 items are used in clinical and health psychology, medicine, and psychiatry for making decisions about patients. Because short tests have large measurement error, the authors ask whether they are reliable enough for classifying patients into a treatment and a nontreatment group. For a given certainty level,…
Descriptors: Psychiatry, Patients, Error of Measurement, Test Length
Finch, Holmes – Applied Psychological Measurement, 2005
This study compares the ability of the multiple indicators, multiple causes (MIMIC) confirmatory factor analysis model to correctly identify cases of differential item functioning (DIF) with more established methods. Although the MIMIC model might have application in identifying DIF for multiple grouping variables, there has been little…
Descriptors: Identification, Factor Analysis, Test Bias, Models
Bejar, Isaac I.; And Others – 1977
Information provided by typical and improved conventional classroom achievement tests was compared with information provided by an adaptive test covering the same subject matter. Both tests were administered to over 700 college students in a general biology course. Using the same scoring method, adaptive testing was found to yield substantially…
Descriptors: Academic Achievement, Achievement Tests, Adaptive Testing, Biology