Publication Date
In 2025 | 1 |
Since 2024 | 2 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 7 |
Since 2006 (last 20 years) | 10 |
Descriptor
Hierarchical Linear Modeling | 10 |
Item Response Theory | 6 |
Test Bias | 5 |
Test Items | 5 |
Models | 4 |
Evaluation Methods | 3 |
Achievement Tests | 2 |
College Students | 2 |
Computation | 2 |
Correlation | 2 |
Difficulty Level | 2 |
More ▼ |
Source
Journal of Educational… | 10 |
Author
Albano, Anthony D. | 2 |
Artur Pokropek | 1 |
Brückner, Sebastian | 1 |
Bunch, Michael B. | 1 |
Cai, Liuhan | 1 |
Carl Westine | 1 |
Carmen Köhler | 1 |
Fox, Jean-Paul | 1 |
Hartig, Johannes | 1 |
He, Wei | 1 |
Hochweber, Jan | 1 |
More ▼ |
Publication Type
Journal Articles | 10 |
Reports - Research | 10 |
Audience
Location
Germany | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Graduate Record Examinations | 1 |
Program for International… | 1 |
What Works Clearinghouse Rating
Tong Wu; Stella Y. Kim; Carl Westine; Michelle Boyer – Journal of Educational Measurement, 2025
While significant attention has been given to test equating to ensure score comparability, limited research has explored equating methods for rater-mediated assessments, where human raters inherently introduce error. If not properly addressed, these errors can undermine score interchangeability and test validity. This study proposes an equating…
Descriptors: Item Response Theory, Evaluators, Error of Measurement, Test Validity
Carmen Köhler; Lale Khorramdel; Artur Pokropek; Johannes Hartig – Journal of Educational Measurement, 2024
For assessment scales applied to different groups (e.g., students from different states; patients in different countries), multigroup differential item functioning (MG-DIF) needs to be evaluated in order to ensure that respondents with the same trait level but from different groups have equal response probabilities on a particular item. The…
Descriptors: Measures (Individuals), Test Bias, Models, Item Response Theory
Shear, Benjamin R. – Journal of Educational Measurement, 2018
When contextual features of test-taking environments differentially affect item responding for different test takers and these features vary across test administrations, they may cause differential item functioning (DIF) that varies across test administrations. Because many common DIF detection methods ignore potential DIF variance, this article…
Descriptors: Test Bias, Regression (Statistics), Hierarchical Linear Modeling
Palermo, Corey; Bunch, Michael B.; Ridge, Kirk – Journal of Educational Measurement, 2019
Although much attention has been given to rater effects in rater-mediated assessment contexts, little research has examined the overall stability of leniency and severity effects over time. This study examined longitudinal scoring data collected during three consecutive administrations of a large-scale, multi-state summative assessment program.…
Descriptors: Scoring, Interrater Reliability, Measurement, Summative Evaluation
Albano, Anthony D.; Cai, Liuhan; Lease, Erin M.; McConnell, Scott R. – Journal of Educational Measurement, 2019
Studies have shown that item difficulty can vary significantly based on the context of an item within a test form. In particular, item position may be associated with practice and fatigue effects that influence item parameter estimation. The purpose of this research was to examine the relevance of item position specifically for assessments used in…
Descriptors: Test Items, Computer Assisted Testing, Item Analysis, Difficulty Level
Brückner, Sebastian; Pellegrino, James W. – Journal of Educational Measurement, 2016
The Standards for Educational and Psychological Testing indicate that validation of assessments should include analyses of participants' response processes. However, such analyses typically are conducted only to supplement quantitative field studies with qualitative data, and seldom are such data connected to quantitative data on student or item…
Descriptors: Hierarchical Linear Modeling, Test Validity, Statistical Analysis, College Students
Naumann, Alexander; Hochweber, Jan; Hartig, Johannes – Journal of Educational Measurement, 2014
Students' performance in assessments is commonly attributed to more or less effective teaching. This implies that students' responses are significantly affected by instruction. However, the assumption that outcome measures indeed are instructionally sensitive is scarcely investigated empirically. In the present study, we propose a…
Descriptors: Test Bias, Longitudinal Studies, Hierarchical Linear Modeling, Test Items
Schmidt, Susanne; Zlatkin-Troitschanskaia, Olga; Fox, Jean-Paul – Journal of Educational Measurement, 2016
Longitudinal research in higher education faces several challenges. Appropriate methods of analyzing competence growth of students are needed to deal with those challenges and thereby obtain valid results. In this article, a pretest-posttest-posttest multivariate multilevel IRT model for repeated measures is introduced which is designed to address…
Descriptors: Foreign Countries, Pretests Posttests, Hierarchical Linear Modeling, Item Response Theory
Albano, Anthony D. – Journal of Educational Measurement, 2013
In many testing programs it is assumed that the context or position in which an item is administered does not have a differential effect on examinee responses to the item. Violations of this assumption may bias item response theory estimates of item and person parameters. This study examines the potentially biasing effects of item position. A…
Descriptors: Test Items, Item Response Theory, Test Format, Questioning Techniques
Jiao, Hong; Wang, Shudong; He, Wei – Journal of Educational Measurement, 2013
This study demonstrated the equivalence between the Rasch testlet model and the three-level one-parameter testlet model and explored the Markov Chain Monte Carlo (MCMC) method for model parameter estimation in WINBUGS. The estimation accuracy from the MCMC method was compared with those from the marginalized maximum likelihood estimation (MMLE)…
Descriptors: Computation, Item Response Theory, Models, Monte Carlo Methods