NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 7 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Eckes, Thomas; Jin, Kuan-Yu – International Journal of Testing, 2021
Severity and centrality are two main kinds of rater effects posing threats to the validity and fairness of performance assessments. Adopting Jin and Wang's (2018) extended facets modeling approach, we separately estimated the magnitude of rater severity and centrality effects in the web-based TestDaF (Test of German as a Foreign Language) writing…
Descriptors: Language Tests, German, Second Languages, Writing Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Man, Kaiwen; Harring, Jeffery R.; Ouyang, Yunbo; Thomas, Sarah L. – International Journal of Testing, 2018
Many important high-stakes decisions--college admission, academic performance evaluation, and even job promotion--depend on accurate and reliable scores from valid large-scale assessments. However, examinees sometimes cheat by copying answers from other test-takers or practicing with test items ahead of time, which can undermine the effectiveness…
Descriptors: Reaction Time, High Stakes Tests, Test Wiseness, Cheating
Peer reviewed Peer reviewed
Direct linkDirect link
Kim, Yoon Jeon; Almond, Russell G.; Shute, Valerie J. – International Journal of Testing, 2016
Game-based assessment (GBA) is a specific use of educational games that employs game activities to elicit evidence for educationally valuable skills and knowledge. While this approach can provide individualized and diagnostic information about students, the design and development of assessment mechanics for a GBA is a nontrivial task. In this…
Descriptors: Design, Evidence Based Practice, Test Construction, Physics
Peer reviewed Peer reviewed
Direct linkDirect link
Sen, Sedat – International Journal of Testing, 2018
Recent research has shown that over-extraction of latent classes can be observed in the Bayesian estimation of the mixed Rasch model when the distribution of ability is non-normal. This study examined the effect of non-normal ability distributions on the number of latent classes in the mixed Rasch model when estimated with maximum likelihood…
Descriptors: Item Response Theory, Comparative Analysis, Computation, Maximum Likelihood Statistics
Peer reviewed Peer reviewed
Direct linkDirect link
Chiu, Chia-Yi; Köhn, Hans-Friedrich; Wu, Huey-Min – International Journal of Testing, 2016
The Reduced Reparameterized Unified Model (Reduced RUM) is a diagnostic classification model for educational assessment that has received considerable attention among psychometricians. However, the computational options for researchers and practitioners who wish to use the Reduced RUM in their work, but do not feel comfortable writing their own…
Descriptors: Educational Diagnosis, Classification, Models, Educational Assessment
Peer reviewed Peer reviewed
Direct linkDirect link
Skaggs, Gary; Wilkins, Jesse L. M.; Hein, Serge F. – International Journal of Testing, 2016
The purpose of this study was to explore the degree of grain size of the attributes and the sample sizes that can support accurate parameter recovery with the General Diagnostic Model (GDM) for a large-scale international assessment. In this resampling study, bootstrap samples were obtained from the 2003 Grade 8 TIMSS in Mathematics at varying…
Descriptors: Achievement Tests, Foreign Countries, Elementary Secondary Education, Science Achievement
Peer reviewed Peer reviewed
Direct linkDirect link
Levy, Roy; Mislevy, Robert J. – International Journal of Testing, 2004
The challenges of modeling students' performance in computer-based interactive assessments include accounting for multiple aspects of knowledge and skill that arise in different situations and the conditional dependencies among multiple aspects of performance. This article describes a Bayesian approach to modeling and estimating cognitive models…
Descriptors: Computer Assisted Testing, Markov Processes, Computer Networks, Bayesian Statistics