ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	6
Since 2006 (last 20 years)	6

Source

International Journal of…

Author

Almond, Russell G.	1
Chiu, Chia-Yi	1
Eckes, Thomas	1
Harring, Jeffery R.	1
Hein, Serge F.	1
Jin, Kuan-Yu	1
Kim, Yoon Jeon	1
Köhn, Hans-Friedrich	1
Levy, Roy	1
Man, Kaiwen	1
Mislevy, Robert J.	1
Ouyang, Yunbo	1
Sen, Sedat	1
Shute, Valerie J.	1
Skaggs, Gary	1
Thomas, Sarah L.	1
Wilkins, Jesse L. M.	1
Wu, Huey-Min	1
More ▼

Publication Type

Journal Articles	7
Reports - Research	5
Reports - Descriptive	2
Guides - Non-Classroom	1

Education Level

Elementary Education	1
Elementary Secondary Education	1
Grade 4	1
Grade 8	1
Junior High Schools	1
Middle Schools	1
Secondary Education	1

Audience

Practitioners	1
Researchers	1

Location

Armenia	1
Austria	1
Germany	1
Iran	1
Norway	1
Tunisia	1

Laws, Policies, & Programs

Assessments and Surveys

Trends in International…

What Works Clearinghouse Rating

Showing all 7 results Save | Export

Examining Severity and Centrality Effects in TestDaF Writing and Speaking Assessments: An Extended Bayesian Many-Facet Rasch Analysis

Peer reviewed

Direct link

Eckes, Thomas; Jin, Kuan-Yu – International Journal of Testing, 2021

Severity and centrality are two main kinds of rater effects posing threats to the validity and fairness of performance assessments. Adopting Jin and Wang's (2018) extended facets modeling approach, we separately estimated the magnitude of rater severity and centrality effects in the web-based TestDaF (Test of German as a Foreign Language) writing…

Descriptors: Language Tests, German, Second Languages, Writing Tests

Response Time Based Nonparametric Kullback-Leibler Divergence Measure for Detecting Aberrant Test-Taking Behavior

Peer reviewed

Direct link

Man, Kaiwen; Harring, Jeffery R.; Ouyang, Yunbo; Thomas, Sarah L. – International Journal of Testing, 2018

Many important high-stakes decisions--college admission, academic performance evaluation, and even job promotion--depend on accurate and reliable scores from valid large-scale assessments. However, examinees sometimes cheat by copying answers from other test-takers or practicing with test items ahead of time, which can undermine the effectiveness…

Descriptors: Reaction Time, High Stakes Tests, Test Wiseness, Cheating

Applying Evidence-Centered Design for the Development of Game-Based Assessments in Physics Playground

Peer reviewed

Direct link

Kim, Yoon Jeon; Almond, Russell G.; Shute, Valerie J. – International Journal of Testing, 2016

Game-based assessment (GBA) is a specific use of educational games that employs game activities to elicit evidence for educationally valuable skills and knowledge. While this approach can provide individualized and diagnostic information about students, the design and development of assessment mechanics for a GBA is a nontrivial task. In this…

Descriptors: Design, Evidence Based Practice, Test Construction, Physics

Spurious Latent Class Problem in the Mixed Rasch Model: A Comparison of Three Maximum Likelihood Estimation Methods under Different Ability Distributions

Peer reviewed

Direct link

Sen, Sedat – International Journal of Testing, 2018

Recent research has shown that over-extraction of latent classes can be observed in the Bayesian estimation of the mixed Rasch model when the distribution of ability is non-normal. This study examined the effect of non-normal ability distributions on the number of latent classes in the mixed Rasch model when estimated with maximum likelihood…

Descriptors: Item Response Theory, Comparative Analysis, Computation, Maximum Likelihood Statistics

Fitting the Reduced RUM with Mplus: A Tutorial

Peer reviewed

Direct link

Chiu, Chia-Yi; Köhn, Hans-Friedrich; Wu, Huey-Min – International Journal of Testing, 2016

The Reduced Reparameterized Unified Model (Reduced RUM) is a diagnostic classification model for educational assessment that has received considerable attention among psychometricians. However, the computational options for researchers and practitioners who wish to use the Reduced RUM in their work, but do not feel comfortable writing their own…

Descriptors: Educational Diagnosis, Classification, Models, Educational Assessment

Grain Size and Parameter Recovery with TIMSS and the General Diagnostic Model

Peer reviewed

Direct link

Skaggs, Gary; Wilkins, Jesse L. M.; Hein, Serge F. – International Journal of Testing, 2016

The purpose of this study was to explore the degree of grain size of the attributes and the sample sizes that can support accurate parameter recovery with the General Diagnostic Model (GDM) for a large-scale international assessment. In this resampling study, bootstrap samples were obtained from the 2003 Grade 8 TIMSS in Mathematics at varying…

Descriptors: Achievement Tests, Foreign Countries, Elementary Secondary Education, Science Achievement

Specifying and Refining a Measurement Model for a Computer-Based Interactive Assessment

Peer reviewed

Direct link

Levy, Roy; Mislevy, Robert J. – International Journal of Testing, 2004

The challenges of modeling students' performance in computer-based interactive assessments include accounting for multiple aspects of knowledge and skill that arise in different situations and the conditional dependencies among multiple aspects of performance. This article describes a Bayesian approach to modeling and estimating cognitive models…

Descriptors: Computer Assisted Testing, Markov Processes, Computer Networks, Bayesian Statistics

Bayesian Statistics	7
Item Response Theory	5
Foreign Countries	3
Monte Carlo Methods	3
Achievement Tests	2
Computation	2
Computer Assisted Testing	2
Educational Assessment	2
Goodness of Fit	2
International Assessment	2
Markov Processes	2
Mathematics Achievement	2
Mathematics Tests	2
Maximum Likelihood Statistics	2
Multivariate Analysis	2
Probability	2
Sample Size	2
Science Achievement	2
Science Tests	2
Simulation	2
Cheating	1
Classification	1
Comparative Analysis	1
Computer Networks	1
Computer Software	1
More ▼