Publication Date
In 2025 | 1 |
Since 2024 | 2 |
Since 2021 (last 5 years) | 3 |
Since 2016 (last 10 years) | 5 |
Since 2006 (last 20 years) | 7 |
Descriptor
Bayesian Statistics | 8 |
Evaluation Methods | 8 |
Scores | 8 |
Comparative Analysis | 3 |
Models | 3 |
Probability | 3 |
Accuracy | 2 |
Classification | 2 |
Computer Simulation | 2 |
Data Analysis | 2 |
Decision Making | 2 |
More ▼ |
Source
ETS Research Report Series | 1 |
Early Education and… | 1 |
Educational and Psychological… | 1 |
Grantee Submission | 1 |
International Educational… | 1 |
ProQuest LLC | 1 |
Psychometrika | 1 |
US Department of Education | 1 |
Author
Alexander D. Latham | 1 |
Burts, Diane C. | 1 |
Carly Oddleifson | 1 |
David A. Klingbeil | 1 |
Durham, Sean | 1 |
Fu, Jianbin | 1 |
Huan Liu | 1 |
Ishan N. Vengurlekar | 1 |
Jessica S. Kim | 1 |
Kim, Do-Hong | 1 |
Lambert, Richard G. | 1 |
More ▼ |
Publication Type
Reports - Research | 6 |
Journal Articles | 5 |
Dissertations/Theses -… | 1 |
Reports - Evaluative | 1 |
Speeches/Meeting Papers | 1 |
Education Level
Early Childhood Education | 1 |
Elementary Education | 1 |
Grade 3 | 1 |
Grade 4 | 1 |
Grade 5 | 1 |
Intermediate Grades | 1 |
Middle Schools | 1 |
Preschool Education | 1 |
Primary Education | 1 |
Audience
Researchers | 1 |
Location
Missouri | 1 |
North Carolina (Charlotte) | 1 |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Huan Liu – ProQuest LLC, 2024
In many large-scale testing programs, examinees are frequently categorized into different performance levels. These classifications are then used to make high-stakes decisions about examinees in contexts such as in licensure, certification, and educational assessments. Numerous approaches to estimating the consistency and accuracy of this…
Descriptors: Classification, Accuracy, Item Response Theory, Decision Making
Carly Oddleifson; Stephen Kilgus; David A. Klingbeil; Alexander D. Latham; Jessica S. Kim; Ishan N. Vengurlekar – Grantee Submission, 2025
The purpose of this study was to conduct a conceptual replication of Pendergast et al.'s (2018) study that examined the diagnostic accuracy of a nomogram procedure, also known as a naive Bayesian approach. The specific naive Bayesian approach combined academic and social-emotional and behavioral (SEB) screening data to predict student performance…
Descriptors: Bayesian Statistics, Accuracy, Social Emotional Learning, Diagnostic Tests
The AI Teacher Test: Measuring the Pedagogical Ability of Blender and GPT-3 in Educational Dialogues
Tack, Anaïs; Piech, Chris – International Educational Data Mining Society, 2022
How can we test whether state-of-the-art generative models, such as Blender and GPT-3, are good AI teachers, capable of replying to a student in an educational dialogue? Designing an AI teacher test is challenging: although evaluation methods are much-needed, there is no off-the-shelf solution to measuring pedagogical ability. This paper reports…
Descriptors: Artificial Intelligence, Dialogs (Language), Bayesian Statistics, Decision Making
Yang, Yanyun; Xia, Yan – Educational and Psychological Measurement, 2019
When item scores are ordered categorical, categorical omega can be computed based on the parameter estimates from a factor analysis model using frequentist estimators such as diagonally weighted least squares. When the sample size is relatively small and thresholds are different across items, using diagonally weighted least squares can yield a…
Descriptors: Scores, Sample Size, Bayesian Statistics, Item Analysis
Kim, Do-Hong; Lambert, Richard G.; Durham, Sean; Burts, Diane C. – Early Education and Development, 2018
Research Findings: This study builds on prior work related to the assessment of young dual language learners (DLLs). The purposes of the study were to (a) determine whether latent subgroups of preschool DLLs would replicate those found previously and (b) examine the validity of GOLD® by Teaching Strategies with empirically derived subgroups.…
Descriptors: Preschool Education, Teaching Methods, Bilingualism, Bilingual Education
Fu, Jianbin; Zapata, Diego; Mavronikolas, Elia – ETS Research Report Series, 2014
Simulation or game-based assessments produce outcome data and process data. In this article, some statistical models that can potentially be used to analyze data from simulation or game-based assessments are introduced. Specifically, cognitive diagnostic models that can be used to estimate latent skills from outcome data so as to scale these…
Descriptors: Simulation, Evaluation Methods, Games, Data Collection
Ligtvoet, Rudy – Psychometrika, 2012
In practice, the sum of the item scores is often used as a basis for comparing subjects. For items that have more than two ordered score categories, only the partial credit model (PCM) and special cases of this model imply that the subjects are stochastically ordered on the common latent variable. However, the PCM is very restrictive with respect…
Descriptors: Simulation, Item Response Theory, Comparative Analysis, Scores
Levy, Roy; Mislevy, Robert J. – US Department of Education, 2004
The challenges of modeling students' performance in simulation-based assessments include accounting for multiple aspects of knowledge and skill that arise in different situations and the conditional dependencies among multiple aspects of performance in a complex assessment. This paper describes a Bayesian approach to modeling and estimating…
Descriptors: Probability, Markov Processes, Monte Carlo Methods, Bayesian Statistics