Publication Date
In 2025 | 20 |
Since 2024 | 63 |
Descriptor
Accuracy | 63 |
Evaluation Methods | 63 |
Artificial Intelligence | 18 |
Foreign Countries | 14 |
Algorithms | 10 |
Bayesian Statistics | 10 |
Comparative Analysis | 10 |
Models | 10 |
Scores | 9 |
Item Response Theory | 8 |
Simulation | 8 |
More ▼ |
Source
Author
Chun Wang | 3 |
Gongjun Xu | 3 |
Kylie Gorney | 2 |
A. M. Sadek | 1 |
Abdessamad Chanaa | 1 |
Abdullah Alamer | 1 |
Alan J. Kinsella | 1 |
Alexander D. Latham | 1 |
Alexandra Miceli | 1 |
Alison Vehorn | 1 |
Amery D. Wu | 1 |
More ▼ |
Publication Type
Journal Articles | 57 |
Reports - Research | 53 |
Reports - Evaluative | 4 |
Dissertations/Theses -… | 3 |
Information Analyses | 3 |
Reports - Descriptive | 3 |
Tests/Questionnaires | 2 |
Education Level
Audience
Location
Australia | 3 |
China | 3 |
Iran | 2 |
Florida | 1 |
Illinois (Chicago) | 1 |
Indonesia | 1 |
Missouri | 1 |
Netherlands | 1 |
Russia | 1 |
Switzerland | 1 |
Turkey | 1 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
Autism Diagnostic Observation… | 1 |
International English… | 1 |
Mullen Scales of Early… | 1 |
Program for International… | 1 |
Vineland Adaptive Behavior… | 1 |
What Works Clearinghouse Rating
Michael Nagel; Lukas Fischer; Tim Pawlowski; Augustin Kelava – Structural Equation Modeling: A Multidisciplinary Journal, 2024
Bayesian estimations of complex regression models with high-dimensional parameter spaces require advanced priors, capable of addressing both sparsity and multicollinearity in the data. The Dirichlet-horseshoe, a new prior distribution that combines and expands on the concepts of the regularized horseshoe and the Dirichlet-Laplace priors, is a…
Descriptors: Bayesian Statistics, Regression (Statistics), Computation, Statistical Distributions
Guido Schwarzer; Gerta Rücker; Cristina Semaca – Research Synthesis Methods, 2024
The "LFK" index has been promoted as an improved method to detect bias in meta-analysis. Putatively, its performance does not depend on the number of studies in the meta-analysis. We conducted a simulation study, comparing the "LFK" index test to three standard tests for funnel plot asymmetry in settings with smaller or larger…
Descriptors: Bias, Meta Analysis, Simulation, Evaluation Methods
Ebru Balta; Celal Deha Dogan – SAGE Open, 2024
As computer-based testing becomes more prevalent, the attention paid to response time (RT) in assessment practice and psychometric research correspondingly increases. This study explores the rate of Type I error in detecting preknowledge cheating behaviors, the power of the Kullback-Leibler (KL) divergence measure, and the L person fit statistic…
Descriptors: Cheating, Accuracy, Reaction Time, Computer Assisted Testing
Jean-Paul Fox – Journal of Educational and Behavioral Statistics, 2025
Popular item response theory (IRT) models are considered complex, mainly due to the inclusion of a random factor variable (latent variable). The random factor variable represents the incidental parameter problem since the number of parameters increases when including data of new persons. Therefore, IRT models require a specific estimation method…
Descriptors: Sample Size, Item Response Theory, Accuracy, Bayesian Statistics
Steffen Zitzmann; Lisa Bardach; Kai T. Horstmann; Matthias Ziegler; Martin Hecht – Structural Equation Modeling: A Multidisciplinary Journal, 2024
We investigated three different approaches for quantifying individual change and reporting it back to persons: (a) the common change score, which is obtained by first computing scale scores from two consecutive measurements and then subtract these scores from one another, (b) the ad-hoc approach, which is similar to the former approach but uses…
Descriptors: Personality Change, Personality Measures, Regression (Statistics), Evaluation Methods
Reese Butterfuss; Harold Doran – Educational Measurement: Issues and Practice, 2025
Large language models are increasingly used in educational and psychological measurement activities. Their rapidly evolving sophistication and ability to detect language semantics make them viable tools to supplement subject matter experts and their reviews of large amounts of text statements, such as educational content standards. This paper…
Descriptors: Alignment (Education), Academic Standards, Content Analysis, Concept Mapping
A. M. Sadek; Fahad Al-Muhlaki – Measurement: Interdisciplinary Research and Perspectives, 2024
In this study, the accuracy of the artificial neural network (ANN) was assessed considering the uncertainties associated with the randomness of the data and the lack of learning. The Monte-Carlo algorithm was applied to simulate the randomness of the input variables and evaluate the output distribution. It has been shown that under certain…
Descriptors: Monte Carlo Methods, Accuracy, Artificial Intelligence, Guidelines
Lingbo Tong; Wen Qu; Zhiyong Zhang – Grantee Submission, 2025
Factor analysis is widely utilized to identify latent factors underlying the observed variables. This paper presents a comprehensive comparative study of two widely used methods for determining the optimal number of factors in factor analysis, the K1 rule, and parallel analysis, along with a more recently developed method, the bass-ackward method.…
Descriptors: Factor Analysis, Monte Carlo Methods, Statistical Analysis, Sample Size
Kylie Gorney; Mark D. Reckase – Journal of Educational Measurement, 2025
In computerized adaptive testing, item exposure control methods are often used to provide a more balanced usage of the item pool. Many of the most popular methods, including the restricted method (Revuelta and Ponsoda), use a single maximum exposure rate to limit the proportion of times that each item is administered. However, Barrada et al.…
Descriptors: Computer Assisted Testing, Adaptive Testing, Test Items, Item Banks
Sean Guo; Briony Swire-Thompson; Xiaoqing Hu – Cognitive Research: Principles and Implications, 2025
Images generated using artificial intelligence (AI) have become increasingly realistic, sparking discussions and fears about an impending "infodemic" where we can no longer trust what we see on the internet. In this preregistered study, we examine whether providing specific media literacy tips about how to spot AI-generated images can…
Descriptors: Media Literacy, Artificial Intelligence, Technology Uses in Education, Visual Stimuli
Huan Liu – ProQuest LLC, 2024
In many large-scale testing programs, examinees are frequently categorized into different performance levels. These classifications are then used to make high-stakes decisions about examinees in contexts such as in licensure, certification, and educational assessments. Numerous approaches to estimating the consistency and accuracy of this…
Descriptors: Classification, Accuracy, Item Response Theory, Decision Making
Alan J. Kinsella – ProQuest LLC, 2024
An accurate self-assessment repertoire is crucial for maintaining high standards of practice, or a scope of competence, among behavior analysts. However, procedural means to achieve this remain underexplored. Medical communities have investigated these effects and largely found that accuracy in self-assessment is poor, with an inverse relation…
Descriptors: Self Evaluation (Individuals), Accuracy, Behavior, Evaluation Methods
Jiaying Xiao; Chun Wang; Gongjun Xu – Grantee Submission, 2024
Accurate item parameters and standard errors (SEs) are crucial for many multidimensional item response theory (MIRT) applications. A recent study proposed the Gaussian Variational Expectation Maximization (GVEM) algorithm to improve computational efficiency and estimation accuracy (Cho et al., 2021). However, the SE estimation procedure has yet to…
Descriptors: Error of Measurement, Models, Evaluation Methods, Item Analysis
Hillary E. Merzdorf; Donna Jaison; Morgan B. Weaver; Julie Linsey; Tracy Hammond; Kerrie A. Douglas – Journal of Engineering Education, 2024
Background: Sketching exists in many disciplines and varies in how it is assessed, making it challenging to define fundamental sketching skills and the characteristics of a high-quality sketch. For instructors to apply effective strategies for teaching and assessing engineering sketching, a clear summary of the constructs, metrics, and objectives…
Descriptors: Freehand Drawing, Engineering Education, Educational Research, Design
Caspar J. Van Lissa; Eli-Boaz Clapper; Rebecca Kuiper – Research Synthesis Methods, 2024
The product Bayes factor (PBF) synthesizes evidence for an informative hypothesis across heterogeneous replication studies. It can be used when fixed- or random effects meta-analysis fall short. For example, when effect sizes are incomparable and cannot be pooled, or when studies diverge significantly in the populations, study designs, and…
Descriptors: Hypothesis Testing, Evaluation Methods, Replication (Evaluation), Sample Size