ERIC Number: ED677868
Record Type: Non-Journal
Publication Date: 2025
Pages: 8
Abstractor: As Provided
ISBN: N/A
ISSN: N/A
EISSN: N/A
Available Date: 0000-00-00
Exploring Ranking Consistency of Generative AI in MOOC Platform Evaluation: A Non-Parametric Approach
Victor K. Y. Chan
International Association for Development of the Information Society, Paper presented at the International Association for Development of the Information Society (IADIS) International Conference on Cognition and Exploratory Learning in the Digital Age (CELDA) (22nd, Porto, Portugal, Nov 1-3, 2025)
This paper extends a prior study on the consistency of generative Artificial Intelligence (AI) models in evaluating Massive Open Online Course (MOOC) platforms. While the original work focused on the consistency of direct numerical scores, this research investigates the consistency of the rankings derived from those scores. When evaluating platforms, the relative order (i.e., which platform is better than another) is often more critical to a decision-maker than the absolute scores, which may be subject to systematic biases. This study analyzes the scores of 31 MOOC platforms across eight dimensions as evaluated by two AI models, Claude+ and Dragonfly. A suite of non-parametric statistical methods are employed, including Spearman's Rank Correlation Coefficient ([rho]), Kendall's Tau ([tau]), and the top-weighted Rank-Biased Overlap (RBO), to measure the concordance of the platform rankings produced by each model. The Wilcoxon Signed-Rank Test is used to assess systematic differences in scoring. Results indicate a moderate to strong monotonic correlation in rankings for dimensions like (2) pedagogical design, (1) content/course quality, and (6) Learner Engagement, reinforcing the original study's findings of consistency. However, the RBO analysis reveals that this agreement is weaker for the top-ranked platforms, providing a more nuanced understanding of AI evaluation consistency. The systemic scoring bias found in the original study is also reaffirmed here. This rank-based analysis offers a robust alternative to score-based comparisons, mitigating the effects of differing internal scoring scales and highlighting the practical utility of AI evaluations for comparative decision-making. By shifting the focus from absolute scores to relative rankings, this study underscores the practical value of generative AI as a decision-support tool in educational technology evaluation. The findings not only enhance methodological rigor in AI-based assessments but also provide actionable insights for learners and institutions navigating an increasingly complex MOOC landscape. [For the complete proceedings, "Proceedings of the International Association for Development of the Information Society (IADIS) International Conference on Cognition and Exploratory Learning in the Digital Age (CELDA) (22nd, Porto, Portugal, November 1-3, 2025)," see ED677812.]
Descriptors: Artificial Intelligence, MOOCs, Reliability, Evaluation Methods, Educational Technology, Online Courses
International Association for the Development of the Information Society. e-mail: secretariat@iadis.org; Web site: http://www.iadisportal.org
Publication Type: Speeches/Meeting Papers; Reports - Research
Education Level: N/A
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: N/A
Grant or Contract Numbers: N/A
Author Affiliations: N/A

Peer reviewed
