Publication Date
In 2025 | 2 |
Since 2024 | 3 |
Since 2021 (last 5 years) | 3 |
Since 2016 (last 10 years) | 3 |
Since 2006 (last 20 years) | 3 |
Descriptor
Source
Educational Measurement:… | 3 |
Author
Alina A. von Davier | 1 |
Deborah J. Harris | 1 |
Guher Gorgun | 1 |
Harold Doran | 1 |
Jiangang Hao | 1 |
Matthias von Davier | 1 |
Okan Bulut | 1 |
Reese Butterfuss | 1 |
Susan Lottridge | 1 |
Victoria Yaneva | 1 |
Publication Type
Journal Articles | 3 |
Reports - Research | 2 |
Reports - Descriptive | 1 |
Education Level
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Reese Butterfuss; Harold Doran – Educational Measurement: Issues and Practice, 2025
Large language models are increasingly used in educational and psychological measurement activities. Their rapidly evolving sophistication and ability to detect language semantics make them viable tools to supplement subject matter experts and their reviews of large amounts of text statements, such as educational content standards. This paper…
Descriptors: Alignment (Education), Academic Standards, Content Analysis, Concept Mapping
Jiangang Hao; Alina A. von Davier; Victoria Yaneva; Susan Lottridge; Matthias von Davier; Deborah J. Harris – Educational Measurement: Issues and Practice, 2024
The remarkable strides in artificial intelligence (AI), exemplified by ChatGPT, have unveiled a wealth of opportunities and challenges in assessment. Applying cutting-edge large language models (LLMs) and generative AI to assessment holds great promise in boosting efficiency, mitigating bias, and facilitating customized evaluations. Conversely,…
Descriptors: Evaluation Methods, Artificial Intelligence, Educational Change, Computer Software
Guher Gorgun; Okan Bulut – Educational Measurement: Issues and Practice, 2025
Automatic item generation may supply many items instantly and efficiently to assessment and learning environments. Yet, the evaluation of item quality persists to be a bottleneck for deploying generated items in learning and assessment settings. In this study, we investigated the utility of using large-language models, specifically Llama 3-8B, for…
Descriptors: Artificial Intelligence, Quality Control, Technology Uses in Education, Automation