Publication Date
In 2025 | 2 |
Since 2024 | 4 |
Since 2021 (last 5 years) | 5 |
Since 2016 (last 10 years) | 6 |
Since 2006 (last 20 years) | 11 |
Descriptor
Source
Author
Chung, H. C. | 1 |
Crowson, H. Michael | 1 |
Drake, Samuel | 1 |
Eva Ulrychová | 1 |
Gu, Lixiong | 1 |
Hamid Mohammadi | 1 |
Hardre, Patricia L. | 1 |
Hsiung, C .M. | 1 |
Jonas Flodén | 1 |
Luo, L. F. | 1 |
Mark J. Gierl | 1 |
More ▼ |
Publication Type
Journal Articles | 10 |
Reports - Research | 10 |
Dissertations/Theses -… | 1 |
Tests/Questionnaires | 1 |
Education Level
Higher Education | 11 |
Postsecondary Education | 11 |
Elementary Education | 1 |
Elementary Secondary Education | 1 |
Grade 5 | 1 |
Audience
Location
China | 1 |
Czech Republic | 1 |
South Korea | 1 |
Taiwan | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Graduate Record Examinations | 1 |
What Works Clearinghouse Rating
Jonas Flodén – British Educational Research Journal, 2025
This study compares how the generative AI (GenAI) large language model (LLM) ChatGPT performs in grading university exams compared to human teachers. Aspects investigated include consistency, large discrepancies and length of answer. Implications for higher education, including the role of teachers and ethics, are also discussed. Three…
Descriptors: College Faculty, Artificial Intelligence, Comparative Testing, Scoring
Tahereh Firoozi; Hamid Mohammadi; Mark J. Gierl – Journal of Educational Measurement, 2025
The purpose of this study is to describe and evaluate a multilingual automated essay scoring (AES) system for grading essays in three languages. Two different sentence embedding models were evaluated within the AES system, multilingual BERT (mBERT) and language-agnostic BERT sentence embedding (LaBSE). German, Italian, and Czech essays were…
Descriptors: College Students, Slavic Languages, German, Italian
Eva Ulrychová; Renata Majovská; Petr Tesar – Journal on Efficiency and Responsibility in Education and Science, 2024
The article deals with the results of mathematics examinations at the University of Finance and Administration in Prague before, during, and immediately after the COVID-19 pandemic-related restrictions. The first objective is to evaluate whether the non-standard forms of testing (correspondence and online), used on an emergency basis during the…
Descriptors: Foreign Countries, COVID-19, Pandemics, Mathematics Tests
On-Soon Lee – Journal of Pan-Pacific Association of Applied Linguistics, 2024
Despite the increasing interest in using AI tools as assistant agents in instructional settings, the effectiveness of ChatGPT, the generative pretrained AI, for evaluating the accuracy of second language (L2) writing has been largely unexplored in formative assessment. Therefore, the current study aims to examine how ChatGPT, as an evaluator,…
Descriptors: Foreign Countries, Undergraduate Students, English (Second Language), Second Language Learning
Razieh Fathi – ProQuest LLC, 2021
This dissertation describes an experiment to investigate how learners with different levels of background in computer science learn core concepts of computer science, in particular, algorithms. We designed a study to focus on cognitive task analysis for eliciting the empirical mental elements of learning two graph algorithms. Cognitive workload…
Descriptors: Undergraduate Students, Computer Science Education, Algorithms, Cognitive Development
Murray, Keith B.; Zdravkovic, Srdan – Journal of Education for Business, 2016
Considerable debate continues regarding the efficacy of the website RateMyProfessors.com (RMP). To date, however, virtually no direct, experimental research has been reported which directly bears on questions relating to sampling adequacy or item adequacy in producing what favorable correlations have been reported. The authors compare the data…
Descriptors: Computer Assisted Testing, Computer Software Evaluation, Student Evaluation of Teacher Performance, Item Analysis
Hsiung, C .M.; Luo, L. F.; Chung, H. C. – Journal of Computer Assisted Learning, 2014
Cooperative learning has many pedagogical benefits. However, if the cooperative learning teams become ineffective, these benefits are lost. Accordingly, this study developed a computer-aided assessment method for identifying ineffective teams at their early stage of dysfunction by using the Mahalanobis distance metric to examine the difference…
Descriptors: Cooperative Learning, Teamwork, Identification, Instructional Effectiveness
Morrison, Keith – Educational Research and Evaluation, 2013
This paper reviews the literature on comparing online and paper course evaluations in higher education and provides a case study of a very large randomised trial on the topic. It presents a mixed but generally optimistic picture of online course evaluations with respect to response rates, what they indicate, and how to increase them. The paper…
Descriptors: Literature Reviews, Course Evaluation, Case Studies, Higher Education
Park, Jooyong – British Journal of Educational Technology, 2010
The newly developed computerized Constructive Multiple-choice Testing system is introduced. The system combines short answer (SA) and multiple-choice (MC) formats by asking examinees to respond to the same question twice, first in the SA format, and then in the MC format. This manipulation was employed to collect information about the two…
Descriptors: Grade 5, Evaluation Methods, Multiple Choice Tests, Scores
Hardre, Patricia L.; Crowson, H. Michael; Xie, Kui – Journal of Educational Computing Research, 2010
Questionnaire instruments are routinely translated to digital administration systems; however, few studies have compared the differential effects of these administrative methods, and fewer yet in authentic contexts-of-use. In this study, 326 university students were randomly assigned to one of two administration conditions, paper-based (PBA) or…
Descriptors: Internet, Computer Assisted Testing, Questionnaires, College Students
Gu, Lixiong; Drake, Samuel; Wolfe, Edward W. – Journal of Technology, Learning, and Assessment, 2006
This study seeks to determine whether item features are related to observed differences in item difficulty (DIF) between computer- and paper-based test delivery media. Examinees responded to 60 quantitative items similar to those found on the GRE general test in either a computer-based or paper-based medium. Thirty-eight percent of the items were…
Descriptors: Test Bias, Test Items, Educational Testing, Student Evaluation