ERIC - Search Results

Publication Date

In 2026	0
Since 2025	2
Since 2022 (last 5 years)	4
Since 2017 (last 10 years)	5
Since 2007 (last 20 years)	10

Descriptor

Comparative Testing	11
Computer Assisted Testing	11
College Students	5
Evaluation Methods	4
Foreign Countries	4
Test Format	4
Test Reliability	4
Comparative Analysis	3
Educational Technology	3
Interrater Reliability	3
Test Validity	3
Artificial Intelligence	2
Course Evaluation	2
Printed Materials	2
Scores	2
Undergraduate Students	2
Accuracy	1
Algorithms	1
Business Administration…	1
COVID-19	1
Case Studies	1
Cognitive Development	1
Cognitive Processes	1
College Entrance Examinations	1
College Faculty	1
More ▼

Source

British Educational Research…	1
British Journal of…	1
Educational Research and…	1
Journal of Computer Assisted…	1
Journal of Education for…	1
Journal of Educational…	1
Journal of Educational…	1
Journal of Pan-Pacific…	1
Journal of Technology,…	1
Journal on Efficiency and…	1
ProQuest LLC	1
More ▼

Publication Type

Journal Articles	10
Reports - Research	10
Dissertations/Theses -…	1
Tests/Questionnaires	1

Education Level

Higher Education	11
Postsecondary Education	11
Elementary Education	1
Elementary Secondary Education	1
Grade 5	1

Audience

Location

China	1
Czech Republic	1
South Korea	1
Taiwan	1

Laws, Policies, & Programs

Assessments and Surveys

Graduate Record Examinations

What Works Clearinghouse Rating

Showing all 11 results Save | Export

Grading Exams Using Large Language Models: A Comparison between Human and AI Grading of Exams in Higher Education Using ChatGPT

Peer reviewed

Direct link

Jonas Flodén – British Educational Research Journal, 2025

This study compares how the generative AI (GenAI) large language model (LLM) ChatGPT performs in grading university exams compared to human teachers. Aspects investigated include consistency, large discrepancies and length of answer. Implications for higher education, including the role of teachers and ethics, are also discussed. Three…

Descriptors: College Faculty, Artificial Intelligence, Comparative Testing, Scoring

Using Automated Procedures to Score Educational Essays Written in Three Languages

Peer reviewed

Direct link

Tahereh Firoozi; Hamid Mohammadi; Mark J. Gierl – Journal of Educational Measurement, 2025

The purpose of this study is to describe and evaluate a multilingual automated essay scoring (AES) system for grading essays in three languages. Two different sentence embedding models were evaluated within the AES system, multilingual BERT (mBERT) and language-agnostic BERT sentence embedding (LaBSE). German, Italian, and Czech essays were…

Descriptors: College Students, Slavic Languages, German, Italian

Results of Mathematics Examinations before, during, and after the COVID-19 Related Restrictions

Peer reviewed
PDF on ERIC

Download full text

Eva Ulrychová; Renata Majovská; Petr Tesar – Journal on Efficiency and Responsibility in Education and Science, 2024

The article deals with the results of mathematics examinations at the University of Finance and Administration in Prague before, during, and immediately after the COVID-19 pandemic-related restrictions. The first objective is to evaluate whether the non-standard forms of testing (correspondence and online), used on an emergency basis during the…

Descriptors: Foreign Countries, COVID-19, Pandemics, Mathematics Tests

Examining AI-Based Accuracy Assessment in L2 Learners' Writing

Peer reviewed

Direct link

On-Soon Lee – Journal of Pan-Pacific Association of Applied Linguistics, 2024

Despite the increasing interest in using AI tools as assistant agents in instructional settings, the effectiveness of ChatGPT, the generative pretrained AI, for evaluating the accuracy of second language (L2) writing has been largely unexplored in formative assessment. Therefore, the current study aims to examine how ChatGPT, as an evaluator,…

Descriptors: Foreign Countries, Undergraduate Students, English (Second Language), Second Language Learning

The Application of Cognitive Task Analysis and Cognitive Load Methods in the Process of Learning Algorithms

Direct link

Razieh Fathi – ProQuest LLC, 2021

This dissertation describes an experiment to investigate how learners with different levels of background in computer science learn core concepts of computer science, in particular, algorithms. We designed a study to focus on cognitive task analysis for eliciting the empirical mental elements of learning two graph algorithms. Cognitive workload…

Descriptors: Undergraduate Students, Computer Science Education, Algorithms, Cognitive Development

Does MTV Really Do a Good Job of Evaluating Professors? An Empirical Test of the Internet Site Ratemyprofessors.com

Peer reviewed

Direct link

Murray, Keith B.; Zdravkovic, Srdan – Journal of Education for Business, 2016

Considerable debate continues regarding the efficacy of the website RateMyProfessors.com (RMP). To date, however, virtually no direct, experimental research has been reported which directly bears on questions relating to sampling adequacy or item adequacy in producing what favorable correlations have been reported. The authors compare the data…

Descriptors: Computer Assisted Testing, Computer Software Evaluation, Student Evaluation of Teacher Performance, Item Analysis

Early Identification of Ineffective Cooperative Learning Teams

Peer reviewed

Direct link

Hsiung, C .M.; Luo, L. F.; Chung, H. C. – Journal of Computer Assisted Learning, 2014

Cooperative learning has many pedagogical benefits. However, if the cooperative learning teams become ineffective, these benefits are lost. Accordingly, this study developed a computer-aided assessment method for identifying ineffective teams at their early stage of dysfunction by using the Mahalanobis distance metric to examine the difference…

Descriptors: Cooperative Learning, Teamwork, Identification, Instructional Effectiveness

Online and Paper Evaluations of Courses: A Literature Review and Case Study

Peer reviewed

Direct link

Morrison, Keith – Educational Research and Evaluation, 2013

This paper reviews the literature on comparing online and paper course evaluations in higher education and provides a case study of a very large randomised trial on the topic. It presents a mixed but generally optimistic picture of online course evaluations with respect to response rates, what they indicate, and how to increase them. The paper…

Descriptors: Literature Reviews, Course Evaluation, Case Studies, Higher Education

Constructive Multiple-Choice Testing System

Peer reviewed

Direct link

Park, Jooyong – British Journal of Educational Technology, 2010

The newly developed computerized Constructive Multiple-choice Testing system is introduced. The system combines short answer (SA) and multiple-choice (MC) formats by asking examinees to respond to the same question twice, first in the SA format, and then in the MC format. This manipulation was employed to collect information about the two…

Descriptors: Grade 5, Evaluation Methods, Multiple Choice Tests, Scores

Differential Effects of Web-Based and Paper-Based Administration of Questionnaire Research Instruments in Authentic Contexts-of-Use

Peer reviewed

Direct link

Hardre, Patricia L.; Crowson, H. Michael; Xie, Kui – Journal of Educational Computing Research, 2010

Questionnaire instruments are routinely translated to digital administration systems; however, few studies have compared the differential effects of these administrative methods, and fewer yet in authentic contexts-of-use. In this study, 326 university students were randomly assigned to one of two administration conditions, paper-based (PBA) or…

Descriptors: Internet, Computer Assisted Testing, Questionnaires, College Students

Differential Item Functioning of GRE Mathematics Items across Computerized and Paper-and-Pencil Testing Media

Peer reviewed
PDF on ERIC

Download full text

Gu, Lixiong; Drake, Samuel; Wolfe, Edward W. – Journal of Technology, Learning, and Assessment, 2006

This study seeks to determine whether item features are related to observed differences in item difficulty (DIF) between computer- and paper-based test delivery media. Examinees responded to 60 quantitative items similar to those found on the GRE general test in either a computer-based or paper-based medium. Thirty-eight percent of the items were…

Descriptors: Test Bias, Test Items, Educational Testing, Student Evaluation

Chung, H. C.	1
Crowson, H. Michael	1
Drake, Samuel	1
Eva Ulrychová	1
Gu, Lixiong	1
Hamid Mohammadi	1
Hardre, Patricia L.	1
Hsiung, C .M.	1
Jonas Flodén	1
Luo, L. F.	1
Mark J. Gierl	1
Morrison, Keith	1
Murray, Keith B.	1
On-Soon Lee	1
Park, Jooyong	1
Petr Tesar	1
Razieh Fathi	1
Renata Majovská	1
Tahereh Firoozi	1
Wolfe, Edward W.	1
Xie, Kui	1
Zdravkovic, Srdan	1
More ▼