NotesFAQContact Us
Collection
Advanced
Search Tips
Publication Date
In 202524
Audience
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 24 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Stefan O'Grady – TESOL Journal, 2025
Task-based language assessment represents a major component of task-based language teaching syllabi. Current perspectives emphasise the importance of tasks in the assessment process, suggesting that adherence to influential models of language production during task design yields predictable test outcomes. The current study contends that the…
Descriptors: Task Analysis, Language Tests, Evaluators, Rating Scales
Peer reviewed Peer reviewed
Direct linkDirect link
Zebo Xu; Prerit S. Mittal; Mohd. Mohsin Ahmed; Chandranath Adak; Zhenguang G. Cai – Reading and Writing: An Interdisciplinary Journal, 2025
The rise of the digital era has led to a decline in handwriting as the primary mode of communication, resulting in negative effects on handwriting literacy, particularly in complex writing systems such as Chinese. The marginalization of handwriting has contributed to the deterioration of penmanship, defined as the ability to write aesthetically…
Descriptors: Handwriting, Writing Skills, Chinese, Ideography
Akif Avcu – Malaysian Online Journal of Educational Technology, 2025
This scope-review presents the milestones of how Hierarchical Rater Models (HRMs) become operable to used in automated essay scoring (AES) to improve instructional evaluation. Although essay evaluations--a useful instrument for evaluating higher-order cognitive abilities--have always depended on human raters, concerns regarding rater bias,…
Descriptors: Automation, Scoring, Models, Educational Assessment
Peer reviewed Peer reviewed
Direct linkDirect link
Peter Baldwin; Victoria Yaneva; Kai North; Le An Ha; Yiyun Zhou; Alex J. Mechaber; Brian E. Clauser – Journal of Educational Measurement, 2025
Recent developments in the use of large-language models have led to substantial improvements in the accuracy of content-based automated scoring of free-text responses. The reported accuracy levels suggest that automated systems could have widespread applicability in assessment. However, before they are used in operational testing, other aspects of…
Descriptors: Artificial Intelligence, Scoring, Computational Linguistics, Accuracy
Peer reviewed Peer reviewed
Direct linkDirect link
Catherine Davies; Holly Ingram – Research Evaluation, 2025
As part of the shift towards a more equitable research culture, funders are reconsidering traditional approaches to peer review. In doing so, they seek to minimize bias towards certain research ideas and researcher profiles, to ensure greater inclusion of disadvantaged groups, to improve review quality, to reduce burden, and to enable more…
Descriptors: Resource Allocation, Research, Culture, Probability
Peer reviewed Peer reviewed
Direct linkDirect link
Boris Forthmann; Benjamin Goecke; Roger E. Beaty – Creativity Research Journal, 2025
Human ratings are ubiquitous in creativity research. Yet, the process of rating responses to creativity tasks -- typically several hundred or thousands of responses, per rater -- is often time-consuming and expensive. Planned missing data designs, where raters only rate a subset of the total number of responses, have been recently proposed as one…
Descriptors: Creativity, Research, Researchers, Research Methodology
Peer reviewed Peer reviewed
Direct linkDirect link
Ngoc My Bui; Jessie S. Barrot – Education and Information Technologies, 2025
With the generative artificial intelligence (AI) tool's remarkable capabilities in understanding and generating meaningful content, intriguing questions have been raised about its potential as an automated essay scoring (AES) system. One such tool is ChatGPT, which is capable of scoring any written work based on predefined criteria. However,…
Descriptors: Artificial Intelligence, Natural Language Processing, Technology Uses in Education, Automation
Peer reviewed Peer reviewed
Direct linkDirect link
Tong Wu; Stella Y. Kim; Carl Westine; Michelle Boyer – Journal of Educational Measurement, 2025
While significant attention has been given to test equating to ensure score comparability, limited research has explored equating methods for rater-mediated assessments, where human raters inherently introduce error. If not properly addressed, these errors can undermine score interchangeability and test validity. This study proposes an equating…
Descriptors: Item Response Theory, Evaluators, Error of Measurement, Test Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Zhongzhou Chen; Tong Wan – Physical Review Physics Education Research, 2025
This study examines the feasibility and potential advantages of using large language models, in particular GPT-4o, to perform partial credit grading of large numbers of student written responses to introductory level physics problems. Students were instructed to write down verbal explanations of their reasoning process when solving one conceptual…
Descriptors: Grading, Technology Uses in Education, Student Evaluation, Science Education
Peer reviewed Peer reviewed
Direct linkDirect link
Katerina Guba; Angelika Tsivinskaya – Studies in Higher Education, 2025
This paper explores the regulators' perspective and demonstrates how legitimacy deficits of private universities outweigh performance results in decisions regarding university inspections. We examined the period when the regulator had an urgent claim on Russian universities, particularly during the campaign to 'clean the system of higher…
Descriptors: Private Sector, Private Education, Universities, Private Colleges
Peer reviewed Peer reviewed
Direct linkDirect link
Ping-Lin Chuang – Language Testing, 2025
This experimental study explores how source use features impact raters' judgment of argumentation in a second language (L2) integrated writing test. One hundred four experienced and novice raters were recruited to complete a rating task that simulated the scoring assignment of a local English Placement Test (EPT). Sixty written responses were…
Descriptors: Interrater Reliability, Evaluators, Information Sources, Primary Sources
Peer reviewed Peer reviewed
Direct linkDirect link
Audrey Doyle – Irish Educational Studies, 2025
For the first time in the history of the high stakes Leaving Certificate Established examination in Ireland, teachers graded and ranked their own students due to COVID-19 restrictions. In the wake of the process, a questionnaire and focus group interviews explored how teachers engaged with the Leaving Certificate Calculated Grades 2020 (CG2020)…
Descriptors: Foreign Countries, Exit Examinations, Teacher Role, Evaluators
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Reza Shahi; Hamdollah Ravand; Golam Reza Rohani – International Journal of Language Testing, 2025
The current paper intends to exploit the Many Facet Rasch Model to investigate and compare the impact of situations (items) and raters on test takers' performance on the Written Discourse Completion Test (WDCT) and Discourse Self-Assessment Tests (DSAT). In this study, the participants were 110 English as a Foreign Language (EFL) students at…
Descriptors: Comparative Analysis, English (Second Language), Second Language Learning, Second Language Instruction
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Yubin Xu; Lin Liu; Jianwen Xiong; Guangtian Zhu – Journal of Baltic Science Education, 2025
As the development and application of large language models (LLMs) in physics education progress, the well-known AI-based chatbot ChatGPT4 has presented numerous opportunities for educational assessment. Investigating the potential of AI tools in practical educational assessment carries profound significance. This study explored the comparative…
Descriptors: Physics, Artificial Intelligence, Computer Software, Accuracy
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Makiko Kato – Journal of Education and Learning, 2025
This study aims to examine whether differences exist in the factors influencing the difficulty of scoring English summaries and determining scores based on the raters' attributes, and to collect candid opinions, considerations, and tentative suggestions for future improvements to the analytic rubric of summary writing for English learners. In this…
Descriptors: Writing Evaluation, Scoring, Writing Skills, English (Second Language)
Previous Page | Next Page »
Pages: 1  |  2