Publication Date
In 2025 | 4 |
Since 2024 | 8 |
Since 2021 (last 5 years) | 18 |
Since 2016 (last 10 years) | 35 |
Since 2006 (last 20 years) | 53 |
Descriptor
Source
Author
Coniam, David | 3 |
Attali, Yigal | 2 |
Bridgeman, Brent | 2 |
Casabianca, Jodi M. | 2 |
Davis, Larry | 2 |
Kunnan, Antony John | 2 |
Mollaun, Pamela | 2 |
Sebrechts, Marc M. | 2 |
Wolfe, Edward W. | 2 |
Xi, Xiaoming | 2 |
Yan, Zi | 2 |
More ▼ |
Publication Type
Reports - Research | 59 |
Journal Articles | 52 |
Tests/Questionnaires | 8 |
Speeches/Meeting Papers | 4 |
Collected Works - Proceedings | 1 |
Education Level
Higher Education | 18 |
Postsecondary Education | 17 |
Secondary Education | 5 |
Elementary Education | 3 |
High Schools | 2 |
Elementary Secondary Education | 1 |
Grade 5 | 1 |
Intermediate Grades | 1 |
Middle Schools | 1 |
Audience
Location
China | 6 |
Hong Kong | 3 |
Germany | 2 |
Taiwan | 2 |
Australia | 1 |
California | 1 |
China (Beijing) | 1 |
Cyprus | 1 |
Europe | 1 |
Iran | 1 |
Japan | 1 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
Test of English as a Foreign… | 17 |
Graduate Record Examinations | 2 |
International English… | 2 |
ACTFL Oral Proficiency… | 1 |
Foreign Language Classroom… | 1 |
Test of English for… | 1 |
Torrance Tests of Creative… | 1 |
What Works Clearinghouse Rating
Zebo Xu; Prerit S. Mittal; Mohd. Mohsin Ahmed; Chandranath Adak; Zhenguang G. Cai – Reading and Writing: An Interdisciplinary Journal, 2025
The rise of the digital era has led to a decline in handwriting as the primary mode of communication, resulting in negative effects on handwriting literacy, particularly in complex writing systems such as Chinese. The marginalization of handwriting has contributed to the deterioration of penmanship, defined as the ability to write aesthetically…
Descriptors: Handwriting, Writing Skills, Chinese, Ideography
Peter Baldwin; Victoria Yaneva; Kai North; Le An Ha; Yiyun Zhou; Alex J. Mechaber; Brian E. Clauser – Journal of Educational Measurement, 2025
Recent developments in the use of large-language models have led to substantial improvements in the accuracy of content-based automated scoring of free-text responses. The reported accuracy levels suggest that automated systems could have widespread applicability in assessment. However, before they are used in operational testing, other aspects of…
Descriptors: Artificial Intelligence, Scoring, Computational Linguistics, Accuracy
Casabianca, Jodi M.; Donoghue, John R.; Shin, Hyo Jeong; Chao, Szu-Fu; Choi, Ikkyu – Journal of Educational Measurement, 2023
Using item-response theory to model rater effects provides an alternative solution for rater monitoring and diagnosis, compared to using standard performance metrics. In order to fit such models, the ratings data must be sufficiently connected in order to estimate rater effects. Due to popular rating designs used in large-scale testing scenarios,…
Descriptors: Item Response Theory, Alternative Assessment, Evaluators, Research Problems
Yishen Song; Qianta Zhu; Huaibo Wang; Qinhua Zheng – IEEE Transactions on Learning Technologies, 2024
Manually scoring and revising student essays has long been a time-consuming task for educators. With the rise of natural language processing techniques, automated essay scoring (AES) and automated essay revising (AER) have emerged to alleviate this burden. However, current AES and AER models require large amounts of training data and lack…
Descriptors: Scoring, Essays, Writing Evaluation, Computer Software
Zhang, Mengxue; Heffernan, Neil; Lan, Andrew – International Educational Data Mining Society, 2023
Automated scoring of student responses to open-ended questions, including short-answer questions, has great potential to scale to a large number of responses. Recent approaches for automated scoring rely on supervised learning, i.e., training classifiers or fine-tuning language models on a small number of responses with human-provided score…
Descriptors: Scoring, Computer Assisted Testing, Mathematics Instruction, Mathematics Tests
Chan, Kinnie Kin Yee; Bond, Trevor; Yan, Zi – Language Testing, 2023
We investigated the relationship between the scores assigned by an Automated Essay Scoring (AES) system, the Intelligent Essay Assessor (IEA), and grades allocated by trained, professional human raters to English essay writing by instigating two procedures novel to written-language assessment: the logistic transformation of AES raw scores into…
Descriptors: Computer Assisted Testing, Essays, Scoring, Scores
Ahmet Can Uyar; Dilek Büyükahiska – International Journal of Assessment Tools in Education, 2025
This study explores the effectiveness of using ChatGPT, an Artificial Intelligence (AI) language model, as an Automated Essay Scoring (AES) tool for grading English as a Foreign Language (EFL) learners' essays. The corpus consists of 50 essays representing various types including analysis, compare and contrast, descriptive, narrative, and opinion…
Descriptors: Artificial Intelligence, Computer Software, Technology Uses in Education, Teaching Methods
Selcuk Acar; Denis Dumas; Peter Organisciak; Kelly Berthiaume – Grantee Submission, 2024
Creativity is highly valued in both education and the workforce, but assessing and developing creativity can be difficult without psychometrically robust and affordable tools. The open-ended nature of creativity assessments has made them difficult to score, expensive, often imprecise, and therefore impractical for school- or district-wide use. To…
Descriptors: Thinking Skills, Elementary School Students, Artificial Intelligence, Measurement Techniques
Swapna Haresh Teckwani; Amanda Huee-Ping Wong; Nathasha Vihangi Luke; Ivan Cherh Chiet Low – Advances in Physiology Education, 2024
The advent of artificial intelligence (AI), particularly large language models (LLMs) like ChatGPT and Gemini, has significantly impacted the educational landscape, offering unique opportunities for learning and assessment. In the realm of written assessment grading, traditionally viewed as a laborious and subjective process, this study sought to…
Descriptors: Accuracy, Reliability, Computational Linguistics, Standards
Yuko Hayashi; Yusuke Kondo; Yutaka Ishii – Innovation in Language Learning and Teaching, 2024
Purpose: This study builds a new system for automatically assessing learners' speech elicited from an oral discourse completion task (DCT), and evaluates the prediction capability of the system with a view to better understanding factors deemed influential in predicting speaking proficiency scores and the pedagogical implications of the system.…
Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Japanese
LaVoie, Noelle; Parker, James; Legree, Peter J.; Ardison, Sharon; Kilcullen, Robert N. – Educational and Psychological Measurement, 2020
Automated scoring based on Latent Semantic Analysis (LSA) has been successfully used to score essays and constrained short answer responses. Scoring tests that capture open-ended, short answer responses poses some challenges for machine learning approaches. We used LSA techniques to score short answer responses to the Consequences Test, a measure…
Descriptors: Semantics, Evaluators, Essays, Scoring
Ockey, Gary J.; Chukharev-Hudilainen, Evgeny – Applied Linguistics, 2021
A challenge of large-scale oral communication assessments is to feasibly assess a broad construct that includes interactional competence. One possible approach in addressing this challenge is to use a spoken dialog system (SDS), with the computer acting as a peer to elicit a ratable speech sample. With this aim, an SDS was built and four trained…
Descriptors: Oral Language, Grammar, Language Fluency, Language Tests
Cox, Troy L.; Brown, Alan V.; Thompson, Gregory L. – Language Testing, 2023
The rating of proficiency tests that use the Inter-agency Roundtable (ILR) and American Council on the Teaching of Foreign Languages (ACTFL) guidelines claims that each major level is based on hierarchal linguistic functions that require mastery of multidimensional traits in such a way that each level subsumes the levels beneath it. These…
Descriptors: Oral Language, Language Fluency, Scoring, Cues
Eckes, Thomas; Jin, Kuan-Yu – International Journal of Testing, 2021
Severity and centrality are two main kinds of rater effects posing threats to the validity and fairness of performance assessments. Adopting Jin and Wang's (2018) extended facets modeling approach, we separately estimated the magnitude of rater severity and centrality effects in the web-based TestDaF (Test of German as a Foreign Language) writing…
Descriptors: Language Tests, German, Second Languages, Writing Tests
Dalton, Sarah Grace; Stark, Brielle C.; Fromm, Davida; Apple, Kristen; MacWhinney, Brian; Rensch, Amanda; Rowedder, Madyson – Journal of Speech, Language, and Hearing Research, 2022
Purpose: The aim of this study was to advance the use of structured, monologic discourse analysis by validating an automated scoring procedure for core lexicon (CoreLex) using transcripts. Method: Forty-nine transcripts from persons with aphasia and 48 transcripts from persons with no brain injury were retrieved from the AphasiaBank database. Five…
Descriptors: Validity, Discourse Analysis, Databases, Scoring