Publication Date
In 2025 | 3 |
Since 2024 | 5 |
Since 2021 (last 5 years) | 14 |
Since 2016 (last 10 years) | 35 |
Since 2006 (last 20 years) | 71 |
Descriptor
Computer Assisted Testing | 79 |
Correlation | 79 |
Scoring | 65 |
Second Language Learning | 25 |
English (Second Language) | 24 |
Language Tests | 24 |
Scores | 24 |
Comparative Analysis | 19 |
Essays | 18 |
Evaluators | 18 |
Accuracy | 16 |
More ▼ |
Source
Author
Attali, Yigal | 4 |
Bridgeman, Brent | 4 |
Breyer, F. Jay | 3 |
Anna-Maria Fall | 2 |
Bennett, Randy Elliot | 2 |
Beula M. Magimairaj | 2 |
Crossley, Scott | 2 |
Denis Dumas | 2 |
Dikici, Ayhan | 2 |
Gentile, Claudia | 2 |
Greg Roberts | 2 |
More ▼ |
Publication Type
Education Level
Audience
Location
California | 2 |
China | 2 |
Australia | 1 |
Canada | 1 |
Denmark | 1 |
Germany | 1 |
Hong Kong | 1 |
Iran | 1 |
Israel | 1 |
Massachusetts | 1 |
North Carolina (Greensboro) | 1 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
Test of English as a Foreign… | 17 |
Graduate Record Examinations | 4 |
Torrance Tests of Creative… | 2 |
Foreign Language Classroom… | 1 |
NEO Personality Inventory | 1 |
SAT (College Admission Test) | 1 |
United States Medical… | 1 |
What Works Clearinghouse Rating
Selcuk Acar; Peter Organisciak; Denis Dumas – Journal of Creative Behavior, 2025
In this three-study investigation, we applied various approaches to score drawings created in response to both Form A and Form B of the Torrance Tests of Creative Thinking-Figural (broadly TTCT-F) as well as the Multi-Trial Creative Ideation task (MTCI). We focused on TTCT-F in Study 1, and utilizing a random forest classifier, we achieved 79% and…
Descriptors: Scoring, Computer Assisted Testing, Models, Correlation
Swapna Haresh Teckwani; Amanda Huee-Ping Wong; Nathasha Vihangi Luke; Ivan Cherh Chiet Low – Advances in Physiology Education, 2024
The advent of artificial intelligence (AI), particularly large language models (LLMs) like ChatGPT and Gemini, has significantly impacted the educational landscape, offering unique opportunities for learning and assessment. In the realm of written assessment grading, traditionally viewed as a laborious and subjective process, this study sought to…
Descriptors: Accuracy, Reliability, Computational Linguistics, Standards
Yuan, Lu; Huang, Yingshi; Li, Shuhang; Chen, Ping – Journal of Educational Measurement, 2023
Online calibration is a key technology for item calibration in computerized adaptive testing (CAT) and has been widely used in various forms of CAT, including unidimensional CAT, multidimensional CAT (MCAT), CAT with polytomously scored items, and cognitive diagnostic CAT. However, as multidimensional and polytomous assessment data become more…
Descriptors: Computer Assisted Testing, Adaptive Testing, Computation, Test Items
Themistocleous, Charalambos; Neophytou, Kyriaki; Rapp, Brenda; Tsapkini, Kyrana – Journal of Speech, Language, and Hearing Research, 2020
Purpose: The evaluation of spelling performance in aphasia reveals deficits in written language and can facilitate the design of targeted writing treatments. Nevertheless, manual scoring of spelling performance is time-consuming, laborious, and error prone. We propose a novel method based on the use of distance metrics to automatically score…
Descriptors: Computer Assisted Testing, Scoring, Spelling, Scores
Wang, Wei; Dorans, Neil J. – ETS Research Report Series, 2021
Agreement statistics and measures of prediction accuracy are often used to assess the quality of two measures of a construct. Agreement statistics are appropriate for measures that are supposed to be interchangeable, whereas prediction accuracy statistics are appropriate for situations where one variable is the target and the other variables are…
Descriptors: Classification, Scaling, Prediction, Accuracy
Eran Hadas; Arnon Hershkovitz – Journal of Learning Analytics, 2025
Creativity is an imperative skill for today's learners, one that has important contributions to issues of inclusion and equity in education. Therefore, assessing creativity is of major importance in educational contexts. However, scoring creativity based on traditional tools suffers from subjectivity and is heavily time- and labour-consuming. This…
Descriptors: Creativity, Evaluation Methods, Computer Assisted Testing, Artificial Intelligence
Cohen, Yoav; Levi, Effi; Ben-Simon, Anat – Applied Measurement in Education, 2018
In the current study, two pools of 250 essays, all written as a response to the same prompt, were rated by two groups of raters (14 or 15 raters per group), thereby providing an approximation to the essay's true score. An automated essay scoring (AES) system was trained on the datasets and then scored the essays using a cross-validation scheme. By…
Descriptors: Test Validity, Automation, Scoring, Computer Assisted Testing
Jiyeo Yun – English Teaching, 2023
Studies on automatic scoring systems in writing assessments have also evaluated the relationship between human and machine scores for the reliability of automated essay scoring systems. This study investigated the magnitudes of indices for inter-rater agreement and discrepancy, especially regarding human and machine scoring, in writing assessment.…
Descriptors: Meta Analysis, Interrater Reliability, Essays, Scoring
Selcuk Acar; Denis Dumas; Peter Organisciak; Kelly Berthiaume – Grantee Submission, 2024
Creativity is highly valued in both education and the workforce, but assessing and developing creativity can be difficult without psychometrically robust and affordable tools. The open-ended nature of creativity assessments has made them difficult to score, expensive, often imprecise, and therefore impractical for school- or district-wide use. To…
Descriptors: Thinking Skills, Elementary School Students, Artificial Intelligence, Measurement Techniques
Pfordresher, Peter Q.; Demorest, Steven M. – Journal of Research in Music Education, 2021
The purpose of this study was to analyze a large sample of volunteers from the general population who were tested with an identical online measure of singing accuracy. A sample of 632 participants completed the Seattle Singing Accuracy Protocol (SSAP), a standardized measure of singing accuracy, available online, that includes a test of pitch…
Descriptors: Correlation, Accuracy, Singing, Computer Assisted Testing
Lottridge, Susan; Wood, Scott; Shaw, Dan – Applied Measurement in Education, 2018
This study sought to provide a framework for evaluating machine score-ability of items using a new score-ability rating scale, and to determine the extent to which ratings were predictive of observed automated scoring performance. The study listed and described a set of factors that are thought to influence machine score-ability; these factors…
Descriptors: Program Effectiveness, Computer Assisted Testing, Test Scoring Machines, Scoring
Beaty, Roger E.; Johnson, Dan R.; Zeitlen, Daniel C.; Forthmann, Boris – Creativity Research Journal, 2022
Semantic distance is increasingly used for automated scoring of originality on divergent thinking tasks, such as the Alternate Uses Task (AUT). Despite some psychometric support for semantic distance -- including positive correlations with human creativity ratings -- additional work is needed to optimize its reliability and validity, including…
Descriptors: Semantics, Scoring, Creative Thinking, Creativity
LaVoie, Noelle; Parker, James; Legree, Peter J.; Ardison, Sharon; Kilcullen, Robert N. – Educational and Psychological Measurement, 2020
Automated scoring based on Latent Semantic Analysis (LSA) has been successfully used to score essays and constrained short answer responses. Scoring tests that capture open-ended, short answer responses poses some challenges for machine learning approaches. We used LSA techniques to score short answer responses to the Consequences Test, a measure…
Descriptors: Semantics, Evaluators, Essays, Scoring
Goecke, Benjamin; Schmitz, Florian; Wilhelm, Oliver – Journal of Intelligence, 2021
Performance in elementary cognitive tasks is moderately correlated with fluid intelligence and working memory capacity. These correlations are higher for more complex tasks, presumably due to increased demands on working memory capacity. In accordance with the binding hypothesis, which states that working memory capacity reflects the limit of a…
Descriptors: Intelligence, Cognitive Processes, Short Term Memory, Reaction Time
Dalton, Sarah Grace; Stark, Brielle C.; Fromm, Davida; Apple, Kristen; MacWhinney, Brian; Rensch, Amanda; Rowedder, Madyson – Journal of Speech, Language, and Hearing Research, 2022
Purpose: The aim of this study was to advance the use of structured, monologic discourse analysis by validating an automated scoring procedure for core lexicon (CoreLex) using transcripts. Method: Forty-nine transcripts from persons with aphasia and 48 transcripts from persons with no brain injury were retrieved from the AphasiaBank database. Five…
Descriptors: Validity, Discourse Analysis, Databases, Scoring