ERIC - Search Results

Publication Date

In 2025	1
Since 2024	2
Since 2021 (last 5 years)	6
Since 2016 (last 10 years)	11
Since 2006 (last 20 years)	15

Descriptor

Classification	16
Computer Assisted Testing	16
Scoring	16
Accuracy	6
Automation	6
Essays	6
Second Language Learning	5
Artificial Intelligence	4
Computer Software	4
English (Second Language)	4
Prediction	4
Test Items	4
College Students	3
Correlation	3
Evaluation Methods	3
Evaluators	3
Foreign Countries	3
Language Proficiency	3
Measurement	3
Natural Language Processing	3
Oral Language	3
Test Construction	3
Writing Evaluation	3
Comparative Analysis	2
Computation	2
More ▼

Source

ETS Research Report Series	2
Contemporary Educational…	1
Educational and Psychological…	1
Grantee Submission	1
IEEE Transactions on Learning…	1
International Educational…	1
International Journal of…	1
Journal of Applied Testing…	1
Journal of Educational…	1
Journal of Pan-Pacific…	1
Journal of Technology,…	1
Language Testing	1
ProQuest LLC	1
ReCALL	1
More ▼

Publication Type

Journal Articles	12
Reports - Research	10
Reports - Evaluative	3
Collected Works - Proceedings	1
Dissertations/Theses -…	1
Reports - Descriptive	1
Speeches/Meeting Papers	1

Education Level

Higher Education	3
Postsecondary Education	2
Early Childhood Education	1
Elementary Secondary Education	1
Junior High Schools	1
Middle Schools	1
Secondary Education	1

Audience

Location

China	2
Japan	1

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…	2
Graduate Record Examinations	1

What Works Clearinghouse Rating

Showing 1 to 15 of 16 results Save | Export

The Vulnerability of AI-Based Scoring Systems to Gaming Strategies: A Case Study

Peer reviewed

Direct link

Peter Baldwin; Victoria Yaneva; Kai North; Le An Ha; Yiyun Zhou; Alex J. Mechaber; Brian E. Clauser – Journal of Educational Measurement, 2025

Recent developments in the use of large-language models have led to substantial improvements in the accuracy of content-based automated scoring of free-text responses. The reported accuracy levels suggest that automated systems could have widespread applicability in assessment. However, before they are used in operational testing, other aspects of…

Descriptors: Artificial Intelligence, Scoring, Computational Linguistics, Accuracy

The Impact of Scoring Later on Mixed Format Adaptive Testing

Direct link

Jing Ma – ProQuest LLC, 2024

This study investigated the impact of scoring polytomous items later on measurement precision, classification accuracy, and test security in mixed-format adaptive testing. Utilizing the shadow test approach, a simulation study was conducted across various test designs, lengths, number and location of polytomous item. Results showed that while…

Descriptors: Scoring, Adaptive Testing, Test Items, Classification

Automated Short Answer Scoring Using an Ensemble of Neural Networks and Latent Semantic Analysis Classifiers

Peer reviewed

Direct link

Ormerod, Christopher; Lottridge, Susan; Harris, Amy E.; Patel, Milan; van Wamelen, Paul; Kodeswaran, Balaji; Woolf, Sharon; Young, Mackenzie – International Journal of Artificial Intelligence in Education, 2023

We introduce a short answer scoring engine made up of an ensemble of deep neural networks and a Latent Semantic Analysis-based model to score short constructed responses for a large suite of questions from a national assessment program. We evaluate the performance of the engine and show that the engine achieves above-human-level performance on a…

Descriptors: Computer Assisted Testing, Scoring, Artificial Intelligence, Semantics

Impact of Categorization and Scaling on Classification Agreement and Prediction Accuracy Statistics. Research Report. ETS RR-21-26

Peer reviewed
PDF on ERIC

Download full text

Wang, Wei; Dorans, Neil J. – ETS Research Report Series, 2021

Agreement statistics and measures of prediction accuracy are often used to assess the quality of two measures of a construct. Agreement statistics are appropriate for measures that are supposed to be interchangeable, whereas prediction accuracy statistics are appropriate for situations where one variable is the target and the other variables are…

Descriptors: Classification, Scaling, Prediction, Accuracy

Identifying Enemy Item Pairs Using Natural Language Processing

Peer reviewed

Direct link

Becker, Kirk A.; Kao, Shu-chuan – Journal of Applied Testing Technology, 2022

Natural Language Processing (NLP) offers methods for understanding and quantifying the similarity between written documents. Within the testing industry these methods have been used for automatic item generation, automated scoring of text and speech, modeling item characteristics, automatic question answering, machine translation, and automated…

Descriptors: Item Banks, Natural Language Processing, Computer Assisted Testing, Scoring

Claim Detection and Relationship with Writing Quality

Peer reviewed
PDF on ERIC

Download full text

Wan, Qian; Crossley, Scott; Allen, Laura; McNamara, Danielle – Grantee Submission, 2020

In this paper, we extracted content-based and structure-based features of text to predict human annotations for claims and nonclaims in argumentative essays. We compared Logistic Regression, Bernoulli Naive Bayes, Gaussian Naive Bayes, Linear Support Vector Classification, Random Forest, and Neural Networks to train classification models. Random…

Descriptors: Persuasive Discourse, Essays, Writing Evaluation, Natural Language Processing

Predicting CEFR Levels in Learners of English: The Use of Microsystem Criterial Features in a Machine Learning Approach

Peer reviewed

Direct link

Gaillat, Thomas; Simpkin, Andrew; Ballier, Nicolas; Stearns, Bernardo; Sousa, Annanda; Bouyé, Manon; Zarrouk, Manel – ReCALL, 2021

This paper focuses on automatically assessing language proficiency levels according to linguistic complexity in learner English. We implement a supervised learning approach as part of an automatic essay scoring system. The objective is to uncover Common European Framework of Reference for Languages (CEFR) criterial features in writings by learners…

Descriptors: Prediction, Rating Scales, English (Second Language), Second Language Learning

Home-Grown Automated Essay Scoring in the Literature Classroom: A Solution for Managing the Crowd?

Peer reviewed
PDF on ERIC

Download full text

Uzun, Kutay – Contemporary Educational Technology, 2018

Managing crowded classes in terms of classroom assessment is a difficult task due to the amount of time which needs to be devoted to providing feedback to student products. In this respect, the present study aimed to develop an automated essay scoring environment as a potential means to overcome this problem. Secondarily, the study aimed to test…

Descriptors: Computer Assisted Testing, Essays, Scoring, English Literature

Automated Scoring of L2 Spoken English with Random Forests

Peer reviewed
PDF on ERIC

Download full text

Kobayashi, Yuichiro; Abe, Mariko – Journal of Pan-Pacific Association of Applied Linguistics, 2016

The purpose of the present study is to assess second language (L2) spoken English using automated scoring techniques. Automated scoring aims to classify a large set of learners' oral performance data into a small number of discrete oral proficiency levels. In automated scoring, objectively measurable features such as the frequencies of lexical and…

Descriptors: Second Language Learning, Computer Assisted Testing, Scoring, Automation

Comparison between Dichotomous and Polytomous Scoring of Innovative Items in a Large-Scale Computerized Adaptive Test

Peer reviewed

Direct link

Jiao, Hong; Liu, Junhui; Haynie, Kathleen; Woo, Ada; Gorham, Jerry – Educational and Psychological Measurement, 2012

This study explored the impact of partial credit scoring of one type of innovative items (multiple-response items) in a computerized adaptive version of a large-scale licensure pretest and operational test settings. The impacts of partial credit scoring on the estimation of the ability parameters and classification decisions in operational test…

Descriptors: Test Items, Computer Assisted Testing, Measures (Individuals), Scoring

Automated Essay Feedback Generation and Its Impact on Revision

Peer reviewed

Direct link

Liu, Ming; Li, Yi; Xu, Weiwei; Liu, Li – IEEE Transactions on Learning Technologies, 2017

Writing an essay is a very important skill for students to master, but a difficult task for them to overcome. It is particularly true for English as Second Language (ESL) students in China. It would be very useful if students could receive timely and effective feedback about their writing. Automatic essay feedback generation is a challenging task,…

Descriptors: Foreign Countries, College Students, Second Language Learning, English (Second Language)

A Comparison of Two Scoring Methods for an Automated Speech Scoring System

Peer reviewed

Direct link

Xi, Xiaoming; Higgins, Derrick; Zechner, Klaus; Williamson, David – Language Testing, 2012

This paper compares two alternative scoring methods--multiple regression and classification trees--for an automated speech scoring system used in a practice environment. The two methods were evaluated on two criteria: construct representation and empirical performance in predicting human scores. The empirical performance of the two scoring models…

Descriptors: Scoring, Classification, Weighted Scores, Comparative Analysis

Proceedings of the International Conference on Educational Data Mining (EDM) (10th, Wuhan, China, June 25-28, 2017)

Peer reviewed
PDF on ERIC

Download full text

Hu, Xiangen, Ed.; Barnes, Tiffany, Ed.; Hershkovitz, Arnon, Ed.; Paquette, Luc, Ed. – International Educational Data Mining Society, 2017

The 10th International Conference on Educational Data Mining (EDM 2017) is held under the auspices of the International Educational Data Mining Society at the Optics Velley Kingdom Plaza Hotel, Wuhan, Hubei Province, in China. This years conference features two invited talks by: Dr. Jie Tang, Associate Professor with the Department of Computer…

Descriptors: Data Analysis, Data Collection, Graphs, Data Use

Toward an Understanding of the Role of Speech Recognition in Nonnative Speech Assessment. TOEFL iBT Research Report. TOEFL iBT-02. ETS RR-07-02

Peer reviewed
PDF on ERIC

Download full text

Zechner, Klaus; Bejar, Isaac I.; Hemat, Ramin – ETS Research Report Series, 2007

The increasing availability and performance of computer-based testing has prompted more research on the automatic assessment of language and speaking proficiency. In this investigation, we evaluated the feasibility of using an off-the-shelf speech-recognition system for scoring speaking prompts from the LanguEdge field test of 2002. We first…

Descriptors: Role, Computer Assisted Testing, Language Proficiency, Oral Language

Computer-Based Assessment in E-Learning: A Framework for Constructing "Intermediate Constraint" Questions and Tasks for Technology Platforms

Peer reviewed
PDF on ERIC

Download full text

Scalise, Kathleen; Gifford, Bernard – Journal of Technology, Learning, and Assessment, 2006

Technology today offers many new opportunities for innovation in educational assessment through rich new assessment tasks and potentially powerful scoring, reporting and real-time feedback mechanisms. One potential limitation for realizing the benefits of computer-based assessment in both instructional assessment and large scale testing comes in…

Descriptors: Electronic Learning, Educational Assessment, Information Technology, Classification

Previous Page | Next Page »

Pages: 1 | 2

Zechner, Klaus	2
Abe, Mariko	1
Alex J. Mechaber	1
Allen, Laura	1
Ballier, Nicolas	1
Barnes, Tiffany, Ed.	1
Becker, Kirk A.	1
Bejar, Isaac I.	1
Bennett, Randy Elliot	1
Bouyé, Manon	1
Brian E. Clauser	1
Crossley, Scott	1
Dorans, Neil J.	1
Gaillat, Thomas	1
Gifford, Bernard	1
Gorham, Jerry	1
Harris, Amy E.	1
Haynie, Kathleen	1
Hemat, Ramin	1
Hershkovitz, Arnon, Ed.	1
Higgins, Derrick	1
Hu, Xiangen, Ed.	1
Jiao, Hong	1
Jing Ma	1
Kai North	1
More ▼