Publication Date
In 2025 | 1 |
Since 2024 | 2 |
Since 2021 (last 5 years) | 6 |
Since 2016 (last 10 years) | 11 |
Since 2006 (last 20 years) | 15 |
Descriptor
Classification | 16 |
Computer Assisted Testing | 16 |
Scoring | 16 |
Accuracy | 6 |
Automation | 6 |
Essays | 6 |
Second Language Learning | 5 |
Artificial Intelligence | 4 |
Computer Software | 4 |
English (Second Language) | 4 |
Prediction | 4 |
More ▼ |
Source
Author
Zechner, Klaus | 2 |
Abe, Mariko | 1 |
Alex J. Mechaber | 1 |
Allen, Laura | 1 |
Ballier, Nicolas | 1 |
Barnes, Tiffany, Ed. | 1 |
Becker, Kirk A. | 1 |
Bejar, Isaac I. | 1 |
Bennett, Randy Elliot | 1 |
Bouyé, Manon | 1 |
Brian E. Clauser | 1 |
More ▼ |
Publication Type
Journal Articles | 12 |
Reports - Research | 10 |
Reports - Evaluative | 3 |
Collected Works - Proceedings | 1 |
Dissertations/Theses -… | 1 |
Reports - Descriptive | 1 |
Speeches/Meeting Papers | 1 |
Education Level
Higher Education | 3 |
Postsecondary Education | 2 |
Early Childhood Education | 1 |
Elementary Secondary Education | 1 |
Junior High Schools | 1 |
Middle Schools | 1 |
Secondary Education | 1 |
Audience
Laws, Policies, & Programs
Assessments and Surveys
Test of English as a Foreign… | 2 |
Graduate Record Examinations | 1 |
What Works Clearinghouse Rating
Peter Baldwin; Victoria Yaneva; Kai North; Le An Ha; Yiyun Zhou; Alex J. Mechaber; Brian E. Clauser – Journal of Educational Measurement, 2025
Recent developments in the use of large-language models have led to substantial improvements in the accuracy of content-based automated scoring of free-text responses. The reported accuracy levels suggest that automated systems could have widespread applicability in assessment. However, before they are used in operational testing, other aspects of…
Descriptors: Artificial Intelligence, Scoring, Computational Linguistics, Accuracy
Jing Ma – ProQuest LLC, 2024
This study investigated the impact of scoring polytomous items later on measurement precision, classification accuracy, and test security in mixed-format adaptive testing. Utilizing the shadow test approach, a simulation study was conducted across various test designs, lengths, number and location of polytomous item. Results showed that while…
Descriptors: Scoring, Adaptive Testing, Test Items, Classification
Ormerod, Christopher; Lottridge, Susan; Harris, Amy E.; Patel, Milan; van Wamelen, Paul; Kodeswaran, Balaji; Woolf, Sharon; Young, Mackenzie – International Journal of Artificial Intelligence in Education, 2023
We introduce a short answer scoring engine made up of an ensemble of deep neural networks and a Latent Semantic Analysis-based model to score short constructed responses for a large suite of questions from a national assessment program. We evaluate the performance of the engine and show that the engine achieves above-human-level performance on a…
Descriptors: Computer Assisted Testing, Scoring, Artificial Intelligence, Semantics
Wang, Wei; Dorans, Neil J. – ETS Research Report Series, 2021
Agreement statistics and measures of prediction accuracy are often used to assess the quality of two measures of a construct. Agreement statistics are appropriate for measures that are supposed to be interchangeable, whereas prediction accuracy statistics are appropriate for situations where one variable is the target and the other variables are…
Descriptors: Classification, Scaling, Prediction, Accuracy
Becker, Kirk A.; Kao, Shu-chuan – Journal of Applied Testing Technology, 2022
Natural Language Processing (NLP) offers methods for understanding and quantifying the similarity between written documents. Within the testing industry these methods have been used for automatic item generation, automated scoring of text and speech, modeling item characteristics, automatic question answering, machine translation, and automated…
Descriptors: Item Banks, Natural Language Processing, Computer Assisted Testing, Scoring
Wan, Qian; Crossley, Scott; Allen, Laura; McNamara, Danielle – Grantee Submission, 2020
In this paper, we extracted content-based and structure-based features of text to predict human annotations for claims and nonclaims in argumentative essays. We compared Logistic Regression, Bernoulli Naive Bayes, Gaussian Naive Bayes, Linear Support Vector Classification, Random Forest, and Neural Networks to train classification models. Random…
Descriptors: Persuasive Discourse, Essays, Writing Evaluation, Natural Language Processing
Gaillat, Thomas; Simpkin, Andrew; Ballier, Nicolas; Stearns, Bernardo; Sousa, Annanda; Bouyé, Manon; Zarrouk, Manel – ReCALL, 2021
This paper focuses on automatically assessing language proficiency levels according to linguistic complexity in learner English. We implement a supervised learning approach as part of an automatic essay scoring system. The objective is to uncover Common European Framework of Reference for Languages (CEFR) criterial features in writings by learners…
Descriptors: Prediction, Rating Scales, English (Second Language), Second Language Learning
Uzun, Kutay – Contemporary Educational Technology, 2018
Managing crowded classes in terms of classroom assessment is a difficult task due to the amount of time which needs to be devoted to providing feedback to student products. In this respect, the present study aimed to develop an automated essay scoring environment as a potential means to overcome this problem. Secondarily, the study aimed to test…
Descriptors: Computer Assisted Testing, Essays, Scoring, English Literature
Kobayashi, Yuichiro; Abe, Mariko – Journal of Pan-Pacific Association of Applied Linguistics, 2016
The purpose of the present study is to assess second language (L2) spoken English using automated scoring techniques. Automated scoring aims to classify a large set of learners' oral performance data into a small number of discrete oral proficiency levels. In automated scoring, objectively measurable features such as the frequencies of lexical and…
Descriptors: Second Language Learning, Computer Assisted Testing, Scoring, Automation
Jiao, Hong; Liu, Junhui; Haynie, Kathleen; Woo, Ada; Gorham, Jerry – Educational and Psychological Measurement, 2012
This study explored the impact of partial credit scoring of one type of innovative items (multiple-response items) in a computerized adaptive version of a large-scale licensure pretest and operational test settings. The impacts of partial credit scoring on the estimation of the ability parameters and classification decisions in operational test…
Descriptors: Test Items, Computer Assisted Testing, Measures (Individuals), Scoring
Liu, Ming; Li, Yi; Xu, Weiwei; Liu, Li – IEEE Transactions on Learning Technologies, 2017
Writing an essay is a very important skill for students to master, but a difficult task for them to overcome. It is particularly true for English as Second Language (ESL) students in China. It would be very useful if students could receive timely and effective feedback about their writing. Automatic essay feedback generation is a challenging task,…
Descriptors: Foreign Countries, College Students, Second Language Learning, English (Second Language)
Xi, Xiaoming; Higgins, Derrick; Zechner, Klaus; Williamson, David – Language Testing, 2012
This paper compares two alternative scoring methods--multiple regression and classification trees--for an automated speech scoring system used in a practice environment. The two methods were evaluated on two criteria: construct representation and empirical performance in predicting human scores. The empirical performance of the two scoring models…
Descriptors: Scoring, Classification, Weighted Scores, Comparative Analysis
Hu, Xiangen, Ed.; Barnes, Tiffany, Ed.; Hershkovitz, Arnon, Ed.; Paquette, Luc, Ed. – International Educational Data Mining Society, 2017
The 10th International Conference on Educational Data Mining (EDM 2017) is held under the auspices of the International Educational Data Mining Society at the Optics Velley Kingdom Plaza Hotel, Wuhan, Hubei Province, in China. This years conference features two invited talks by: Dr. Jie Tang, Associate Professor with the Department of Computer…
Descriptors: Data Analysis, Data Collection, Graphs, Data Use
Zechner, Klaus; Bejar, Isaac I.; Hemat, Ramin – ETS Research Report Series, 2007
The increasing availability and performance of computer-based testing has prompted more research on the automatic assessment of language and speaking proficiency. In this investigation, we evaluated the feasibility of using an off-the-shelf speech-recognition system for scoring speaking prompts from the LanguEdge field test of 2002. We first…
Descriptors: Role, Computer Assisted Testing, Language Proficiency, Oral Language
Scalise, Kathleen; Gifford, Bernard – Journal of Technology, Learning, and Assessment, 2006
Technology today offers many new opportunities for innovation in educational assessment through rich new assessment tasks and potentially powerful scoring, reporting and real-time feedback mechanisms. One potential limitation for realizing the benefits of computer-based assessment in both instructional assessment and large scale testing comes in…
Descriptors: Electronic Learning, Educational Assessment, Information Technology, Classification
Previous Page | Next Page »
Pages: 1 | 2