Publication Date
In 2025 | 2 |
Since 2024 | 4 |
Since 2021 (last 5 years) | 32 |
Since 2016 (last 10 years) | 78 |
Since 2006 (last 20 years) | 139 |
Descriptor
Classification | 179 |
Computer Assisted Testing | 179 |
Test Items | 42 |
Accuracy | 39 |
Adaptive Testing | 38 |
Foreign Countries | 38 |
Comparative Analysis | 30 |
Item Response Theory | 26 |
Probability | 26 |
Statistical Analysis | 25 |
Cutting Scores | 23 |
More ▼ |
Source
Author
Spray, Judith A. | 5 |
Kalohn, John C. | 4 |
Wang, Wen-Chung | 4 |
Bennett, Randy Elliot | 3 |
Kim, Jiseon | 3 |
Thompson, Nathan A. | 3 |
Barnes, Tiffany, Ed. | 2 |
Chung, Hyewon | 2 |
Deane, Paul | 2 |
Dodd, Barbara G. | 2 |
Huang, Chi-Yu | 2 |
More ▼ |
Publication Type
Education Level
Higher Education | 32 |
Postsecondary Education | 22 |
Elementary Education | 19 |
Secondary Education | 17 |
Grade 3 | 16 |
Grade 4 | 16 |
Grade 8 | 14 |
Grade 6 | 13 |
Grade 7 | 13 |
Early Childhood Education | 12 |
Grade 5 | 12 |
More ▼ |
Audience
Researchers | 3 |
Location
Texas | 6 |
Canada | 4 |
Florida | 4 |
United Kingdom | 4 |
China | 3 |
Germany | 3 |
Greece | 3 |
Israel | 3 |
Netherlands | 3 |
North Carolina | 3 |
Pennsylvania | 3 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Peter Baldwin; Victoria Yaneva; Kai North; Le An Ha; Yiyun Zhou; Alex J. Mechaber; Brian E. Clauser – Journal of Educational Measurement, 2025
Recent developments in the use of large-language models have led to substantial improvements in the accuracy of content-based automated scoring of free-text responses. The reported accuracy levels suggest that automated systems could have widespread applicability in assessment. However, before they are used in operational testing, other aspects of…
Descriptors: Artificial Intelligence, Scoring, Computational Linguistics, Accuracy
Putnikovic, Marko; Jovanovic, Jelena – IEEE Transactions on Learning Technologies, 2023
Automatic grading of short answers is an important task in computer-assisted assessment (CAA). Recently, embeddings, as semantic-rich textual representations, have been increasingly used to represent short answers and predict the grade. Despite the recent trend of applying embeddings in automatic short answer grading (ASAG), there are no…
Descriptors: Automation, Computer Assisted Testing, Grading, Natural Language Processing
Jing Ma – ProQuest LLC, 2024
This study investigated the impact of scoring polytomous items later on measurement precision, classification accuracy, and test security in mixed-format adaptive testing. Utilizing the shadow test approach, a simulation study was conducted across various test designs, lengths, number and location of polytomous item. Results showed that while…
Descriptors: Scoring, Adaptive Testing, Test Items, Classification
Yang Du; Susu Zhang – Journal of Educational and Behavioral Statistics, 2025
Item compromise has long posed challenges in educational measurement, jeopardizing both test validity and test security of continuous tests. Detecting compromised items is therefore crucial to address this concern. The present literature on compromised item detection reveals two notable gaps: First, the majority of existing methods are based upon…
Descriptors: Item Response Theory, Item Analysis, Bayesian Statistics, Educational Assessment
Demir, Seda – Journal of Educational Technology and Online Learning, 2022
The purpose of this research was to evaluate the effect of item pool and selection algorithms on computerized classification testing (CCT) performance in terms of some classification evaluation metrics. For this purpose, 1000 examinees' response patterns using the R package were generated and eight item pools with 150, 300, 450, and 600 items…
Descriptors: Test Items, Item Banks, Mathematics, Computer Assisted Testing
Ormerod, Christopher; Lottridge, Susan; Harris, Amy E.; Patel, Milan; van Wamelen, Paul; Kodeswaran, Balaji; Woolf, Sharon; Young, Mackenzie – International Journal of Artificial Intelligence in Education, 2023
We introduce a short answer scoring engine made up of an ensemble of deep neural networks and a Latent Semantic Analysis-based model to score short constructed responses for a large suite of questions from a national assessment program. We evaluate the performance of the engine and show that the engine achieves above-human-level performance on a…
Descriptors: Computer Assisted Testing, Scoring, Artificial Intelligence, Semantics
Sebastian Moncaleano – ProQuest LLC, 2021
The growth of computer-based testing over the last two decades has motivated the creation of innovative item formats. It is often argued that technology-enhanced items (TEIs) provide better measurement of test-takers' knowledge, skills, and abilities by increasing the authenticity of tasks presented to test-takers (Sireci & Zenisky, 2006).…
Descriptors: Computer Assisted Testing, Test Format, Test Items, Classification
Wang, Wei; Dorans, Neil J. – ETS Research Report Series, 2021
Agreement statistics and measures of prediction accuracy are often used to assess the quality of two measures of a construct. Agreement statistics are appropriate for measures that are supposed to be interchangeable, whereas prediction accuracy statistics are appropriate for situations where one variable is the target and the other variables are…
Descriptors: Classification, Scaling, Prediction, Accuracy
Becker, Kirk A.; Kao, Shu-chuan – Journal of Applied Testing Technology, 2022
Natural Language Processing (NLP) offers methods for understanding and quantifying the similarity between written documents. Within the testing industry these methods have been used for automatic item generation, automated scoring of text and speech, modeling item characteristics, automatic question answering, machine translation, and automated…
Descriptors: Item Banks, Natural Language Processing, Computer Assisted Testing, Scoring
Ifenthaler, Dirk; Sahin, Muhittin – Interactive Technology and Smart Education, 2023
Purpose: This study aims to focus on providing a computerized classification testing (CCT) system that can easily be embedded as a self-assessment feature into the existing legacy environment of a higher education institution, empowering students with self-assessments to monitor their learning progress and following strict data protection…
Descriptors: College Students, Classification, Self Evaluation (Individuals), Progress Monitoring
Sun, Bo; Zhu, Yunzong; Yao, Zeng; Xiao, Rong; Xiao, Yongkang; Wei, Yungang – IEEE Transactions on Learning Technologies, 2020
Reading comprehension tasks are commonly used for developing students' reading ability. In order to adaptively recommend reading comprehension materials to students engaged in computerized testing, the information in an item bank (a collection of test items stored in a dataset) must be effectively indexed. Familiarity with the topics present in…
Descriptors: Automation, Indexing, Item Banks, Classification
Zur, Amir; Applebaum, Isaac; Nardo, Jocelyn Elizabeth; DeWeese, Dory; Sundrani, Sameer; Salehi, Shima – International Educational Data Mining Society, 2023
Detailed learning objectives foster an effective and equitable learning environment by clarifying what instructors expect students to learn, rather than requiring students to use prior knowledge to infer these expectations. When questions are labeled with relevant learning goals, students understand which skills are tested by those questions.…
Descriptors: Equal Education, Prior Learning, Educational Objectives, Chemistry
Charalampos-S Charitsis – ProQuest LLC, 2023
The employment rate of software developers has risen significantly over the last 30 years. As a result, more students are considering computer science as a potential career path. Over the last 15 years, introductory programming course (CS1) enrollment has been increasing at a much faster rate than the increase in the number of CS faculty, with no…
Descriptors: Computer Science Education, Programming, Natural Language Processing, Computer Software
Wan, Qian; Crossley, Scott; Allen, Laura; McNamara, Danielle – Grantee Submission, 2020
In this paper, we extracted content-based and structure-based features of text to predict human annotations for claims and nonclaims in argumentative essays. We compared Logistic Regression, Bernoulli Naive Bayes, Gaussian Naive Bayes, Linear Support Vector Classification, Random Forest, and Neural Networks to train classification models. Random…
Descriptors: Persuasive Discourse, Essays, Writing Evaluation, Natural Language Processing
Luz, Yael; Yerushalmy, Michal – Journal for Research in Mathematics Education, 2023
We report on an innovative design of algorithmic analysis that supports automatic online assessment of students' exploration of geometry propositions in a dynamic geometry environment. We hypothesized that difficulties with and misuse of terms or logic in conjectures are rooted in the early exploration stages of inquiry. We developed a generic…
Descriptors: Algorithms, Computer Assisted Testing, Geometry, Mathematics Instruction