Publication Date
In 2025 | 2 |
Since 2024 | 3 |
Since 2021 (last 5 years) | 28 |
Since 2016 (last 10 years) | 57 |
Since 2006 (last 20 years) | 113 |
Descriptor
Computer Assisted Testing | 128 |
Classification | 120 |
Foreign Countries | 33 |
Accuracy | 31 |
Test Items | 30 |
Adaptive Testing | 25 |
Comparative Analysis | 21 |
Second Language Learning | 21 |
Models | 20 |
Item Response Theory | 19 |
Evaluation Methods | 16 |
More ▼ |
Source
Author
Wang, Wen-Chung | 4 |
Thompson, Nathan A. | 3 |
Chung, Hyewon | 2 |
Deane, Paul | 2 |
Dodd, Barbara G. | 2 |
Huebner, Alan | 2 |
Kim, Jiseon | 2 |
Liu, Chen-Wei | 2 |
Park, Ryoungsun | 2 |
Spray, Judith A. | 2 |
Xi, Xiaoming | 2 |
More ▼ |
Publication Type
Journal Articles | 128 |
Reports - Research | 84 |
Reports - Evaluative | 25 |
Reports - Descriptive | 15 |
Tests/Questionnaires | 4 |
Information Analyses | 3 |
Opinion Papers | 1 |
Reports - General | 1 |
Education Level
Audience
Researchers | 1 |
Location
Canada | 4 |
China | 3 |
Taiwan | 3 |
Texas | 3 |
United Kingdom | 3 |
Florida | 2 |
Germany | 2 |
Greece | 2 |
Australia | 1 |
California (Los Angeles) | 1 |
Europe | 1 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Peter Baldwin; Victoria Yaneva; Kai North; Le An Ha; Yiyun Zhou; Alex J. Mechaber; Brian E. Clauser – Journal of Educational Measurement, 2025
Recent developments in the use of large-language models have led to substantial improvements in the accuracy of content-based automated scoring of free-text responses. The reported accuracy levels suggest that automated systems could have widespread applicability in assessment. However, before they are used in operational testing, other aspects of…
Descriptors: Artificial Intelligence, Scoring, Computational Linguistics, Accuracy
Putnikovic, Marko; Jovanovic, Jelena – IEEE Transactions on Learning Technologies, 2023
Automatic grading of short answers is an important task in computer-assisted assessment (CAA). Recently, embeddings, as semantic-rich textual representations, have been increasingly used to represent short answers and predict the grade. Despite the recent trend of applying embeddings in automatic short answer grading (ASAG), there are no…
Descriptors: Automation, Computer Assisted Testing, Grading, Natural Language Processing
Yang Du; Susu Zhang – Journal of Educational and Behavioral Statistics, 2025
Item compromise has long posed challenges in educational measurement, jeopardizing both test validity and test security of continuous tests. Detecting compromised items is therefore crucial to address this concern. The present literature on compromised item detection reveals two notable gaps: First, the majority of existing methods are based upon…
Descriptors: Item Response Theory, Item Analysis, Bayesian Statistics, Educational Assessment
Demir, Seda – Journal of Educational Technology and Online Learning, 2022
The purpose of this research was to evaluate the effect of item pool and selection algorithms on computerized classification testing (CCT) performance in terms of some classification evaluation metrics. For this purpose, 1000 examinees' response patterns using the R package were generated and eight item pools with 150, 300, 450, and 600 items…
Descriptors: Test Items, Item Banks, Mathematics, Computer Assisted Testing
Ormerod, Christopher; Lottridge, Susan; Harris, Amy E.; Patel, Milan; van Wamelen, Paul; Kodeswaran, Balaji; Woolf, Sharon; Young, Mackenzie – International Journal of Artificial Intelligence in Education, 2023
We introduce a short answer scoring engine made up of an ensemble of deep neural networks and a Latent Semantic Analysis-based model to score short constructed responses for a large suite of questions from a national assessment program. We evaluate the performance of the engine and show that the engine achieves above-human-level performance on a…
Descriptors: Computer Assisted Testing, Scoring, Artificial Intelligence, Semantics
Wang, Wei; Dorans, Neil J. – ETS Research Report Series, 2021
Agreement statistics and measures of prediction accuracy are often used to assess the quality of two measures of a construct. Agreement statistics are appropriate for measures that are supposed to be interchangeable, whereas prediction accuracy statistics are appropriate for situations where one variable is the target and the other variables are…
Descriptors: Classification, Scaling, Prediction, Accuracy
Becker, Kirk A.; Kao, Shu-chuan – Journal of Applied Testing Technology, 2022
Natural Language Processing (NLP) offers methods for understanding and quantifying the similarity between written documents. Within the testing industry these methods have been used for automatic item generation, automated scoring of text and speech, modeling item characteristics, automatic question answering, machine translation, and automated…
Descriptors: Item Banks, Natural Language Processing, Computer Assisted Testing, Scoring
Ifenthaler, Dirk; Sahin, Muhittin – Interactive Technology and Smart Education, 2023
Purpose: This study aims to focus on providing a computerized classification testing (CCT) system that can easily be embedded as a self-assessment feature into the existing legacy environment of a higher education institution, empowering students with self-assessments to monitor their learning progress and following strict data protection…
Descriptors: College Students, Classification, Self Evaluation (Individuals), Progress Monitoring
Sun, Bo; Zhu, Yunzong; Yao, Zeng; Xiao, Rong; Xiao, Yongkang; Wei, Yungang – IEEE Transactions on Learning Technologies, 2020
Reading comprehension tasks are commonly used for developing students' reading ability. In order to adaptively recommend reading comprehension materials to students engaged in computerized testing, the information in an item bank (a collection of test items stored in a dataset) must be effectively indexed. Familiarity with the topics present in…
Descriptors: Automation, Indexing, Item Banks, Classification
Luz, Yael; Yerushalmy, Michal – Journal for Research in Mathematics Education, 2023
We report on an innovative design of algorithmic analysis that supports automatic online assessment of students' exploration of geometry propositions in a dynamic geometry environment. We hypothesized that difficulties with and misuse of terms or logic in conjectures are rooted in the early exploration stages of inquiry. We developed a generic…
Descriptors: Algorithms, Computer Assisted Testing, Geometry, Mathematics Instruction
Sinharay, Sandip – Educational and Psychological Measurement, 2022
Administrative problems such as computer malfunction and power outage occasionally lead to missing item scores and hence to incomplete data on mastery tests such as the AP and U.S. Medical Licensing examinations. Investigators are often interested in estimating the probabilities of passing of the examinees with incomplete data on mastery tests.…
Descriptors: Mastery Tests, Computer Assisted Testing, Probability, Test Wiseness
Shukla, Vishakha; Long, Madeleine; Bhatia, Vrinda; Rubio-Fernandez, Paula – Journal of Experimental Psychology: Learning, Memory, and Cognition, 2022
While most research on scalar implicature has focused on the lexical scale "some" vs "all," here we investigated an understudied scale formed by two syntactic constructions: categorizations (e.g., "Wilma is a nurse") and comparisons ("Wilma is like a nurse"). An experimental study by Rubio-Fernandez et al.…
Descriptors: Cues, Pragmatics, Comparative Analysis, Syntax
Mertens, Ute; Finn, Bridgid; Lindner, Marlit Annalena – Journal of Educational Psychology, 2022
Feedback is one of the most important factors for successful learning. Contemporary computer-based learning and testing environments allow the implementation of automated feedback in a simple and efficient manner. Previous meta-analyses suggest that different types of feedback are not equally effective. This heterogeneity might depend on learner…
Descriptors: Computer Assisted Testing, Feedback (Response), Electronic Learning, Network Analysis
Carioti, Desiré; Stucchi, Natale Adolfo; Toneatto, Carlo; Masia, Marta Franca; Del Monte, Milena; Stefanelli, Silvia; Travellini, Simona; Marcelli, Antonella; Tettamanti, Marco; Vernice, Mirta; Guasti, Maria Teresa; Berlingeri, Manuela – Annals of Dyslexia, 2023
In this study, we validated the "ReadFree tool", a computerised battery of 12 visual and auditory tasks developed to identify poor readers also in minority-language children (MLC). We tested the task-specific discriminant power on 142 Italian-monolingual participants (8-13 years old) divided into monolingual poor readers (N = 37) and…
Descriptors: Language Minorities, Task Analysis, Italian, Monolingualism
Fadillah, Sarah Meilani; Ha, Minsu; Nuraeni, Eni; Indriyanti, Nurma Yunita – Malaysian Journal of Learning and Instruction, 2023
Purpose: Researchers discovered that when students were given the opportunity to change their answers, a majority changed their responses from incorrect to correct, and this change often increased the overall test results. What prompts students to modify their answers? This study aims to examine the modification of scientific reasoning test, with…
Descriptors: Science Tests, Multiple Choice Tests, Test Items, Decision Making