Publication Date
In 2025 | 2 |
Since 2024 | 2 |
Since 2021 (last 5 years) | 5 |
Since 2016 (last 10 years) | 5 |
Since 2006 (last 20 years) | 6 |
Descriptor
Computer Assisted Testing | 14 |
Scoring | 14 |
Adaptive Testing | 4 |
Automation | 4 |
Item Response Theory | 4 |
Test Construction | 4 |
Test Items | 4 |
Comparative Analysis | 3 |
Simulation | 3 |
Test Scoring Machines | 3 |
Algorithms | 2 |
More ▼ |
Source
Journal of Educational… | 14 |
Author
Alex J. Mechaber | 1 |
Bejar, Isaac I. | 1 |
Bennett, Randy Elliot | 1 |
Braun, Henry I. | 1 |
Brian E. Clauser | 1 |
Casabianca, Jodi M. | 1 |
Chao, Szu-Fu | 1 |
Chen, Ping | 1 |
Choi, Ikkyu | 1 |
Clauser, Brian E. | 1 |
Clyman, Stephen G. | 1 |
More ▼ |
Publication Type
Journal Articles | 14 |
Reports - Research | 7 |
Reports - Evaluative | 6 |
Book/Product Reviews | 1 |
Speeches/Meeting Papers | 1 |
Education Level
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
Advanced Placement… | 1 |
What Works Clearinghouse Rating
Peter Baldwin; Victoria Yaneva; Kai North; Le An Ha; Yiyun Zhou; Alex J. Mechaber; Brian E. Clauser – Journal of Educational Measurement, 2025
Recent developments in the use of large-language models have led to substantial improvements in the accuracy of content-based automated scoring of free-text responses. The reported accuracy levels suggest that automated systems could have widespread applicability in assessment. However, before they are used in operational testing, other aspects of…
Descriptors: Artificial Intelligence, Scoring, Computational Linguistics, Accuracy
Yuan, Lu; Huang, Yingshi; Li, Shuhang; Chen, Ping – Journal of Educational Measurement, 2023
Online calibration is a key technology for item calibration in computerized adaptive testing (CAT) and has been widely used in various forms of CAT, including unidimensional CAT, multidimensional CAT (MCAT), CAT with polytomously scored items, and cognitive diagnostic CAT. However, as multidimensional and polytomous assessment data become more…
Descriptors: Computer Assisted Testing, Adaptive Testing, Computation, Test Items
Wallace N. Pinto Jr.; Jinnie Shin – Journal of Educational Measurement, 2025
In recent years, the application of explainability techniques to automated essay scoring and automated short-answer grading (ASAG) models, particularly those based on transformer architectures, has gained significant attention. However, the reliability and consistency of these techniques remain underexplored. This study systematically investigates…
Descriptors: Automation, Grading, Computer Assisted Testing, Scoring
Casabianca, Jodi M.; Donoghue, John R.; Shin, Hyo Jeong; Chao, Szu-Fu; Choi, Ikkyu – Journal of Educational Measurement, 2023
Using item-response theory to model rater effects provides an alternative solution for rater monitoring and diagnosis, compared to using standard performance metrics. In order to fit such models, the ratings data must be sufficiently connected in order to estimate rater effects. Due to popular rating designs used in large-scale testing scenarios,…
Descriptors: Item Response Theory, Alternative Assessment, Evaluators, Research Problems
Dorsey, David W.; Michaels, Hillary R. – Journal of Educational Measurement, 2022
We have dramatically advanced our ability to create rich, complex, and effective assessments across a range of uses through technology advancement. Artificial Intelligence (AI) enabled assessments represent one such area of advancement--one that has captured our collective interest and imagination. Scientists and practitioners within the domains…
Descriptors: Validity, Ethics, Artificial Intelligence, Evaluation Methods
Shermis, Mark D.; Lottridge, Sue; Mayfield, Elijah – Journal of Educational Measurement, 2015
This study investigated the impact of anonymizing text on predicted scores made by two kinds of automated scoring engines: one that incorporates elements of natural language processing (NLP) and one that does not. Eight data sets (N = 22,029) were used to form both training and test sets in which the scoring engines had access to both text and…
Descriptors: Scoring, Essays, Computer Assisted Testing, Natural Language Processing

Davey, Tim; And Others – Journal of Educational Measurement, 1997
The development and scoring of a recently introduced computer-based writing skills test is described. The test asks the examinee to edit a writing passage presented on a computer screen. Scoring difficulties are addressed through the combined use of option weighting and the sequential probability ratio test. (SLD)
Descriptors: Computer Assisted Testing, Educational Innovation, Probability, Scoring

Williamson, David M.; Bejar, Isaac I.; Hone, Anne S. – Journal of Educational Measurement, 1999
Contrasts "mental models" used by automated scoring for the simulation division of the computerized Architect Registration Examination with those used by experienced human graders for 3,613 candidate solutions. Discusses differences in the models used and the potential of automated scoring to enhance the validity evidence of scores. (SLD)
Descriptors: Architects, Comparative Analysis, Computer Assisted Testing, Judges

Thissen, David; And Others – Journal of Educational Measurement, 1989
An approach to scoring reading comprehension based on the concept of the testlet is described, using models developed for items in multiple categories. The model is illustrated using data from 3,866 examinees. Application of testlet scoring to multiple category models developed for individual items is discussed. (SLD)
Descriptors: Adaptive Testing, Computer Assisted Testing, Item Response Theory, Mathematical Models

Clauser, Brian E.; Margolis, Melissa J.; Clyman, Stephen G.; Ross, Linette P. – Journal of Educational Measurement, 1997
Research on automated scoring is extended by comparing alternative automated systems for scoring a computer simulation of physicians' patient management skills. A regression-based system is more highly correlated with experts' evaluations than a system that uses complex rules to map performances into score levels, but both approaches are feasible.…
Descriptors: Algorithms, Automation, Comparative Analysis, Computer Assisted Testing

Bennett, Randy Elliot; Steffen, Manfred; Singley, Mark Kevin; Morley, Mary; Jacquemin, Daniel – Journal of Educational Measurement, 1997
Scoring accuracy and item functioning were studied for an open-ended response type test in which correct answers can take many different surface forms. Results with 1,864 graduate school applicants showed automated scoring to approximate the accuracy of multiple-choice scoring. Items functioned similarly to other item types being considered. (SLD)
Descriptors: Adaptive Testing, Automation, College Applicants, Computer Assisted Testing

Wainer, Howard; Lewis, Charles – Journal of Educational Measurement, 1990
Three different applications of the testlet concept are presented, and the psychometric models most suitable for each application are described. Difficulties that testlets can help overcome include (1) context effects; (2) item ordering; and (3) content balancing. Implications for test construction are discussed. (SLD)
Descriptors: Algorithms, Computer Assisted Testing, Elementary Secondary Education, Item Response Theory

Patience, Wayne – Journal of Educational Measurement, 1990
The four main subsystems of the MicroCAT Testing System for developing, administering, scoring, and analyzing computerized tests using conventional or item response theory methods are described. Judgments of three users of the system are included in the evaluation of this software. (SLD)
Descriptors: Adaptive Testing, Computer Assisted Testing, Computer Software, Computer Software Reviews

Braun, Henry I.; And Others – Journal of Educational Measurement, 1990
The accuracy with which expert systems (ESs) score a new nonmultiple-choice free-response test item was investigated, using 734 high school students who were administered an advanced-placement computer science examination. ESs produced scores for 82 percent to 95 percent of the responses and displayed high agreement with a human reader on the…
Descriptors: Advanced Placement, Computer Assisted Testing, Computer Science, Constructed Response