ERIC - Search Results

Publication Date

In 2025	1
Since 2024	1
Since 2021 (last 5 years)	4
Since 2016 (last 10 years)	8
Since 2006 (last 20 years)	11

Descriptor

Computer Assisted Testing	14
Error of Measurement	14
Scoring	14
Interrater Reliability	5
Psychometrics	5
Adaptive Testing	4
Comparative Analysis	4
Correlation	4
Item Response Theory	4
Test Items	4
Automation	3
Diagnostic Tests	3
Language Tests	3
Reliability	3
Response Style (Tests)	3
Test Validity	3
Accuracy	2
At Risk Students	2
Children	2
Comparative Testing	2
English (Second Language)	2
Evaluation Methods	2
Evaluators	2
Goodness of Fit	2
Item Banks	2
More ▼

Source

ETS Research Report Series	2
British Educational Research…	1
CALICO Journal	1
Educational Measurement:…	1
Educational Testing Service	1
Educational and Psychological…	1
Grantee Submission	1
International Journal of…	1
Journal of Psychoeducational…	1
Language, Speech, and Hearing…	1
Perceptual and Motor Skills	1
ProQuest LLC	1
More ▼

Publication Type

Journal Articles	11
Reports - Research	11
Tests/Questionnaires	2
Dissertations/Theses -…	1
Guides - Non-Classroom	1
Reports - Descriptive	1

Education Level

Higher Education	2
Postsecondary Education	2
Elementary Education	1
Elementary Secondary Education	1

Audience

Practitioners

Location

China

Laws, Policies, & Programs

Assessments and Surveys

Rod and Frame Test	1
Test of English as a Foreign…	1

What Works Clearinghouse Rating

Showing all 14 results Save | Export

Grading Exams Using Large Language Models: A Comparison between Human and AI Grading of Exams in Higher Education Using ChatGPT

Peer reviewed

Direct link

Jonas Flodén – British Educational Research Journal, 2025

This study compares how the generative AI (GenAI) large language model (LLM) ChatGPT performs in grading university exams compared to human teachers. Aspects investigated include consistency, large discrepancies and length of answer. Implications for higher education, including the role of teachers and ethics, are also discussed. Three…

Descriptors: College Faculty, Artificial Intelligence, Comparative Testing, Scoring

Automated Essay Scoring Effect on Test Equating Errors in Mixed-Format Test

Peer reviewed
PDF on ERIC

Download full text

Uysal, Ibrahim; Dogan, Nuri – International Journal of Assessment Tools in Education, 2021

Scoring constructed-response items can be highly difficult, time-consuming, and costly in practice. Improvements in computer technology have enabled automated scoring of constructed-response items. However, the application of automated scoring without an investigation of test equating can lead to serious problems. The goal of this study was to…

Descriptors: Computer Assisted Testing, Scoring, Item Response Theory, Test Format

Digital Module 18: Automated Scoring

Peer reviewed

Direct link

Lottridge, Sue; Burkhardt, Amy; Boyer, Michelle – Educational Measurement: Issues and Practice, 2020

In this digital ITEMS module, Dr. Sue Lottridge, Amy Burkhardt, and Dr. Michelle Boyer provide an overview of automated scoring. Automated scoring is the use of computer algorithms to score unconstrained open-ended test items by mimicking human scoring. The use of automated scoring is increasing in educational assessment programs because it allows…

Descriptors: Computer Assisted Testing, Scoring, Automation, Educational Assessment

Online Administration of the Test of Narrative Language--Second Edition: Psychometrics and Considerations for Remote Assessment

Peer reviewed
PDF on ERIC

Download full text

Direct link

Beula M. Magimairaj; Philip Capin; Sandra L. Gillam; Sharon Vaughn; Greg Roberts; Anna-Maria Fall; Ronald B. Gillam – Grantee Submission, 2022

Purpose: Our aim was to evaluate the psychometric properties of the online administered format of the Test of Narrative Language--Second Edition (TNL-2; Gillam & Pearson, 2017), given the importance of assessing children's narrative ability and considerable absence of psychometric studies of spoken language assessments administered online.…

Descriptors: Computer Assisted Testing, Language Tests, Story Telling, Language Impairments

Imputation Methods to Deal with Missing Responses in Computerized Adaptive Multistage Testing

Peer reviewed

Direct link

Cetin-Berber, Dee Duygu; Sari, Halil Ibrahim; Huggins-Manley, Anne Corinne – Educational and Psychological Measurement, 2019

Routing examinees to modules based on their ability level is a very important aspect in computerized adaptive multistage testing. However, the presence of missing responses may complicate estimation of examinee ability, which may result in misrouting of individuals. Therefore, missing responses should be handled carefully. This study investigated…

Descriptors: Computer Assisted Testing, Adaptive Testing, Error of Measurement, Research Problems

Online Administration of the Test of Narrative Language--Second Edition: Psychometrics and Considerations for Remote Assessment

Peer reviewed
PDF on ERIC

Download full text

Direct link

Beula M. Magimairaj; Philip Capin; Sandra L. Gillam; Sharon Vaughn; Greg Roberts; Anna-Maria Fall; Ronald B. Gillam – Language, Speech, and Hearing Services in Schools, 2022

Descriptors: Computer Assisted Testing, Language Tests, Story Telling, Language Impairments

A Fair Comparison of the Performance of Computerized Adaptive Testing and Multistage Adaptive Testing

Direct link

Wang, Keyin – ProQuest LLC, 2017

The comparison of item-level computerized adaptive testing (CAT) and multistage adaptive testing (MST) has been researched extensively (e.g., Kim & Plake, 1993; Luecht et al., 1996; Patsula, 1999; Jodoin, 2003; Hambleton & Xing, 2006; Keng, 2008; Zheng, 2012). Various CAT and MST designs have been investigated and compared under the same…

Descriptors: Comparative Analysis, Computer Assisted Testing, Adaptive Testing, Test Items

Internet Administration of the Paper-and-Pencil Gifted Rating Scale: Assessing Psychometric Equivalence

Peer reviewed

Direct link

Yarnell, Jordy B.; Pfeiffer, Steven I. – Journal of Psychoeducational Assessment, 2015

The present study examined the psychometric equivalence of administering a computer-based version of the Gifted Rating Scale (GRS) compared with the traditional paper-and-pencil GRS-School Form (GRS-S). The GRS-S is a teacher-completed rating scale used in gifted assessment. The GRS-Electronic Form provides an alternative method of administering…

Descriptors: Gifted, Psychometrics, Rating Scales, Computer Assisted Testing

Investigating the Application of Automated Writing Evaluation to Chinese Undergraduate English Majors: A Case Study of "WriteToLearn"

Peer reviewed
PDF on ERIC

Download full text

Liu, Sha; Kunnan, Antony John – CALICO Journal, 2016

This study investigated the application of "WriteToLearn" on Chinese undergraduate English majors' essays in terms of its scoring ability and the accuracy of its error feedback. Participants were 163 second-year English majors from a university located in Sichuan province who wrote 326 essays from two writing prompts. Each paper was…

Descriptors: Foreign Countries, Undergraduate Students, English (Second Language), Second Language Learning

On-the-Fly Customization of Automated Essay Scoring. Research Report. ETS RR-07-42

Peer reviewed
PDF on ERIC

Download full text

Attali, Yigal – ETS Research Report Series, 2007

Because there is no commonly accepted view of what makes for good writing, automated essay scoring (AES) ideally should be able to accommodate different theoretical positions, certainly at the level of state standards but also perhaps among teachers at the classroom level. This paper presents a practical approach and an interactive computer…

Descriptors: Computer Assisted Testing, Automation, Essay Tests, Scoring

Scoring Rod-and-Frame Tests: Quantitative and Qualitative Considerations.

Peer reviewed

Haller, Otto; Edgington, Eugene S. – Perceptual and Motor Skills, 1982

Current scoring procedures depend on unrealistic assumptions about subjects' performance on the rod-and-frame test. A procedure is presented which corrects for constant error, is sensitive to response strategy and consistency, and examines qualitative and quantitative aspects of performance and individual differences in laterality bias as defined…

Descriptors: Computer Assisted Testing, Cues, Error of Measurement, Individual Differences

Tolerable Variation in Item Parameter Estimates for Linear and Adaptive Computer-Based Testing. Research Report No. 04-28

Download full text

Rizavi, Saba; Way, Walter D.; Davey, Tim; Herbert, Erin – Educational Testing Service, 2004

Item parameter estimates vary for a variety of reasons, including estimation error, characteristics of the examinee samples, and context effects (e.g., item location effects, section location effects, etc.). Although we expect variation based on theory, there is reason to believe that observed variation in item parameter estimates exceeds what…

Descriptors: Adaptive Testing, Test Items, Computation, Context Effect

Investigating the Utility of Analytic Scoring for the TOEFL Academic Speaking Test (TAST). TOEFL iBT Research Report. TOEFL iBT-01. ETS RR-06-07

Peer reviewed
PDF on ERIC

Download full text

Xi, Xiaoming; Mollaun, Pam – ETS Research Report Series, 2006

This study explores the utility of analytic scoring for the TOEFL® Academic Speaking Test (TAST) in providing useful and reliable diagnostic information in three aspects of candidates' performance: delivery, language use, and topic development. G studies were used to investigate the dependability of the analytic scores, the distinctness of the…

Descriptors: English (Second Language), Language Tests, Second Language Learning, Oral Language

An Information Comparison of Conventional and Adaptive Tests in the Measurement of Classroom Achievement. Research Report 77-7.

Download full text

Bejar, Isaac I.; And Others – 1977

Information provided by typical and improved conventional classroom achievement tests was compared with information provided by an adaptive test covering the same subject matter. Both tests were administered to over 700 college students in a general biology course. Using the same scoring method, adaptive testing was found to yield substantially…

Descriptors: Academic Achievement, Achievement Tests, Adaptive Testing, Biology

Anna-Maria Fall	2
Beula M. Magimairaj	2
Greg Roberts	2
Philip Capin	2
Ronald B. Gillam	2
Sandra L. Gillam	2
Sharon Vaughn	2
Attali, Yigal	1
Bejar, Isaac I.	1
Boyer, Michelle	1
Burkhardt, Amy	1
Cetin-Berber, Dee Duygu	1
Davey, Tim	1
Dogan, Nuri	1
Edgington, Eugene S.	1
Haller, Otto	1
Herbert, Erin	1
Huggins-Manley, Anne Corinne	1
Jonas Flodén	1
Kunnan, Antony John	1
Liu, Sha	1
Lottridge, Sue	1
Mollaun, Pam	1
Pfeiffer, Steven I.	1
More ▼