NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 11 results Save | Export
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Jing Miao; Yi Cao; Michael E. Walker – ETS Research Report Series, 2024
Studies of test score comparability have been conducted at different stages in the history of testing to ensure that test results carry the same meaning regardless of test conditions. The expansion of at-home testing via remote proctoring sparked another round of interest. This study uses data from three licensure tests to assess potential mode…
Descriptors: Testing, Test Format, Computer Assisted Testing, Home Study
Peer reviewed Peer reviewed
PDF on ERIC Download full text
McCaffrey, Daniel F.; Casabianca, Jodi M.; Ricker-Pedley, Kathryn L.; Lawless, René R.; Wendler, Cathy – ETS Research Report Series, 2022
This document describes a set of best practices for developing, implementing, and maintaining the critical process of scoring constructed-response tasks. These practices address both the use of human raters and automated scoring systems as part of the scoring process and cover the scoring of written, spoken, performance, or multimodal responses.…
Descriptors: Best Practices, Scoring, Test Format, Computer Assisted Testing
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Blair Lehman; Jesse R. Sparks; Jonathan Steinberg – ETS Research Report Series, 2024
Over the last 20 years, many methods have been proposed to use process data (e.g., response time) to detect changes in engagement during the test-taking process. However, many of these methods were developed and evaluated in highly similar testing contexts: 30 or more single-select multiple-choice items presented in a linear, fixed sequence in…
Descriptors: National Competency Tests, Secondary School Mathematics, Secondary School Students, Mathematics Tests
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Wang, Wei; Dorans, Neil J. – ETS Research Report Series, 2021
Agreement statistics and measures of prediction accuracy are often used to assess the quality of two measures of a construct. Agreement statistics are appropriate for measures that are supposed to be interchangeable, whereas prediction accuracy statistics are appropriate for situations where one variable is the target and the other variables are…
Descriptors: Classification, Scaling, Prediction, Accuracy
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Carol Eckerly; Yue Jia; Paul Jewsbury – ETS Research Report Series, 2022
Testing programs have explored the use of technology-enhanced items alongside traditional item types (e.g., multiple-choice and constructed-response items) as measurement evidence of latent constructs modeled with item response theory (IRT). In this report, we discuss considerations in applying IRT models to a particular type of adaptive testlet…
Descriptors: Computer Assisted Testing, Test Items, Item Response Theory, Scoring
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Olivera-Aguilar, Margarita; Lee, Hee-Sun; Pallant, Amy; Belur, Vinetha; Mulholland, Matthew; Liu, Ou Lydia – ETS Research Report Series, 2022
This study uses a computerized formative assessment system that provides automated scoring and feedback to help students write scientific arguments in a climate change curriculum. We compared the effect of contextualized versus generic automated feedback on students' explanations of scientific claims and attributions of uncertainty to those…
Descriptors: Computer Assisted Testing, Formative Evaluation, Automation, Scoring
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Patrick Kyllonen; Amit Sevak; Teresa Ober; Ikkyu Choi; Jesse Sparks; Daniel Fishtein – ETS Research Report Series, 2024
Assessment refers to a broad array of approaches for measuring or evaluating a person's (or group of persons') skills, behaviors, dispositions, or other attributes. Assessments range from standardized tests used in admissions, employee selection, licensure examinations, and domestic and international large-scale assessments of cognitive and…
Descriptors: Assessment Literacy, Testing, Test Bias, Test Construction
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Choi, Ikkyu; Hao, Jiangang; Deane, Paul; Zhang, Mo – ETS Research Report Series, 2021
"Biometrics" are physical or behavioral human characteristics that can be used to identify a person. It is widely known that keystroke or typing dynamics for short, fixed texts (e.g., passwords) could serve as a behavioral biometric. In this study, we investigate whether keystroke data from essay responses can lead to a reliable…
Descriptors: Accuracy, High Stakes Tests, Writing Tests, Benchmarking
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Lopez, Alexis A.; Guzman-Orth, Danielle; Zapata-Rivera, Diego; Forsyth, Carolyn M.; Luce, Christine – ETS Research Report Series, 2021
Substantial progress has been made toward applying technology enhanced conversation-based assessments (CBAs) to measure the English-language proficiency of English learners (ELs). CBAs are conversation-based systems that use conversations among computer-animated agents and a test taker. We expanded the design and capability of prior…
Descriptors: Accuracy, English Language Learners, Language Proficiency, Language Tests
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Kyle, Kristopher; Choe, Ann Tai; Eguchi, Masaki; LaFlair, Geoff; Ziegler, Nicole – ETS Research Report Series, 2021
A key piece of a validity argument for a language assessment tool is clear overlap between assessment tasks and the target language use (TLU) domain (i.e., the domain description inference). The TOEFL 2000 Spoken and Written Academic Language (T2K-SWAL) corpus, which represents a variety of academic registers and disciplines in traditional…
Descriptors: Comparative Analysis, Second Language Learning, English (Second Language), Language Tests
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Owen, Nathaniel; Shrestha, Prithvi N.; Hultgren, Anna Kristina – ETS Research Report Series, 2021
This project examined academic reading in two contrasting English as a medium of instruction (EMI) university settings in Nepal and Sweden and the unique challenges facing students who are studying in a language other than their primary language. The motivation for the project was to explore the role of high-stakes testing in EMI contexts and the…
Descriptors: English for Academic Purposes, Reading Tests, Language of Instruction, Computer Assisted Testing