NotesFAQContact Us
Collection
Advanced
Search Tips
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing all 9 results Save | Export
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Parker, Mark A. J.; Hedgeland, Holly; Jordan, Sally E.; Braithwaite, Nicholas St. J. – European Journal of Science and Mathematics Education, 2023
The study covers the development and testing of the alternative mechanics survey (AMS), a modified force concept inventory (FCI), which used automatically marked free-response questions. Data were collected over a period of three academic years from 611 participants who were taking physics classes at high school and university level. A total of…
Descriptors: Test Construction, Scientific Concepts, Physics, Test Reliability
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Yamamoto, Kentaro; He, Qiwei; Shin, Hyo Jeong; von Davier, Mattias – ETS Research Report Series, 2017
Approximately a third of the Programme for International Student Assessment (PISA) items in the core domains (math, reading, and science) are constructed-response items and require human coding (scoring). This process is time-consuming, expensive, and prone to error as often (a) humans code inconsistently, and (b) coding reliability in…
Descriptors: Foreign Countries, Achievement Tests, International Assessment, Secondary School Students
Edward Paul Getman – Online Submission, 2020
Despite calls for engaging assessments targeting young language learners (YLLs) between 8 and 13 years old, what makes assessment tasks engaging and how such task characteristics affect measurement quality have not been well studied empirically. Furthermore, there has been a dearth of validity research about technology-enhanced speaking tests for…
Descriptors: English (Second Language), Language Tests, Second Language Learning, Learner Engagement
Smarter Balanced Assessment Consortium, 2016
The goal of this study was to gather comprehensive evidence about the alignment of the Smarter Balanced summative assessments to the Common Core State Standards (CCSS). Alignment of the Smarter Balanced summative assessments to the CCSS is a critical piece of evidence regarding the validity of inferences students, teachers and policy makers can…
Descriptors: Alignment (Education), Summative Evaluation, Common Core State Standards, Test Content
Peer reviewed Peer reviewed
Direct linkDirect link
Crossley, Scott; Clevinger, Amanda; Kim, YouJin – Language Assessment Quarterly, 2014
There has been a growing interest in the use of integrated tasks in the field of second language testing to enhance the authenticity of language tests. However, the role of text integration in test takers' performance has not been widely investigated. The purpose of the current study is to examine the effects of text-based relational (i.e.,…
Descriptors: Language Proficiency, Connected Discourse, Language Tests, English (Second Language)
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Zhang, Mo; Breyer, F. Jay; Lorenz, Florian – ETS Research Report Series, 2013
In this research, we investigated the suitability of implementing "e-rater"® automated essay scoring in a high-stakes large-scale English language testing program. We examined the effectiveness of generic scoring and 2 variants of prompt-based scoring approaches. Effectiveness was evaluated on a number of dimensions, including agreement…
Descriptors: Computer Assisted Testing, Computer Software, Scoring, Language Tests
Bennett, Randy Elliot; Rock, Donald A. – 1993
Formulating-Hypotheses (F-H) items present a situation and ask the examinee to generate as many explanations for it as possible. This study examined the generalizability, validity, and examinee perceptions of a computer-delivered version of the task. Eight F-H questions were administered to 192 graduate students. Half of the items restricted…
Descriptors: Computer Assisted Testing, Difficulty Level, Generalizability Theory, Graduate Students
Peer reviewed Peer reviewed
Direct linkDirect link
McGhee, Debbie E.; Lowell, Nana – New Directions for Teaching and Learning, 2003
This study compares mean ratings, inter-rater reliabilities, and the factor structure of items for online and paper student-rating forms from the University of Washington's Instructional Assessment System. (Contains 3 figures and 2 tables.)
Descriptors: Psychometrics, Factor Structure, Student Evaluation of Teacher Performance, Test Items
Stansfield, Charles W., Ed. – 1986
This collection of essays on measurement theory and language testing includes: "Computerized Adaptive Testing: Implications for Language Test Developers" (Peter Tung); "The Promise and Threat of Computerized Adaptive Assessment of Reading Comprehension" (Michael Canale); "Computerized Rasch Analysis of Item Bias in ESL…
Descriptors: Chinese, Cloze Procedure, Computer Assisted Testing, Computer Software