NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
Researchers2
Laws, Policies, & Programs
No Child Left Behind Act 20011
What Works Clearinghouse Rating
Showing 1 to 15 of 38 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Stefanie A. Wind; Yuan Ge – Measurement: Interdisciplinary Research and Perspectives, 2024
Mixed-format assessments made up of multiple-choice (MC) items and constructed response (CR) items that are scored using rater judgments include unique psychometric considerations. When these item types are combined to estimate examinee achievement, information about the psychometric quality of each component can depend on that of the other. For…
Descriptors: Interrater Reliability, Test Bias, Multiple Choice Tests, Responses
Peer reviewed Peer reviewed
Direct linkDirect link
Wind, Stefanie A.; Walker, A. Adrienne – Educational Measurement: Issues and Practice, 2021
Many large-scale performance assessments include score resolution procedures for resolving discrepancies in rater judgments. The goal of score resolution is conceptually similar to person fit analyses: To identify students for whom observed scores may not accurately reflect their achievement. Previously, researchers have observed that…
Descriptors: Goodness of Fit, Performance Based Assessment, Evaluators, Decision Making
Yarbro, Jeffrey T.; Olney, Andrew M. – Grantee Submission, 2021
This paper explores the concept of dynamically generating definitions using a deep-learning model. We do this by creating a dataset that contains definition entries and contexts associated with each definition. We then fine-tune a GPT-2 based model on the dataset to allow the model to generate contextual definitions. We evaluate our model with…
Descriptors: Definitions, Learning Processes, Models, Context Effect
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Caspari-Sadeghi, Sima; Mille, Elena; Epperlein, Hella; Forster-Heinlein, Brigitte – Mathematics Teaching Research Journal, 2022
This collaborative action research highlights the need for developing students' evaluative competence and self-reflection by embedding self-and-peer assessment into online instruction. Over the course of a semester in an online master program in mathematics and computer sciences, students conducted research on assigned topics, held presentations,…
Descriptors: Graduate Students, Masters Programs, College Mathematics, Mathematics Education
Jay Schyler Raadt – ProQuest LLC, 2020
In response to concerns about using only standardized multiple-choice assessments, some school districts have moved to using alternative ratings of student achievement with authentic assessments. However, such assessments are often limited in terms of the psychometric validity data supporting their use. The present study mixed quantitative and…
Descriptors: Performance Based Assessment, Middle School Students, Scoring Rubrics, Content Validity
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Kim, Kerry J.; Meir, Eli; Pope, Denise S.; Wendel, Daniel – Journal of Educational Data Mining, 2017
Computerized classification of student answers offers the possibility of instant feedback and improved learning. Open response (OR) questions provide greater insight into student thinking and understanding than more constrained multiple choice (MC) questions, but development of automated classifiers is more difficult, often requiring training a…
Descriptors: Classification, Computer Assisted Testing, Multiple Choice Tests, Test Format
Peer reviewed Peer reviewed
Direct linkDirect link
Milner-Bolotin, Marina; Egersdorfer, Davor; Vinayagam, Murugan – Physical Review Physics Education Research, 2016
This paper describes the second year of a multi-year study on the implementation of Peer Instruction and PeerWise-inspired pedagogies in a physics methods course in a teacher education program at a large research university in Western Canada. In the first year of this study, Peer Instruction was implemented consistently in the physics methods…
Descriptors: Pedagogical Content Knowledge, Teacher Education Programs, Peer Teaching, Physics
Peer reviewed Peer reviewed
Direct linkDirect link
Thompson, Andrew R.; O'Loughlin, Valerie D. – Anatomical Sciences Education, 2015
Bloom's taxonomy is a resource commonly used to assess the cognitive level associated with course assignments and examination questions. Although widely utilized in educational research, Bloom's taxonomy has received limited attention as an analytical tool in the anatomical sciences. Building on previous research, the Blooming Anatomy Tool (BAT)…
Descriptors: Anatomy, Classification, Scoring Rubrics, Multiple Choice Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Slepkov, Aaron D.; Shiell, Ralph C. – Physical Review Special Topics - Physics Education Research, 2014
Constructed-response (CR) questions are a mainstay of introductory physics textbooks and exams. However, because of the time, cost, and scoring reliability constraints associated with this format, CR questions are being increasingly replaced by multiple-choice (MC) questions in formal exams. The integrated testlet (IT) is a recently developed…
Descriptors: Science Tests, Physics, Responses, Multiple Choice Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Kuo, Bor-Chen; Chen, Chun-Hua; Yang, Chih-Wei; Mok, Magdalena Mo Ching – Educational Psychology, 2016
Traditionally, teachers evaluate students' abilities via their total test scores. Recently, cognitive diagnostic models (CDMs) have begun to provide information about the presence or absence of students' skills or misconceptions. Nevertheless, CDMs are typically applied to tests with multiple-choice (MC) items, which provide less diagnostic…
Descriptors: Multiple Choice Tests, Responses, Test Items, Models
Peer reviewed Peer reviewed
Direct linkDirect link
Prevost, Luanna B.; Smith, Michelle K.; Knight, Jennifer K. – CBE - Life Sciences Education, 2016
Previous work has shown that students have persistent difficulties in understanding how central dogma processes can be affected by a stop codon mutation. To explore these difficulties, we modified two multiple-choice questions from the Genetics Concept Assessment into three open-ended questions that asked students to write about how a stop codon…
Descriptors: Science Instruction, Genetics, Scientific Concepts, Scoring
Peer reviewed Peer reviewed
Direct linkDirect link
Osa-Melero, Lucia – Hispania, 2012
Reading texts with historical and sociopolitical content in a foreign language is often a challenge for second language students. The obstacles encountered by students should be of concern to language instructors. Lack of background knowledge frequently causes the reader to abandon the reading activity with a sense of disappointment and…
Descriptors: Reading Comprehension, Textbooks, Multiple Choice Tests, Comparative Analysis
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Ato, Manuel; Lopez, Juan Jose; Benavente, Ana – Psicologica: International Journal of Methodology and Experimental Psychology, 2011
A comparison between six rater agreement measures obtained using three different approaches was achieved by means of a simulation study. Rater coefficients suggested by Bennet's [sigma] (1954), Scott's [pi] (1955), Cohen's [kappa] (1960) and Gwet's [gamma] (2008) were selected to represent the classical, descriptive approach, [alpha] agreement…
Descriptors: Interrater Reliability, Measurement, Comparative Analysis, Statistical Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Dankbaar, Mary E. W.; Alsma, Jelmer; Jansen, Els E. H.; van Merrienboer, Jeroen J. G.; van Saase, Jan L. C. M.; Schuit, Stephanie C. E. – Advances in Health Sciences Education, 2016
Simulation games are becoming increasingly popular in education, but more insight in their critical design features is needed. This study investigated the effects of fidelity of open patient cases in adjunct to an instructional e-module on students' cognitive skills and motivation. We set up a three-group randomized post-test-only design: a…
Descriptors: Experimental Groups, Thinking Skills, Computer Games, Motivation
Peer reviewed Peer reviewed
Direct linkDirect link
Hsieh, Mingchuan – Language Assessment Quarterly, 2013
The Yes/No Angoff and Bookmark method for setting standards on educational assessment are currently two of the most popular standard-setting methods. However, there is no research into the comparability of these two methods in the context of language assessment. This study compared results from the Yes/No Angoff and Bookmark methods as applied to…
Descriptors: Standard Setting (Scoring), Comparative Analysis, Language Tests, Multiple Choice Tests
Previous Page | Next Page ยป
Pages: 1  |  2  |  3