NotesFAQContact Us
Collection
Advanced
Search Tips
Publication Date
In 20250
Since 20240
Since 2021 (last 5 years)0
Since 2016 (last 10 years)3
Since 2006 (last 20 years)9
Audience
Laws, Policies, & Programs
Assessments and Surveys
National Assessment of…1
What Works Clearinghouse Rating
Showing all 12 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Bush, Martin – Assessment & Evaluation in Higher Education, 2015
The humble multiple-choice test is very widely used within education at all levels, but its susceptibility to guesswork makes it a suboptimal assessment tool. The reliability of a multiple-choice test is partly governed by the number of items it contains; however, longer tests are more time consuming to take, and for some subject areas, it can be…
Descriptors: Guessing (Tests), Multiple Choice Tests, Test Format, Test Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Beserra, Vagner; Nussbaum, Miguel; Grass, Antonio – Interactive Learning Environments, 2017
When using educational video games, particularly drill-and-practice video games, there are several ways of providing an answer to a quiz. The majority of paper-based options can be classified as being either multiple-choice or constructed-response. Therefore, in the process of creating an educational drill-and-practice video game, one fundamental…
Descriptors: Multiple Choice Tests, Drills (Practice), Educational Games, Video Games
Peer reviewed Peer reviewed
Direct linkDirect link
Kim, Ahyoung Alicia; Lee, Shinhye; Chapman, Mark; Wilmes, Carsten – TESOL Quarterly: A Journal for Teachers of English to Speakers of Other Languages and of Standard English as a Second Dialect, 2019
This study aimed to investigate how Grade 1-2 English language learners (ELLs) differ in their performance on a writing test in two test modes: paper and online. Participants were 139 ELLs in the United States. They completed three writing tasks, representing three test modes: (1) a paper in which students completed their writing using a…
Descriptors: Elementary School Students, English (Second Language), Second Language Learning, Second Language Instruction
Peer reviewed Peer reviewed
Direct linkDirect link
Stevenson, Claire E.; Heiser, Willem J.; Resing, Wilma C. M. – Journal of Psychoeducational Assessment, 2016
Multiple-choice (MC) analogy items are often used in cognitive assessment. However, in dynamic testing, where the aim is to provide insight into potential for learning and the learning process, constructed-response (CR) items may be of benefit. This study investigated whether training with CR or MC items leads to differences in the strategy…
Descriptors: Logical Thinking, Multiple Choice Tests, Test Items, Cognitive Tests
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Wang, Zhen; Yao, Lihua – ETS Research Report Series, 2013
The current study used simulated data to investigate the properties of a newly proposed method (Yao's rater model) for modeling rater severity and its distribution under different conditions. Our study examined the effects of rater severity, distributions of rater severity, the difference between item response theory (IRT) models with rater effect…
Descriptors: Test Format, Test Items, Responses, Computation
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Kim, Sooyeon; Walker, Michael E. – ETS Research Report Series, 2009
We examined the appropriateness of the anchor composition in a mixed-format test, which includes both multiple-choice (MC) and constructed-response (CR) items, using subpopulation invariance indices. We derived linking functions in the nonequivalent groups with anchor test (NEAT) design using two types of anchor sets: (a) MC only and (b) a mix of…
Descriptors: Test Format, Equated Scores, Test Items, Multiple Choice Tests
Peer reviewed Peer reviewed
PDF on ERIC Download full text
DeCarlo, Lawrence T. – ETS Research Report Series, 2008
Rater behavior in essay grading can be viewed as a signal-detection task, in that raters attempt to discriminate between latent classes of essays, with the latent classes being defined by a scoring rubric. The present report examines basic aspects of an approach to constructed-response (CR) scoring via a latent-class signal-detection model. The…
Descriptors: Scoring, Responses, Test Format, Bias
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Kim, Sooyeon; Walker, Michael E.; McHale, Frederick – ETS Research Report Series, 2008
This study examined variations of a nonequivalent groups equating design used with constructed-response (CR) tests to determine which design was most effective in producing equivalent scores across the two tests to be equated. Using data from a large-scale exam, the study investigated the use of anchor CR item rescoring in the context of classical…
Descriptors: Equated Scores, Comparative Analysis, Test Format, Responses
Peer reviewed Peer reviewed
Direct linkDirect link
Kim, Seonghoon; Kolen, Michael J. – Applied Measurement in Education, 2006
Four item response theory linking methods (2 moment methods and 2 characteristic curve methods) were compared to concurrent (CO) calibration with the focus on the degree of robustness to format effects (FEs) when applying the methods to multidimensional data that reflected the FEs associated with mixed-format tests. Based on the quantification of…
Descriptors: Item Response Theory, Robustness (Statistics), Test Format, Comparative Analysis
Johanson, George A.; Gips, Crystal J. – 1993
The decision to use a forced-choice test item format versus an item format where choice is not forced (e.g., a Likert scale) might best be determined by the nature of the information sought since the difficult decisions required for forced-choice formats may result in a different scaling than an unforced method. If a forced choice is desired,…
Descriptors: Administrator Attitudes, Comparative Analysis, Likert Scales, Principals
Peer reviewed Peer reviewed
Jaeger, Richard M.; Wolf, Marian B. – Journal of Educational Measurement, 1982
The effectiveness of a Likert-scale and three paired-choice presentation formats in discriminating among parents' preferences for curriculum elements were compared. Paired-choice formats gave more reliable discriminations which increased with stimulus specificity. Similarities and differences in preference orderings are discussed. (Author/CM)
Descriptors: Comparative Analysis, Elementary Education, Parent Attitudes, Parent School Relationship
Park, Chung; Allen, Nancy L. – 1994
This study is part of continuing research into the meaning of future National Assessment of Educational Progress (NAEP) science scales. In this study, the test framework, as examined by NAEP's consensus process, and attributes of the items, identified by science experts, cognitive scientists, and measurement specialists, are examined. Preliminary…
Descriptors: Communication (Thought Transfer), Comparative Analysis, Construct Validity, Content Validity