NotesFAQContact Us
Collection
Advanced
Search Tips
Publication Date
In 20250
Since 20240
Since 2021 (last 5 years)0
Since 2016 (last 10 years)1
Since 2006 (last 20 years)10
Audience
What Works Clearinghouse Rating
Showing all 11 results Save | Export
Arneson, Amy – ProQuest LLC, 2019
This three-paper dissertation explores item cluster-based assessments, first in general as it relates to modeling, and then, specific issues surrounding a particular item cluster-based assessment designed. There should be a reasonable analogy between the structure of a psychometric model and the cognitive theory that the assessment is based upon.…
Descriptors: Item Response Theory, Test Items, Critical Thinking, Cognitive Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Kim, Sooyeon; Walker, Michael E.; McHale, Frederick – Journal of Educational Measurement, 2010
In this study we examined variations of the nonequivalent groups equating design for tests containing both multiple-choice (MC) and constructed-response (CR) items to determine which design was most effective in producing equivalent scores across the two tests to be equated. Using data from a large-scale exam, this study investigated the use of…
Descriptors: Measures (Individuals), Scoring, Equated Scores, Test Bias
Hagge, Sarah Lynn – ProQuest LLC, 2010
Mixed-format tests containing both multiple-choice and constructed-response items are widely used on educational tests. Such tests combine the broad content coverage and efficient scoring of multiple-choice items with the assessment of higher-order thinking skills thought to be provided by constructed-response items. However, the combination of…
Descriptors: Test Format, True Scores, Equated Scores, Psychometrics
Xu, Zeyu; Nichols, Austin – National Center for Analysis of Longitudinal Data in Education Research, 2010
The gold standard in making causal inference on program effects is a randomized trial. Most randomization designs in education randomize classrooms or schools rather than individual students. Such "clustered randomization" designs have one principal drawback: They tend to have limited statistical power or precision. This study aims to…
Descriptors: Test Format, Reading Tests, Norm Referenced Tests, Research Design
Peer reviewed Peer reviewed
Direct linkDirect link
Frey, Andreas; Hartig, Johannes; Rupp, Andre A. – Educational Measurement: Issues and Practice, 2009
In most large-scale assessments of student achievement, several broad content domains are tested. Because more items are needed to cover the content domains than can be presented in the limited testing time to each individual student, multiple test forms or booklets are utilized to distribute the items to the students. The construction of an…
Descriptors: Measures (Individuals), Test Construction, Theory Practice Relationship, Design
Zheng, Ying; Cheng, Liying; Klinger, Don A. – TESL Canada Journal, 2007
Large scale testing in English affects second-language students not only greatly but also differently than first-language learners. The research literature reports that confounding factors in such large-scale testing such as varying test formats may differentially affect the performance of students from diverse backgrounds. An investigation of…
Descriptors: Reading Comprehension, Reading Tests, Test Format, Educational Testing
Peer reviewed Peer reviewed
Direct linkDirect link
Wiliam, Dylan – Review of Research in Education, 2010
The idea that validity should be considered a property of inferences, rather than of assessments, has developed slowly over the past century. In early writings about the validity of educational assessments, validity was defined as a property of an assessment. The most common definition was that an assessment was valid to the extent that it…
Descriptors: Educational Assessment, Validity, Inferences, Construct Validity
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Scalise, Kathleen; Gifford, Bernard – Journal of Technology, Learning, and Assessment, 2006
Technology today offers many new opportunities for innovation in educational assessment through rich new assessment tasks and potentially powerful scoring, reporting and real-time feedback mechanisms. One potential limitation for realizing the benefits of computer-based assessment in both instructional assessment and large scale testing comes in…
Descriptors: Electronic Learning, Educational Assessment, Information Technology, Classification
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Johnson, Martin; Green, Sylvia – Journal of Technology, Learning, and Assessment, 2006
The transition from paper-based to computer-based assessment raises a number of important issues about how mode might affect children's performance and question answering strategies. In this project 104 eleven-year-olds were given two sets of matched mathematics questions, one set on-line and the other on paper. Facility values were analyzed to…
Descriptors: Student Attitudes, Computer Assisted Testing, Program Effectiveness, Elementary School Students
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Ketterlin-Geller, Leanne R. – Journal of Technology, Learning, and Assessment, 2005
Universal design for assessment (UDA) is intended to increase participation of students with disabilities and English-language learners in general education assessments by addressing student needs through customized testing platforms. Computer-based testing provides an optimal format for creating individually-tailored tests. However, although a…
Descriptors: Student Needs, Disabilities, Grade 3, Second Language Learning
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Horkay, Nancy; Bennett, Randy Elliott; Allen, Nancy; Kaplan, Bruce; Yan, Fred – Journal of Technology, Learning, and Assessment, 2006
This study investigated the comparability of scores for paper and computer versions of a writing test administered to eighth grade students. Two essay prompts were given on paper to a nationally representative sample as part of the 2002 main NAEP writing assessment. The same two essay prompts were subsequently administered on computer to a second…
Descriptors: Writing Evaluation, Writing Tests, Computer Assisted Testing, Program Effectiveness