Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 12 |
Since 2016 (last 10 years) | 29 |
Since 2006 (last 20 years) | 57 |
Descriptor
Test Items | 69 |
Mathematics Tests | 25 |
Difficulty Level | 18 |
Scores | 18 |
Test Construction | 18 |
Science Tests | 15 |
Item Response Theory | 14 |
Test Validity | 12 |
Grade 4 | 11 |
Item Analysis | 11 |
Multiple Choice Tests | 11 |
More ▼ |
Source
Educational Assessment | 69 |
Author
Lee, Hee-Sun | 3 |
Linn, Marcia C. | 3 |
Liu, Ou Lydia | 3 |
Russell, Michael | 3 |
Solano-Flores, Guillermo | 3 |
Briggs, Derek C. | 2 |
Bulut, Okan | 2 |
Cormier, Damien C. | 2 |
DeMars, Christine E. | 2 |
Huff, Kristen L. | 2 |
Katz, Irvin R. | 2 |
More ▼ |
Publication Type
Journal Articles | 69 |
Reports - Research | 47 |
Reports - Evaluative | 17 |
Reports - Descriptive | 5 |
Tests/Questionnaires | 3 |
Information Analyses | 1 |
Speeches/Meeting Papers | 1 |
Education Level
Elementary Education | 21 |
Middle Schools | 20 |
Secondary Education | 16 |
Intermediate Grades | 15 |
Grade 4 | 14 |
Junior High Schools | 13 |
Elementary Secondary Education | 10 |
Grade 5 | 10 |
Grade 8 | 10 |
Grade 6 | 8 |
Grade 7 | 8 |
More ▼ |
Audience
Location
California | 3 |
Massachusetts | 3 |
Washington | 3 |
Kansas | 2 |
Minnesota | 2 |
Oregon | 2 |
Turkey | 2 |
Alabama | 1 |
Canada | 1 |
Florida | 1 |
Georgia | 1 |
More ▼ |
Laws, Policies, & Programs
Individuals with Disabilities… | 1 |
Individuals with Disabilities… | 1 |
No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
National Assessment of… | 3 |
Massachusetts Comprehensive… | 1 |
Motivated Strategies for… | 1 |
Trends in International… | 1 |
Washington Assessment of… | 1 |
What Works Clearinghouse Rating
Daniel P. Jurich; Matthew J. Madison – Educational Assessment, 2023
Diagnostic classification models (DCMs) are psychometric models that provide probabilistic classifications of examinees on a set of discrete latent attributes. When analyzing or constructing assessments scored by DCMs, understanding how each item influences attribute classifications can clarify the meaning of the measured constructs, facilitate…
Descriptors: Test Items, Models, Classification, Influences
Randall, Jennifer – Educational Assessment, 2023
In a justice-oriented antiracist assessment process, attention to the disruption of white supremacy must occur at every stage--from construct articulation to score reporting. An important step in the assessment development process is the item review stage often referred to as Bias/Fairness and Sensitivity Review. I argue that typical approaches to…
Descriptors: Social Justice, Racism, Test Bias, Test Items
Haladyna, Thomas M.; Rodriguez, Michael C. – Educational Assessment, 2021
Full-information item analysis provides item developers and reviewers comprehensive empirical evidence of item quality, including option response frequency, point-biserial index (PBI) for distractors, mean-scores of respondents selecting each option, and option trace lines. The multi-serial index (MSI) is introduced as a more informative…
Descriptors: Test Items, Item Analysis, Reading Tests, Mathematics Tests
Wind, Stefanie A.; Guo, Wenjing – Educational Assessment, 2021
Scoring procedures for the constructed-response (CR) items in large-scale mixed-format educational assessments often involve checks for rater agreement or rater reliability. Although these analyses are important, researchers have documented rater effects that persist despite rater training and that are not always detected in rater agreement and…
Descriptors: Scoring, Responses, Test Items, Test Format
Waterbury, Glenn Thomas; DeMars, Christine E. – Educational Assessment, 2021
Vertical scaling is used to put tests of different difficulty onto a common metric. The Rasch model is often used to perform vertical scaling, despite its strict functional form. Few, if any, studies have examined anchor item choice when using the Rasch model to vertically scale data that do not fit the model. The purpose of this study was to…
Descriptors: Test Items, Equated Scores, Item Response Theory, Scaling
Russell, Michael; Szendey, Olivia; Kaplan, Larry – Educational Assessment, 2021
Differential Item Function (DIF) analysis is commonly employed to examine potential bias produced by a test item. Since its introduction DIF analyses have focused on potential bias related to broad categories of oppression, including gender, racial stratification, economic class, and ableness. More recently, efforts to examine the effects of…
Descriptors: Test Bias, Achievement Tests, Individual Characteristics, Disadvantaged
Sparks, Jesse R.; van Rijn, Peter W.; Deane, Paul – Educational Assessment, 2021
Effectively evaluating the credibility and accuracy of multiple sources is critical for college readiness. We developed 24 source evaluation tasks spanning four predicted difficulty levels of a hypothesized learning progression (LP) and piloted these tasks to evaluate the utility of an LP-based approach to designing formative literacy assessments.…
Descriptors: Middle School Students, Information Sources, Grade 6, Grade 7
Bulut, Okan; Bulut, Hatice Cigdem; Cormier, Damien C.; Ilgun Dibek, Munevver; Sahin Kursad, Merve – Educational Assessment, 2023
Some statewide testing programs allow students to receive corrective feedback and revise their answers during testing. Despite its pedagogical benefits, the effects of providing revision opportunities remain unknown in the context of alternate assessments. Therefore, this study examined student data from a large-scale alternate assessment that…
Descriptors: Error Correction, Alternative Assessment, Feedback (Response), Multiple Choice Tests
Aydin, Utkun; Birgili, Bengi – Educational Assessment, 2023
Internationally, mathematics education reform has been directed toward characterizing educational goals that go beyond topic/content/skill descriptions and develop students' problem solving. The Revised Bloom's Taxonomy and MATH (Mathematical Assessment Task Hierarchy) Taxonomy characterize such goals. University entrance examinations have been…
Descriptors: Critical Thinking, Thinking Skills, Skill Development, Mathematics Instruction
Russell, Michael; Szendey, Olivia; Li, Zhushan – Educational Assessment, 2022
Recent research provides evidence that an intersectional approach to defining reference and focal groups results in a higher percentage of comparisons flagged for potential DIF. The study presented here examined the generalizability of this pattern across methods for examining DIF. While the level of DIF detection differed among the four methods…
Descriptors: Comparative Analysis, Item Analysis, Test Items, Test Construction
Tracy Noble; Craig S. Wells; Ann S. Rosebery – Educational Assessment, 2023
This article reports on two quantitative studies of English learners' (ELs) interactions with constructed-response items from a Grade 5 state science test. Study 1 investigated the relationships between the constructed-response item-level variables of English Reading Demand, English Writing Demand, and Background Knowledge Demand and the…
Descriptors: Grade 5, State Standards, Standardized Tests, Science Tests
Moon, Jung Aa; Keehner, Madeleine; Katz, Irvin R. – Educational Assessment, 2020
We investigated how item formats influence test takers' response tendencies under uncertainty. Adult participants solved content-equivalent math items in three formats: multiple-selection multiple-choice, grid with forced-choice (true-false) options, and grid with non-forced-choice options. Participants showed a greater tendency to commit (rather…
Descriptors: College Students, Test Wiseness, Test Format, Test Items
Russell, Michael; Moncaleano, Sebastian – Educational Assessment, 2019
Over the past decade, large-scale testing programs have employed technology-enhanced items (TEI) to improve the fidelity with which an item measures a targeted construct. This paper presents findings from a review of released TEIs employed by large-scale testing programs worldwide. Analyses examine the prevalence with which different types of TEIs…
Descriptors: Computer Assisted Testing, Fidelity, Elementary Secondary Education, Test Items
Walker, A. Adrienne; Jennings, Jeremy Kyle; Engelhard, George, Jr. – Educational Assessment, 2018
Individual person fit analyses provide important information regarding the validity of test score inferences for an "individual" test taker. In this study, we use data from an undergraduate statistics test (N = 1135) to illustrate a two-step method that researchers and practitioners can use to examine individual person fit. First, person…
Descriptors: Test Items, Test Validity, Scores, Statistics
Becker, Anthony; Nekrasova-Beker, Tatiana – Educational Assessment, 2018
While previous research has identified numerous factors that contribute to item difficulty, studies involving large-scale reading tests have provided mixed results. This study examined five selected-response item types used to measure reading comprehension in the Pearson Test of English Academic: a) multiple-choice (choose one answer), b)…
Descriptors: Reading Comprehension, Test Items, Reading Tests, Test Format