ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	12
Since 2016 (last 10 years)	29
Since 2006 (last 20 years)	57

Descriptor

Test Items	69
Mathematics Tests	25
Difficulty Level	18
Scores	18
Test Construction	18
Science Tests	15
Item Response Theory	14
Test Validity	12
Grade 4	11
Item Analysis	11
Multiple Choice Tests	11
Test Bias	11
Test Format	10
Grade 5	9
Middle School Students	9
Student Evaluation	9
English (Second Language)	8
Grade 8	8
Achievement Tests	7
Elementary School Students	7
English Language Learners	7
Reading Tests	7
Regression (Statistics)	7
Comparative Analysis	6
Computer Assisted Testing	6
More ▼

Source

Educational Assessment

Publication Type

Journal Articles	69
Reports - Research	47
Reports - Evaluative	17
Reports - Descriptive	5
Tests/Questionnaires	3
Information Analyses	1
Speeches/Meeting Papers	1

Education Level

Elementary Education	21
Middle Schools	20
Secondary Education	16
Intermediate Grades	15
Grade 4	14
Junior High Schools	13
Elementary Secondary Education	10
Grade 5	10
Grade 8	10
Grade 6	8
Grade 7	8
Higher Education	6
High Schools	5
Postsecondary Education	4
Grade 10	3
Grade 3	3
Early Childhood Education	2
Grade 9	2
Primary Education	2
Preschool Education	1
More ▼

Audience

Location

California	3
Massachusetts	3
Washington	3
Kansas	2
Minnesota	2
Oregon	2
Turkey	2
Alabama	1
Canada	1
Florida	1
Georgia	1
Germany	1
Idaho	1
Illinois	1
Indiana	1
Michigan	1
Missouri	1
New Jersey	1
New York (New York)	1
Ohio	1
Utah	1
Vermont	1
More ▼

Laws, Policies, & Programs

Individuals with Disabilities…	1
Individuals with Disabilities…	1
No Child Left Behind Act 2001	1

Assessments and Surveys

National Assessment of…	3
Massachusetts Comprehensive…	1
Motivated Strategies for…	1
Trends in International…	1
Washington Assessment of…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 69 results Save | Export

Measuring Item Influence for Diagnostic Classification Models

Peer reviewed

Direct link

Daniel P. Jurich; Matthew J. Madison – Educational Assessment, 2023

Diagnostic classification models (DCMs) are psychometric models that provide probabilistic classifications of examinees on a set of discrete latent attributes. When analyzing or constructing assessments scored by DCMs, understanding how each item influences attribute classifications can clarify the meaning of the measured constructs, facilitate…

Descriptors: Test Items, Models, Classification, Influences

It Ain't near 'Bout Fair: Re-Envisioning the Bias and Sensitivity Review Process from a Justice-Oriented Antiracist Perspective

Peer reviewed

Direct link

Randall, Jennifer – Educational Assessment, 2023

In a justice-oriented antiracist assessment process, attention to the disruption of white supremacy must occur at every stage--from construct articulation to score reporting. An important step in the assessment development process is the item review stage often referred to as Bias/Fairness and Sensitivity Review. I argue that typical approaches to…

Descriptors: Social Justice, Racism, Test Bias, Test Items

Using Full-Information Item Analysis to Improve Item Quality

Peer reviewed

Direct link

Haladyna, Thomas M.; Rodriguez, Michael C. – Educational Assessment, 2021

Full-information item analysis provides item developers and reviewers comprehensive empirical evidence of item quality, including option response frequency, point-biserial index (PBI) for distractors, mean-scores of respondents selecting each option, and option trace lines. The multi-serial index (MSI) is introduced as a more informative…

Descriptors: Test Items, Item Analysis, Reading Tests, Mathematics Tests

Beyond Agreement: Exploring Rater Effects in Large-Scale Mixed Format Assessments

Peer reviewed

Direct link

Wind, Stefanie A.; Guo, Wenjing – Educational Assessment, 2021

Scoring procedures for the constructed-response (CR) items in large-scale mixed-format educational assessments often involve checks for rater agreement or rater reliability. Although these analyses are important, researchers have documented rater effects that persist despite rater training and that are not always detected in rater agreement and…

Descriptors: Scoring, Responses, Test Items, Test Format

Anchors Aweigh: How the Choice of Anchor Items Affects the Vertical Scaling of 3PL Data with the Rasch Model

Peer reviewed

Direct link

Waterbury, Glenn Thomas; DeMars, Christine E. – Educational Assessment, 2021

Vertical scaling is used to put tests of different difficulty onto a common metric. The Rasch model is often used to perform vertical scaling, despite its strict functional form. Few, if any, studies have examined anchor item choice when using the Rasch model to vertically scale data that do not fit the model. The purpose of this study was to…

Descriptors: Test Items, Equated Scores, Item Response Theory, Scaling

An Intersectional Approach to DIF: Do Initial Findings Hold across Tests?

Peer reviewed

Direct link

Russell, Michael; Szendey, Olivia; Kaplan, Larry – Educational Assessment, 2021

Differential Item Function (DIF) analysis is commonly employed to examine potential bias produced by a test item. Since its introduction DIF analyses have focused on potential bias related to broad categories of oppression, including gender, racial stratification, economic class, and ableness. More recently, efforts to examine the effects of…

Descriptors: Test Bias, Achievement Tests, Individual Characteristics, Disadvantaged

Assessing Source Evaluation Skills of Middle School Students Using Learning Progressions

Peer reviewed

Direct link

Sparks, Jesse R.; van Rijn, Peter W.; Deane, Paul – Educational Assessment, 2021

Effectively evaluating the credibility and accuracy of multiple sources is critical for college readiness. We developed 24 source evaluation tasks spanning four predicted difficulty levels of a hypothesized learning progression (LP) and piloted these tasks to evaluate the utility of an LP-based approach to designing formative literacy assessments.…

Descriptors: Middle School Students, Information Sources, Grade 6, Grade 7

The Effects of Providing Students with Revision Opportunities in Alternate Assessments

Peer reviewed

Direct link

Bulut, Okan; Bulut, Hatice Cigdem; Cormier, Damien C.; Ilgun Dibek, Munevver; Sahin Kursad, Merve – Educational Assessment, 2023

Some statewide testing programs allow students to receive corrective feedback and revise their answers during testing. Despite its pedagogical benefits, the effects of providing revision opportunities remain unknown in the context of alternate assessments. Therefore, this study examined student data from a large-scale alternate assessment that…

Descriptors: Error Correction, Alternative Assessment, Feedback (Response), Multiple Choice Tests

Assessing Mathematical Higher-Order Thinking Skills: An Analysis of Turkish University Entrance Examinations

Peer reviewed

Direct link

Aydin, Utkun; Birgili, Bengi – Educational Assessment, 2023

Internationally, mathematics education reform has been directed toward characterizing educational goals that go beyond topic/content/skill descriptions and develop students' problem solving. The Revised Bloom's Taxonomy and MATH (Mathematical Assessment Task Hierarchy) Taxonomy characterize such goals. University entrance examinations have been…

Descriptors: Critical Thinking, Thinking Skills, Skill Development, Mathematics Instruction

An Intersectional Approach to DIF: Comparing Outcomes across Methods

Peer reviewed

Direct link

Russell, Michael; Szendey, Olivia; Li, Zhushan – Educational Assessment, 2022

Recent research provides evidence that an intersectional approach to defining reference and focal groups results in a higher percentage of comparisons flagged for potential DIF. The study presented here examined the generalizability of this pattern across methods for examining DIF. While the level of DIF detection differed among the four methods…

Descriptors: Comparative Analysis, Item Analysis, Test Items, Test Construction

English Learners and Constructed-Response Science Test Items Challenges and Opportunities

Peer reviewed

Direct link

Tracy Noble; Craig S. Wells; Ann S. Rosebery – Educational Assessment, 2023

This article reports on two quantitative studies of English learners' (ELs) interactions with constructed-response items from a Grade 5 state science test. Study 1 investigated the relationships between the constructed-response item-level variables of English Reading Demand, English Writing Demand, and Background Knowledge Demand and the…

Descriptors: Grade 5, State Standards, Standardized Tests, Science Tests

Test Takers' Response Tendencies in Alternative Item Formats: A Cognitive Science Approach

Peer reviewed

Direct link

Moon, Jung Aa; Keehner, Madeleine; Katz, Irvin R. – Educational Assessment, 2020

We investigated how item formats influence test takers' response tendencies under uncertainty. Adult participants solved content-equivalent math items in three formats: multiple-selection multiple-choice, grid with forced-choice (true-false) options, and grid with non-forced-choice options. Participants showed a greater tendency to commit (rather…

Descriptors: College Students, Test Wiseness, Test Format, Test Items

Examining the Use and Construct Fidelity of Technology-Enhanced Items Employed by K-12 Testing Programs

Peer reviewed

Direct link

Russell, Michael; Moncaleano, Sebastian – Educational Assessment, 2019

Over the past decade, large-scale testing programs have employed technology-enhanced items (TEI) to improve the fidelity with which an item measures a targeted construct. This paper presents findings from a review of released TEIs employed by large-scale testing programs worldwide. Analyses examine the prevalence with which different types of TEIs…

Descriptors: Computer Assisted Testing, Fidelity, Elementary Secondary Education, Test Items

Using Person Response Functions to Investigate Areas of Person Misfit Related to Item Characteristics

Peer reviewed

Direct link

Walker, A. Adrienne; Jennings, Jeremy Kyle; Engelhard, George, Jr. – Educational Assessment, 2018

Individual person fit analyses provide important information regarding the validity of test score inferences for an "individual" test taker. In this study, we use data from an undergraduate statistics test (N = 1135) to illustrate a two-step method that researchers and practitioners can use to examine individual person fit. First, person…

Descriptors: Test Items, Test Validity, Scores, Statistics

Investigating the Effect of Different Selected-Response Item Formats for Reading Comprehension

Peer reviewed

Direct link

Becker, Anthony; Nekrasova-Beker, Tatiana – Educational Assessment, 2018

While previous research has identified numerous factors that contribute to item difficulty, studies involving large-scale reading tests have provided mixed results. This study examined five selected-response item types used to measure reading comprehension in the Pearson Test of English Academic: a) multiple-choice (choose one answer), b)…

Descriptors: Reading Comprehension, Test Items, Reading Tests, Test Format

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5

Lee, Hee-Sun	3
Linn, Marcia C.	3
Liu, Ou Lydia	3
Russell, Michael	3
Solano-Flores, Guillermo	3
Briggs, Derek C.	2
Bulut, Okan	2
Cormier, Damien C.	2
DeMars, Christine E.	2
Huff, Kristen L.	2
Katz, Irvin R.	2
Li, Min	2
Lopez, Alexis	2
Plake, Barbara S.	2
Rodriguez, Michael C.	2
Sireci, Stephen G.	2
Szendey, Olivia	2
Taylor, Catherine S.	2
Wolf, Mikyung Kim	2
Abedi, Jamal	1
Alonzo, Alicia C.	1
Ann S. Rosebery	1
Aydin, Utkun	1
Barnett-Clarke, Carne	1
Barton, Karen	1
More ▼