ERIC - Search Results

Publication Date

In 2025	0
Since 2024	2
Since 2021 (last 5 years)	8
Since 2016 (last 10 years)	19
Since 2006 (last 20 years)	50

Descriptor

Test Bias	65
Test Items	34
Item Response Theory	17
Comparative Analysis	14
Foreign Countries	13
Mathematics Tests	11
Simulation	9
Evaluation Methods	8
International Assessment	8
Error of Measurement	7
Item Analysis	7
Reading Tests	7
Scores	7
Test Validity	7
Accuracy	6
Achievement Tests	6
College Entrance Examinations	6
Computation	6
Difficulty Level	6
Gender Differences	6
Measurement	6
Multiple Choice Tests	6
Statistical Analysis	6
Test Construction	6
Equated Scores	5
More ▼

Source

Applied Measurement in…

Publication Type

Journal Articles	65
Reports - Research	47
Reports - Evaluative	17
Information Analyses	5
Speeches/Meeting Papers	2
Tests/Questionnaires	1

Education Level

Elementary Secondary Education	10
Secondary Education	7
Elementary Education	6
Grade 4	5
Grade 5	5
Higher Education	5
Postsecondary Education	4
Grade 3	3
Grade 7	3
Grade 8	3
Intermediate Grades	3
Grade 6	2
Junior High Schools	2
Middle Schools	2
Early Childhood Education	1
Grade 1	1
Grade 10	1
Grade 12	1
Grade 2	1
High Schools	1
Primary Education	1
More ▼

Audience

Location

Canada	4
Florida	2
Iran	2
United States	2
Belgium	1
California	1
France	1
Jordan	1
Massachusetts	1
Netherlands	1
New York	1
North Carolina	1
Oklahoma	1
Singapore	1
Spain	1
Texas	1
United Kingdom	1
More ▼

Laws, Policies, & Programs

Race to the Top

Assessments and Surveys

Program for International…	8
Trends in International…	4
Graduate Record Examinations	1
SAT (College Admission Test)	1
TerraNova Multiple Assessments	1
Test of English as a Foreign…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 65 results Save | Export

A Critical Review of Fairness from Multiple Perspectives: Implications for Classroom Assessment Theory

Peer reviewed

Direct link

Amirhossein Rasooli; Christopher DeLuca – Applied Measurement in Education, 2024

Inspired by the recent 21st century social and educational movements toward equity, diversity, and inclusion for disadvantaged groups, educational researchers have sought in conceptualizing fairness in classroom assessment contexts. These efforts have provoked promising key theoretical foundations and empirical investigations to examine fairness…

Descriptors: Test Bias, Student Evaluation, Social Justice, Equal Education

Multi-Group Generalizations of SIBTEST and Crossing-SIBTEST

Peer reviewed

Direct link

Chalmers, R. Philip; Zheng, Guoguo – Applied Measurement in Education, 2023

This article presents generalizations of SIBTEST and crossing-SIBTEST statistics for differential item functioning (DIF) investigations involving more than two groups. After reviewing the original two-group setup for these statistics, a set of multigroup generalizations that support contrast matrices for joint tests of DIF are presented. To…

Descriptors: Test Bias, Test Items, Item Response Theory, Error of Measurement

Cross-Cultural Validation of the Mathematics Construct and Attribute Profiles: A Differential Item Functioning Approach

Peer reviewed

Direct link

Yi-Hsin Chen – Applied Measurement in Education, 2024

This study aims to apply the differential item functioning (DIF) technique with the deterministic inputs, noisy "and" gate (DINA) model to validate the mathematics construct and diagnostic attribute profiles across American and Singaporean students. Even with the same ability level, every single item is expected to show uniform DIF…

Descriptors: Foreign Countries, Achievement Tests, Elementary Secondary Education, International Assessment

Accuracy and Sensitivity of Coefficient Alpha and Its Alternatives with Unidimensional and Contaminated Scales

Peer reviewed

Direct link

Xiao, Leifeng; Hau, Kit-Tai – Applied Measurement in Education, 2023

We compared coefficient alpha with five alternatives (omega total, omega RT, omega h, GLB, and coefficient H) in two simulation studies. Results showed for unidimensional scales, (a) all indices except omega h performed similarly well for most conditions; (b) alpha is still good; (c) GLB and coefficient H overestimated reliability with small…

Descriptors: Test Theory, Test Reliability, Factor Analysis, Test Length

When Should Individual Ability Estimates Be Reported if Rapid Guessing Is Present?

Peer reviewed

Direct link

Rios, Joseph A. – Applied Measurement in Education, 2022

Testing programs are confronted with the decision of whether to report individual scores for examinees that have engaged in rapid guessing (RG). As noted by the "Standards for Educational and Psychological Testing," this decision should be based on a documented criterion that determines score exclusion. To this end, a number of heuristic…

Descriptors: Testing, Guessing (Tests), Academic Ability, Scores

Does the Response Options Placement Provide Clues to the Correct Answers in Multiple-Choice Tests? A Systematic Review

Peer reviewed

Direct link

Lions, Séverin; Monsalve, Carlos; Dartnell, Pablo; Blanco, María Paz; Ortega, Gabriel; Lemarié, Julie – Applied Measurement in Education, 2022

Multiple-choice tests are widely used in education, often for high-stakes assessment purposes. Consequently, these tests should be constructed following the highest standards. Many efforts have been undertaken to advance item-writing guidelines intended to improve tests. One important issue is the unwanted effects of the options' position on test…

Descriptors: Multiple Choice Tests, High Stakes Tests, Test Construction, Guidelines

Personalized Online Learning, Test Fairness, and Educational Measurement: Considering Differential Content Exposure Prior to a High Stakes End of Course Exam

Peer reviewed

Direct link

Daniel Katz; Anne Corinne Huggins-Manley; Walter Leite – Applied Measurement in Education, 2022

According to the "Standards for Educational and Psychological Testing" (2014), one aspect of test fairness concerns examinees having comparable opportunities to learn prior to taking tests. Meanwhile, many researchers are developing platforms enhanced by artificial intelligence (AI) that can personalize curriculum to individual student…

Descriptors: High Stakes Tests, Test Bias, Testing Problems, Prior Learning

Detecting Differential Item Functioning Using Cognitive Diagnosis Models: Applications of the Wald Test and Likelihood Ratio Test in a University Entrance Examination

Peer reviewed

Direct link

Mehrazmay, Roghayeh; Ghonsooly, Behzad; de la Torre, Jimmy – Applied Measurement in Education, 2021

The present study aims to examine gender differential item functioning (DIF) in the reading comprehension section of a high stakes test using cognitive diagnosis models. Based on the multiple-group generalized deterministic, noisy "and" gate (MG G-DINA) model, the Wald test and likelihood ratio test are used to detect DIF. The flagged…

Descriptors: Test Bias, College Entrance Examinations, Gender Differences, Reading Tests

Impact of Item Parameter Drift on Rasch Scale Stability in Small Samples over Multiple Administrations

Peer reviewed

Direct link

Kopp, Jason P.; Jones, Andrew T. – Applied Measurement in Education, 2020

Traditional psychometric guidelines suggest that at least several hundred respondents are needed to obtain accurate parameter estimates under the Rasch model. However, recent research indicates that Rasch equating results in accurate parameter estimates with sample sizes as small as 25. Item parameter drift under the Rasch model has been…

Descriptors: Item Response Theory, Psychometrics, Sample Size, Sampling

The Trade-Off between Model Fit, Invariance, and Validity: The Case of PISA Science Assessments

Peer reviewed

Direct link

El Masri, Yasmine H.; Andrich, David – Applied Measurement in Education, 2020

In large-scale educational assessments, it is generally required that tests are composed of items that function invariantly across the groups to be compared. Despite efforts to ensure invariance in the item construction phase, for a range of reasons (including the security of items) it is often necessary to account for differential item…

Descriptors: Models, Goodness of Fit, Test Validity, Achievement Tests

Standard Errors for National Trends in International Large-Scale Assessments in the Case of Cross-National Differential Item Functioning

Peer reviewed

Direct link

Sachse, Karoline A.; Haag, Nicole – Applied Measurement in Education, 2017

Standard errors computed according to the operational practices of international large-scale assessment studies such as the Programme for International Student Assessment's (PISA) or the Trends in International Mathematics and Science Study (TIMSS) may be biased when cross-national differential item functioning (DIF) and item parameter drift are…

Descriptors: Error of Measurement, Test Bias, International Assessment, Computation

Detection of Differential Item Functioning for More than Two Groups: A Monte Carlo Comparison of Methods

Peer reviewed

Direct link

Finch, W. Holmes – Applied Measurement in Education, 2016

Differential item functioning (DIF) assessment is a crucial component in test construction, serving as the primary way in which instrument developers ensure that measures perform in the same way for multiple groups within the population. When such is not the case, scores may not accurately reflect the trait of interest for all individuals in the…

Descriptors: Test Bias, Monte Carlo Methods, Comparative Analysis, Population Groups

Analyzing Fairness among Linguistic Minority Populations Using a Latent Class Differential Item Functioning Approach

Peer reviewed

Direct link

Oliveri, Maria Elena; Ercikan, Kadriye; Lyons-Thomas, Juliette; Holtzman, Steven – Applied Measurement in Education, 2016

Differential item functioning (DIF) analyses have been used as the primary method in large-scale assessments to examine fairness for subgroups. Currently, DIF analyses are conducted utilizing manifest methods using observed characteristics (gender and race/ethnicity) for grouping examinees. Homogeneity of item responses is assumed denoting that…

Descriptors: Test Bias, Language Minorities, Effect Size, Foreign Countries

A General Approach to Measuring Test-Taking Effort on Computer-Based Tests

Peer reviewed

Direct link

Wise, Steven L.; Gao, Lingyun – Applied Measurement in Education, 2017

There has been an increased interest in the impact of unmotivated test taking on test performance and score validity. This has led to the development of new ways of measuring test-taking effort based on item response time. In particular, Response Time Effort (RTE) has been shown to provide an assessment of effort down to the level of individual…

Descriptors: Test Bias, Computer Assisted Testing, Item Response Theory, Achievement Tests

Item Parameter Drift in a Time-Varying Predictor

Peer reviewed

Direct link

Lee, HyeSun – Applied Measurement in Education, 2018

The current simulation study examined the effects of Item Parameter Drift (IPD) occurring in a short scale on parameter estimates in multilevel models where scores from a scale were employed as a time-varying predictor to account for outcome scores. Five factors, including three decisions about IPD, were considered for simulation conditions. It…

Descriptors: Test Items, Hierarchical Linear Modeling, Predictor Variables, Scores

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5

Ercikan, Kadriye	6
Hambleton, Ronald K.	4
Penfield, Randall D.	3
Banks, Kathleen	2
Lyons-Thomas, Juliette	2
Oliveri, Maria Elena	2
Raju, Nambury S.	2
Sireci, Stephen G.	2
Su, Ya-Hui	2
Wang, Wen-Chung	2
Wells, Craig S.	2
Alvarez, Karina	1
Amirhossein Rasooli	1
Andrich, David	1
Angoff, William H.	1
Anne Corinne Huggins-Manley	1
Baldwin, Su	1
Benítez, Isabel	1
Blanco, María Paz	1
Bolt, Sara E.	1
Boughton, Keith A.	1
Bridgeman, Brent	1
Buckendahl, Chad W.	1
Cahalan-Laitusis, Cara	1
More ▼