ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	9
Since 2006 (last 20 years)	17

Descriptor

Computation	18
Scores	18
Test Bias	18
Test Items	9
Item Response Theory	6
Statistical Analysis	6
Evaluation Methods	5
Comparative Analysis	4
Correlation	3
Effect Size	3
Error of Measurement	3
Factor Analysis	3
Simulation	3
Test Reliability	3
Accuracy	2
Data	2
Data Analysis	2
Difficulty Level	2
Educational Assessment	2
Elementary School Students	2
Elementary Secondary Education	2
Foreign Countries	2
Grade 4	2
International Assessment	2
Mathematics Tests	2
More ▼

Source

ETS Research Report Series	3
Applied Measurement in…	2
International Journal of…	2
Journal of Educational and…	2
Educational Testing Service	1
Educational and Psychological…	1
Federal Reserve Bank of Boston	1
Journal of Abnormal Child…	1
Journal of Educational…	1
Journal of Experimental…	1
Journal of Psychoeducational…	1
Measurement and Evaluation in…	1
Structural Equation Modeling	1
More ▼

Publication Type

Journal Articles	16
Reports - Research	13
Reports - Evaluative	4
Reports - Descriptive	1

Education Level

Elementary Education	3
Elementary Secondary Education	3
Grade 4	3
Early Childhood Education	1
Grade 10	1
Grade 3	1
Grade 5	1
Grade 6	1
Grade 7	1
Grade 8	1
Grade 9	1
High Schools	1
Higher Education	1
Intermediate Grades	1
Middle Schools	1
Preschool Education	1
More ▼

Audience

Location

Alabama	1
California	1
Florida	1
Idaho	1
Iran	1
Nebraska	1
New Mexico	1
New York	1
North Dakota	1
Ohio	1
Texas	1
United States	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

National Assessment of…	1
Progress in International…	1
Trends in International…	1
Woodcock Johnson Tests of…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 18 results Save | Export

Ordinal Approaches to Decomposing Between-Group Test Score Disparities

Peer reviewed

Direct link

Quinn, David M.; Ho, Andrew D. – Journal of Educational and Behavioral Statistics, 2021

The estimation of test score "gaps" and gap trends plays an important role in monitoring educational inequality. Researchers decompose gaps and gap changes into within- and between-school portions to generate evidence on the role schools play in shaping these inequalities. However, existing decomposition methods assume an equal-interval…

Descriptors: Scores, Tests, Achievement Gap, Equal Education

Estimating Difference-Score Reliability in Pretest-Posttest Settings

Peer reviewed

Direct link

Gu, Zhengguo; Emons, Wilco H. M.; Sijtsma, Klaas – Journal of Educational and Behavioral Statistics, 2021

Clinical, medical, and health psychologists use difference scores obtained from pretest--posttest designs employing the same test to assess intraindividual change possibly caused by an intervention addressing, for example, anxiety, depression, eating disorder, or addiction. Reliability of difference scores is important for interpreting observed…

Descriptors: Test Reliability, Scores, Pretests Posttests, Computation

Observed Scores as Matching Variables in Differential Item Functioning under the One- and Two-Parameter Logistic Models: Population Results. Research Report. ETS RR-19-06

Peer reviewed
PDF on ERIC

Download full text

Guo, Hongwen; Dorans, Neil J. – ETS Research Report Series, 2019

We derive formulas for the differential item functioning (DIF) measures that two routinely used DIF statistics are designed to estimate. The DIF measures that match on observed scores are compared to DIF measures based on an unobserved ability (theta or true score) for items that are described by either the one-parameter logistic (1PL) or…

Descriptors: Scores, Test Bias, Statistical Analysis, Item Response Theory

Examining Differential Item Functioning: IRT-Based Detection in the Framework of Confirmatory Factor Analysis

Peer reviewed

Direct link

Dimitrov, Dimiter M. – Measurement and Evaluation in Counseling and Development, 2017

This article offers an approach to examining differential item functioning (DIF) under its item response theory (IRT) treatment in the framework of confirmatory factor analysis (CFA). The approach is based on integrating IRT- and CFA-based testing of DIF and using bias-corrected bootstrap confidence intervals with a syntax code in Mplus.

Descriptors: Test Bias, Item Response Theory, Factor Analysis, Evaluation Methods

Item Parameter Drift in a Time-Varying Predictor

Peer reviewed

Direct link

Lee, HyeSun – Applied Measurement in Education, 2018

The current simulation study examined the effects of Item Parameter Drift (IPD) occurring in a short scale on parameter estimates in multilevel models where scores from a scale were employed as a time-varying predictor to account for outcome scores. Five factors, including three decisions about IPD, were considered for simulation conditions. It…

Descriptors: Test Items, Hierarchical Linear Modeling, Predictor Variables, Scores

An Empirical Investigation of the Potential Impact of Item Misfit on Test Scores. Research Report. ETS RR-17-60

Peer reviewed
PDF on ERIC

Download full text

Kim, Sooyeon; Robin, Frederic – ETS Research Report Series, 2017

In this study, we examined the potential impact of item misfit on the reported scores of an admission test from the subpopulation invariance perspective. The target population of the test consisted of 3 major subgroups with different geographic regions. We used the logistic regression function to estimate item parameters of the operational items…

Descriptors: Scores, Test Items, Test Bias, International Assessment

Effects of Differential Item Functioning on Examinees' Test Performance and Reliability of Test

Peer reviewed

Direct link

Lee, Yi-Hsuan; Zhang, Jinming – International Journal of Testing, 2017

Simulations were conducted to examine the effect of differential item functioning (DIF) on measurement consequences such as total scores, item response theory (IRT) ability estimates, and test reliability in terms of the ratio of true-score variance to observed-score variance and the standard error of estimation for the IRT ability parameter. The…

Descriptors: Test Bias, Test Reliability, Performance, Scores

Estimating Variance Components from Sparse Data Matrices in Large-Scale Educational Assessments

Peer reviewed

Direct link

DeMars, Christine – Applied Measurement in Education, 2015

In generalizability theory studies in large-scale testing contexts, sometimes a facet is very sparsely crossed with the object of measurement. For example, when assessments are scored by human raters, it may not be practical to have every rater score all students. Sometimes the scoring is systematically designed such that the raters are…

Descriptors: Educational Assessment, Measurement, Data, Generalizability Theory

Rasch Mixture Models for DIF Detection: A Comparison of Old and New Score Specifications

Peer reviewed

Direct link

Frick, Hannah; Strobl, Carolin; Zeileis, Achim – Educational and Psychological Measurement, 2015

Rasch mixture models can be a useful tool when checking the assumption of measurement invariance for a single Rasch model. They provide advantages compared to manifest differential item functioning (DIF) tests when the DIF groups are only weakly correlated with the manifest covariates available. Unlike in single Rasch models, estimation of Rasch…

Descriptors: Item Response Theory, Test Bias, Comparative Analysis, Scores

An Examination of Construct Validity for the EARLI Numeracy Skill Measures

Peer reviewed

Direct link

Cheng, Weiyi; Lei, Pui-Wa; DiPerna, James C. – Journal of Experimental Education, 2017

The purpose of the current study was to examine dimensionality and concurrent validity evidence of the EARLI numeracy measures (DiPerna, Morgan, & Lei, 2007), which were developed to assess key skills such as number identification, counting, and basic arithmetic. Two methods (NOHARM with approximate chi-square test and DIMTEST with DETECT…

Descriptors: Construct Validity, Numeracy, Mathematics Tests, Statistical Analysis

Test Review: Reynolds, C. R., Voress, J. V., Kamphaus, R. W. (2015), "Mathematics Fluency and Calculation Tests (MFaCTs) review." PRO-ED

Peer reviewed

Direct link

Marbach, Joshua – Journal of Psychoeducational Assessment, 2017

The Mathematics Fluency and Calculation Tests (MFaCTs) are a series of measures designed to assess for arithmetic calculation skills and calculation fluency in children ages 6 through 18. There are five main purposes of the MFaCTs: (1) identifying students who are behind in basic math fact automaticity; (2) evaluating possible delays in arithmetic…

Descriptors: Mathematics Tests, Computation, Mathematics Skills, Arithmetic

Toward Increasing Fairness in Score Scale Calibrations Employed in International Large-Scale Assessments

Peer reviewed

Direct link

Oliveri, Maria Elena; von Davier, Matthias – International Journal of Testing, 2014

In this article, we investigate the creation of comparable score scales across countries in international assessments. We examine potential improvements to current score scale calibration procedures used in international large-scale assessments. Our approach seeks to improve fairness in scoring international large-scale assessments, which often…

Descriptors: Test Bias, Scores, International Programs, Educational Assessment

A Comparison of Methods for Estimating Conditional Item Score Differences in Differential Item Functioning (DIF) Assessments. Research Report. ETS RR-10-15

Download full text

Moses, Tim; Miao, Jing; Dorans, Neil – Educational Testing Service, 2010

This study compared the accuracies of four differential item functioning (DIF) estimation methods, where each method makes use of only one of the following: raw data, logistic regression, loglinear models, or kernel smoothing. The major focus was on the estimation strategies' potential for estimating score-level, conditional DIF. A secondary focus…

Descriptors: Test Bias, Statistical Analysis, Computation, Scores

Summarizing Item Difficulty Variation with Parcel Scores

Peer reviewed

Direct link

Camilli, Gregory; Prowker, Adam; Dossey, John A.; Lindquist, Mary M.; Chiu, Ting-Wei; Vargas, Sadako; de la Torre, Jimmy – Journal of Educational Measurement, 2008

A new method for analyzing differential item functioning is proposed to investigate the relative strengths and weaknesses of multiple groups of examinees. Accordingly, the notion of a conditional measure of difference between two groups (Reference and Focal) is generalized to a conditional variance. The objective of this article is to present and…

Descriptors: Test Bias, National Competency Tests, Grade 4, Difficulty Level

Evaluating the Effectiveness of a Full-Population Estimation Method. Research Report. ETS RR-08-18

Peer reviewed
PDF on ERIC

Download full text

Braun, Henry; Zhang, Jinming; Vezzu, Sailesh – ETS Research Report Series, 2008

At present, although the percentages of students with disabilities (SDs) and/or students who are English language learners (ELL) excluded from a NAEP administration are reported, no statistical adjustment is made for these excluded students in the calculation of NAEP results. However, the exclusion rates for both SD and ELL students vary…

Descriptors: Research Methodology, Computation, Disabilities, English Language Learners

Previous Page | Next Page »

Pages: 1 | 2

Zhang, Jinming	2
Braun, Henry	1
Burke, Mary A.	1
Camilli, Gregory	1
Cheng, Weiyi	1
Cheung, Mike W. L.	1
Chiu, Ting-Wei	1
DeMars, Christine	1
DiPerna, James C.	1
Dimitrov, Dimiter M.	1
Dorans, Neil	1
Dorans, Neil J.	1
Dossey, John A.	1
Emons, Wilco H. M.	1
Frick, Hannah	1
Gillberg, Christopher	1
Gu, Zhengguo	1
Guo, Hongwen	1
Heiervang, Einar	1
Heimann, Mikael	1
Ho, Andrew D.	1
Kim, Sooyeon	1
Lee, HyeSun	1
Lee, Yi-Hsuan	1
Lei, Pui-Wa	1
More ▼