Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 9 |
Since 2006 (last 20 years) | 17 |
Descriptor
Computation | 18 |
Scores | 18 |
Test Bias | 18 |
Test Items | 9 |
Item Response Theory | 6 |
Statistical Analysis | 6 |
Evaluation Methods | 5 |
Comparative Analysis | 4 |
Correlation | 3 |
Effect Size | 3 |
Error of Measurement | 3 |
More ▼ |
Source
Author
Zhang, Jinming | 2 |
Braun, Henry | 1 |
Burke, Mary A. | 1 |
Camilli, Gregory | 1 |
Cheng, Weiyi | 1 |
Cheung, Mike W. L. | 1 |
Chiu, Ting-Wei | 1 |
DeMars, Christine | 1 |
DiPerna, James C. | 1 |
Dimitrov, Dimiter M. | 1 |
Dorans, Neil | 1 |
More ▼ |
Publication Type
Journal Articles | 16 |
Reports - Research | 13 |
Reports - Evaluative | 4 |
Reports - Descriptive | 1 |
Education Level
Elementary Education | 3 |
Elementary Secondary Education | 3 |
Grade 4 | 3 |
Early Childhood Education | 1 |
Grade 10 | 1 |
Grade 3 | 1 |
Grade 5 | 1 |
Grade 6 | 1 |
Grade 7 | 1 |
Grade 8 | 1 |
Grade 9 | 1 |
More ▼ |
Audience
Location
Alabama | 1 |
California | 1 |
Florida | 1 |
Idaho | 1 |
Iran | 1 |
Nebraska | 1 |
New Mexico | 1 |
New York | 1 |
North Dakota | 1 |
Ohio | 1 |
Texas | 1 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
National Assessment of… | 1 |
Progress in International… | 1 |
Trends in International… | 1 |
Woodcock Johnson Tests of… | 1 |
What Works Clearinghouse Rating
Quinn, David M.; Ho, Andrew D. – Journal of Educational and Behavioral Statistics, 2021
The estimation of test score "gaps" and gap trends plays an important role in monitoring educational inequality. Researchers decompose gaps and gap changes into within- and between-school portions to generate evidence on the role schools play in shaping these inequalities. However, existing decomposition methods assume an equal-interval…
Descriptors: Scores, Tests, Achievement Gap, Equal Education
Gu, Zhengguo; Emons, Wilco H. M.; Sijtsma, Klaas – Journal of Educational and Behavioral Statistics, 2021
Clinical, medical, and health psychologists use difference scores obtained from pretest--posttest designs employing the same test to assess intraindividual change possibly caused by an intervention addressing, for example, anxiety, depression, eating disorder, or addiction. Reliability of difference scores is important for interpreting observed…
Descriptors: Test Reliability, Scores, Pretests Posttests, Computation
Guo, Hongwen; Dorans, Neil J. – ETS Research Report Series, 2019
We derive formulas for the differential item functioning (DIF) measures that two routinely used DIF statistics are designed to estimate. The DIF measures that match on observed scores are compared to DIF measures based on an unobserved ability (theta or true score) for items that are described by either the one-parameter logistic (1PL) or…
Descriptors: Scores, Test Bias, Statistical Analysis, Item Response Theory
Dimitrov, Dimiter M. – Measurement and Evaluation in Counseling and Development, 2017
This article offers an approach to examining differential item functioning (DIF) under its item response theory (IRT) treatment in the framework of confirmatory factor analysis (CFA). The approach is based on integrating IRT- and CFA-based testing of DIF and using bias-corrected bootstrap confidence intervals with a syntax code in Mplus.
Descriptors: Test Bias, Item Response Theory, Factor Analysis, Evaluation Methods
Lee, HyeSun – Applied Measurement in Education, 2018
The current simulation study examined the effects of Item Parameter Drift (IPD) occurring in a short scale on parameter estimates in multilevel models where scores from a scale were employed as a time-varying predictor to account for outcome scores. Five factors, including three decisions about IPD, were considered for simulation conditions. It…
Descriptors: Test Items, Hierarchical Linear Modeling, Predictor Variables, Scores
Kim, Sooyeon; Robin, Frederic – ETS Research Report Series, 2017
In this study, we examined the potential impact of item misfit on the reported scores of an admission test from the subpopulation invariance perspective. The target population of the test consisted of 3 major subgroups with different geographic regions. We used the logistic regression function to estimate item parameters of the operational items…
Descriptors: Scores, Test Items, Test Bias, International Assessment
Lee, Yi-Hsuan; Zhang, Jinming – International Journal of Testing, 2017
Simulations were conducted to examine the effect of differential item functioning (DIF) on measurement consequences such as total scores, item response theory (IRT) ability estimates, and test reliability in terms of the ratio of true-score variance to observed-score variance and the standard error of estimation for the IRT ability parameter. The…
Descriptors: Test Bias, Test Reliability, Performance, Scores
DeMars, Christine – Applied Measurement in Education, 2015
In generalizability theory studies in large-scale testing contexts, sometimes a facet is very sparsely crossed with the object of measurement. For example, when assessments are scored by human raters, it may not be practical to have every rater score all students. Sometimes the scoring is systematically designed such that the raters are…
Descriptors: Educational Assessment, Measurement, Data, Generalizability Theory
Frick, Hannah; Strobl, Carolin; Zeileis, Achim – Educational and Psychological Measurement, 2015
Rasch mixture models can be a useful tool when checking the assumption of measurement invariance for a single Rasch model. They provide advantages compared to manifest differential item functioning (DIF) tests when the DIF groups are only weakly correlated with the manifest covariates available. Unlike in single Rasch models, estimation of Rasch…
Descriptors: Item Response Theory, Test Bias, Comparative Analysis, Scores
Cheng, Weiyi; Lei, Pui-Wa; DiPerna, James C. – Journal of Experimental Education, 2017
The purpose of the current study was to examine dimensionality and concurrent validity evidence of the EARLI numeracy measures (DiPerna, Morgan, & Lei, 2007), which were developed to assess key skills such as number identification, counting, and basic arithmetic. Two methods (NOHARM with approximate chi-square test and DIMTEST with DETECT…
Descriptors: Construct Validity, Numeracy, Mathematics Tests, Statistical Analysis
Marbach, Joshua – Journal of Psychoeducational Assessment, 2017
The Mathematics Fluency and Calculation Tests (MFaCTs) are a series of measures designed to assess for arithmetic calculation skills and calculation fluency in children ages 6 through 18. There are five main purposes of the MFaCTs: (1) identifying students who are behind in basic math fact automaticity; (2) evaluating possible delays in arithmetic…
Descriptors: Mathematics Tests, Computation, Mathematics Skills, Arithmetic
Oliveri, Maria Elena; von Davier, Matthias – International Journal of Testing, 2014
In this article, we investigate the creation of comparable score scales across countries in international assessments. We examine potential improvements to current score scale calibration procedures used in international large-scale assessments. Our approach seeks to improve fairness in scoring international large-scale assessments, which often…
Descriptors: Test Bias, Scores, International Programs, Educational Assessment
Moses, Tim; Miao, Jing; Dorans, Neil – Educational Testing Service, 2010
This study compared the accuracies of four differential item functioning (DIF) estimation methods, where each method makes use of only one of the following: raw data, logistic regression, loglinear models, or kernel smoothing. The major focus was on the estimation strategies' potential for estimating score-level, conditional DIF. A secondary focus…
Descriptors: Test Bias, Statistical Analysis, Computation, Scores
Camilli, Gregory; Prowker, Adam; Dossey, John A.; Lindquist, Mary M.; Chiu, Ting-Wei; Vargas, Sadako; de la Torre, Jimmy – Journal of Educational Measurement, 2008
A new method for analyzing differential item functioning is proposed to investigate the relative strengths and weaknesses of multiple groups of examinees. Accordingly, the notion of a conditional measure of difference between two groups (Reference and Focal) is generalized to a conditional variance. The objective of this article is to present and…
Descriptors: Test Bias, National Competency Tests, Grade 4, Difficulty Level
Braun, Henry; Zhang, Jinming; Vezzu, Sailesh – ETS Research Report Series, 2008
At present, although the percentages of students with disabilities (SDs) and/or students who are English language learners (ELL) excluded from a NAEP administration are reported, no statistical adjustment is made for these excluded students in the calculation of NAEP results. However, the exclusion rates for both SD and ELL students vary…
Descriptors: Research Methodology, Computation, Disabilities, English Language Learners
Previous Page | Next Page ยป
Pages: 1 | 2