Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 1 |
Since 2006 (last 20 years) | 5 |
Descriptor
Source
Applied Psychological… | 2 |
Applied Measurement in… | 1 |
Educational and Psychological… | 1 |
Journal of Educational… | 1 |
Publication Type
Journal Articles | 5 |
Reports - Research | 4 |
Reports - Evaluative | 1 |
Education Level
Grade 8 | 1 |
Junior High Schools | 1 |
Middle Schools | 1 |
Secondary Education | 1 |
Audience
Practitioners | 1 |
Location
Laws, Policies, & Programs
Assessments and Surveys
National Assessment of… | 1 |
Program for International… | 1 |
Progress in International… | 1 |
Trends in International… | 1 |
What Works Clearinghouse Rating
Hsiao, Yu-Yu; Kwok, Oi-Man; Lai, Mark H. C. – Educational and Psychological Measurement, 2018
Path models with observed composites based on multiple items (e.g., mean or sum score of the items) are commonly used to test interaction effects. Under this practice, researchers generally assume that the observed composites are measured without errors. In this study, we reviewed and evaluated two alternative methods within the structural…
Descriptors: Error of Measurement, Testing, Scores, Models
Rutkowski, Leslie – Applied Measurement in Education, 2014
Large-scale assessment programs such as the National Assessment of Educational Progress (NAEP), Trends in International Mathematics and Science Study (TIMSS), and Programme for International Student Assessment (PISA) use a sophisticated assessment administration design called matrix sampling that minimizes the testing burden on individual…
Descriptors: Measurement, Testing, Item Sampling, Computation
Puhan, Gautam – Journal of Educational Measurement, 2012
Tucker and chained linear equatings were evaluated in two testing scenarios. In Scenario 1, referred to as rater comparability scoring and equating, the anchor-to-total correlation is often very high for the new form but moderate for the reference form. This may adversely affect the results of Tucker equating, especially if the new and reference…
Descriptors: Testing, Scoring, Equated Scores, Statistical Analysis
Woods, Carol M. – Applied Psychological Measurement, 2011
Differential item functioning (DIF) occurs when an item on a test, questionnaire, or interview has different measurement properties for one group of people versus another, irrespective of true group-mean differences on the constructs being measured. This article is focused on item response theory based likelihood ratio testing for DIF (IRT-LR or…
Descriptors: Simulation, Item Response Theory, Testing, Questionnaires
Woods, Carol M. – Applied Psychological Measurement, 2009
Differential item functioning (DIF) occurs when items on a test or questionnaire have different measurement properties for one group of people versus another, irrespective of group-mean differences on the construct. Methods for testing DIF require matching members of different groups on an estimate of the construct. Preferably, the estimate is…
Descriptors: Test Results, Testing, Item Response Theory, Test Bias