ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	1
Since 2006 (last 20 years)	5

Descriptor

Error of Measurement	5
Statistical Bias	5
Testing	5
Computation	3
Comparative Analysis	2
Evaluation Methods	2
Item Response Theory	2
Maximum Likelihood Statistics	2
Monte Carlo Methods	2
Multiple Regression Analysis	2
Simulation	2
Statistical Analysis	2
Test Bias	2
Test Results	2
Accuracy	1
Achievement	1
Background	1
Classification	1
Computer Software	1
Correlation	1
Equated Scores	1
Grade 8	1
Interviews	1
Item Sampling	1
Mathematics Tests	1
More ▼

Source

Applied Psychological…	2
Applied Measurement in…	1
Educational and Psychological…	1
Journal of Educational…	1

Author

Woods, Carol M.	2
Hsiao, Yu-Yu	1
Kwok, Oi-Man	1
Lai, Mark H. C.	1
Puhan, Gautam	1
Rutkowski, Leslie	1

Publication Type

Journal Articles	5
Reports - Research	4
Reports - Evaluative	1

Education Level

Grade 8	1
Junior High Schools	1
Middle Schools	1
Secondary Education	1

Audience

Practitioners

Location

Laws, Policies, & Programs

Assessments and Surveys

National Assessment of…	1
Program for International…	1
Progress in International…	1
Trends in International…	1

What Works Clearinghouse Rating

Showing all 5 results Save | Export

Evaluation of Two Methods for Modeling Measurement Errors When Testing Interaction Effects with Observed Composite Scores

Peer reviewed

Direct link

Hsiao, Yu-Yu; Kwok, Oi-Man; Lai, Mark H. C. – Educational and Psychological Measurement, 2018

Path models with observed composites based on multiple items (e.g., mean or sum score of the items) are commonly used to test interaction effects. Under this practice, researchers generally assume that the observed composites are measured without errors. In this study, we reviewed and evaluated two alternative methods within the structural…

Descriptors: Error of Measurement, Testing, Scores, Models

Sensitivity of Achievement Estimation to Conditioning Model Misclassification

Peer reviewed

Direct link

Rutkowski, Leslie – Applied Measurement in Education, 2014

Large-scale assessment programs such as the National Assessment of Educational Progress (NAEP), Trends in International Mathematics and Science Study (TIMSS), and Programme for International Student Assessment (PISA) use a sophisticated assessment administration design called matrix sampling that minimizes the testing burden on individual…

Descriptors: Measurement, Testing, Item Sampling, Computation

Choosing among Tucker or Chained Linear Equating in Two Testing Situations: Rater Comparability Scoring and Randomly Equivalent Groups with an Anchor

Peer reviewed

Direct link

Puhan, Gautam – Journal of Educational Measurement, 2012

Tucker and chained linear equatings were evaluated in two testing scenarios. In Scenario 1, referred to as rater comparability scoring and equating, the anchor-to-total correlation is often very high for the new form but moderate for the reference form. This may adversely affect the results of Tucker equating, especially if the new and reference…

Descriptors: Testing, Scoring, Equated Scores, Statistical Analysis

Ramsay-Curve Differential Item Functioning

Peer reviewed

Direct link

Woods, Carol M. – Applied Psychological Measurement, 2011

Differential item functioning (DIF) occurs when an item on a test, questionnaire, or interview has different measurement properties for one group of people versus another, irrespective of true group-mean differences on the constructs being measured. This article is focused on item response theory based likelihood ratio testing for DIF (IRT-LR or…

Descriptors: Simulation, Item Response Theory, Testing, Questionnaires

Empirical Selection of Anchors for Tests of Differential Item Functioning

Peer reviewed

Direct link

Woods, Carol M. – Applied Psychological Measurement, 2009

Differential item functioning (DIF) occurs when items on a test or questionnaire have different measurement properties for one group of people versus another, irrespective of group-mean differences on the construct. Methods for testing DIF require matching members of different groups on an estimate of the construct. Preferably, the estimate is…

Descriptors: Test Results, Testing, Item Response Theory, Test Bias