ERIC - Search Results

Publication Date

In 2025	1
Since 2024	1
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	4

Descriptor

Error of Measurement	7
Evaluation Methods	7
Scaling	7
Item Response Theory	3
Measurement Techniques	3
Hypothesis Testing	2
Sample Size	2
Scores	2
Scoring	2
Statistical Inference	2
Test Items	2
Ability	1
Achievement Tests	1
Adults	1
Alternative Assessment	1
Competence	1
Computation	1
Data Analysis	1
Educational Research	1
Equated Scores	1
Evaluators	1
Foreign Countries	1
Grade 9	1
Hierarchical Linear Modeling	1
Interrater Reliability	1
More ▼

Source

Educational and Psychological…	1
International Journal of…	1
Psychometrika	1
Stanford Center for Education…	1

Author

Benjamin Lugu	1
Carstensen, Claus H.	1
Cook, Linda L.	1
Croon, Marcel A.	1
Ho, Andrew D.	1
Kalogrides, Demetra	1
Köhler, Carmen	1
Petersen, Nancy S.	1
Pohl, Steffi	1
Reardon, Sean F.	1
Rudner, Lawrence M.	1
Sijtsma, Klaas	1
Stefanie A. Wind	1
Thayer, Dorothy T.	1
Yurou Wang	1
Zwick, Rebecca	1
van der Ark, L. Andries	1
More ▼

Publication Type

Reports - Research	5
Journal Articles	3
Speeches/Meeting Papers	2
ERIC Digests in Full Text	1
ERIC Publications	1
Reports - Descriptive	1

Education Level

Grade 9	1
High Schools	1
Junior High Schools	1
Middle Schools	1
Secondary Education	1

Audience

Researchers

Location

Germany

Laws, Policies, & Programs

Assessments and Surveys

National Assessment of…

What Works Clearinghouse Rating

Showing all 7 results Save | Export

Combining Mokken Scale Analysis with Rasch Measurement Theory to Explore Differences in Measurement Quality between Subgroups

Peer reviewed

Direct link

Stefanie A. Wind; Benjamin Lugu; Yurou Wang – International Journal of Testing, 2025

Mokken Scale Analysis (MSA) is a nonparametric approach that offers exploratory tools for understanding the nature of item responses while emphasizing invariance requirements. MSA is often discussed as it relates to Rasch measurement theory, which also emphasizes invariance, but uses parametric models. Researchers who have compared and combined…

Descriptors: Item Response Theory, Scaling, Surveys, Evaluation Methods

Validation Methods for Aggregate-Level Test Scale Linking: A Case Study Mapping School District Test Score Distributions to a Common Scale. CEPA Working Paper No. 16-09

Download full text

Reardon, Sean F.; Ho, Andrew D.; Kalogrides, Demetra – Stanford Center for Education Policy Analysis, 2019

Linking score scales across different tests is considered speculative and fraught, even at the aggregate level (Feuer et al., 1999; Thissen, 2007). We introduce and illustrate validation methods for aggregate linkages, using the challenge of linking U.S. school district average test scores across states as a motivating example. We show that…

Descriptors: Test Validity, Evaluation Methods, School Districts, Scores

Taking the Missing Propensity into Account When Estimating Competence Scores: Evaluation of Item Response Theory Models for Nonignorable Omissions

Peer reviewed

Direct link

Köhler, Carmen; Pohl, Steffi; Carstensen, Claus H. – Educational and Psychological Measurement, 2015

When competence tests are administered, subjects frequently omit items. These missing responses pose a threat to correctly estimating the proficiency level. Newer model-based approaches aim to take nonignorable missing data processes into account by incorporating a latent missing propensity into the measurement model. Two assumptions are typically…

Descriptors: Competence, Tests, Evaluation Methods, Adults

Mokken Scale Analysis for Dichotomous Items Using Marginal Models

Peer reviewed

Direct link

van der Ark, L. Andries; Croon, Marcel A.; Sijtsma, Klaas – Psychometrika, 2008

Scalability coefficients play an important role in Mokken scale analysis. For a set of items, scalability coefficients have been defined for each pair of items, for each individual item, and for the entire scale. Hypothesis testing with respect to these scalability coefficients has not been fully developed. This study introduces marginal modelling…

Descriptors: Hypothesis Testing, Item Response Theory, Error of Measurement, Scaling

Evaluation of the Magnitude of Differential Item Functioning in Polytomous Items. Program Statistics Research Technical Report No. 94-2.

Download full text

Zwick, Rebecca; Thayer, Dorothy T. – 1994

Several recent studies have investigated the application of statistical inference procedures to the analysis of differential item functioning (DIF) in test items that are scored on an ordinal scale. Mantel's extension of the Mantel-Haenszel test is a possible hypothesis-testing method for this purpose. The development of descriptive statistics for…

Descriptors: Error of Measurement, Evaluation Methods, Hypothesis Testing, Item Bias

Reducing Errors Due to the Use of Judges. ERIC/TM Digest.

Download full text

Rudner, Lawrence M. – 1992

Several common sources of error in assessment that depends on the use of judges are identified, and ways to reduce the impact of rating errors are examined. Numerous threats to the validity of scores based on ratings exist. These threats include: (1) the halo effect; (2) stereotyping; (3) perception differences; (4) leniency/stringency error; and…

Descriptors: Alternative Assessment, Error of Measurement, Evaluation Methods, Evaluators

Download full text

Cook, Linda L.; Petersen, Nancy S. – 1986

This paper examines how various equating methods are affected by: (1) sampling error; (2) sample characteristics; and (3) characteristics of anchor test items. It reviews empirical studies that investigated the invariance of equating transformations, and it discusses empirical and simulation studies that focus on how the properties of anchor tests…

Descriptors: Educational Research, Equated Scores, Error of Measurement, Evaluation Methods