Publication Date
In 2025 | 1 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 2 |
Since 2006 (last 20 years) | 4 |
Descriptor
Error of Measurement | 7 |
Evaluation Methods | 7 |
Scaling | 7 |
Item Response Theory | 3 |
Measurement Techniques | 3 |
Hypothesis Testing | 2 |
Sample Size | 2 |
Scores | 2 |
Scoring | 2 |
Statistical Inference | 2 |
Test Items | 2 |
More ▼ |
Source
Educational and Psychological… | 1 |
International Journal of… | 1 |
Psychometrika | 1 |
Stanford Center for Education… | 1 |
Author
Benjamin Lugu | 1 |
Carstensen, Claus H. | 1 |
Cook, Linda L. | 1 |
Croon, Marcel A. | 1 |
Ho, Andrew D. | 1 |
Kalogrides, Demetra | 1 |
Köhler, Carmen | 1 |
Petersen, Nancy S. | 1 |
Pohl, Steffi | 1 |
Reardon, Sean F. | 1 |
Rudner, Lawrence M. | 1 |
More ▼ |
Publication Type
Reports - Research | 5 |
Journal Articles | 3 |
Speeches/Meeting Papers | 2 |
ERIC Digests in Full Text | 1 |
ERIC Publications | 1 |
Reports - Descriptive | 1 |
Education Level
Grade 9 | 1 |
High Schools | 1 |
Junior High Schools | 1 |
Middle Schools | 1 |
Secondary Education | 1 |
Audience
Researchers | 1 |
Location
Germany | 1 |
Laws, Policies, & Programs
Assessments and Surveys
National Assessment of… | 1 |
What Works Clearinghouse Rating
Stefanie A. Wind; Benjamin Lugu; Yurou Wang – International Journal of Testing, 2025
Mokken Scale Analysis (MSA) is a nonparametric approach that offers exploratory tools for understanding the nature of item responses while emphasizing invariance requirements. MSA is often discussed as it relates to Rasch measurement theory, which also emphasizes invariance, but uses parametric models. Researchers who have compared and combined…
Descriptors: Item Response Theory, Scaling, Surveys, Evaluation Methods
Reardon, Sean F.; Ho, Andrew D.; Kalogrides, Demetra – Stanford Center for Education Policy Analysis, 2019
Linking score scales across different tests is considered speculative and fraught, even at the aggregate level (Feuer et al., 1999; Thissen, 2007). We introduce and illustrate validation methods for aggregate linkages, using the challenge of linking U.S. school district average test scores across states as a motivating example. We show that…
Descriptors: Test Validity, Evaluation Methods, School Districts, Scores
Köhler, Carmen; Pohl, Steffi; Carstensen, Claus H. – Educational and Psychological Measurement, 2015
When competence tests are administered, subjects frequently omit items. These missing responses pose a threat to correctly estimating the proficiency level. Newer model-based approaches aim to take nonignorable missing data processes into account by incorporating a latent missing propensity into the measurement model. Two assumptions are typically…
Descriptors: Competence, Tests, Evaluation Methods, Adults
van der Ark, L. Andries; Croon, Marcel A.; Sijtsma, Klaas – Psychometrika, 2008
Scalability coefficients play an important role in Mokken scale analysis. For a set of items, scalability coefficients have been defined for each pair of items, for each individual item, and for the entire scale. Hypothesis testing with respect to these scalability coefficients has not been fully developed. This study introduces marginal modelling…
Descriptors: Hypothesis Testing, Item Response Theory, Error of Measurement, Scaling
Zwick, Rebecca; Thayer, Dorothy T. – 1994
Several recent studies have investigated the application of statistical inference procedures to the analysis of differential item functioning (DIF) in test items that are scored on an ordinal scale. Mantel's extension of the Mantel-Haenszel test is a possible hypothesis-testing method for this purpose. The development of descriptive statistics for…
Descriptors: Error of Measurement, Evaluation Methods, Hypothesis Testing, Item Bias
Rudner, Lawrence M. – 1992
Several common sources of error in assessment that depends on the use of judges are identified, and ways to reduce the impact of rating errors are examined. Numerous threats to the validity of scores based on ratings exist. These threats include: (1) the halo effect; (2) stereotyping; (3) perception differences; (4) leniency/stringency error; and…
Descriptors: Alternative Assessment, Error of Measurement, Evaluation Methods, Evaluators
Cook, Linda L.; Petersen, Nancy S. – 1986
This paper examines how various equating methods are affected by: (1) sampling error; (2) sample characteristics; and (3) characteristics of anchor test items. It reviews empirical studies that investigated the invariance of equating transformations, and it discusses empirical and simulation studies that focus on how the properties of anchor tests…
Descriptors: Educational Research, Equated Scores, Error of Measurement, Evaluation Methods