Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 0 |
| Since 2017 (last 10 years) | 3 |
| Since 2007 (last 20 years) | 8 |
Descriptor
| Evaluation Methods | 8 |
| Sample Size | 8 |
| Test Validity | 8 |
| Error of Measurement | 3 |
| Item Analysis | 3 |
| Test Bias | 3 |
| Computation | 2 |
| Construct Validity | 2 |
| Equated Scores | 2 |
| Error Patterns | 2 |
| Factor Analysis | 2 |
| More ▼ | |
Source
| Applied Measurement in… | 1 |
| ETS Research Report Series | 1 |
| Educational and Psychological… | 1 |
| Measurement and Evaluation in… | 1 |
| Measurement:… | 1 |
| ProQuest LLC | 1 |
| Research Quarterly for… | 1 |
| Stanford Center for Education… | 1 |
Author
| Garrett, Phyllis | 1 |
| Henson, Robert A. | 1 |
| Ho, Andrew D. | 1 |
| Kalogrides, Demetra | 1 |
| Lewis, Todd F. | 1 |
| Looney, Marilyn A. | 1 |
| Phillips, Gary W. | 1 |
| Reardon, Sean F. | 1 |
| Sessoms, John | 1 |
| Yoo, Jin Eun | 1 |
| Zwick, Rebecca | 1 |
| More ▼ | |
Publication Type
| Journal Articles | 6 |
| Reports - Research | 5 |
| Dissertations/Theses -… | 1 |
| Information Analyses | 1 |
| Reports - Descriptive | 1 |
| Reports - Evaluative | 1 |
Education Level
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
| National Assessment of… | 1 |
What Works Clearinghouse Rating
Reardon, Sean F.; Ho, Andrew D.; Kalogrides, Demetra – Stanford Center for Education Policy Analysis, 2019
Linking score scales across different tests is considered speculative and fraught, even at the aggregate level (Feuer et al., 1999; Thissen, 2007). We introduce and illustrate validation methods for aggregate linkages, using the challenge of linking U.S. school district average test scores across states as a motivating example. We show that…
Descriptors: Test Validity, Evaluation Methods, School Districts, Scores
Lewis, Todd F. – Measurement and Evaluation in Counseling and Development, 2017
American Educational Research Association (AERA) standards stipulate that researchers show evidence of the internal structure of instruments. Confirmatory factor analysis (CFA) is one structural equation modeling procedure designed to assess construct validity of assessments that has broad applicability for counselors interested in instrument…
Descriptors: Educational Research, Factor Analysis, Structural Equation Models, Construct Validity
Sessoms, John; Henson, Robert A. – Measurement: Interdisciplinary Research and Perspectives, 2018
Diagnostic classification models (DCMs) classify examinees based on the skills they have mastered given their test performance. This classification enables targeted feedback that can inform remedial instruction. Unfortunately, applications of DCMs have been criticized (e.g., no validity support). Generally, these evaluations have been brief and…
Descriptors: Literature Reviews, Classification, Models, Criticism
Looney, Marilyn A. – Research Quarterly for Exercise and Sport, 2013
Given that equating/linking applications are now appearing in kinesiology literature, this article provides an overview of the different types of linked test scores: equated, concordant, and predicted. It also addresses the different types of evidence required to determine whether the scores from two different field tests (measuring the same…
Descriptors: Scores, Psychomotor Skills, Scoring, Measurement Techniques
Phillips, Gary W. – Applied Measurement in Education, 2015
This article proposes that sampling design effects have potentially huge unrecognized impacts on the results reported by large-scale district and state assessments in the United States. When design effects are unrecognized and unaccounted for they lead to underestimating the sampling error in item and test statistics. Underestimating the sampling…
Descriptors: State Programs, Sampling, Research Design, Error of Measurement
Zwick, Rebecca – ETS Research Report Series, 2012
Differential item functioning (DIF) analysis is a key component in the evaluation of the fairness and validity of educational tests. The goal of this project was to review the status of ETS DIF analysis procedures, focusing on three aspects: (a) the nature and stringency of the statistical rules used to flag items, (b) the minimum sample size…
Descriptors: Test Bias, Sample Size, Bayesian Statistics, Evaluation Methods
Garrett, Phyllis – ProQuest LLC, 2009
The use of polytomous items in assessments has increased over the years, and as a result, the validity of these assessments has been a concern. Differential item functioning (DIF) and missing data are two factors that may adversely affect assessment validity. Both factors have been studied separately, but DIF and missing data are likely to occur…
Descriptors: Sample Size, Monte Carlo Methods, Test Validity, Effect Size
Yoo, Jin Eun – Educational and Psychological Measurement, 2009
This Monte Carlo study investigates the beneficiary effect of including auxiliary variables during estimation of confirmatory factor analysis models with multiple imputation. Specifically, it examines the influence of sample size, missing rates, missingness mechanism combinations, missingness types (linear or convex), and the absence or presence…
Descriptors: Monte Carlo Methods, Research Methodology, Test Validity, Factor Analysis

Peer reviewed
Direct link
