Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 4 |
Since 2006 (last 20 years) | 10 |
Descriptor
Sampling | 8 |
Item Response Theory | 7 |
Equated Scores | 5 |
Computation | 4 |
Accuracy | 3 |
Bayesian Statistics | 2 |
Goodness of Fit | 2 |
Item Sampling | 2 |
Measurement Techniques | 2 |
Monte Carlo Methods | 2 |
Reliability | 2 |
More ▼ |
Source
Journal of Educational… | 10 |
Author
Albano, Anthony D. | 1 |
Baldwin, Peter | 1 |
Brennan, Robert L. | 1 |
Castellano, Katherine E. | 1 |
Combs, Adam | 1 |
Jiang, Yanlin | 1 |
Kane, Michael T. | 1 |
Kim, Hyung Jin | 1 |
Kim, Sooyeon | 1 |
Lee, Won-Chan | 1 |
Li, Deping | 1 |
More ▼ |
Publication Type
Journal Articles | 10 |
Reports - Research | 5 |
Reports - Evaluative | 4 |
Reports - Descriptive | 1 |
Education Level
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Combs, Adam – Journal of Educational Measurement, 2023
A common method of checking person-fit in Bayesian item response theory (IRT) is the posterior-predictive (PP) method. In recent years, more powerful approaches have been proposed that are based on resampling methods using the popular L*[subscript z] statistic. There has also been proposed a new Bayesian model checking method based on pivotal…
Descriptors: Bayesian Statistics, Goodness of Fit, Evaluation Methods, Monte Carlo Methods
Castellano, Katherine E.; McCaffrey, Daniel F.; Lockwood, J. R. – Journal of Educational Measurement, 2023
The simple average of student growth scores is often used in accountability systems, but it can be problematic for decision making. When computed using a small/moderate number of students, it can be sensitive to the sample, resulting in inaccurate representations of growth of the students, low year-to-year stability, and inequities for…
Descriptors: Academic Achievement, Accountability, Decision Making, Computation
Kim, Hyung Jin; Brennan, Robert L.; Lee, Won-Chan – Journal of Educational Measurement, 2020
In equating, smoothing techniques are frequently used to diminish sampling error. There are typically two types of smoothing: presmoothing and postsmoothing. For polynomial log-linear presmoothing, an optimum smoothing degree can be determined statistically based on the Akaike information criterion or Chi-square difference criterion. For…
Descriptors: Equated Scores, Sampling, Error of Measurement, Statistical Analysis
Sinharay, Sandip – Journal of Educational Measurement, 2016
De la Torre and Deng suggested a resampling-based approach for person-fit assessment (PFA). The approach involves the use of the [math equation unavailable] statistic, a corrected expected a posteriori estimate of the examinee ability, and the Monte Carlo (MC) resampling method. The Type I error rate of the approach was closer to the nominal level…
Descriptors: Sampling, Research Methodology, Error Patterns, Monte Carlo Methods
Albano, Anthony D. – Journal of Educational Measurement, 2015
Research on equating with small samples has shown that methods with stronger assumptions and fewer statistical estimates can lead to decreased error in the estimated equating function. This article introduces a new approach to linear observed-score equating, one which provides flexible control over how form difficulty is assumed versus estimated…
Descriptors: Equated Scores, Sample Size, Sampling, Statistical Inference
Zu, Jiyun; Puhan, Gautam – Journal of Educational Measurement, 2014
Preequating is in demand because it reduces score reporting time. In this article, we evaluated an observed-score preequating method: the empirical item characteristic curve (EICC) method, which makes preequating without item response theory (IRT) possible. EICC preequating results were compared with a criterion equating and with IRT true-score…
Descriptors: Item Response Theory, Equated Scores, Item Analysis, Item Sampling
Li, Deping; Jiang, Yanlin; von Davier, Alina A. – Journal of Educational Measurement, 2012
This study investigates a sequence of item response theory (IRT) true score equatings based on various scale transformation approaches and evaluates equating accuracy and consistency over time. The results show that the biases and sample variances for the IRT true score equating (both direct and indirect) are quite small (except for the mean/sigma…
Descriptors: True Scores, Equated Scores, Item Response Theory, Accuracy
Baldwin, Peter – Journal of Educational Measurement, 2011
Growing interest in fully Bayesian item response models begs the question: To what extent can model parameter posterior draws enhance existing practices? One practice that has traditionally relied on model parameter point estimates but may be improved by using posterior draws is the development of a common metric for two independently calibrated…
Descriptors: Item Response Theory, Bayesian Statistics, Computation, Sampling
Kane, Michael T. – Journal of Educational Measurement, 2013
To validate an interpretation or use of test scores is to evaluate the plausibility of the claims based on the scores. An argument-based approach to validation suggests that the claims based on the test scores be outlined as an argument that specifies the inferences and supporting assumptions needed to get from test responses to score-based…
Descriptors: Test Interpretation, Validity, Scores, Test Use
Kim, Sooyeon; Livingston, Samuel A. – Journal of Educational Measurement, 2010
Score equating based on small samples of examinees is often inaccurate for the examinee populations. We conducted a series of resampling studies to investigate the accuracy of five methods of equating in a common-item design. The methods were chained equipercentile equating of smoothed distributions, chained linear equating, chained mean equating,…
Descriptors: Equated Scores, Test Items, Item Sampling, Item Response Theory