NotesFAQContact Us
Collection
Advanced
Search Tips
Showing 1 to 15 of 22 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Combs, Adam – Journal of Educational Measurement, 2023
A common method of checking person-fit in Bayesian item response theory (IRT) is the posterior-predictive (PP) method. In recent years, more powerful approaches have been proposed that are based on resampling methods using the popular L*[subscript z] statistic. There has also been proposed a new Bayesian model checking method based on pivotal…
Descriptors: Bayesian Statistics, Goodness of Fit, Evaluation Methods, Monte Carlo Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Kim, Hyung Jin; Brennan, Robert L.; Lee, Won-Chan – Journal of Educational Measurement, 2020
In equating, smoothing techniques are frequently used to diminish sampling error. There are typically two types of smoothing: presmoothing and postsmoothing. For polynomial log-linear presmoothing, an optimum smoothing degree can be determined statistically based on the Akaike information criterion or Chi-square difference criterion. For…
Descriptors: Equated Scores, Sampling, Error of Measurement, Statistical Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Albano, Anthony D. – Journal of Educational Measurement, 2015
Research on equating with small samples has shown that methods with stronger assumptions and fewer statistical estimates can lead to decreased error in the estimated equating function. This article introduces a new approach to linear observed-score equating, one which provides flexible control over how form difficulty is assumed versus estimated…
Descriptors: Equated Scores, Sample Size, Sampling, Statistical Inference
Peer reviewed Peer reviewed
Direct linkDirect link
Li, Deping; Jiang, Yanlin; von Davier, Alina A. – Journal of Educational Measurement, 2012
This study investigates a sequence of item response theory (IRT) true score equatings based on various scale transformation approaches and evaluates equating accuracy and consistency over time. The results show that the biases and sample variances for the IRT true score equating (both direct and indirect) are quite small (except for the mean/sigma…
Descriptors: True Scores, Equated Scores, Item Response Theory, Accuracy
Peer reviewed Peer reviewed
Direct linkDirect link
Kim, Sooyeon; Livingston, Samuel A. – Journal of Educational Measurement, 2010
Score equating based on small samples of examinees is often inaccurate for the examinee populations. We conducted a series of resampling studies to investigate the accuracy of five methods of equating in a common-item design. The methods were chained equipercentile equating of smoothed distributions, chained linear equating, chained mean equating,…
Descriptors: Equated Scores, Test Items, Item Sampling, Item Response Theory
Peer reviewed Peer reviewed
Lee, Guemin; Fitzpatrick, Anne R. – Journal of Educational Measurement, 2003
Studied three procedures for estimating the standard errors of school passing rates using a generalizability theory model and considered the effects of student sample size. Results show that procedures differ in terms of assumptions about the populations from which students were sampled, and student sample size was found to have a large effect on…
Descriptors: Error of Measurement, Estimation (Mathematics), Generalizability Theory, Sampling
Peer reviewed Peer reviewed
Shaw, Dale G; And Others – Journal of Educational Measurement, 1987
Information loss occurs when continuous data are grouped in discrete intervals. After calculating the squared correlation coefficients between continuous data and corresponding grouped data for four population distributions, the effects of population distribution, number of intervals, and interval width on information loss and recovery were…
Descriptors: Intervals, Rating Scales, Sampling, Scaling
Peer reviewed Peer reviewed
Callender, John C.; Osburn, H. G. – Journal of Educational Measurement, 1979
Some procedures for estimating internal consistency reliability may be superior mathematically to the more commonly used methods such as Coefficient Alpha. One problem is computational difficulty; the other is the possibility of overestimation due to capitalization on chance. (Author/CTM)
Descriptors: Higher Education, Mathematical Formulas, Research Problems, Sampling
Peer reviewed Peer reviewed
Baglin, Roger F. – Journal of Educational Measurement, 1981
While major test publishers randomly select school districts for their national norming studies, a survey of "accepting" and "declining" districts supports the hypothesis that self-selection bias results in overrepresentation of districts which already use a specific publisher's tests or instructional materials. (Author/BW)
Descriptors: National Norms, Norm Referenced Tests, Sampling, Standardized Tests
Peer reviewed Peer reviewed
Norcini, John J.; And Others – Journal of Educational Measurement, 1988
Multiple matrix sampling is applied to a variation of Angoff's standard setting method. Thirty-six experts (internists) and 190 items were divided into five groups, and borderline examinee performance was estimated. There was some variability in the cutting scores produced by the individual groups, but various components were well estimated. (SLD)
Descriptors: Cutting Scores, Minimum Competency Testing, Physicians, Sampling
Peer reviewed Peer reviewed
Linn, Robert L. – Journal of Educational Measurement, 1983
When the precise basis of selection effect on correlation and regression equations is unknown but can be modeled by selection on a variable that is highly but not perfectly related to observed scores, the selection effects can lead to the commonly observed "overprediction" results in studies of predictive bias. (Author/PN)
Descriptors: Bias, Correlation, Higher Education, Prediction
Peer reviewed Peer reviewed
Gross, Alan L.; Shulman, Vivian – Journal of Educational Measurement, 1980
The suitability of the beta binomial test model for criterion referenced testing was investigated, first by considering whether underlying assumptions are realistic, and second, by examining the robustness of the model. Results suggest that the model may have practical value. (Author/RD)
Descriptors: Criterion Referenced Tests, Goodness of Fit, Higher Education, Item Sampling
Peer reviewed Peer reviewed
Garg, Rashmi; And Others – Journal of Educational Measurement, 1986
For the purpose of obtaining data to use in test development, multiple matrix sampling plans were compared to examinee sampling plans. Data were simulated for examinees, sampled from a population with a normal distribution of ability, responding to items selected from an item universe. (Author/LMO)
Descriptors: Difficulty Level, Monte Carlo Methods, Sampling, Statistical Studies
Peer reviewed Peer reviewed
Gressard, Risa P.; Loyd, Brenda H. – Journal of Educational Measurement, 1991
A Monte Carlo study, which simulated 10,000 examinees' responses to four tests, investigated the effect of item stratification on parameter estimation in multiple matrix sampling of achievement data. Practical multiple matrix sampling is based on item stratification by item discrimination and a sampling plan with moderate number of subtests. (SLD)
Descriptors: Achievement Tests, Comparative Testing, Computer Simulation, Estimation (Mathematics)
Peer reviewed Peer reviewed
Angoff, William H.; Cowell, William R. – Journal of Educational Measurement, 1986
Linear conversions were developed relating scores on recent forms of the Graduate Record Examinations. Conversions based on specially selected subpopulations were compared with total-group conversions and evaluated. Conclusions indicated that the data clearly support the assumption of population independence for homogenoeous tests, but not quite…
Descriptors: College Entrance Examinations, Equated Scores, Groups, Higher Education
Previous Page | Next Page »
Pages: 1  |  2