ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	4
Since 2006 (last 20 years)	10

Source

Journal of Educational…

Author

Albano, Anthony D.	1
Baldwin, Peter	1
Brennan, Robert L.	1
Castellano, Katherine E.	1
Combs, Adam	1
Jiang, Yanlin	1
Kane, Michael T.	1
Kim, Hyung Jin	1
Kim, Sooyeon	1
Lee, Won-Chan	1
Li, Deping	1
Livingston, Samuel A.	1
Lockwood, J. R.	1
McCaffrey, Daniel F.	1
Puhan, Gautam	1
Sinharay, Sandip	1
Zu, Jiyun	1
von Davier, Alina A.	1
More ▼

Publication Type

Journal Articles	10
Reports - Research	5
Reports - Evaluative	4
Reports - Descriptive	1

Education Level

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 10 results Save | Export

A New Bayesian Person-Fit Analysis Method Using Pivotal Discrepancy Measures

Peer reviewed

Direct link

Combs, Adam – Journal of Educational Measurement, 2023

A common method of checking person-fit in Bayesian item response theory (IRT) is the posterior-predictive (PP) method. In recent years, more powerful approaches have been proposed that are based on resampling methods using the popular L*[subscript z] statistic. There has also been proposed a new Bayesian model checking method based on pivotal…

Descriptors: Bayesian Statistics, Goodness of Fit, Evaluation Methods, Monte Carlo Methods

An Exploration of an Improved Aggregate Student Growth Measure Using Data from Two States

Peer reviewed

Direct link

Castellano, Katherine E.; McCaffrey, Daniel F.; Lockwood, J. R. – Journal of Educational Measurement, 2023

The simple average of student growth scores is often used in accountability systems, but it can be problematic for decision making. When computed using a small/moderate number of students, it can be sensitive to the sample, resulting in inaccurate representations of growth of the students, low year-to-year stability, and inequities for…

Descriptors: Academic Achievement, Accountability, Decision Making, Computation

A New Statistic to Assess Fitness of Cubic-Spline Postsmoothing

Peer reviewed

Direct link

Kim, Hyung Jin; Brennan, Robert L.; Lee, Won-Chan – Journal of Educational Measurement, 2020

In equating, smoothing techniques are frequently used to diminish sampling error. There are typically two types of smoothing: presmoothing and postsmoothing. For polynomial log-linear presmoothing, an optimum smoothing degree can be determined statistically based on the Akaike information criterion or Chi-square difference criterion. For…

Descriptors: Equated Scores, Sampling, Error of Measurement, Statistical Analysis

Assessment of Person Fit Using Resampling-Based Approaches

Peer reviewed

Direct link

Sinharay, Sandip – Journal of Educational Measurement, 2016

De la Torre and Deng suggested a resampling-based approach for person-fit assessment (PFA). The approach involves the use of the [math equation unavailable] statistic, a corrected expected a posteriori estimate of the examinee ability, and the Monte Carlo (MC) resampling method. The Type I error rate of the approach was closer to the nominal level…

Descriptors: Sampling, Research Methodology, Error Patterns, Monte Carlo Methods

A General Linear Method for Equating with Small Samples

Peer reviewed

Direct link

Albano, Anthony D. – Journal of Educational Measurement, 2015

Research on equating with small samples has shown that methods with stronger assumptions and fewer statistical estimates can lead to decreased error in the estimated equating function. This article introduces a new approach to linear observed-score equating, one which provides flexible control over how form difficulty is assumed versus estimated…

Descriptors: Equated Scores, Sample Size, Sampling, Statistical Inference

Preequating with Empirical Item Characteristic Curves: An Observed-Score Preequating Method

Peer reviewed

Direct link

Zu, Jiyun; Puhan, Gautam – Journal of Educational Measurement, 2014

Preequating is in demand because it reduces score reporting time. In this article, we evaluated an observed-score preequating method: the empirical item characteristic curve (EICC) method, which makes preequating without item response theory (IRT) possible. EICC preequating results were compared with a criterion equating and with IRT true-score…

Descriptors: Item Response Theory, Equated Scores, Item Analysis, Item Sampling

The Accuracy and Consistency of a Series of IRT True Score Equatings

Peer reviewed

Direct link

Li, Deping; Jiang, Yanlin; von Davier, Alina A. – Journal of Educational Measurement, 2012

This study investigates a sequence of item response theory (IRT) true score equatings based on various scale transformation approaches and evaluates equating accuracy and consistency over time. The results show that the biases and sample variances for the IRT true score equating (both direct and indirect) are quite small (except for the mean/sigma…

Descriptors: True Scores, Equated Scores, Item Response Theory, Accuracy

A Strategy for Developing a Common Metric in Item Response Theory when Parameter Posterior Distributions Are Known

Peer reviewed

Direct link

Baldwin, Peter – Journal of Educational Measurement, 2011

Growing interest in fully Bayesian item response models begs the question: To what extent can model parameter posterior draws enhance existing practices? One practice that has traditionally relied on model parameter point estimates but may be improved by using posterior draws is the development of a common metric for two independently calibrated…

Descriptors: Item Response Theory, Bayesian Statistics, Computation, Sampling

Validating the Interpretations and Uses of Test Scores

Peer reviewed

Direct link

Kane, Michael T. – Journal of Educational Measurement, 2013

To validate an interpretation or use of test scores is to evaluate the plausibility of the claims based on the scores. An argument-based approach to validation suggests that the claims based on the test scores be outlined as an argument that specifies the inferences and supporting assumptions needed to get from test responses to score-based…

Descriptors: Test Interpretation, Validity, Scores, Test Use

Comparisons among Small Sample Equating Methods in a Common-Item Design

Peer reviewed

Direct link

Kim, Sooyeon; Livingston, Samuel A. – Journal of Educational Measurement, 2010

Score equating based on small samples of examinees is often inaccurate for the examinee populations. We conducted a series of resampling studies to investigate the accuracy of five methods of equating in a common-item design. The methods were chained equipercentile equating of smoothed distributions, chained linear equating, chained mean equating,…

Descriptors: Equated Scores, Test Items, Item Sampling, Item Response Theory

Sampling	8
Item Response Theory	7
Equated Scores	5
Computation	4
Accuracy	3
Bayesian Statistics	2
Goodness of Fit	2
Item Sampling	2
Measurement Techniques	2
Monte Carlo Methods	2
Reliability	2
Statistical Analysis	2
Academic Achievement	1
Accountability	1
Alternative Assessment	1
Bias	1
Comparative Analysis	1
Construct Validity	1
Content Validity	1
Criterion Referenced Tests	1
Decision Making	1
Error Patterns	1
Error of Measurement	1
Evaluation Methods	1
Evidence	1
More ▼