Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 1 |
Since 2006 (last 20 years) | 9 |
Descriptor
Methods | 11 |
Equated Scores | 6 |
Computation | 5 |
Test Items | 5 |
Comparative Analysis | 4 |
Item Response Theory | 4 |
Test Construction | 4 |
Accuracy | 3 |
Error of Measurement | 3 |
Sampling | 3 |
Test Length | 3 |
More ▼ |
Source
ETS Research Report Series | 11 |
Author
von Davier, Alina A. | 3 |
Chen, Haiwen | 2 |
Lee, Yi-Hsuan | 2 |
Zhang, Jinming | 2 |
Guo, Hongwen | 1 |
Haberman, Shelby | 1 |
Han, Ning | 1 |
Kandathil, Mathew | 1 |
Kong, Nan | 1 |
Li, Deping | 1 |
Li, Feifei | 1 |
More ▼ |
Publication Type
Journal Articles | 11 |
Reports - Research | 11 |
Numerical/Quantitative Data | 1 |
Education Level
Grade 3 | 1 |
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
National Assessment of… | 1 |
What Works Clearinghouse Rating
Chen, Haiwen; Livingston, Samuel A. – ETS Research Report Series, 2013
This paper presents a new equating method for the nonequivalent groups with anchor test design: poststratification equating based on true anchor scores. The linear version of this method is shown to be equivalent, under certain conditions, to Levine observed score equating, in the same way that the linear version of poststratification equating is…
Descriptors: Equated Scores, Test Items, Methods
Lu, Ru; Haberman, Shelby; Guo, Hongwen; Liu, Jinghua – ETS Research Report Series, 2015
In this study, we apply jackknifing to anchor items to evaluate the impact of anchor selection on equating stability. In an ideal world, the choice of anchor items should have little impact on equating results. When this ideal does not correspond to reality, selection of anchor items can strongly influence equating results. This influence does not…
Descriptors: Test Construction, Equated Scores, Test Items, Sampling
Li, Feifei – ETS Research Report Series, 2017
An information-correction method for testlet-based tests is introduced. This method takes advantage of both generalizability theory (GT) and item response theory (IRT). The measurement error for the examinee proficiency parameter is often underestimated when a unidimensional conditional-independence IRT model is specified for a testlet dataset. By…
Descriptors: Item Response Theory, Generalizability Theory, Tests, Error of Measurement
Puhan, Gautam – ETS Research Report Series, 2013
The purpose of this study was to demonstrate that the choice of sample weights when defining the target population under poststratification equating can be a critical factor in determining the accuracy of the equating results under a unique equating scenario, known as "rater comparability scoring and equating." The nature of data…
Descriptors: Scoring, Equated Scores, Sampling, Accuracy
von Davier, Alina A.; Chen, Haiwen – ETS Research Report Series, 2013
In the framework of the observed-score equating methods for the nonequivalent groups with anchor test design, there are 3 fundamentally different ways of using the information provided by the anchor scores to equate the scores of a new form to those of an old form. One method uses the anchor scores as a conditioning variable, such as the Tucker…
Descriptors: Equated Scores, Item Response Theory, True Scores, Methods
Oranje, Andreas; Li, Deping; Kandathil, Mathew – ETS Research Report Series, 2009
Several complex sample standard error estimators based on linearization and resampling for the latent regression model of the National Assessment of Educational Progress (NAEP) are studied with respect to design choices such as number of items, number of regressors, and the efficiency of the sample. This paper provides an evaluation of the extent…
Descriptors: Error of Measurement, Computation, Regression (Statistics), National Competency Tests
Lee, Yi-Hsuan; von Davier, Alina A. – ETS Research Report Series, 2008
The kernel equating method (von Davier, Holland, & Thayer, 2004) is based on a flexible family of equipercentile-like equating functions that use a Gaussian kernel to continuize the discrete score distributions. While the classical equipercentile, or percentile-rank, equating method carries out the continuization step by linear interpolation,…
Descriptors: Equated Scores, Comparative Analysis, Methods, Accuracy
Liu, Ou Lydia; Rijmen, Frank; Kong, Nan – ETS Research Report Series, 2007
Parallel analysis has been well documented to be an effective and accurate method for determining the number of factors to retain in exploratory factor analysis. Despite its theoretical and empirical advantages, the popularity of parallel analysis has been thwarted by its limited access in statistical software such as SPSS and SAS, especially in…
Descriptors: Factor Analysis, Correlation, Computer Software, Computer Oriented Programs
Lee, Yi-Hsuan; Zhang, Jinming – ETS Research Report Series, 2008
The method of maximum-likelihood is typically applied to item response theory (IRT) models when the ability parameter is estimated while conditioning on the true item parameters. In practice, the item parameters are unknown and need to be estimated first from a calibration sample. Lewis (1985) and Zhang and Lu (2007) proposed the expected response…
Descriptors: Item Response Theory, Comparative Analysis, Computation, Ability
von Davier, Alina A.; Han, Ning – ETS Research Report Series, 2004
This study investigates the population sensitivity of the commonly used linear equating methods in the Non-Equivalent-groups with an Anchor Test (NEAT) design: the Tucker, the Levine observed-score, and the chain linear methods. For a detailed analysis of the subject, we apply three distinctive approaches to a real data set from a NEAT design: a)…
Descriptors: Equated Scores, Test Construction, Methods, Comparative Analysis
Zhang, Jinming – ETS Research Report Series, 2005
Lord's bias function and the weighted likelihood estimation method are effective in reducing the bias of the maximum likelihood estimate of an examinee's ability under the assumption that the true item parameters are known. This paper presents simulation studies to determine the effectiveness of these two methods in reducing the bias when the item…
Descriptors: Statistical Bias, Maximum Likelihood Statistics, Computation, Ability