Publication Date
In 2025 | 0 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 4 |
Since 2006 (last 20 years) | 9 |
Descriptor
Correlation | 11 |
Error of Measurement | 11 |
Item Response Theory | 4 |
Accuracy | 3 |
Predictor Variables | 3 |
Sample Size | 3 |
Scores | 3 |
Simulation | 3 |
Bayesian Statistics | 2 |
Comparative Analysis | 2 |
Computation | 2 |
More ▼ |
Source
Journal of Educational… | 11 |
Author
Augustin Mutak | 1 |
Belfry, M. Joan | 1 |
Cho, Sun-Joo | 1 |
Esther Ulitzsch | 1 |
Himelfarb, Igor | 1 |
Jochen Ranger | 1 |
Lee, Soo | 1 |
Lee, Woo-yeol | 1 |
Moses, Tim | 1 |
Puhan, Gautam | 1 |
Robert Krause | 1 |
More ▼ |
Publication Type
Journal Articles | 11 |
Reports - Research | 9 |
Reports - Evaluative | 2 |
Education Level
Secondary Education | 2 |
Elementary Secondary Education | 1 |
High Schools | 1 |
Higher Education | 1 |
Postsecondary Education | 1 |
Audience
Location
United Kingdom (England) | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Program for International… | 1 |
SAT (College Admission Test) | 1 |
What Works Clearinghouse Rating
Augustin Mutak; Robert Krause; Esther Ulitzsch; Sören Much; Jochen Ranger; Steffi Pohl – Journal of Educational Measurement, 2024
Understanding the intraindividual relation between an individual's speed and ability in testing scenarios is essential to assure a fair assessment. Different approaches exist for estimating this relationship, that either rely on specific study designs or on specific assumptions. This paper aims to add to the toolbox of approaches for estimating…
Descriptors: Testing, Academic Ability, Time on Task, Correlation
Sinharay, Sandip – Journal of Educational Measurement, 2018
The value-added method of Haberman is arguably one of the most popular methods to evaluate the quality of subscores. The method is based on the classical test theory and deems a subscore to be of added value if the subscore predicts the corresponding true subscore better than does the total score. Sinharay provided an interpretation of the added…
Descriptors: Scores, Value Added Models, Raw Scores, Item Response Theory
Lee, Woo-yeol; Cho, Sun-Joo – Journal of Educational Measurement, 2017
Cross-level invariance in a multilevel item response model can be investigated by testing whether the within-level item discriminations are equal to the between-level item discriminations. Testing the cross-level invariance assumption is important to understand constructs in multilevel data. However, in most multilevel item response model…
Descriptors: Test Items, Item Response Theory, Item Analysis, Simulation
Lee, Soo; Suh, Youngsuk – Journal of Educational Measurement, 2018
Lord's Wald test for differential item functioning (DIF) has not been studied extensively in the context of the multidimensional item response theory (MIRT) framework. In this article, Lord's Wald test was implemented using two estimation approaches, marginal maximum likelihood estimation and Bayesian Markov chain Monte Carlo estimation, to detect…
Descriptors: Item Response Theory, Sample Size, Models, Error of Measurement
Moses, Tim – Journal of Educational Measurement, 2012
The focus of this paper is assessing the impact of measurement errors on the prediction error of an observed-score regression. Measures are presented and described for decomposing the linear regression's prediction error variance into parts attributable to the true score variance and the error variances of the dependent variable and the predictor…
Descriptors: Error of Measurement, Prediction, Regression (Statistics), True Scores
Puhan, Gautam – Journal of Educational Measurement, 2012
Tucker and chained linear equatings were evaluated in two testing scenarios. In Scenario 1, referred to as rater comparability scoring and equating, the anchor-to-total correlation is often very high for the new form but moderate for the reference form. This may adversely affect the results of Tucker equating, especially if the new and reference…
Descriptors: Testing, Scoring, Equated Scores, Statistical Analysis
Seo, Minhee; Roussos, Louis A. – Journal of Educational Measurement, 2010
DIMTEST is a widely used and studied method for testing the hypothesis of test unidimensionality as represented by local item independence. However, DIMTEST does not report the amount of multidimensionality that exists in data when rejecting its null. To provide more information regarding the degree to which data depart from unidimensionality, a…
Descriptors: Effect Size, Statistical Bias, Computation, Test Length
Yao, Lihua – Journal of Educational Measurement, 2010
In educational assessment, overall scores obtained by simply averaging a number of domain scores are sometimes reported. However, simply averaging the domain scores ignores the fact that different domains have different score points, that scores from those domains are related, and that at different score points the relationship between overall…
Descriptors: Educational Assessment, Error of Measurement, Item Response Theory, Scores
Zwick, Rebecca; Himelfarb, Igor – Journal of Educational Measurement, 2011
Research has often found that, when high school grades and SAT scores are used to predict first-year college grade-point average (FGPA) via regression analysis, African-American and Latino students, are, on average, predicted to earn higher FGPAs than they actually do. Under various plausible models, this phenomenon can be explained in terms of…
Descriptors: Socioeconomic Status, Grades (Scholastic), Error of Measurement, White Students

Winne, Philip H.; Belfry, M. Joan – Journal of Educational Measurement, 1982
This review of issues about correcting for attenuation concludes that the basic difficulty lies in being able to identify and equate sources of variance in estimates of validity and reliability. Recommendations are proposed for cautious use of correction for attenuation. (Author/CM)
Descriptors: Correlation, Error of Measurement, Research Methodology, Statistical Analysis

Seddon, G. M.; And Others – Journal of Educational Measurement, 1981
In a Monte Carlo simulation, a methodology was developed to investigate the existence of radex properties among objective test items. In an experiment with items covering four categories of Bloom's cognitive domain taxonomy, the items did not have the factorial properties of a radex with four levels of complexity. (Author/BW)
Descriptors: Correlation, Error of Measurement, Factor Analysis, Factor Structure