Publication Date
| In 2026 | 0 |
| Since 2025 | 2 |
| Since 2022 (last 5 years) | 12 |
| Since 2017 (last 10 years) | 26 |
| Since 2007 (last 20 years) | 90 |
Descriptor
| True Scores | 416 |
| Error of Measurement | 121 |
| Test Reliability | 110 |
| Statistical Analysis | 107 |
| Mathematical Models | 97 |
| Item Response Theory | 87 |
| Correlation | 76 |
| Equated Scores | 76 |
| Reliability | 64 |
| Test Theory | 52 |
| Test Items | 51 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 12 |
| Practitioners | 2 |
| Administrators | 1 |
| Teachers | 1 |
Location
| Australia | 1 |
| Canada | 1 |
| China | 1 |
| Colorado | 1 |
| Illinois | 1 |
| Israel | 1 |
| New York | 1 |
| Oregon | 1 |
| Taiwan | 1 |
| Texas | 1 |
| United Kingdom (England) | 1 |
| More ▼ | |
Laws, Policies, & Programs
| Elementary and Secondary… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Edwards, Michael C.; Vevea, Jack L. – Journal of Educational and Behavioral Statistics, 2006
This article examines a subscore augmentation procedure. The approach uses empirical Bayes adjustments and is intended to improve the overall accuracy of measurement when information is scant. Simulations examined the impact of the method on subscale scores in a variety of realistic conditions. The authors focused on two popular scoring methods:…
Descriptors: Geometric Concepts, True Scores, Scoring, Item Response Theory
Braun, Henry I.; Wainer, Howard – 1989
A desirable goal would be to develop a methodology for scoring essays so that the final grades are less affected by when or by whom each essay was read. It seems sensible to derive such grades by somehow adjusting the ratings originally given by each reader. This essay describes a solution that relies on statistical adjustment, using the context…
Descriptors: Essay Tests, Estimation (Mathematics), Interrater Reliability, Scoring
Klaas, Alan C. – 1975
Current usage and theory of standard error of measurement calls for one standard error of measurement figure to be used across all levels of scoring. The study revealed that scoring variance across scoring levels is not constant. As scoring ability increases scoring variance decreases. The assertion that low and high scoring subjects will…
Descriptors: Error of Measurement, Guessing (Tests), Scoring, Statistical Analysis
Lord, Frederic M.; Hamilton, Martha S. – 1972
A numerical procedure is outlined for obtaining an interval estimate of true score. The procedure is applied to several sets of test data. (Author)
Descriptors: Bayesian Statistics, Hypothesis Testing, Psychological Testing, Statistical Analysis
Peer reviewedJarjoura, David – Journal of Educational Statistics, 1985
Issues regarding tolerance and confidence intervals are discussed within the context of educational measurement, and conceptual distinctions are drawn between these two types of intervals. Points are raised about the advantages of tolerance intervals when the focus is on a particular observed score rather than a particular examinee. (Author/BW)
Descriptors: Comparative Analysis, Error of Measurement, Mathematical Models, Test Interpretation
Peer reviewedLu, K. H. – Educational and Psychological Measurement, 1971
Descriptors: Difficulty Level, Statistical Analysis, Statistical Significance, Test Items
Peer reviewedCureton, Edward E. – Educational and Psychological Measurement, 1971
A derivation of a formula for the stability coefficient is presented and discussed in terms of test reliability over time. (PR)
Descriptors: Error of Measurement, Raw Scores, Statistical Analysis, Test Reliability
Peer reviewedHsu, Tse-chi; Wu, Kuo-liang; Yu, Jya-yi Wu; Lee, Ming-yen – International Journal of Testing, 2002
Explored the feasibility of applying a method that incorporates collateral information to equate tests constructed for a college entrance examination by comparing its results with those of item response theory (IRT) true score equating. Simulation results suggest that, overall, equating results based on collateral information are relatively…
Descriptors: College Entrance Examinations, Equated Scores, Item Response Theory, Simulation
Peer reviewedAtkinson, Leslie – Journal of School Psychology, 1990
Offers standard errors of prediction and confidence intervals for Vineland Adaptive Behavior Scales (VABS) that help in deciding whether variation in obtained scores of scale administered to the same person more than once is a result of measurement error or whether it reflects actual change in examinee's functional level. Presented values were…
Descriptors: Error of Measurement, Foreign Countries, Raw Scores, Test Interpretation
Gierl, Mark J.; Gotzmann, Andrea; Boughton, Keith A. – Applied Measurement in Education, 2004
Differential item functioning (DIF) analyses are used to identify items that operate differently between two groups, after controlling for ability. The Simultaneous Item Bias Test (SIBTEST) is a popular DIF detection method that matches examinees on a true score estimate of ability. However in some testing situations, like test translation and…
Descriptors: True Scores, Simulation, Test Bias, Student Evaluation
Stone, Gregory Ethan; Beltyukova, Svetlana; Fox, Christine M. – International Journal of Testing, 2008
Judge-mediated examinations are defined as those for which expert evaluation (using rubrics) is required to determine correctness, completeness, and reasonability of test-taker responses. The use of multifaceted Rasch modeling has led to improvements in the reliability of scoring such examinations. The establishment of criterion-referenced…
Descriptors: Interrater Reliability, High Stakes Tests, Standard Setting, Minimum Competencies
Eignor, Daniel R.; And Others – 1995
Two recent simulation studies were conducted to aid in the diagnosis and interpretation of equating differences found between random and matched (nonrandom) samples for four commonly used equating procedures: (1) Tucker; (2) Levine equally reliable; (3) Chained equipercentile observed-score; and (4) three-parameter, item response theory true-score…
Descriptors: Criteria, Equated Scores, Item Response Theory, Raw Scores
PDF pending restorationHanson, Bradley A. – 1991
This paper presents a detailed derivation of method of moments estimates used in computer programs for the four-parameter beta compound binomial strong true score model. A procedure is presented to deal with the case in which the usual method of moments estimates do not exist or result in invalid parameter estimates. The results presented…
Descriptors: Classification, Computation, Computer Software, Equations (Mathematics)
Brennan, Robert L. – 1990
In 1955, R. Levine introduced two linear equating procedures for the common-item non-equivalent populations design. His procedures make the same assumptions about true scores; they differ in terms of the nature of the equating function used. In this paper, two parameterizations of a classical congeneric model are introduced to model the variables…
Descriptors: Equated Scores, Equations (Mathematics), Mathematical Models, Research Design
Cook, Linda L.; And Others – 1983
The purpose of this study was to empirically examine the relationship between violations of the assumption of unidimensionality, as assessed by the factor analysis of item parcel data, and the quality of item response theory (IRT) true-score equating, as measured by score scale stability. The verbal section of the Scholastic Aptitude Test (SAT)…
Descriptors: College Entrance Examinations, Equated Scores, Factor Analysis, Latent Trait Theory

Direct link
