Publication Date
| In 2026 | 0 |
| Since 2025 | 2 |
| Since 2022 (last 5 years) | 12 |
| Since 2017 (last 10 years) | 26 |
| Since 2007 (last 20 years) | 90 |
Descriptor
| True Scores | 416 |
| Error of Measurement | 121 |
| Test Reliability | 110 |
| Statistical Analysis | 107 |
| Mathematical Models | 97 |
| Item Response Theory | 87 |
| Correlation | 76 |
| Equated Scores | 76 |
| Reliability | 64 |
| Test Theory | 52 |
| Test Items | 51 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 12 |
| Practitioners | 2 |
| Administrators | 1 |
| Teachers | 1 |
Location
| Australia | 1 |
| Canada | 1 |
| China | 1 |
| Colorado | 1 |
| Illinois | 1 |
| Israel | 1 |
| New York | 1 |
| Oregon | 1 |
| Taiwan | 1 |
| Texas | 1 |
| United Kingdom (England) | 1 |
| More ▼ | |
Laws, Policies, & Programs
| Elementary and Secondary… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Peer reviewedBowers, John – Educational and Psychological Measurement, 1971
Descriptors: Error of Measurement, Mathematical Models, Test Reliability, True Scores
Peer reviewedBond, Lloyd – Psychometrika, 1979
Tucker, Damarin, and Messick proposed a "base-free" measure of change which involves the computation of residual scores that are uncorrelated with true scores on the pretest. The present note discusses this change measure and demonstrates that properties they attribute to a are, in fact, properties of b. (Author/CTM)
Descriptors: Differences, Pretests Posttests, Research Reviews (Publications), Scores
Peer reviewedConger, Anthony J. – Educational and Psychological Measurement, 1980
Reliability maximizing weights are related to theoretically specified true score scaling weights to show a constant relationship that is invariant under separate linear tranformations on each variable in the system. Test theoretic relations should be derived for the most general model available and not for unnecessarily constrained models.…
Descriptors: Mathematical Formulas, Scaling, Test Reliability, Test Theory
Peer reviewedWilcox, Rand R. – Applied Psychological Measurement, 1979
Using a new coefficient, a rescaling of the Bayes risk is examined and a modification of this coefficient is described which yields an index that always has a value between zero and one. (Author/MH)
Descriptors: Bayesian Statistics, Measurement Techniques, Scoring, Technical Reports
Peer reviewedDimitrov, Dimiter M. – Journal of Applied Measurement, 2003
Proposes formulas for expected true-score measures and reliability of binary items as a function of their Rasch difficulty when the trait (ability) distribution is normal or logistic. Provides an illustrative example for using the proposed formulas. (SLD)
Descriptors: Ability, Difficulty Level, Item Response Theory, Reliability
Peer reviewedTisak, John; Tisak, Marie S. – Applied Psychological Measurement, 1996
Dynamic generalizations of reliability and validity that will incorporate longitudinal or developmental models, using latent curve analysis, are discussed. A latent curve model formulated to depict change is incorporated into the classical definitions of reliability and validity. The approach is illustrated with sociological and psychological…
Descriptors: Definitions, Development, Longitudinal Studies, Models
Peer reviewedCliff, Norman – Psychometrika, 1989
This paper argues that: test data are ordinal; latent trait scores are only determined ordinally; and test data are used largely for ordinal purposes. A set of ordinal assumptions is presented, including an ordinal version of local independence. It is concluded that a purely ordinal test theory is possible. (TJH)
Descriptors: Equations (Mathematics), Latent Trait Theory, Regression (Statistics), True Scores
Peer reviewedKrus, David J.; Helmstadter, Gerald C. – Educational and Psychological Measurement, 1993
Negative coefficients of reliability, sometimes returned by the standard formula for estimation of the internal-consistency reliability, are neither theoretically nor numerically correct. Alternative strategies for test development in this special case are suggested. (Author)
Descriptors: Estimation (Mathematics), Reliability, Test Construction, Test Use
Peer reviewedJiang, Hai; Stout, William – Journal of Educational and Behavioral Statistics, 1998
Proposes a new regression correction for the SIBTEST statistical tests (R. Shealy and W. Stout, 1993) that essentially uses a two-segment piecewise linear regression of the true on observed matching subtest scores. A simulation study illustrates the approach. (SLD)
Descriptors: Estimation (Mathematics), Item Bias, Regression (Statistics), Simulation
Haberman, Shelby J. – ETS Research Report Series, 2008
In educational testing, subscores may be provided based on a portion of the items from a larger test. One consideration in evaluation of such subscores is their ability to predict a criterion score. Two limitations on prediction exist. The first, which is well known, is that the coefficient of determination for linear prediction of the criterion…
Descriptors: Scores, Validity, Educational Testing, Correlation
Gaudron, Jean-Philippe; Vautier, Stephane – Journal of Vocational Behavior, 2007
This study aimed at estimating the correlation between true scores (true consistency) of vocational interest over a short time span in a sample of 1089 adults. Participants were administered 54 items assessing vocational, family, and leisure interests twice over a 1-month period. Responses were analyzed with a multitrait (MT) model, which supposes…
Descriptors: Vocational Interests, Correlation, True Scores, Longitudinal Studies
Stocking, Martha L.; And Others – 1988
A sequence of simulations was carried out to aid in the diagnosis and interpretation of equating differences found between random and matched (nonrandom) samples for four commonly used equating procedures: (1) Tucker linear observed-score equating; (2) Levine equally reliable linear observed-score equating; (3) equipercentile curvilinear…
Descriptors: Equated Scores, Item Response Theory, Sample Size, Simulation
Peer reviewedSchulman, Robert S.; Haden, Richard L. – Psychometrika, 1975
A model is proposed for the description of ordinal test scores based on the definition of true score as expected rank; its deviations are compared with results from classical test theory. An unbiased estimator of population true score from sample data is calculated. Score variance and population reliability are examined. (Author/BJG)
Descriptors: Career Development, Mathematical Models, Test Reliability, Test Theory
Peer reviewedKearns, Jack; Meredith, William – Psychometrika, 1975
Examines the question of how large a sample must be in order to produce empirical Bayes estimates which are preferable to other commonly used estimates, such as proportion correct observed score. (Author/RC)
Descriptors: Bayesian Statistics, Item Analysis, Probability, Sampling
Peer reviewedNg, K. T. – Educational and Psychological Measurement, 1974
This paper is aimed at demonstrating that Charles Spearman postulated neither a platonic true-error distinction nor a requirement for constant true scores under repeated measurement. (Author/RC)
Descriptors: Career Development, Correlation, Models, Test Reliability

Direct link
