Publication Date
| In 2026 | 0 |
| Since 2025 | 4 |
| Since 2022 (last 5 years) | 14 |
| Since 2017 (last 10 years) | 28 |
| Since 2007 (last 20 years) | 92 |
Descriptor
| True Scores | 418 |
| Error of Measurement | 122 |
| Test Reliability | 110 |
| Statistical Analysis | 107 |
| Mathematical Models | 97 |
| Item Response Theory | 87 |
| Correlation | 76 |
| Equated Scores | 76 |
| Reliability | 64 |
| Test Theory | 52 |
| Test Items | 51 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 12 |
| Practitioners | 2 |
| Administrators | 1 |
| Teachers | 1 |
Location
| Australia | 1 |
| Canada | 1 |
| China | 1 |
| Colorado | 1 |
| Illinois | 1 |
| Israel | 1 |
| New York | 1 |
| Oregon | 1 |
| Taiwan | 1 |
| Texas | 1 |
| United Kingdom (England) | 1 |
| More ▼ | |
Laws, Policies, & Programs
| Elementary and Secondary… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Peer reviewedCureton, Edward E. – Educational and Psychological Measurement, 1971
A rebuttal of Frary's 1969 article in Educational and Psychological Measurement. (MS)
Descriptors: Error of Measurement, Guessing (Tests), Multiple Choice Tests, Scoring Formulas
Peer reviewedMorrison, Donald G.; Brockway, George – Psychometrika, 1979
A modified beta binomial model is presented for use in analyzing random guessing multiple choice tests and taste tests. Detection probabilities for each item are distributed beta across the population subjects. Properties for the observable distribution of correct responses are derived. Two concepts of true score estimates are presented.…
Descriptors: Bayesian Statistics, Guessing (Tests), Mathematical Models, Multiple Choice Tests
Peer reviewedGupta, J. K.; And Others – Journal of Experimental Education, 1989
A model was developed for estimating individuals' true changes on test scores using the additional information provided by an auxiliary variable correlated with a trait being measured. This provides an improved and more precise estimate when compared with simple difference scores. It involves the reliability of "X" scores only. (SLD)
Descriptors: Change, Equations (Mathematics), Estimation (Mathematics), Mathematical Models
Peer reviewedZwick, Rebecca; And Others – Journal of Educational Measurement, 1995
In a simulation study of ability and estimation of differential item functioning (DIF) in computerized adaptive tests, Rasch-based DIF statistics were highly correlated with generating DIF, but DIF statistics tended to be slightly smaller than in the three-parameter logistic model analyses. (SLD)
Descriptors: Ability, Adaptive Testing, Computer Assisted Testing, Computer Simulation
Lyu, C. Felicia; And Others – 1995
A smoothed version of standardization, which merges kernel smoothing with the traditional standardization differential item functioning (DIF) approach, was used to examine DIF for student-produced response (SPR) items on the Scholastic Assessment Test (SAT) I mathematics test at both the item and testlet levels. This nonparametric technique avoids…
Descriptors: Aptitude Tests, Item Bias, Mathematics Tests, Multiple Choice Tests
Wingersky, Marilyn S. – 1989
In a variable-length adaptive test with a stopping rule that relied on the asymptotic standard error of measurement of the examinee's estimated true score, M. S. Stocking (1987) discovered that it was sufficient to know the examinee's true score and the number of items administered to predict with some accuracy whether an examinee's true score was…
Descriptors: Adaptive Testing, Bayesian Statistics, Error of Measurement, Estimation (Mathematics)
Cliff, Norman – 1984
In almost all applications of measurement there is some sort of response by a human subject. Almost always, the response scale is ordinal, but almost always it is treated as if it were an interval measure. Methods for treating data ordinally are currently being developed in three areas: ordinal analysis for questionnaire responses, ordinal…
Descriptors: Multiple Regression Analysis, Questionnaires, Research Problems, Scores
Peer reviewedLord, Frederic M. – Psychometrika, 1975
For the six available sets of empirical data, the discrimination (slope) parameter of the logistic item characteristic curve was found to have a significant positive correlation over items with the difficulty (location) parameter. This unpleasant situation can be eliminated by a suitably chosen transformation of the ability scale. (Author/RC)
Descriptors: Ability, Aptitude Tests, Correlation, Item Analysis
Wilcox, Rand R. – 1980
Concern about passing those examinees who should pass, and retaining those who need remedial work, is one problem related to criterion-referenced testing. This paper deals with one aspect of that problem. When determining how many items to include on a criterion-referenced test, practitioners must resolve various non-statistical issues before a…
Descriptors: Bayesian Statistics, Criterion Referenced Tests, Latent Trait Theory, Mathematical Models
Livingston, Samuel A. – 1970
The procedure of estimating true scores by means of a transformation of the obtained score based on the reliability coefficient is compared with the use of the obtained score without transformation. Using the mean squared error as a criterion, the transformed score is a better estimate for most examinees but poorer for those whose true scores lie…
Descriptors: Analysis of Variance, Measurement, Raw Scores, Scores
Pravalpruk, Kowit; Porter, Andrew C. – 1974
When random assignment has been accomplished and an analysis of covariance (ANCOVA) is being used to correct for initial differences among treatment groups, use of unreliable covariables not only decreases the power of ANCOVA, but also causes ANCOVA to test biased treatment effects. Several correction procedures have been suggested for the single…
Descriptors: Analysis of Covariance, Mathematical Models, Research Problems, Statistical Analysis
Doppelt, Jerome E. – Test Service Bulletin, 1956
The standard error of measurement as a means for estimating the margin of error that should be allowed for in test scores is discussed. The true score measures the performance that is characteristic of the person tested; the variations, plus and minus, around the true score describe a characteristic of the test. When the standard deviation is used…
Descriptors: Bulletins, Error of Measurement, Measurement Techniques, Reliability
Peer reviewedZimmerman, Donald W. – Educational and Psychological Measurement, 1976
Using the concepts of conditional probability, conditional expectation, and conditional independence, the main results of the classical test theory model can be derived in a very few steps with minimal assumptions. The present effort explores the possibility that present classical test theories can be further condensed. (Author/RC)
Descriptors: Career Development, Correlation, Mathematical Models, Measurement
Peer reviewedWilcox, Rand R.; Harris, Chester W. – Journal of Educational Measurement, 1977
Emrick's proposed method for determining a mastery level cut-off score is questioned. Emrick's method is shown to be useful only in limited situations. (JKS)
Descriptors: Correlation, Cutting Scores, Mastery Tests, Mathematical Models
Peer reviewedBlok, H. – Journal of Educational Measurement, 1985
Raters judged essays on two occasions making it possible to address the question of whether multiple ratings, however obtained, represent the same true scores. Multiple ratings of a given rater did represent the same true scores, but ratings of different raters did not. Reliability, validity, and invalidity coefficients were computed. (Author/DWH)
Descriptors: Analysis of Variance, Elementary Education, Essay Tests, Evaluators


