Publication Date
| In 2026 | 0 |
| Since 2025 | 2 |
| Since 2022 (last 5 years) | 12 |
| Since 2017 (last 10 years) | 26 |
| Since 2007 (last 20 years) | 90 |
Descriptor
| True Scores | 416 |
| Error of Measurement | 121 |
| Test Reliability | 110 |
| Statistical Analysis | 107 |
| Mathematical Models | 97 |
| Item Response Theory | 87 |
| Correlation | 76 |
| Equated Scores | 76 |
| Reliability | 64 |
| Test Theory | 52 |
| Test Items | 51 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 12 |
| Practitioners | 2 |
| Administrators | 1 |
| Teachers | 1 |
Location
| Australia | 1 |
| Canada | 1 |
| China | 1 |
| Colorado | 1 |
| Illinois | 1 |
| Israel | 1 |
| New York | 1 |
| Oregon | 1 |
| Taiwan | 1 |
| Texas | 1 |
| United Kingdom (England) | 1 |
| More ▼ | |
Laws, Policies, & Programs
| Elementary and Secondary… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Peer reviewedGupta, J. K.; And Others – Journal of Experimental Education, 1989
A model was developed for estimating individuals' true changes on test scores using the additional information provided by an auxiliary variable correlated with a trait being measured. This provides an improved and more precise estimate when compared with simple difference scores. It involves the reliability of "X" scores only. (SLD)
Descriptors: Change, Equations (Mathematics), Estimation (Mathematics), Mathematical Models
Peer reviewedZwick, Rebecca; And Others – Journal of Educational Measurement, 1995
In a simulation study of ability and estimation of differential item functioning (DIF) in computerized adaptive tests, Rasch-based DIF statistics were highly correlated with generating DIF, but DIF statistics tended to be slightly smaller than in the three-parameter logistic model analyses. (SLD)
Descriptors: Ability, Adaptive Testing, Computer Assisted Testing, Computer Simulation
Hartig, Johannes; Holzel, Britta; Moosbrugger, Helfried – Multivariate Behavioral Research, 2007
Numerous studies have shown increasing item reliabilities as an effect of the item position in personality scales. Traditionally, these context effects are analyzed based on item-total correlations. This approach neglects that trends in item reliabilities can be caused either by an increase in true score variance or by a decrease in error…
Descriptors: True Scores, Error of Measurement, Structural Equation Models, Simulation
Lyu, C. Felicia; And Others – 1995
A smoothed version of standardization, which merges kernel smoothing with the traditional standardization differential item functioning (DIF) approach, was used to examine DIF for student-produced response (SPR) items on the Scholastic Assessment Test (SAT) I mathematics test at both the item and testlet levels. This nonparametric technique avoids…
Descriptors: Aptitude Tests, Item Bias, Mathematics Tests, Multiple Choice Tests
Wingersky, Marilyn S. – 1989
In a variable-length adaptive test with a stopping rule that relied on the asymptotic standard error of measurement of the examinee's estimated true score, M. S. Stocking (1987) discovered that it was sufficient to know the examinee's true score and the number of items administered to predict with some accuracy whether an examinee's true score was…
Descriptors: Adaptive Testing, Bayesian Statistics, Error of Measurement, Estimation (Mathematics)
Cliff, Norman – 1984
In almost all applications of measurement there is some sort of response by a human subject. Almost always, the response scale is ordinal, but almost always it is treated as if it were an interval measure. Methods for treating data ordinally are currently being developed in three areas: ordinal analysis for questionnaire responses, ordinal…
Descriptors: Multiple Regression Analysis, Questionnaires, Research Problems, Scores
Peer reviewedLord, Frederic M. – Psychometrika, 1975
For the six available sets of empirical data, the discrimination (slope) parameter of the logistic item characteristic curve was found to have a significant positive correlation over items with the difficulty (location) parameter. This unpleasant situation can be eliminated by a suitably chosen transformation of the ability scale. (Author/RC)
Descriptors: Ability, Aptitude Tests, Correlation, Item Analysis
Wilcox, Rand R. – 1980
Concern about passing those examinees who should pass, and retaining those who need remedial work, is one problem related to criterion-referenced testing. This paper deals with one aspect of that problem. When determining how many items to include on a criterion-referenced test, practitioners must resolve various non-statistical issues before a…
Descriptors: Bayesian Statistics, Criterion Referenced Tests, Latent Trait Theory, Mathematical Models
Livingston, Samuel A. – 1970
The procedure of estimating true scores by means of a transformation of the obtained score based on the reliability coefficient is compared with the use of the obtained score without transformation. Using the mean squared error as a criterion, the transformed score is a better estimate for most examinees but poorer for those whose true scores lie…
Descriptors: Analysis of Variance, Measurement, Raw Scores, Scores
Pravalpruk, Kowit; Porter, Andrew C. – 1974
When random assignment has been accomplished and an analysis of covariance (ANCOVA) is being used to correct for initial differences among treatment groups, use of unreliable covariables not only decreases the power of ANCOVA, but also causes ANCOVA to test biased treatment effects. Several correction procedures have been suggested for the single…
Descriptors: Analysis of Covariance, Mathematical Models, Research Problems, Statistical Analysis
Doppelt, Jerome E. – Test Service Bulletin, 1956
The standard error of measurement as a means for estimating the margin of error that should be allowed for in test scores is discussed. The true score measures the performance that is characteristic of the person tested; the variations, plus and minus, around the true score describe a characteristic of the test. When the standard deviation is used…
Descriptors: Bulletins, Error of Measurement, Measurement Techniques, Reliability
Peer reviewedZimmerman, Donald W. – Educational and Psychological Measurement, 1976
Using the concepts of conditional probability, conditional expectation, and conditional independence, the main results of the classical test theory model can be derived in a very few steps with minimal assumptions. The present effort explores the possibility that present classical test theories can be further condensed. (Author/RC)
Descriptors: Career Development, Correlation, Mathematical Models, Measurement
Peer reviewedWilcox, Rand R.; Harris, Chester W. – Journal of Educational Measurement, 1977
Emrick's proposed method for determining a mastery level cut-off score is questioned. Emrick's method is shown to be useful only in limited situations. (JKS)
Descriptors: Correlation, Cutting Scores, Mastery Tests, Mathematical Models
Peer reviewedBlok, H. – Journal of Educational Measurement, 1985
Raters judged essays on two occasions making it possible to address the question of whether multiple ratings, however obtained, represent the same true scores. Multiple ratings of a given rater did represent the same true scores, but ratings of different raters did not. Reliability, validity, and invalidity coefficients were computed. (Author/DWH)
Descriptors: Analysis of Variance, Elementary Education, Essay Tests, Evaluators
Peer reviewedAndersen, Erling B. – Psychometrika, 1985
A model for longitudinal latent structure analysis was proposed that combined the values of a latent variable at two time points in a two-dimensional latent density. The correlation coefficient between the two values of the latent variable can then be estimated. (NSF)
Descriptors: Correlation, Latent Trait Theory, Mathematical Models, Maximum Likelihood Statistics

Direct link
