Publication Date
| In 2026 | 3 |
| Since 2025 | 666 |
| Since 2022 (last 5 years) | 3167 |
| Since 2017 (last 10 years) | 7408 |
| Since 2007 (last 20 years) | 15046 |
Descriptor
| Test Reliability | 15036 |
| Test Validity | 10272 |
| Reliability | 9759 |
| Foreign Countries | 7141 |
| Test Construction | 4823 |
| Validity | 4191 |
| Measures (Individuals) | 3877 |
| Factor Analysis | 3825 |
| Psychometrics | 3525 |
| Interrater Reliability | 3124 |
| Correlation | 3039 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 709 |
| Practitioners | 451 |
| Teachers | 208 |
| Administrators | 122 |
| Policymakers | 66 |
| Counselors | 42 |
| Students | 38 |
| Parents | 11 |
| Community | 7 |
| Support Staff | 6 |
| Media Staff | 5 |
| More ▼ | |
Location
| Turkey | 1327 |
| Australia | 436 |
| Canada | 379 |
| China | 368 |
| United States | 271 |
| United Kingdom | 256 |
| Indonesia | 252 |
| Taiwan | 234 |
| Netherlands | 223 |
| Spain | 216 |
| California | 214 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 8 |
| Meets WWC Standards with or without Reservations | 9 |
| Does not meet standards | 6 |
Peer reviewedRaju, Nambury S. – Educational and Psychological Measurement, 1982
A necessary and sufficient condition for a perfectly homogeneous test in the sense of Loevinger is stated and proved. Using this result, a formula for computing the maximum possible KR-20 when the test variance is assumed fixed is presented. A new index of test homogeneity is also presented and discussed. (Author/BW)
Descriptors: Mathematical Formulas, Mathematical Models, Multiple Choice Tests, Test Reliability
Peer reviewedMonsen, Randall B. – American Annals of the Deaf, 1981
Data on the validity, reliability, and internal consistency of the test are presented to show that it can reliably predict the intelligibility of running speech. (Author)
Descriptors: Deafness, Speech Evaluation, Test Reliability, Test Use
Peer reviewedMelzer, Charles W.; And Others – Educational and Psychological Measurement, 1981
The magnitude of statistical bias for the phi-coefficient was investigated, using computer simulated examinations in which all the students had equal knowledge. Several modifications of phi were tested, but when applied to real examinations, none succeeded in improving its reproducibility when items are re-used on equivalent student groups.…
Descriptors: Correlation, Item Analysis, Mathematical Models, Multiple Choice Tests
Peer reviewedGibbons, Jean D.; And Others – Psychometrika, 1979
On a multiple-choice test in which each item has k alternative responses, the test taker is permitted to choose any subset which he believes contains the one correct answer. A scoring system is devised. (Author/CTM)
Descriptors: Confidence Testing, Efficiency, Multiple Choice Tests, Scoring
Peer reviewedDilworth, Collett B., Jr.; Reising, Robert W. – Clearing House, 1979
Three types of validity are discussed: content, criterion, and construct. It is suggested that those who teach and evaluate student writing should be aware of the need for valid and reliable measures of evaluating and grading student compositions. (KC)
Descriptors: Educational Philosophy, Evaluation Criteria, Grading, Reliability
Peer reviewedNishisato, Shizuhiko; Sheu, Wen-Jenn – Psychometrika, 1980
A modification of the method of reciprocal averages for scaling multiple choice data is proposed. The proposed method handles the data in a piecewise fashion and allows for faster convergence to a solution. (Author/JKS)
Descriptors: Item Analysis, Measurement Techniques, Multiple Choice Tests, Test Reliability
Peer reviewedten Berge, Jos M. F.; Zegers, Frits E. – Psychometrika, 1978
Two lower bounds to reliability in classical test theory, Guttman's lamda and Cronbach's alpha, are shown to be terms of an infinite series of lower bounds. All terms of this series are equal to reliability if and only if the test contains items which are tau-equivalent. (Author/JKS)
Descriptors: Mathematical Formulas, Psychometrics, Technical Reports, Test Interpretation
Peer reviewedHuynh, Huynh – Journal of Educational Statistics, 1981
Simulated data based on five test score distributions indicate that a slight modification of the asymptotic normal theory for the estimation of the p and kappa indices in mastery testing will provide results which are in close agreement with those based on small samples from the beta-binomial distribution. (Author/BW)
Descriptors: Error of Measurement, Mastery Tests, Mathematical Models, Test Reliability
Peer reviewedTollefson, Nona; Tracy, D. B. – Education, 1980
The reliability and validity of essay scoring were investigated by comparing the mean scores assigned to good and poor quality essay responses of different lengths. Long responses had a significantly higher mean than responses of short or moderate length. Good quality responses were graded significantly higher than poor quality responses. (Author)
Descriptors: Essay Tests, Grade 10, Grading, Reliability
Clarke, B. R.; And Others – B. C. Journal of Special Education, 1979
Data showed high reliability for all nine syntactic structures and total screen scores. Reliability remained high when results were examined for different hearing loss categories. There was a significant decrease in scores across these hearing loss categories but a marked increase in the discriminating power of the screens. (Author/CL)
Descriptors: Exceptional Child Research, Hearing Impairments, Screening Tests, Test Reliability
Trieber, J. Marshall – Training and Development Journal, 1980
Aims to help instructors make more valid test questions, particularly multiple-choice ones. Emphasis is placed on multiple-choice questions to show the wealth of opportunities they offer for testing because of their uses, objectivity, and ease of grading. Discusses test scheduling, construction, and evaluation and follow-up. (CT)
Descriptors: Multiple Choice Tests, Test Construction, Test Reliability, Test Validity
Peer reviewedSilverstein, A. B. – Educational and Psychological Measurement, 1980
An alternative derivation was given of Gaylord's formulas showing the relationships among the average item intercorrelation, the average item-test correlation, and test reliability. Certain parallels were also noted in analysis of variance and principal component analysis. (Author)
Descriptors: Analysis of Variance, Item Analysis, Mathematical Formulas, Test Reliability
Peer reviewedConger, Anthony J. – Educational and Psychological Measurement, 1980
Reliability maximizing weights are related to theoretically specified true score scaling weights to show a constant relationship that is invariant under separate linear tranformations on each variable in the system. Test theoretic relations should be derived for the most general model available and not for unnecessarily constrained models.…
Descriptors: Mathematical Formulas, Scaling, Test Reliability, Test Theory
Peer reviewedKaiser, Henry F. – Educational and Psychological Measurement, 1980
The use of Bayes' estimates for proportions in the Law of Comparative Judgment is suggested to avoid sample proportions of zero and one. (Author)
Descriptors: Bayesian Statistics, Comparative Analysis, Reliability, Statistical Analysis
Peer reviewedDivgi, D. R. – Applied Psychological Measurement, 1980
The dependence of reliability indices for mastery tests on mean and cutoff scores was examined in the case of three decision-theoretic indices. Dependence of kappa on mean and cutoff scores was opposite to that of the proportion of correct decisions, which was linearly related to average threshold loss. (Author/BW)
Descriptors: Classification, Cutting Scores, Mastery Tests, Test Reliability


