Publication Date
| In 2026 | 0 |
| Since 2025 | 8 |
| Since 2022 (last 5 years) | 36 |
| Since 2017 (last 10 years) | 115 |
| Since 2007 (last 20 years) | 378 |
Descriptor
| Test Theory | 1166 |
| Test Items | 262 |
| Test Reliability | 252 |
| Test Construction | 246 |
| Test Validity | 245 |
| Psychometrics | 183 |
| Scores | 176 |
| Item Response Theory | 168 |
| Foreign Countries | 160 |
| Item Analysis | 141 |
| Statistical Analysis | 134 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Location
| United States | 17 |
| United Kingdom (England) | 15 |
| Canada | 14 |
| Australia | 13 |
| Turkey | 12 |
| Sweden | 8 |
| United Kingdom | 8 |
| Netherlands | 7 |
| Texas | 7 |
| New York | 6 |
| Taiwan | 6 |
| More ▼ | |
Laws, Policies, & Programs
| No Child Left Behind Act 2001 | 4 |
| Elementary and Secondary… | 3 |
| Individuals with Disabilities… | 3 |
Assessments and Surveys
What Works Clearinghouse Rating
Peer reviewedWilliams, Richard H.; Zimmerman, Donald W. – Journal of Experimental Education, 1982
A mathematical link between test reliability and test validity is derived, taking into account the correlation between error scores on a test and error scores on a criterion measure. When this correlation is positive, the "paradoxical" nonmonotonic relation between test reliability and test validity occurs universally. (Author/BW)
Descriptors: Correlation, Error of Measurement, Mathematical Models, Test Reliability
Burton, Robert S. – New Directions for Testing and Measurement, 1980
Although Model A, the only norm-referenced evaluation procedure in the Title I Evaluation and Reporting System, requires no data other than the test scores themselves, it introduces two sources of bias and involved three test administrations. Roberts' two-test procedure offers the advantages of less bias and less testing. (RL)
Descriptors: Comparative Analysis, Mathematical Formulas, Scores, Statistical Bias
Peer reviewedKraemer, Helena Chmura – Psychometrika, 1981
Limitations and extensions of Feldt's approach to testing the equality of Cronbach's alpha coefficients in independent and matched samples are discussed. In particular, this approach is used to test equality of intraclass correlation coefficients. (Author)
Descriptors: Analysis of Variance, Correlation, Hypothesis Testing, Mathematical Models
Peer reviewedHolland, Paul W. – Psychometrika, 1981
Deciding whether sets of test data are consistent with any of a large class of item response models is considered. The assumption of local independence is weakened to a new condition, local nonnegative dependence (LND). Necessary and sufficient conditions are derived for a LND item response model. (Author/JKS)
Descriptors: Item Analysis, Latent Trait Theory, Mathematical Models, Psychometrics
Peer reviewedBentler, P. M.; Woodward, Arthur J. – Psychometrika, 1980
A chain of lower bound inequalities leading to the greatest lower bound to reliability is established for the internal consistency of a composite of unit-weighted scores (such as a test). Algorithms for obtaining various reliability coefficients are presented. (Author/JKS)
Descriptors: Factor Analysis, Item Analysis, Measurement Techniques, Test Construction
Peer reviewedZdenek, Joseph W. – Foreign Language Annals, 1980
In spite of new methodologies in foreign language instruction, much testing is still of the traditional type. Paper and pencil tests are given, testing in exactly the same way the teachers themselves were tested. This article suggests 25 points for language teachers on all levels. (Author/PJM)
Descriptors: Second Language Instruction, Teaching Methods, Test Construction, Test Theory
Peer reviewedJohnson, D. G. – Journal of Visual Impairment and Blindness, 1989
Preliminary evaluation of a testing technique which might meet the need for a standardized, validated, and objective means of psychologically testing people with visual or reading impairments is reported. The test is intended to be administered via an audiocassette, with identical administration and response procedures for totally and partially…
Descriptors: Audiotape Cassettes, Blindness, Psychological Testing, Psychometrics
Peer reviewedSamejima, Fumiko – Applied Psychological Measurement, 1994
The reliability coefficient is predicted from the test information function (TIF) or two modified TIF formulas and a specific trait distribution. Examples illustrate the variability of the reliability coefficient across different trait distributions, and results are compared with empirical reliability coefficients. (SLD)
Descriptors: Adaptive Testing, Error of Measurement, Estimation (Mathematics), Reliability
Peer reviewedCampbell, N. Jo – Clearing House, 1994
Discusses the exact meaning and limitations of commonly used types of standardized test scores: grade equivalent scores; percentile ranks, and standard scores. (SR)
Descriptors: Elementary Secondary Education, Grade Equivalent Scores, Standardized Tests, Test Results
Peer reviewedJones, W. Paul – Educational and Psychological Measurement, 1991
A Bayesian alternative to interpretations based on classical reliability theory is presented. Procedures are detailed for calculation of a posterior score and credible interval with joint consideration of item sample and occasion error. (Author/SLD)
Descriptors: Bayesian Statistics, Equations (Mathematics), Mathematical Models, Statistical Inference
Peer reviewedVongumivitch, Viphavee; Carr, Nathan – Issues in Applied Linguistics, 2001
Includes an interview with a noted figure in the field of language assessment. Discusses his work on washback theory as well as his experiences with and views on the challenges and advantages of computer-based and Web-based testing. (Author/VWL)
Descriptors: Computer Assisted Testing, Interviews, Language Tests, Test Theory
Zimmerman, Donald W.; Williams, Richard H.; Zumbo, Bruno D.; Ross, Donald – International Journal of Testing, 2005
This article focuses on Louis Guttman's contributions to the classical theory of educational and psychological tests, one of the lesser known of his many contributions to quantitative methods in the social sciences. Guttman's work in this field provided a rigorous mathematical basis for ideas that, for many decades after Spearman's initial work,…
Descriptors: Evaluation Methods, Test Theory, Social Sciences, Psychological Testing
Raju, Nambury S.; Oshima, T.C. – Educational and Psychological Measurement, 2005
Two new prophecy formulas for estimating item response theory (IRT)-based reliability of a shortened or lengthened test are proposed. Some of the relationships between the two formulas, one of which is identical to the well-known Spearman-Brown prophecy formula, are examined and illustrated. The major assumptions underlying these formulas are…
Descriptors: Item Response Theory, Test Reliability, Evaluation Methods, Computation
Reeve, Charlie L.; Lam, Holly – Intelligence, 2005
The simple practice effects commonly observed when retaking general cognitive ability tests present a potential paradox. If observed score changes reflect real changes in g, we must revisit our understanding of its stability. Conversely, if observed score changes reflect something other than a true change in the underlying latent construct, this…
Descriptors: Psychometrics, Cognitive Ability, Cognitive Measurement, Test Theory
Corkum, Penny; Andreou, Pantelis; Schachar, Russell; Tannock, Rosemary; Cunningham, Charles – Educational and Psychological Measurement, 2007
With increasing interest in studies evaluating treatment outcome in children with attention deficit hyperactivity disorder (ADHD), there is a need for treatment-sensitive instruments that are feasible, yield valid and reliable scores, and measure outcome in a "time-locked" and "situation- and symptom-specific" manner. These instruments are needed…
Descriptors: Attention Deficit Disorders, Children, Evaluation Methods, Generalizability Theory

Direct link
