Publication Date
| In 2026 | 0 |
| Since 2025 | 8 |
| Since 2022 (last 5 years) | 36 |
| Since 2017 (last 10 years) | 115 |
| Since 2007 (last 20 years) | 378 |
Descriptor
| Test Theory | 1166 |
| Test Items | 262 |
| Test Reliability | 252 |
| Test Construction | 246 |
| Test Validity | 245 |
| Psychometrics | 183 |
| Scores | 176 |
| Item Response Theory | 168 |
| Foreign Countries | 160 |
| Item Analysis | 141 |
| Statistical Analysis | 134 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Location
| United States | 17 |
| United Kingdom (England) | 15 |
| Canada | 14 |
| Australia | 13 |
| Turkey | 12 |
| Sweden | 8 |
| United Kingdom | 8 |
| Netherlands | 7 |
| Texas | 7 |
| New York | 6 |
| Taiwan | 6 |
| More ▼ | |
Laws, Policies, & Programs
| No Child Left Behind Act 2001 | 4 |
| Elementary and Secondary… | 3 |
| Individuals with Disabilities… | 3 |
Assessments and Surveys
What Works Clearinghouse Rating
Peer reviewedJohnson, Richard W. – Journal of Vocational Behavior, 1974
The content of each of the Occupational and Nonoccupational scales on the Strong Vocational Interest Blank for Women was described in terms of the categories used for the Basic Interest scales. Several shortcomings of the SVIB-W were noted. (Author)
Descriptors: Females, Item Analysis, Occupational Aspiration, Test Construction
Peer reviewedHuynh, Huynh – Psychometrika, 1978
The use of Cohen's kappa index as a measure of the reliability of multiple classifications is developed. Special cases of the index as well as the effects of test length on the index are also explored. (JKS)
Descriptors: Career Development, Classification, Mastery Tests, Test Length
Peer reviewedDorans, Neil J.; Kulick, Edward – Journal of Educational Measurement, 1986
The standardization method for assessing unexpected differential item performance or differential item functioning is introduced. Findings of five studies are summarized, in which the statistical method of standardization is used to look for unexpected differences in item performance across different subpopulations of the Scholastic Aptitude Test.…
Descriptors: Groups, Item Analysis, Sociometric Techniques, Standardized Tests
Peer reviewedLewis, Charles – Psychometrika, 1986
On the occasion of Psychometrika's fiftieth anniversary, the past twenty-five years' developments in mental test theory are reviewed. Psychometrika articles treating topics in test theory are listed in a bibliography. (Author/LMO)
Descriptors: Cognitive Measurement, Mathematical Models, Psychological Testing, Psychometrics
Peer reviewedDavies, Alan – System, 1985
Evaluates John Oller's contribution to a theory of language testing, particularly his provision of detailed empirical work. Argues that Oller's work is of sufficient importance for serious flaws to be noted. Three flaws are discussed in the areas of communication, pragmatics, authenticity, and the unifactorial/one best test approach. (Author/SED)
Descriptors: Communicative Competence (Languages), Evaluation, Language Tests, Second Language Learning
Peer reviewedChambers, William V. – Social Behavior and Personality, 1985
Personal construct psychologists have suggested various psychological functions explain differences in the stability of constructs. Among these functions are constellatory and loose construction. This paper argues that measurement error is a more parsimonious explanation of the differences in construct stability reported in these studies. (Author)
Descriptors: Error of Measurement, Test Construction, Test Format, Test Reliability
Peer reviewedWilcox, Rand R. – Educational and Psychological Measurement, 1983
This article provides unbiased estimates of the proportion of items in an item domain that an examinee would answer correctly if every item were attempted, when a closed sequential testing procedure is used. (Author)
Descriptors: Estimation (Mathematics), Psychometrics, Scores, Sequential Approach
Peer reviewedAttig, John – Social Studies Review, 1984
A study of several versions of the Educational Testing Service's Achievement Test in American History and Social Studies casts doubt upon the claim that the test accurately assesses historical knowledge. Suggestions for improving the Achievement Test are made. (RM)
Descriptors: Achievement Tests, Secondary Education, Test Theory, Test Validity
Chang, Shun-Wen; Hanson, Bradley A.; Harris, Deborah J. – 2001
The requirement of large sample sizes for calibrating items based on item response theory (IRT) models is not easily met in many practical pretesting situations. Although classical item statistics could be estimated with much smaller samples, the values may not be comparable across different groups of examinees. This study extended the authors'…
Descriptors: Item Response Theory, Pretests Posttests, Sample Size, Test Items
Hwang, Dae-Yeop – 2002
This study compared classical test theory (CTT) and item response theory (IRT). The behavior of the item and person statistics derived from these two measurement frameworks was examined analytically and empirically using a data set obtained from BILOG (R. Mislay and D. Block, 1997). The example was a 15-item test with a sample size of 600…
Descriptors: Comparative Analysis, Measurement Techniques, Scores, Statistical Distributions
Rogosa, David – 2000
In the reporting of individual student results from standardized tests in educational assessments, the percentile rank of the individual student is a major numerical indicator. This paper develops a formulation and presents calculations to examine the accuracy of the individual percentile rank score. Here, accuracy follows the common-sense…
Descriptors: Comparative Analysis, Elementary Secondary Education, Standardized Tests, Test Results
Hoffman, R. Gene; Wise, Lauress L. – 2000
Classical test theory is based on the concept of a true score for each examinee, defined as the expected or average score across an infinite number of repeated parallel tests. In most cases, there is only a score from a single administration of the test in question. The difference between this single observed score and the underlying true score is…
Descriptors: Achievement, Classification, Observation, Probability
Peer reviewedRindskopf, David – Psychometrika, 1983
Various models have been proposed for analyzing dichotomous test or questionnaire items which were constructed to reflect an assumed underlying structure (e.g., hierarchical). This paper shows that many such models are special cases of latent class analysis and discusses a currently available computer program to analyze them. (Author/JKS)
Descriptors: Computer Programs, Item Analysis, Mathematical Models, Measurement Techniques
Peer reviewedFrijters, J. E. R. – Psychometrika, 1981
The Triangular Constant Method was designed for the measurement of discriminability between sensory stimuli. Its original model assumes a steady excitatory detection state. The purpose of this paper is to elaborate on the consequences of assuming a variable exicitatory state and to formulate the concomitant model. (Author)
Descriptors: Data Analysis, Mathematical Models, Measurement Techniques, Perception
Peer reviewedWilcox, Rand R. – Journal of Educational Statistics, 1981
Both the binomial and beta-binomial models are applied to various problems occurring in mental test theory. The paper reviews and critiques these models. The emphasis is on the extensions of the models that have been proposed in recent years, and that might not be familiar to many educators. (Author)
Descriptors: Error of Measurement, Item Analysis, Mathematical Models, Test Reliability


