Publication Date
| In 2026 | 0 |
| Since 2025 | 8 |
| Since 2022 (last 5 years) | 36 |
| Since 2017 (last 10 years) | 115 |
| Since 2007 (last 20 years) | 378 |
Descriptor
| Test Theory | 1166 |
| Test Items | 262 |
| Test Reliability | 252 |
| Test Construction | 246 |
| Test Validity | 245 |
| Psychometrics | 183 |
| Scores | 176 |
| Item Response Theory | 168 |
| Foreign Countries | 160 |
| Item Analysis | 141 |
| Statistical Analysis | 134 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Location
| United States | 17 |
| United Kingdom (England) | 15 |
| Canada | 14 |
| Australia | 13 |
| Turkey | 12 |
| Sweden | 8 |
| United Kingdom | 8 |
| Netherlands | 7 |
| Texas | 7 |
| New York | 6 |
| Taiwan | 6 |
| More ▼ | |
Laws, Policies, & Programs
| No Child Left Behind Act 2001 | 4 |
| Elementary and Secondary… | 3 |
| Individuals with Disabilities… | 3 |
Assessments and Surveys
What Works Clearinghouse Rating
Peer reviewedJohnson, D. G. – Journal of Visual Impairment and Blindness, 1989
Preliminary evaluation of a testing technique which might meet the need for a standardized, validated, and objective means of psychologically testing people with visual or reading impairments is reported. The test is intended to be administered via an audiocassette, with identical administration and response procedures for totally and partially…
Descriptors: Audiotape Cassettes, Blindness, Psychological Testing, Psychometrics
Peer reviewedSamejima, Fumiko – Applied Psychological Measurement, 1994
The reliability coefficient is predicted from the test information function (TIF) or two modified TIF formulas and a specific trait distribution. Examples illustrate the variability of the reliability coefficient across different trait distributions, and results are compared with empirical reliability coefficients. (SLD)
Descriptors: Adaptive Testing, Error of Measurement, Estimation (Mathematics), Reliability
Peer reviewedCampbell, N. Jo – Clearing House, 1994
Discusses the exact meaning and limitations of commonly used types of standardized test scores: grade equivalent scores; percentile ranks, and standard scores. (SR)
Descriptors: Elementary Secondary Education, Grade Equivalent Scores, Standardized Tests, Test Results
Peer reviewedJones, W. Paul – Educational and Psychological Measurement, 1991
A Bayesian alternative to interpretations based on classical reliability theory is presented. Procedures are detailed for calculation of a posterior score and credible interval with joint consideration of item sample and occasion error. (Author/SLD)
Descriptors: Bayesian Statistics, Equations (Mathematics), Mathematical Models, Statistical Inference
Peer reviewedVongumivitch, Viphavee; Carr, Nathan – Issues in Applied Linguistics, 2001
Includes an interview with a noted figure in the field of language assessment. Discusses his work on washback theory as well as his experiences with and views on the challenges and advantages of computer-based and Web-based testing. (Author/VWL)
Descriptors: Computer Assisted Testing, Interviews, Language Tests, Test Theory
Zimmerman, Donald W.; Williams, Richard H.; Zumbo, Bruno D.; Ross, Donald – International Journal of Testing, 2005
This article focuses on Louis Guttman's contributions to the classical theory of educational and psychological tests, one of the lesser known of his many contributions to quantitative methods in the social sciences. Guttman's work in this field provided a rigorous mathematical basis for ideas that, for many decades after Spearman's initial work,…
Descriptors: Evaluation Methods, Test Theory, Social Sciences, Psychological Testing
Raju, Nambury S.; Oshima, T.C. – Educational and Psychological Measurement, 2005
Two new prophecy formulas for estimating item response theory (IRT)-based reliability of a shortened or lengthened test are proposed. Some of the relationships between the two formulas, one of which is identical to the well-known Spearman-Brown prophecy formula, are examined and illustrated. The major assumptions underlying these formulas are…
Descriptors: Item Response Theory, Test Reliability, Evaluation Methods, Computation
Biswas, Ajoy Kumar – Applied Psychological Measurement, 2006
This article studies the ordinal reliability of (total) test scores. This study is based on a classical-type linear model of observed score (X), true score (T), and random error (E). Based on the idea of Kendall's tau-a coefficient, a measure of ordinal reliability for small-examinee populations is developed. This measure is extended to large…
Descriptors: True Scores, Test Theory, Test Reliability, Scores
Reeve, Charlie L.; Lam, Holly – Intelligence, 2005
The simple practice effects commonly observed when retaking general cognitive ability tests present a potential paradox. If observed score changes reflect real changes in g, we must revisit our understanding of its stability. Conversely, if observed score changes reflect something other than a true change in the underlying latent construct, this…
Descriptors: Psychometrics, Cognitive Ability, Cognitive Measurement, Test Theory
Borsboom, Denny – Psychometrika, 2006
This paper analyzes the theoretical, pragmatic, and substantive factors that have hampered the integration between psychology and psychometrics. Theoretical factors include the operationalist mode of thinking which is common throughout psychology, the dominance of classical test theory, and the use of "construct validity" as a catch-all category…
Descriptors: Psychometrics, Psychology, Test Theory, Construct Validity
van der Linden, Wim J.; Sotaridona, Leonardo – Journal of Educational and Behavioral Statistics, 2006
A statistical test for detecting answer copying on multiple-choice items is presented. The test is based on the exact null distribution of the number of random matches between two test takers under the assumption that the response process follows a known response model. The null distribution can easily be generalized to the family of distributions…
Descriptors: Test Items, Multiple Choice Tests, Cheating, Responses
Reid, Christine A.; Kolakowsky-Hayner, Stephanie A.; Lewis, Allen N.; Armstrong, Amy J. – Rehabilitation Counseling Bulletin, 2007
Item response theory (IRT) methodology is introduced as a tool for improving assessment instruments used with people who have disabilities. Need for this approach in rehabilitation is emphasized; differences between IRT and classical test theory are clarified. Concepts essential to understanding IRT are defined, necessary data assumptions are…
Descriptors: Psychometrics, Methods, Item Response Theory, Aptitude Tests
van der Linden, Wim J.; Breithaupt, Krista; Chuah, Siang Chee; Zhang, Yanwei – Journal of Educational Measurement, 2007
A potential undesirable effect of multistage testing is differential speededness, which happens if some of the test takers run out of time because they receive subtests with items that are more time intensive than others. This article shows how a probabilistic response-time model can be used for estimating differences in time intensities and speed…
Descriptors: Adaptive Testing, Evaluation Methods, Test Items, Reaction Time
Mapuranga, Raymond; Dorans, Neil J.; Middleton, Kyndra – ETS Research Report Series, 2008
In many practical settings, essentially the same differential item functioning (DIF) procedures have been in use since the late 1980s. Since then, examinee populations have become more heterogeneous, and tests have included more polytomously scored items. This paper summarizes and classifies new DIF methods and procedures that have appeared since…
Descriptors: Test Bias, Educational Development, Evaluation Methods, Statistical Analysis
Corkum, Penny; Andreou, Pantelis; Schachar, Russell; Tannock, Rosemary; Cunningham, Charles – Educational and Psychological Measurement, 2007
With increasing interest in studies evaluating treatment outcome in children with attention deficit hyperactivity disorder (ADHD), there is a need for treatment-sensitive instruments that are feasible, yield valid and reliable scores, and measure outcome in a "time-locked" and "situation- and symptom-specific" manner. These instruments are needed…
Descriptors: Attention Deficit Disorders, Children, Evaluation Methods, Generalizability Theory

Direct link
