Publication Date
| In 2026 | 0 |
| Since 2025 | 8 |
| Since 2022 (last 5 years) | 36 |
| Since 2017 (last 10 years) | 115 |
| Since 2007 (last 20 years) | 378 |
Descriptor
| Test Theory | 1166 |
| Test Items | 262 |
| Test Reliability | 252 |
| Test Construction | 246 |
| Test Validity | 245 |
| Psychometrics | 183 |
| Scores | 176 |
| Item Response Theory | 168 |
| Foreign Countries | 160 |
| Item Analysis | 141 |
| Statistical Analysis | 134 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Location
| United States | 17 |
| United Kingdom (England) | 15 |
| Canada | 14 |
| Australia | 13 |
| Turkey | 12 |
| Sweden | 8 |
| United Kingdom | 8 |
| Netherlands | 7 |
| Texas | 7 |
| New York | 6 |
| Taiwan | 6 |
| More ▼ | |
Laws, Policies, & Programs
| No Child Left Behind Act 2001 | 4 |
| Elementary and Secondary… | 3 |
| Individuals with Disabilities… | 3 |
Assessments and Surveys
What Works Clearinghouse Rating
Peer reviewedLoyd, Brenda H. – Applied Measurement in Education, 1988
The impact of item response theory (IRT) on the measurement practitioner is discussed, with a review of potential benefits. The complexity of IRT theory and procedures and the lack of robustness of IRT procedures to violation of assumptions must be recognized for the measurement practitioner to realize its advantages. (SLD)
Descriptors: Educational Researchers, Evaluation Methods, Evaluators, Latent Trait Theory
Peer reviewedMessick, Samuel – Educational Researcher, 1989
Presents a unified concept of test validity that integrates both the scientific and ethical considerations of test interpretation and use. Argues that the appropriateness, meaningfulness, and usefulness of score-based inferences are inseparable, and that this integration is based on construct validity. (FMW)
Descriptors: Construct Validity, Ethics, Scores, Social Influences
Peer reviewedGupta, J. K.; And Others – Journal of Experimental Education, 1988
How the validity of gain scores varies with the standard deviations of pretest and posttest scores and the correlation between the two are analyzed. Earlier findings that under realistic testing conditions difference scores can have excellent predictive value are supported. Conditions under which gain scores have optimum validity are specified.…
Descriptors: Educational Change, Equations (Mathematics), Measures (Individuals), Predictive Validity
Peer reviewedAnderson, Timothy; Dixon, Wallace E., Jr. – Journal of Research on Adolescence, 1995
Tested one-, two-, three-, and four-factor models within normal and psychiatric adolescent inpatient groups to confirm the factor structure for the Wechsler Intelligence Scale for Children-Revised (WISC-R). For both samples, the Kaufman three-factor solution had the best overall fit of the WISC-R subtest covariance structure. Other models were…
Descriptors: Adolescents, Factor Analysis, Institutionalized Persons, Intelligence Tests
Peer reviewedShohamy, Elana – Annual Review of Applied Linguistics, 1990
Reviews studies and tests that show how discourse analysis has contributed to the theory, research, and development of language testing, covering the relations among discourse analysis and competence and testing theory; research on language tests and tasks; and task development. A 60-citation unannotated bibliography is included. (CB)
Descriptors: Communicative Competence (Languages), Discourse Analysis, Language Research, Language Tests
Peer reviewedGrigorenko, Elena L.; Sternberg, Robert J.; Ehrman, Madeline E. – Modern Language Journal, 2000
Presents a rationale, description, and partial construct validation of a new theory of foreign language aptitude: CANAL-F--Cognitive Ability for Novelty in Acquisition of Language (foreign). The theory was applied and implemented in a test of foreign language aptitude (CANAL-FT). Outlines the CANAL-F theory and details of its instrumentation…
Descriptors: Construct Validity, Language Aptitude, Language Tests, Second Language Instruction
Allen, Nancy L.; Holland, Paul W.; Thayer, Dorothy T. – Journal of Educational Measurement, 2005
Allowing students to choose the question(s) that they will answer from among several possible alternatives is often viewed as a mechanism for increasing fairness in certain types of assessments. The fairness of optional topic choice is not a universally accepted fact, however, and various studies have been done to assess this question. We examine…
Descriptors: Test Theory, Test Items, Student Evaluation, Evaluation Methods
Lucke, Joseph F. – Applied Psychological Measurement, 2005
Psychometric theory focuses primarily on tests that are homogeneous, measuring only one attribute of a psychosocial entity. However, the complexity of psychosocial behavior often requires tests that are heterogeneous, measuring more than one attribute. In this presentation, reliability and internal consistency are extended to heterogeneous tests…
Descriptors: Psychometrics, Item Response Theory, Test Reliability, Psychological Studies
Waldron, Chad H. – Online Submission, 2008
The research study examined whether a difference existed between the reading achievement scores of an experimental group and a control group in standardized reading achievement. This difference measured the effect of systematic oral reading fluency instruction with repeated readings. Data from the 4Sight Pennsylvania Benchmark Reading Assessments…
Descriptors: Experimental Groups, Control Groups, Reading Fluency, Reading Achievement
Blanton, Hart; Jaccard, James – Psychological Review, 2006
Theories that posit multiplicative relationships between variables are common in psychology. A. G. Greenwald et al. recently presented a theory that explicated relationships between group identification, group attitudes, and self-esteem. Their theory posits a multiplicative relationship between concepts when predicting a criterion variable.…
Descriptors: Testing, Models, Psychology, Case Studies
Arnold, Margery E. – 1996
It is incorrect to say "the test is reliable" because reliability is a function not only of the test itself, but of many factors. The present paper explains how different factors affect classical reliability estimates such as test-retest, interrater, internal consistency, and equivalent forms coefficients. Furthermore, the limits of classical test…
Descriptors: Estimation (Mathematics), Generalizability Theory, Heuristics, Interrater Reliability
Ammeraal, Brenda – 1997
A study examined the correlation between students' placement test scores on a multiple-choice test and their passing rate on the Advanced Placement (AP) language exam. Statistics show that the number of students taking advanced placement tests is increasing, and a review of the literature supports the need for further research in the area of…
Descriptors: Advanced Placement, Catholic Schools, Correlation, English
Berger, Martijn P. F.; Veerkamp, Wim J. J. – 1994
The designing of tests has been a source of concern for test developers over the past decade. Various kinds of test forms have been applied. Among these are the fixed-form test, the adaptive test, and the testlet. Each of these forms has its own design. In this paper, the construction of test forms is placed within the general framework of optimal…
Descriptors: Adaptive Testing, Foreign Countries, Research Design, Selection
Mislevy, Robert J. – 1995
Educational test theory consists of statistical and methodological tools to support inferences about examinees' knowledge, skills, and accomplishments. The evolution of test theory has been shaped by the nature of users' inferences which, until recently, have been framed almost exclusively in terms of trait and behavioral psychology. Progress in…
Descriptors: Cognitive Psychology, Developmental Psychology, Educational Testing, Inferences
Schumacker, Randall E. – 1998
In comparing measurement theories, it is evident that the awareness of the concept of measurement error during the time of Galileo has lead to the formulation of observed scores comprising a true score and error (classical theory), universe score and various random error components (generalizability theory), or individual latent ability and error…
Descriptors: Comparative Analysis, Computer Software, Error of Measurement, Generalizability Theory

Direct link
