Publication Date
| In 2026 | 0 |
| Since 2025 | 8 |
| Since 2022 (last 5 years) | 36 |
| Since 2017 (last 10 years) | 115 |
| Since 2007 (last 20 years) | 378 |
Descriptor
| Test Theory | 1166 |
| Test Items | 262 |
| Test Reliability | 252 |
| Test Construction | 246 |
| Test Validity | 245 |
| Psychometrics | 183 |
| Scores | 176 |
| Item Response Theory | 168 |
| Foreign Countries | 160 |
| Item Analysis | 141 |
| Statistical Analysis | 134 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Location
| United States | 17 |
| United Kingdom (England) | 15 |
| Canada | 14 |
| Australia | 13 |
| Turkey | 12 |
| Sweden | 8 |
| United Kingdom | 8 |
| Netherlands | 7 |
| Texas | 7 |
| New York | 6 |
| Taiwan | 6 |
| More ▼ | |
Laws, Policies, & Programs
| No Child Left Behind Act 2001 | 4 |
| Elementary and Secondary… | 3 |
| Individuals with Disabilities… | 3 |
Assessments and Surveys
What Works Clearinghouse Rating
Powell, J. C. – 1980
A multi-faceted model for the selection of answers for multiple-choice tests was developed from the findings of a series of exploratory studies. This model implies that answer selection should be curvilinear. A series of models were tested for fit using the chi square procedure. Data were collected from 359 elementary school students ages 9-12.…
Descriptors: Elementary Education, Foreign Countries, Goodness of Fit, Guessing (Tests)
Powell, J. C. – 1980
Current Scoring practices for multiple-choice tests are rooted in early Associationist Theory and are based on a two-step procedure: (1) right answers counted as ones and wrong answers are zeros, and (2) number of right answers form a total-correct score. The author contends that if either step is invalid, the use of the general linear model (GLM)…
Descriptors: Elementary Secondary Education, Higher Education, Logical Thinking, Multiple Choice Tests
Engelhard, George, Jr. – 1980
The Rasch model is described as a latent trait model which meets the five criteria that characterize reasonable and objective measurements of an individual's ability independent of the test items used. The criteria are: (1) calibration of test items must be independent of particular norming groups; (2) measurement of individuals must be…
Descriptors: Achievement Tests, Difficulty Level, Elementary Secondary Education, Equated Scores
Bernknopf, Stanley; Bashaw, W. L. – 1976
The present study was designed to examine whether or not traditional procedures concerning item selection and reliability are both applicable and appropriate for criterion-referenced (CR) tests. It was also designed to examine traditional procedures and those designed especially for CR testing in relation to test variance and item homogeneity.…
Descriptors: Career Development, Comparative Analysis, Criterion Referenced Tests, Item Analysis
Ryan, Joseph P.; Hamm, Debra W. – 1976
A procedure is described for increasing the reliability of tests after they have been given and for developing shorter but more reliable tests. Eight tests administered to 200 graduate students studying educational research are analyzed. The analysis considers the original tests, the items loading on the first factor of the test, and the items…
Descriptors: Career Development, Factor Analysis, Factor Structure, Item Analysis
Peer reviewedBlanchard, Jay; Johns, Jerry – Reading Psychology, 1986
Concludes that IRIs can be useful, flexible assessment and instruction tools in the hands of knowledgeable teachers. Offers suggestions for their use. (FL)
Descriptors: Elementary Secondary Education, Evaluation Criteria, Informal Reading Inventories, Reading Diagnosis
Peer reviewedJansen, Margo G. H. – Journal of Educational Statistics, 1986
In this paper a Bayesian procedure is developed for the simultaneous estimation of the reading ability and difficulty parameters which are assumed to be factors in reading errors by the multiplicative Poisson Model. According to several criteria, the Bayesian estimates are better than comparable maximum likelihood estimates. (Author/JAZ)
Descriptors: Achievement Tests, Bayesian Statistics, Comparative Analysis, Difficulty Level
Choppin, Bruce – Evaluation in Education: An International Review Series, 1985
Using the analogy of temperature measurement, the Rasch model is presented with arguments for its adoption as the basic scaling technique for achievement measures. Three extensions of the Rasch model for more complex testing are developed. Test development for the British national assessment program and the promise of item banking are also…
Descriptors: Academic Achievement, Achievement Tests, Educational Assessment, Item Banks
Choppin, Bruce – Evaluation in Education: An International Review Series, 1985
During 1969 the International Association for the Evaluation of Educational Achievement began a series of cross-cultural studies to investigate the workings of multiple-choice achievement tests and student guessing behaviors. Empirical models to correct for guessing are discussed in terms of test item difficulty, number of response choices,…
Descriptors: Achievement Tests, Cross Cultural Studies, Educational Testing, Guessing (Tests)
Peer reviewedLeary, Linda F.; Dorans, Neil J. – Review of Educational Research, 1985
Research on the potential effects of different item arrangement schemes on item statistics is reviewed for three separate periods. Earliest studies investigated the simple main effect of item order on test performance. The late 1960s emphasized interactions between item order and examinees' characteristics. Current concern focuses on item…
Descriptors: Achievement Tests, Aptitude Tests, Item Analysis, Latent Trait Theory
Peer reviewedTyler, Ralph W. – International Journal of Educational Research, 1986
This monograph examines the shortcomings of conceptualization and practice in the evaluation of student learning and the uses of evaluation for teachers, principals, district supervisors, and state and national authorities. A new paradigm is proposed for assessing educational potential, effective learning, educational achievements of large…
Descriptors: Economic Change, Educational Assessment, Educational Change, Educational Opportunities
Peer reviewedBhaskar, R.; Dillard, Jesse F. – Instructional Science, 1983
Description of an objective method for assigning weights to questions on examinations includes discussions of classical test theory, knowledge organization, and how task analysis can be used to identify knowledge elements required to solve specific problems, rank them, and assign objective weights to exam questions using a Pareto distribution (7…
Descriptors: Accounting, Epistemology, Evaluation Methods, Item Analysis
Peer reviewedHale, Gordon A.; And Others – Modern Language Journal, 1984
Provides a bibliography of published research papers that either describe the history of the TOEFL, offer a critical review of the test, or interpret TOEFL research findings. Some topics include: the correlation of TOEFL with other standardized tests of English language proficiency, TOEFL's role as a predictor of academic performance, the…
Descriptors: Cultural Awareness, English (Second Language), Ethnic Groups, Language Proficiency
Peer reviewedBoehm, Ann E. – Teachers College Record, 1973
The 1970s has become the time for reevaluating the intended purposes of tests, and for considering meaningful alternatives to the test score. Author considers the criterion-referenced versus the norm-referenced test. (Author/RK)
Descriptors: Behavioral Objectives, Criterion Referenced Tests, Educational Objectives, Evaluative Thinking
Peer reviewedAiraisian, Peter W. – International Journal of Educational Research, 1997
This issue presents examinations of educational testing, large-scale alternative assessment, small-scale alternative assessment, and educational measurement. These discussions go beyond technical issues to provide a conceptual perspective and a view of underlying histories, theories, applications, and the uncertainties associated with these…
Descriptors: Alternative Assessment, Educational Assessment, Educational Change, Educational Testing


