Publication Date
| In 2026 | 0 |
| Since 2025 | 8 |
| Since 2022 (last 5 years) | 36 |
| Since 2017 (last 10 years) | 115 |
| Since 2007 (last 20 years) | 378 |
Descriptor
| Test Theory | 1166 |
| Test Items | 262 |
| Test Reliability | 252 |
| Test Construction | 246 |
| Test Validity | 245 |
| Psychometrics | 183 |
| Scores | 176 |
| Item Response Theory | 168 |
| Foreign Countries | 160 |
| Item Analysis | 141 |
| Statistical Analysis | 134 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Location
| United States | 17 |
| United Kingdom (England) | 15 |
| Canada | 14 |
| Australia | 13 |
| Turkey | 12 |
| Sweden | 8 |
| United Kingdom | 8 |
| Netherlands | 7 |
| Texas | 7 |
| New York | 6 |
| Taiwan | 6 |
| More ▼ | |
Laws, Policies, & Programs
| No Child Left Behind Act 2001 | 4 |
| Elementary and Secondary… | 3 |
| Individuals with Disabilities… | 3 |
Assessments and Surveys
What Works Clearinghouse Rating
DeWine, Sue; And Others – 1985
Based on a review of the literature, this paper examines criticisms leveled against the communication audit developed by the International Communication Association (ICA) and then offers a modified version of the audit designed to meet those criticisms. Following a brief introduction, the first section of the paper reviews criticisms of the audit,…
Descriptors: Communication Research, Organizational Communication, Research Methodology, Speech Communication
Kane, Michael; Wilson, Jennifer – 1982
This paper evaluates the magnitude of the total error in estimates of the difference between an examinee's domain score and the cutoff score. An observed score based on a random sample of items from the domain, and an estimated cutoff score derived from a judgmental standard setting procedure are assumed. The work of Brennan and Lockwood (1980) is…
Descriptors: Criterion Referenced Tests, Cutting Scores, Error of Measurement, Mastery Tests
Petersen, Nancy S.; And Others – 1982
In January 1982, the College Board and Educational Testing Service implemented a technical change in the procedures used to equate scores on the Scholastic Aptitude Test (SAT). For previous editions of the SAT, a linear equating procedure was used to establish the comparability of scores on different editions. Beginning in January 1982, this…
Descriptors: College Entrance Examinations, Equated Scores, Latent Trait Theory, Research Methodology
Hutchinson, T. P. – 1984
One means of learning about the processes operating in a multiple choice test is to include some test items, called nonsense items, which have no correct answer. This paper compares two versions of a mathematical model of test performance to interpret test data that includes both genuine and nonsense items. One formula is based on the usual…
Descriptors: Foreign Countries, Guessing (Tests), Mathematical Models, Multiple Choice Tests
PDF pending restorationLovett, Hubert T. – 1975
The reliability of a criterion referenced test was defined as a measure of the degree to which the test discriminates between an individual's level of performance and a predetermined criterion level. The variances of observed and true scores were defined as the squared deviation of the score from the criterion. Based on these definitions and the…
Descriptors: Career Development, Comparative Analysis, Criterion Referenced Tests, Mathematical Models
Stability of the Kaufman Assessment Battery for Children for a Sample of At-Risk Preschool Children.
Peer reviewedLyon, Mark A.; Smith, Douglas K. – Psychology in the Schools, 1987
Examined stability of the Kaufman Assessment Battery for 53 at-risk preschool children. Over 9 months the stability coefficients for the global scales ranged from .78 to .88, and for the subtests from .65 to .79. Concluded that scores display adequate stability, but the Simultaneous scale is less stable than the Sequential or Achievement scales.…
Descriptors: Cognitive Measurement, High Risk Students, Preschool Children, Preschool Education
Peer reviewedHarris, Deborah J.; Subkoviak, Michael J. – Educational and Psychological Measurement, 1986
This study examined three statistical methods for selecting items for mastery tests: (1) pretest-posttest; (2) latent trait; and (3) agreement statistics. The correlation between the latent trait method and agreement statistics, proposed here as an alternative, was substantial. Results for the pretest-posttest method confirmed its reputed…
Descriptors: Computer Simulation, Correlation, Item Analysis, Latent Trait Theory
Peer reviewedBlixt, Sonya L.; Shama, Deborah D. – Educational and Psychological Measurement, 1986
Methods of estimating the standard error at different ability levels were compared. Overall, it was found that at a given ability level the standard errors calculated using different formulas are not appreciably different. Further, for most situations the traditional method of calculating a standard error probably provides sufficient precision.…
Descriptors: College Freshmen, Error of Measurement, Higher Education, Mathematics Achievement
Peer reviewedFagley, N. S. – Journal of Educational Psychology, 1987
This article investigates positional response bias, testwiseness, and guessing strategy as components of variance in test responses on multiple-choice tests. University students responded to two content exams, a testwiseness measure, and a guessing strategy measure. The proportion of variance in test scores accounted for by positional response…
Descriptors: Achievement Tests, Guessing (Tests), Higher Education, Multiple Choice Tests
Peer reviewedLord, Frederic M. – Psychometrika, 1983
Asymptotic formulas are derived for the bias in the maximum likelihood estimators of the item parameters in the logistic item response model when examinee abilities are known. Numerical results are given for a typical verbal test for college admission. (Author)
Descriptors: College Entrance Examinations, Estimation (Mathematics), Item Analysis, Latent Trait Theory
Peer reviewedFricke, Reiner; Luhmann, Reinhold – Studies in Educational Evaluation, 1983
On the basis of the characteristics of criterion-referenced tests, the contribution of German research to the development and application of criterion-referenced tests is discussed. (PN)
Descriptors: Criterion Referenced Tests, Item Analysis, Measurement Techniques, Models
Mislevy, Robert J.; Almond, Russell G.; Yan, Duanli; Steinberg, Linda S. – 2000
Educational assessments that exploit advances in technology and cognitive psychology can produce observations and pose student models that outstrip familiar test-theoretic models and analytic methods. Bayesian inference networks (BINs), which include familiar models and techniques as special cases, can be used to manage belief about students'…
Descriptors: Bayesian Statistics, Educational Assessment, Educational Technology, Educational Testing
Gillmore, Gerald M. – New Directions for Testing and Measurement, 1983
The unique conceptual framework and language of generalizability theory are presented. While this chapter is relevant to any area in which generalizability theory is applicable, it emphasizes evaluation research, and most examples come from that area. (Author/PN)
Descriptors: Achievement Tests, Analysis of Variance, Decision Making, Error of Measurement
Peer reviewedMercer, Walter – Educational Leadership, 1983
The high failure rate of Black prospective teachers in the southern states raises the challenge to change the quality of education for all Americans. (MLF)
Descriptors: Black Teachers, Educational Quality, Elementary Secondary Education, Minimum Competency Testing
Palmer, Chester – Measurement and Evaluation in Guidance, 1983
Discusses use of item banks in preparing local tests, using as an example the construction of a college mathematics placement test. Compares recommended placements with course results at the end of the quarter. Students who placed themselves above recommendation did better than expected, perhaps because of strong self-selection factors. (PAS)
Descriptors: College Students, Higher Education, Item Banks, Mathematics Achievement


