Publication Date
In 2025 | 0 |
Since 2024 | 3 |
Since 2021 (last 5 years) | 6 |
Since 2016 (last 10 years) | 63 |
Since 2006 (last 20 years) | 138 |
Descriptor
Scoring Formulas | 582 |
Test Reliability | 146 |
Multiple Choice Tests | 120 |
Test Validity | 105 |
Guessing (Tests) | 100 |
Scoring | 91 |
Higher Education | 89 |
Evaluation Methods | 77 |
Test Interpretation | 76 |
Test Construction | 74 |
Statistical Analysis | 68 |
More ▼ |
Source
Author
Publication Type
Education Level
Audience
Researchers | 12 |
Practitioners | 10 |
Community | 5 |
Parents | 5 |
Teachers | 3 |
Policymakers | 2 |
Location
Florida | 7 |
United Kingdom | 6 |
United Kingdom (England) | 6 |
Australia | 5 |
Canada | 5 |
United States | 5 |
Georgia | 3 |
New York | 3 |
North Carolina | 3 |
Turkey | 3 |
California | 2 |
More ▼ |
Laws, Policies, & Programs
Elementary and Secondary… | 3 |
No Child Left Behind Act 2001 | 3 |
Education for All Handicapped… | 1 |
Individuals with Disabilities… | 1 |
Serrano v Priest | 1 |
Assessments and Surveys
What Works Clearinghouse Rating

Zimmerman, Donald W. – Journal of Experimental Education, 1977
Derives formulas for the validity of predictor-criterion tests that hold for all test scores constructed according to the expected-value concept of true score. These more general formulas disclose some paradoxical properties of test validity under conditions where errors are correlated and have some implications for practical testing situations…
Descriptors: Correlation, Criterion Referenced Tests, Scoring Formulas, Tables (Data)

Dundon, William D.; And Others – Learning Disability Quarterly, 1986
Results of recategorizing the Wechsler Intelligence Scale for Children (Revised) subtest scores of 159 black learning disabled primary grade children into spatial, conceptual, and sequential scales as recommended by A. Bannatyne led to the conclusion that the diagnostic utility of the Bannatyne recategorization is questionable. (Author/DB)
Descriptors: Black Youth, Disability Identification, Learning Disabilities, Primary Education
Impara, James C.; Plake, Barbara S. – 2000
This paper reports the results of using several alternative methods of setting cut scores. The methods used were: (1) a variation of the Angoff method (1971); (2) a variation of the borderline group method; and (3) an advanced impact method (G. Dillon, 1996). The results discussed are from studies undertaken to set the cut scores for fourth grade…
Descriptors: Cutting Scores, Intermediate Grades, Mathematics Tests, Scoring Formulas

Clausing, Gerhard; Senko, Donna – Unterrichtspraxis, 1978
Cloze testing and language performance is discussed as are two techniques for awarding partial credit: the quick performance measurement and feedback technique and the three-stage scoring hierarchy for partial credit. A figure and tables are included. (EJS)
Descriptors: Cloze Procedure, Language Instruction, Language Tests, Scoring Formulas

Koch, William R.; Dodd, Barbara G. – Applied Measurement in Education, 1989
Various aspects of the computerized adaptive testing (CAT) procedure for partial credit scoring were manipulated, focusing on the effects of the manipulations on operational characteristics of the CAT. The effects of item-pool size, item-pool information, and stepsizes used along the trait continuum were assessed. (TJH)
Descriptors: Adaptive Testing, Computer Assisted Testing, Item Banks, Maximum Likelihood Statistics

Oltman, Phillip K.; Stricker, Lawrence J. – Language Testing, 1990
A recent multidimensional scaling analysis of the Test of English-as-a-Foreign-Language (TOEFL) item response data identified clusters of items in the test sections that, being more homogeneous than their parent sections, might be better for diagnostic use. The analysis was repeated using different scoring techniques. Results diverged only for…
Descriptors: English (Second Language), Item Analysis, Language Tests, Scaling

Achenbach, Thomas M.; McConaughy, Stephanie H. – School Psychology Review, 1996
Presents similarities and differences between the DSM-IV and empirically based approaches to behavioral and emotional problems. A case example illustrates the applications of the two approaches to school-based assessment. (Author/JDM)
Descriptors: Behavior Problems, Case Studies, Elementary Secondary Education, Emotional Problems
Attali, Yigal – ETS Research Report Series, 2007
Because there is no commonly accepted view of what makes for good writing, automated essay scoring (AES) ideally should be able to accommodate different theoretical positions, certainly at the level of state standards but also perhaps among teachers at the classroom level. This paper presents a practical approach and an interactive computer…
Descriptors: Computer Assisted Testing, Automation, Essay Tests, Scoring
Love, Gayle A. – 1987
In a review of relevant literature, it is argued that correction for guessing formulas should not be used. It is contended that such formulas correct for guessing that does not really exist in a noticeable amount, penalize those students who have low self-esteem and self-confidence, correct for errors that are not necessarily errors, benefit risk…
Descriptors: Guessing (Tests), Scoring Formulas, Self Esteem, Teacher Made Tests

Olejnik, Stephen; Porter, Andrew C. – Educational and Psychological Measurement, 1975
The four scoring strategies compared were: lamda coefficients, chi-square weights, and two applications of multiple discriminant analysis. No significant differences were found when applied to the Kuder Occupational Interest Survey. (RC)
Descriptors: Analysis of Variance, Comparative Analysis, Discriminant Analysis, Interest Inventories
Livingston, Samuel A.; Kastrinos, William – 1982
Leo Nedelsky developed a method for determining absolute grading standards for multiple choice tests. His method required a group of judges to examine each test question and eliminate those responses which the lowest D- student should be able to reject as incorrect. The correct answer probabilities remaining were used in computing an expected test…
Descriptors: Cutting Scores, Judges, Multiple Choice Tests, Real Estate
Budescu, David V. – 1979
This paper outlines a technique for differentially weighting options of a multiple choice test in a fashion that maximizes the item predictive validity. The rule can be applied with different number of categories and the "optimal" number of categories can be determined by significance tests and/or through the R2 criterion. Our theoretical analysis…
Descriptors: Multiple Choice Tests, Predictive Validity, Scoring Formulas, Test Items
Berk, Ronald A. – 1980
Seventeen statistics for measuring the reliability of criterion-referenced tests were critically reviewed. The review was organized into two sections: (1) a discussion of preliminary considerations to provide a foundation for choosing the appropriate category of "reliability" (threshold loss function, squared-error loss-function, or…
Descriptors: Criterion Referenced Tests, Cutting Scores, Scoring Formulas, Statistical Analysis
Marco, Gary L. – 1975
A method of interpolation has been derived that should be superior to linear interpolation in computing the percentile ranks of test scores for unimodal score distributions. The superiority of the logistic interpolation over the linear interpolation is most noticeable for distributions consisting of only a small number of score intervals (say…
Descriptors: Comparative Analysis, Intervals, Mathematical Models, Percentage
Arneklev, Bruce; And Others – 1976
One of the most important contentions of the Rasch model of item analysis is that two tests of the same trait, having some items in common, can be linked together using a "linking constant" derived from the common items. This would be accomplished by administering both tests to a sample of testees, calibrating the items of the tests…
Descriptors: Elementary School Mathematics, Goodness of Fit, Item Analysis, Measurement Techniques