Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 2 |
Since 2006 (last 20 years) | 4 |
Descriptor
Test Items | 14 |
Test Reliability | 14 |
Weighted Scores | 14 |
Test Validity | 8 |
Item Analysis | 6 |
Multiple Choice Tests | 6 |
Scoring | 6 |
Test Construction | 5 |
Achievement Tests | 4 |
Correlation | 4 |
Higher Education | 4 |
More ▼ |
Source
Applied Psychological… | 3 |
College Board | 2 |
Applied Measurement in… | 1 |
Educational and Psychological… | 1 |
Evaluation and the Health… | 1 |
International Association for… | 1 |
PROFILE: Issues in Teachers'… | 1 |
Author
Publication Type
Reports - Research | 8 |
Journal Articles | 5 |
Speeches/Meeting Papers | 2 |
Books | 1 |
Collected Works - General | 1 |
Non-Print Media | 1 |
Reference Materials - General | 1 |
Reports - Evaluative | 1 |
Education Level
Higher Education | 2 |
Postsecondary Education | 2 |
Adult Education | 1 |
Elementary Education | 1 |
Elementary Secondary Education | 1 |
Grade 4 | 1 |
High Schools | 1 |
Intermediate Grades | 1 |
Secondary Education | 1 |
Audience
Location
Colombia | 1 |
Laws, Policies, & Programs
Assessments and Surveys
SAT (College Admission Test) | 2 |
International Association for… | 1 |
Progress in International… | 1 |
Trends in International… | 1 |
What Works Clearinghouse Rating
Wagemaker, Hans, Ed. – International Association for the Evaluation of Educational Achievement, 2020
Although International Association for the Evaluation of Educational Achievement-pioneered international large-scale assessment (ILSA) of education is now a well-established science, non-practitioners and many users often substantially misunderstand how large-scale assessments are conducted, what questions and challenges they are designed to…
Descriptors: International Assessment, Achievement Tests, Educational Assessment, Comparative Analysis
Palacio, Marcela; Gaviria, Sandra; Brown, James Dean – PROFILE: Issues in Teachers' Professional Development, 2016
Frustrations with traditional testing led a group of teachers at the English for adults program at Universidad EAFIT (Colombia) to design tests aligned with the institutional teaching philosophy and classroom practices. This article reports on a study of an item-by-item evaluation of a series of English exams for validity and reliability in an…
Descriptors: Foreign Countries, English (Second Language), Second Language Learning, Second Language Instruction
Hendrickson, Amy; Patterson, Brian; Ewing, Maureen – College Board, 2010
The psychometric considerations and challenges associated with including constructed response items on tests are discussed along with how these issues affect the form assembly specifications for mixed-format exams. Reliability and validity, security and fairness, pretesting, content and skills coverage, test length and timing, weights, statistical…
Descriptors: Multiple Choice Tests, Test Format, Test Construction, Test Validity

Claudy, John G. – Applied Psychological Measurement, 1978
Option weighting is an alternative to increasing test length as a means of improving the reliability of a test. The effects on test reliability of option weighting procedures were compared in two empirical studies using four independent sets of items. Biserial weights were found to be superior. (Author/CTM)
Descriptors: Higher Education, Item Analysis, Scoring Formulas, Test Items

MisLevy, Robert J.; Bock, R. Darrell – Educational and Psychological Measurement, 1982
An alternative biweight estimator based on Tukey's is examined in which (1) test disturbances are not assumed to be the same for all subjects, (2) each response is utilized proportional to its value, and (3) the biweight and maximum likelihood estimate agree when no disturbances are present. Smaller mean-squared errors are shown. (Author/CM)
Descriptors: Error of Measurement, Estimation (Mathematics), Guessing (Tests), Latent Trait Theory

Kane, Michael; Moloney, James – Applied Psychological Measurement, 1978
The answer-until-correct (AUC) procedure requires that examinees respond to a multi-choice item until they answer it correctly. Using a modified version of Horst's model for examinee behavior, this paper compares the effect of guessing on item reliability for the AUC procedure and the zero-one scoring procedure. (Author/CTM)
Descriptors: Guessing (Tests), Item Analysis, Mathematical Models, Multiple Choice Tests
Sympson, J. Bradford; Haladyna, Thomas M. – 1988
A new approach to polychotomous scoring of test items, similar to "max-alpha" scaling (MAS) and known as polyweighting, has been developed. Unlike MAS, this new method of polychotomous scoring provides scoring weights for a given item that are independent of the difficulty of other items in the analysis. Moreover, the scoring weights are…
Descriptors: Computer Software, Difficulty Level, Item Analysis, Latent Trait Theory
Sykes, Robert C.; Hou, Liling – Applied Measurement in Education, 2003
Weighting responses to Constructed-Response (CR) items has been proposed as a way to increase the contribution these items make to the test score when there is insufficient testing time to administer additional CR items. The effect of various types of weighting items of an IRT-based mixed-format writing examination was investigated.…
Descriptors: Item Response Theory, Weighted Scores, Responses, Scores

Harasym, P. H.; And Others – Evaluation and the Health Professions, 1980
Coded, as opposed to free response items, in a multiple choice physiology test had a cueing effect which raised students' scores, especially for lower achievers. Reliability of coded items was also lower. Item format and scoring method had an effect on test results. (GDC)
Descriptors: Achievement Tests, Comparative Testing, Cues, Higher Education
Downey, Ronald G.
Previous research has studied the effects of different methods of item option weighting on the reliability and concurrent and predictive validity of achievement tests. Increases in reliability are generally found, but with mixed results for validity. Several methods of producing option weights, (i.e., Guttman internal and external weights and…
Descriptors: Achievement Tests, Comparative Analysis, Correlation, Grade Point Average
Hendrickson, Amy; Patterson, Brian; Melican, Gerald – College Board, 2008
Presented at the Annual National Council on Measurement in Education (NCME) in New York in March 2008. This presentation explores how different item weighting can affect the effective weights, validity coefficents and test reliability of composite scores among test takers.
Descriptors: Multiple Choice Tests, Test Format, Test Validity, Test Reliability

Downey, Ronald G. – Applied Psychological Measurement, 1979
This research attempted to interrelate several methods of producing option weights (i.e., Guttman internal and external weights and judges' weights) and examined their effects on reliability and on concurrent, predictive, and face validity. It was concluded that option weighting offered limited, if any, improvement over unit weighting. (Author/CTM)
Descriptors: Achievement Tests, Answer Keys, Comparative Testing, High Schools
Gonzalez-Tamayo, Eulogio – 1987
The concepts of universe of admissible observation and universe of generalization from the generalizability theory were applied to calculate the intraclass correlation coefficient of a licensure test. The internal consistency coefficient of a dichotomously scored test is identical to the intraclass correlation coefficient of a two-facet design.…
Descriptors: Adults, Analysis of Variance, Content Validity, Criterion Referenced Tests
Rippey, Robert M. – 1971
Technical improvements, which may be made in the reliability and validity of tests through confidence scores, are discussed. However, studies indicate that subjects do not handle their confidence uniformly. (MS)
Descriptors: Computer Programs, Confidence Testing, Correlation, Difficulty Level