NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
Researchers4
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 26 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Sophie Litschwartz – Society for Research on Educational Effectiveness, 2021
Background/Context: Pass/fail standardized exams frequently selectively rescore failing exams and retest failing examinees. This practice distorts the test score distribution and can confuse those who do analysis on these distributions. In 2011, the Wall Street Journal showed large discontinuities in the New York City Regent test score…
Descriptors: Standardized Tests, Pass Fail Grading, Scoring Rubrics, Scoring Formulas
Peer reviewed Peer reviewed
Direct linkDirect link
Lee, Minji K.; Sweeney, Kevin; Melican, Gerald J. – Educational Assessment, 2017
This study investigates the relationships among factor correlations, inter-item correlations, and the reliability estimates of subscores, providing a guideline with respect to psychometric properties of useful subscores. In addition, it compares subscore estimation methods with respect to reliability and distinctness. The subscore estimation…
Descriptors: Scores, Test Construction, Test Reliability, Test Validity
Hixson, Nate; Rhudy, Vaughn – West Virginia Department of Education, 2013
Student responses to the West Virginia Educational Standards Test (WESTEST) 2 Online Writing Assessment are scored by a computer-scoring engine. The scoring method is not widely understood among educators, and there exists a misperception that it is not comparable to hand scoring. To address these issues, the West Virginia Department of Education…
Descriptors: Scoring Formulas, Scoring Rubrics, Interrater Reliability, Test Scoring Machines
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Haberman, Shelby J. – ETS Research Report Series, 2008
In educational testing, subscores may be provided based on a portion of the items from a larger test. One consideration in evaluation of such subscores is their ability to predict a criterion score. Two limitations on prediction exist. The first, which is well known, is that the coefficient of determination for linear prediction of the criterion…
Descriptors: Scores, Validity, Educational Testing, Correlation
Budescu, David V. – 1979
This paper outlines a technique for differentially weighting options of a multiple choice test in a fashion that maximizes the item predictive validity. The rule can be applied with different number of categories and the "optimal" number of categories can be determined by significance tests and/or through the R2 criterion. Our theoretical analysis…
Descriptors: Multiple Choice Tests, Predictive Validity, Scoring Formulas, Test Items
Peer reviewed Peer reviewed
Direct linkDirect link
MacCann, Robert G. – Psychometrika, 2004
For (0, 1) scored multiple-choice tests, a formula giving test reliability as a function of the number of item options is derived, assuming the "knowledge or random guessing model," the parallelism of the new and old tests (apart from the guessing probability), and the assumptions of classical test theory. It is shown that the formula is a more…
Descriptors: Guessing (Tests), Multiple Choice Tests, Test Reliability, Test Theory
Wilcox, Rand R. – 1979
In the past, several latent structure models have been proposed for handling problems associated with measuring the achievement of examinees. Typically, however, these models describe a specific examinee in terms of an item domain or they describe a few items in terms of a population of examinees. In this paper, a model is proposed which allows a…
Descriptors: Achievement Tests, Guessing (Tests), Mathematical Models, Multiple Choice Tests
Peer reviewed Peer reviewed
Drasgow, Fritz; And Others – Applied Psychological Measurement, 1989
Multilinear formula scoring (MFS) is reviewed, with emphasis on estimating option characteristic curves (OCSs). MFS was used to estimate OCSs for the arithmetic reasoning subtest of the Armed Services Vocational Aptitude Battery for 2,978 examinees. A second analysis obtained OCSs for simulated data. The use of MFS is discussed. (SLD)
Descriptors: Estimation (Mathematics), Mathematical Models, Multiple Choice Tests, Scores
Angoff, William H. – 1985
This paper points out that there are certain generalizations about directions for guessing and methods of scoring that require that data be derived from random groups design. It supports the viewpoint that it is neither sufficient nor appropriate to make such generalizations on the basis of an analysis of scores obtained from the answer sheets of…
Descriptors: Correlation, Guessing (Tests), Research Design, Scoring Formulas
van den Brink, Wulfert – Evaluation in Education: International Progress, 1982
Binomial models for domain-referenced testing are compared, emphasizing the assumptions underlying the beta-binomial model. Advantages and disadvantages are discussed. A proposed item sampling model is presented which takes the effect of guessing into account. (Author/CM)
Descriptors: Comparative Analysis, Criterion Referenced Tests, Item Sampling, Measurement Techniques
Yen, Wendy M. – 1982
Test scores that are not perfectly reliable cannot be strictly equated unless they are strictly parallel. This fact implies that tau equivalence can be lost if an equipercentile equating is applied to observed scores that are not strictly parallel. Thirty-six simulated data sets are produced to simulate equating tests with different difficulties…
Descriptors: Difficulty Level, Equated Scores, Latent Trait Theory, Methods
Peer reviewed Peer reviewed
Huynh, Huynh – Journal of Educational Statistics, 1986
Under the assumptions of classical measurement theory and the condition of normality, a formula is derived for the reliability of composite scores. The formula represents an extension of the Spearman-Brown formula to the case of truncated data. (Author/JAZ)
Descriptors: Computer Simulation, Error of Measurement, Expectancy Tables, Scoring Formulas
Hutchinson, T. P. – 1984
One means of learning about the processes operating in a multiple choice test is to include some test items, called nonsense items, which have no correct answer. This paper compares two versions of a mathematical model of test performance to interpret test data that includes both genuine and nonsense items. One formula is based on the usual…
Descriptors: Foreign Countries, Guessing (Tests), Mathematical Models, Multiple Choice Tests
Divgi, D. R. – 1980
A method is proposed for providing an absolute, in contrast to comparative, evaluation of how well two tests are equated by transforming their raw scores into a particular common scale. The method is direct, not requiring creation of a standard for comparison; expresses its results in scaled rather than raw scores, and allows examination of the…
Descriptors: Equated Scores, Evaluation Criteria, Item Analysis, Latent Trait Theory
Peer reviewed Peer reviewed
Lord, Frederic M. – Journal of Educational Measurement, 1984
Four methods are outlined for estimating or approximating from a single test administration the standard error of measurement of number-right test score at specified ability levels or cutting scores. The methods are illustrated and compared on one set of real test data. (Author)
Descriptors: Academic Ability, Cutting Scores, Error of Measurement, Scoring Formulas
Previous Page | Next Page ยป
Pages: 1  |  2