NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 12 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Bardhoshi, Gerta; Erford, Bradley T. – Measurement and Evaluation in Counseling and Development, 2017
Precision is a key facet of test development, with score reliability determined primarily according to the types of error one wants to approximate and demonstrate. This article identifies and discusses several primary forms of reliability estimation: internal consistency (i.e., split-half, KR-20, a), test-retest, alternate forms, interscorer, and…
Descriptors: Scores, Test Reliability, Accuracy, Pretests Posttests
Peer reviewed Peer reviewed
Direct linkDirect link
Lee, Minji K.; Sweeney, Kevin; Melican, Gerald J. – Educational Assessment, 2017
This study investigates the relationships among factor correlations, inter-item correlations, and the reliability estimates of subscores, providing a guideline with respect to psychometric properties of useful subscores. In addition, it compares subscore estimation methods with respect to reliability and distinctness. The subscore estimation…
Descriptors: Scores, Test Construction, Test Reliability, Test Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Runco, Mark A.; Acar, Selcuk – Creativity Research Journal, 2012
Divergent thinking (DT) tests are very often used in creativity studies. Certainly DT does not guarantee actual creative achievement, but tests of DT are reliable and reasonably valid predictors of certain performance criteria. The validity of DT is described as reasonable because validity is not an all-or-nothing attribute, but is, instead, a…
Descriptors: Creativity, Creative Activities, Creative Thinking, Test Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Stewart, Jeffrey; White, David A. – TESOL Quarterly: A Journal for Teachers of English to Speakers of Other Languages and of Standard English as a Second Dialect, 2011
Multiple-choice tests such as the Vocabulary Levels Test (VLT) are often viewed as a preferable estimator of vocabulary knowledge when compared to yes/no checklists, because self-reporting tests introduce the possibility of students overreporting or underreporting scores. However, multiple-choice tests have their own unique disadvantages. It has…
Descriptors: Guessing (Tests), Scoring Formulas, Multiple Choice Tests, Test Reliability
Peer reviewed Peer reviewed
Hansen, Richard – Journal of Educational Measurement, 1971
The relationship between certain personality variables and the degree to which examines display certainty in their responses was investigated. (Author)
Descriptors: Guessing (Tests), Individual Characteristics, Multiple Choice Tests, Personality Assessment
Knapp, Thomas R. – Measurement and Evaluation in Guidance, 1980
Supports arguments against general use of change scores and recommends the Lord/McNemar estimates of true change. Provides a numerical example illustrating the reliability problem and the problem of the prediction of true change from various linear composites of initial and final measures. (Author)
Descriptors: Counseling Techniques, Literature Reviews, Pretests Posttests, Research Methodology
Peer reviewed Peer reviewed
Direct linkDirect link
Burton, Richard F. – Assessment & Evaluation in Higher Education, 2004
The standard error of measurement usefully provides confidence limits for scores in a given test, but is it possible to quantify the reliability of a test with just a single number that allows comparison of tests of different format? Reliability coefficients do not do this, being dependent on the spread of examinee attainment. Better in this…
Descriptors: Multiple Choice Tests, Error of Measurement, Test Reliability, Test Items
Foegen, Anne – Diagnostique, 2000
A study involving 105 sixth-graders examined three aspects of technical adequacy with respect to two general outcome measures in mathematics: the effects of aggregating scores and correcting for random guessing on reliability and validity and the extent to which the measures were sensitive to changes in performance. (Contains references.)…
Descriptors: Curriculum Based Assessment, Disabilities, Grade 6, Mathematics
Lenel, Julia C.; Gilmer, Jerry S. – 1986
In some testing programs an early item analysis is performed before final scoring in order to validate the intended keys. As a result, some items which are flawed and do not discriminate well may be keyed so as to give credit to examinees no matter which answer was chosen. This is referred to as allkeying. This research examined how varying the…
Descriptors: Equated Scores, Item Analysis, Latent Trait Theory, Licensing Examinations (Professions)
Roudabush, Glenn E. – 1975
The objective of this study was to show that standardized reading scores could be adequately estimated from scores on a criterion-referenced test in reading. This would reduce classroom test time, while, at the same time, provide the kinds of information teachers need to guide instruction, and the kinds of information administrators require for…
Descriptors: Achievement Tests, Correlation, Criterion Referenced Tests, Equated Scores
Rippey, Robert M. – 1971
Technical improvements, which may be made in the reliability and validity of tests through confidence scores, are discussed. However, studies indicate that subjects do not handle their confidence uniformly. (MS)
Descriptors: Computer Programs, Confidence Testing, Correlation, Difficulty Level
Bormuth, John R. – 1979
A procedure is demonstrated for constructing tables showing, for each score on a commercial reading achievement test, the percentage of real-world materials that the testee is likely to comprehend with at least a criterion level of proficiency, the percentages of students in a local or national sample who can competently comprehend a given…
Descriptors: Criterion Referenced Tests, Elementary Secondary Education, Equivalency Tests, Expectancy Tables