Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 2 |
Since 2006 (last 20 years) | 6 |
Descriptor
Weighted Scores | 9 |
Correlation | 3 |
Error of Measurement | 3 |
Item Response Theory | 3 |
Scoring | 3 |
Equated Scores | 2 |
Evaluation Methods | 2 |
Item Analysis | 2 |
Licensing Examinations… | 2 |
Probability | 2 |
Reliability | 2 |
More ▼ |
Source
Applied Measurement in… | 9 |
Author
Publication Type
Journal Articles | 9 |
Reports - Research | 7 |
Reports - Evaluative | 2 |
Education Level
Higher Education | 1 |
Postsecondary Education | 1 |
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
United States Medical… | 1 |
What Works Clearinghouse Rating
Steven L. Wise; G. Gage Kingsbury; Meredith L. Langi – Applied Measurement in Education, 2023
Recent research has provided evidence that performance change during a student's test event can indicate the presence of test-taking disengagement. Meaningful performance change implies that some portions of the test event reflect assumed maximum performance better than others and, because disengagement tends to diminish performance,…
Descriptors: Tests, Weighted Scores, Test Wiseness, Scoring
Goodman, Joshua T.; Dallas, Andrew D.; Fan, Fen – Applied Measurement in Education, 2020
Recent research has suggested that re-setting the standard for each administration of a small sample examination, in addition to the high cost, does not adequately maintain similar performance expectations year after year. Small-sample equating methods have shown promise with samples between 20 and 30. For groups that have fewer than 20 students,…
Descriptors: Equated Scores, Sample Size, Sampling, Weighted Scores
Clauser, Jerome C.; Clauser, Brian E.; Hambleton, Ronald K. – Applied Measurement in Education, 2014
The purpose of the present study was to extend past work with the Angoff method for setting standards by examining judgments at the judge level rather than the panel level. The focus was on investigating the relationship between observed Angoff standard setting judgments and empirical conditional probabilities. This relationship has been used as a…
Descriptors: Standard Setting (Scoring), Validity, Reliability, Correlation
Powers, Donald E.; Escoffery, David S.; Duchnowski, Matthew P. – Applied Measurement in Education, 2015
By far, the most frequently used method of validating (the interpretation and use of) automated essay scores has been to compare them with scores awarded by human raters. Although this practice is questionable, human-machine agreement is still often regarded as the "gold standard." Our objective was to refine this model and apply it to…
Descriptors: Essays, Test Scoring Machines, Program Validation, Criterion Referenced Tests
Keller, Lisa A.; Keller, Robert R. – Applied Measurement in Education, 2015
Equating test forms is an essential activity in standardized testing, with increased importance with the accountability systems in existence through the mandate of Adequate Yearly Progress. It is through equating that scores from different test forms become comparable, which allows for the tracking of changes in the performance of students from…
Descriptors: Item Response Theory, Rating Scales, Standardized Tests, Scoring Rubrics
Phillips, Gary W. – Applied Measurement in Education, 2015
This article proposes that sampling design effects have potentially huge unrecognized impacts on the results reported by large-scale district and state assessments in the United States. When design effects are unrecognized and unaccounted for they lead to underestimating the sampling error in item and test statistics. Underestimating the sampling…
Descriptors: State Programs, Sampling, Research Design, Error of Measurement
Kane, Michael; Case, Susan M. – Applied Measurement in Education, 2004
The scores on 2 distinct tests (e.g., essay and objective) are often combined to create a composite score, which is used to make decisions. The validity of the observed composite can sometimes be evaluated relative to an external criterion. However, in cases where no criterion is available, the observed composite has generally been evaluated in…
Descriptors: Validity, Weighted Scores, Reliability, Student Evaluation
Sykes, Robert C.; Hou, Liling – Applied Measurement in Education, 2003
Weighting responses to Constructed-Response (CR) items has been proposed as a way to increase the contribution these items make to the test score when there is insufficient testing time to administer additional CR items. The effect of various types of weighting items of an IRT-based mixed-format writing examination was investigated.…
Descriptors: Item Response Theory, Weighted Scores, Responses, Scores

Haladyna, Thomas M. – Applied Measurement in Education, 1990
Number-correct scoring was compared to empirical option weighting for estimating domain scores and making pass/fail decisions that are typical of certification, licensing, competency, and proficiency testing. The usefulness of the option-total correlation option-weighting method was illustrated with a sample of 1,000 high school students. (SLD)
Descriptors: Achievement Tests, Certification, Correlation, Decision Making