Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 2 |
Since 2006 (last 20 years) | 5 |
Descriptor
Student Evaluation | 9 |
Validity | 7 |
Evaluation Methods | 5 |
Scores | 5 |
Reliability | 3 |
Scoring | 3 |
Educational Assessment | 2 |
Interrater Reliability | 2 |
Item Response Theory | 2 |
Academic Achievement | 1 |
Algebra | 1 |
More ▼ |
Source
Applied Measurement in… | 9 |
Author
Ames, Allison J. | 1 |
Calfee, Robert | 1 |
Carney, Michele | 1 |
Case, Susan M. | 1 |
Champion, Joe | 1 |
Elliott, Stephen N. | 1 |
Fisher, Steve | 1 |
Holzman, Madison A. | 1 |
Johnson, Robert L. | 1 |
Kane, Michael | 1 |
Kuhs, Therese | 1 |
More ▼ |
Publication Type
Journal Articles | 9 |
Reports - Research | 5 |
Reports - Evaluative | 2 |
Reports - Descriptive | 1 |
Reports - General | 1 |
Education Level
Junior High Schools | 1 |
Middle Schools | 1 |
Secondary Education | 1 |
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Carney, Michele; Paulding, Katie; Champion, Joe – Applied Measurement in Education, 2022
Teachers need ways to efficiently assess students' cognitive understanding. One promising approach involves easily adapted and administered item types that yield quantitative scores that can be interpreted in terms of whether or not students likely possess key understandings. This study illustrates an approach to analyzing response process…
Descriptors: Middle School Students, Logical Thinking, Mathematical Logic, Problem Solving
Myers, Aaron J.; Ames, Allison J.; Leventhal, Brian C.; Holzman, Madison A. – Applied Measurement in Education, 2020
When rating performance assessments, raters may ascribe different scores for the same performance when rubric application does not align with the intended application of the scoring criteria. Given performance assessment score interpretation assumes raters apply rubrics as rubric developers intended, misalignment between raters' scoring processes…
Descriptors: Scoring Rubrics, Validity, Item Response Theory, Interrater Reliability
Stone, Clement A.; Ye, Feifei; Zhu, Xiaowen; Lane, Suzanne – Applied Measurement in Education, 2010
Although reliability of subscale scores may be suspect, subscale scores are the most common type of diagnostic information included in student score reports. This research compared methods for augmenting the reliability of subscale scores for an 8th-grade mathematics assessment. Yen's Objective Performance Index, Wainer et al.'s augmented scores,…
Descriptors: Item Response Theory, Case Studies, Reliability, Scores
Zhang, Bo; Ohland, Matthew W. – Applied Measurement in Education, 2009
One major challenge in using group projects to assess student learning is accounting for the differences of contribution among group members so that the mark assigned to each individual actually reflects their performance. This research addresses the validity of grading group projects by evaluating different methods that derive individualized…
Descriptors: Monte Carlo Methods, Validity, Student Evaluation, Evaluation Methods
Elliott, Stephen N.; Roach, Andrew T. – Applied Measurement in Education, 2007
This article examines three typical approaches to alternate assessment for students with significant cognitive disabilities--portfolios, performance assessments, and rating scales. A detailed analysis of common and unique design features of these approaches is provided, including features of each approach that influence the psychometric quality of…
Descriptors: Psychometrics, Validity, Rating Scales, Alternative Assessment
Kane, Michael; Case, Susan M. – Applied Measurement in Education, 2004
The scores on 2 distinct tests (e.g., essay and objective) are often combined to create a composite score, which is used to make decisions. The validity of the observed composite can sometimes be evaluated relative to an external criterion. However, in cases where no criterion is available, the observed composite has generally been evaluated in…
Descriptors: Validity, Weighted Scores, Reliability, Student Evaluation
Penfield, Randall D.; Miller, Jeffrey M. – Applied Measurement in Education, 2004
As automated scoring of complex constructed-response examinations reaches operational status, the process of evaluating the quality of resultant scores, particularly in contrast to scores of expert human graders, becomes as complex as the data itself. Using a vignette from the Architectural Registration Examination (ARE), this article explores the…
Descriptors: Student Evaluation, Evaluation Methods, Content Validity, Scoring
Johnson, Robert L.; Penny, Jim; Fisher, Steve; Kuhs, Therese – Applied Measurement in Education, 2003
When raters assign different scores to a performance task, a method for resolving rating differences is required to report a single score to the examinee. Recent studies indicate that decisions about examinees, such as pass/fail decisions, differ across resolution methods. Previous studies also investigated the interrater reliability of…
Descriptors: Test Reliability, Test Validity, Scores, Interrater Reliability

Valencia, Sheila W.; Calfee, Robert – Applied Measurement in Education, 1991
Using portfolios in assessing literacy is explored, considering student portfolios and the teacher's class portfolio. Portfolio assessment is a valuable complement to externally mandated tests, but technical issues must be addressed if the portfolio movement is to survive. Portfolios must be linked to the broader task of instructional improvement.…
Descriptors: Academic Achievement, Educational Assessment, Educational Improvement, Elementary School Teachers