NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 4 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Bejar, Isaac I.; Li, Chen; McCaffrey, Daniel – Applied Measurement in Education, 2020
We evaluate the feasibility of developing predictive models of rater behavior, that is, "rater-specific" models for predicting the scores produced by a rater under operational conditions. In the present study, the dependent variable is the score assigned to essays by a rater, and the predictors are linguistic attributes of the essays…
Descriptors: Scoring, Essays, Behavior, Predictive Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Steedle, Jeffrey T.; Ferrara, Steve – Applied Measurement in Education, 2016
As an alternative to rubric scoring, comparative judgment generates essay scores by aggregating decisions about the relative quality of the essays. Comparative judgment eliminates certain scorer biases and potentially reduces training requirements, thereby allowing a large number of judges, including teachers, to participate in essay evaluation.…
Descriptors: Essays, Scoring, Comparative Analysis, Evaluators
Peer reviewed Peer reviewed
Direct linkDirect link
Buzick, Heather; Oliveri, Maria Elena; Attali, Yigal; Flor, Michael – Applied Measurement in Education, 2016
Automated essay scoring is a developing technology that can provide efficient scoring of large numbers of written responses. Its use in higher education admissions testing provides an opportunity to collect validity and fairness evidence to support current uses and inform its emergence in other areas such as K-12 large-scale assessment. In this…
Descriptors: Essays, Learning Disabilities, Attention Deficit Hyperactivity Disorder, Scoring
Peer reviewed Peer reviewed
Direct linkDirect link
Powers, Donald E.; Escoffery, David S.; Duchnowski, Matthew P. – Applied Measurement in Education, 2015
By far, the most frequently used method of validating (the interpretation and use of) automated essay scores has been to compare them with scores awarded by human raters. Although this practice is questionable, human-machine agreement is still often regarded as the "gold standard." Our objective was to refine this model and apply it to…
Descriptors: Essays, Test Scoring Machines, Program Validation, Criterion Referenced Tests