ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	7
Since 2006 (last 20 years)	8

Source

Applied Measurement in…

Publication Type

Journal Articles	10
Reports - Research	8
Reports - Evaluative	2
Reports - Descriptive	1

Education Level

Higher Education	3
Postsecondary Education	3

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

Graduate Record Examinations	2
Test of English as a Foreign…	1

What Works Clearinghouse Rating

Showing all 10 results Save | Export

The Impact of Setting Scoring Expectations on Rater Scoring Rates and Accuracy

Peer reviewed

Direct link

Wendler, Cathy; Glazer, Nancy; Bridgeman, Brent – Applied Measurement in Education, 2020

Efficient constructed response (CR) scoring requires both accuracy and speed from human raters. This study was designed to determine if setting scoring rate expectations would encourage raters to score at a faster pace, and if so, if there would be differential effects on scoring accuracy for raters who score at different rates. Three rater groups…

Descriptors: Scoring, Expectation, Accuracy, Time

Understanding and Interpreting Human Scoring

Peer reviewed

Direct link

Glazer, Nancy; Wolfe, Edward W. – Applied Measurement in Education, 2020

This introductory article describes how constructed response scoring is carried out, particularly the rater monitoring processes and illustrates three potential designs for conducting rater monitoring in an operational scoring project. The introduction also presents a framework for interpreting research conducted by those who study the constructed…

Descriptors: Scoring, Test Format, Responses, Predictor Variables

Predictive Modeling of Rater Behavior: Implications for Quality Assurance in Essay Scoring

Peer reviewed

Direct link

Bejar, Isaac I.; Li, Chen; McCaffrey, Daniel – Applied Measurement in Education, 2020

We evaluate the feasibility of developing predictive models of rater behavior, that is, "rater-specific" models for predicting the scores produced by a rater under operational conditions. In the present study, the dependent variable is the score assigned to essays by a rater, and the predictors are linguistic attributes of the essays…

Descriptors: Scoring, Essays, Behavior, Predictive Measurement

The Impact of Operational Scoring Experience and Additional Mentored Training on Raters' Essay Scoring Accuracy

Peer reviewed

Direct link

Choi, Ikkyu; Wolfe, Edward W. – Applied Measurement in Education, 2020

Rater training is essential in ensuring the quality of constructed response scoring. Most of the current knowledge about rater training comes from experimental contexts with an emphasis on short-term effects. Few sources are available for empirical evidence on whether and how raters become more accurate as they gain scoring experiences or what…

Descriptors: Scoring, Accuracy, Training, Evaluators

Applying Cognitive Theory to the Human Essay Rating Process

Peer reviewed

Direct link

Finn, Bridgid; Arslan, Burcu; Walsh, Matthew – Applied Measurement in Education, 2020

To score an essay response, raters draw on previously trained skills and knowledge about the underlying rubric and score criterion. Cognitive processes such as remembering, forgetting, and skill decay likely influence rater performance. To investigate how forgetting influences scoring, we evaluated raters' scoring accuracy on TOEFL and GRE essays.…

Descriptors: Epistemology, Essay Tests, Evaluators, Cognitive Processes

Evaluating Comparative Judgment as an Approach to Essay Scoring

Peer reviewed

Direct link

Steedle, Jeffrey T.; Ferrara, Steve – Applied Measurement in Education, 2016

As an alternative to rubric scoring, comparative judgment generates essay scores by aggregating decisions about the relative quality of the essays. Comparative judgment eliminates certain scorer biases and potentially reduces training requirements, thereby allowing a large number of judges, including teachers, to participate in essay evaluation.…

Descriptors: Essays, Scoring, Comparative Analysis, Evaluators

Comparing Human and Automated Essay Scoring for Prospective Graduate Students with Learning Disabilities and/or ADHD

Peer reviewed

Direct link

Buzick, Heather; Oliveri, Maria Elena; Attali, Yigal; Flor, Michael – Applied Measurement in Education, 2016

Automated essay scoring is a developing technology that can provide efficient scoring of large numbers of written responses. Its use in higher education admissions testing provides an opportunity to collect validity and fairness evidence to support current uses and inform its emergence in other areas such as K-12 large-scale assessment. In this…

Descriptors: Essays, Learning Disabilities, Attention Deficit Hyperactivity Disorder, Scoring

Validating Automated Essay Scoring: A (Modest) Refinement of the "Gold Standard"

Peer reviewed

Direct link

Powers, Donald E.; Escoffery, David S.; Duchnowski, Matthew P. – Applied Measurement in Education, 2015

By far, the most frequently used method of validating (the interpretation and use of) automated essay scores has been to compare them with scores awarded by human raters. Although this practice is questionable, human-machine agreement is still often regarded as the "gold standard." Our objective was to refine this model and apply it to…

Descriptors: Essays, Test Scoring Machines, Program Validation, Criterion Referenced Tests

Development of a Scoring Algorithm To Replace Expert Rating for Scoring a Complex Performance-Based Assessment.

Peer reviewed

Clauser, Brian E.; Ross, Linette P.; Clyman, Stephen G.; Rose, Kathie M.; Margolis, Melissa J.; Nungester, Ronald J.; Piemme, Thomas E.; Chang, Lucy; El-Bayoumi, Gigi; Malakoff, Gary L.; Pincetl, Pierre S. – Applied Measurement in Education, 1997

Describes an automated scoring algorithm for a computer-based simulation examination of physicians' patient-management skills. Results with 280 medical students show that scores produced using this algorithm are highly correlated to actual clinician ratings. Scores were also effective in discriminating between case performance judged passing or…

Descriptors: Algorithms, Computer Assisted Testing, Computer Simulation, Evaluators

Cross-State Comparability of Judgments of Student Writing: Results from the New Standards Project.

Peer reviewed

Linn, Robert L.; And Others – Applied Measurement in Education, 1992

Ten states participated in a cross-state scoring workshop in 1991, evaluating writing from elementary school, middle school, and high school students. Correlation of scores assigned by readers from one state with those from readers from another state were generally quite high. Implications for defining common standards are discussed. (SLD)

Descriptors: Comparative Analysis, Correlation, Elementary School Students, Elementary Secondary Education

Evaluators	10
Scoring	10
Essay Tests	5
Essays	4
Accuracy	3
Comparative Analysis	3
Correlation	3
Predictor Variables	3
College Entrance Examinations	2
Computer Assisted Testing	2
English (Second Language)	2
Graduate Study	2
Interrater Reliability	2
Language Tests	2
Novices	2
Statistical Analysis	2
Writing Tests	2
Academic Standards	1
Achievement Rating	1
Algorithms	1
Attention Deficit…	1
Behavior	1
Cognitive Processes	1
College Students	1
Computer Simulation	1
More ▼

Glazer, Nancy	2
Wolfe, Edward W.	2
Arslan, Burcu	1
Attali, Yigal	1
Bejar, Isaac I.	1
Bridgeman, Brent	1
Buzick, Heather	1
Chang, Lucy	1
Choi, Ikkyu	1
Clauser, Brian E.	1
Clyman, Stephen G.	1
Duchnowski, Matthew P.	1
El-Bayoumi, Gigi	1
Escoffery, David S.	1
Ferrara, Steve	1
Finn, Bridgid	1
Flor, Michael	1
Li, Chen	1
Linn, Robert L.	1
Malakoff, Gary L.	1
Margolis, Melissa J.	1
McCaffrey, Daniel	1
Nungester, Ronald J.	1
Oliveri, Maria Elena	1
More ▼