ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	13
Since 2006 (last 20 years)	17

Source

Applied Measurement in…

Publication Type

Journal Articles	26
Reports - Research	21
Reports - Evaluative	6
Speeches/Meeting Papers	3
Tests/Questionnaires	2
Information Analyses	1
Reports - Descriptive	1

Education Level

Higher Education	4
Postsecondary Education	4
Grade 7	1

Audience

Location

Europe	1
Georgia	1
Iran (Tehran)	1
Israel	1
New York	1

Laws, Policies, & Programs

Assessments and Surveys

Graduate Record Examinations	4
Test of English as a Foreign…	2
Bar Examinations	1
College Level Examination…	1
National Teacher Examinations	1
SAT (College Admission Test)	1

What Works Clearinghouse Rating

Applied Measurement in Education X

Showing 1 to 15 of 26 results Save | Export

Predictive Modeling of Rater Behavior: Implications for Quality Assurance in Essay Scoring

Peer reviewed

Direct link

Bejar, Isaac I.; Li, Chen; McCaffrey, Daniel – Applied Measurement in Education, 2020

We evaluate the feasibility of developing predictive models of rater behavior, that is, "rater-specific" models for predicting the scores produced by a rater under operational conditions. In the present study, the dependent variable is the score assigned to essays by a rater, and the predictors are linguistic attributes of the essays…

Descriptors: Scoring, Essays, Behavior, Predictive Measurement

The Impact of Setting Scoring Expectations on Rater Scoring Rates and Accuracy

Peer reviewed

Direct link

Wendler, Cathy; Glazer, Nancy; Bridgeman, Brent – Applied Measurement in Education, 2020

Efficient constructed response (CR) scoring requires both accuracy and speed from human raters. This study was designed to determine if setting scoring rate expectations would encourage raters to score at a faster pace, and if so, if there would be differential effects on scoring accuracy for raters who score at different rates. Three rater groups…

Descriptors: Scoring, Expectation, Accuracy, Time

Exploring Interrelationships among L2 Writing Subskills: Insights from Cognitive Diagnostic Models

Peer reviewed

Direct link

Hamdollah Ravand; Farshad Effatpanah; Wenchao Ma; Jimmy de la Torre; Purya Baghaei; Olga Kunina-Habenicht – Applied Measurement in Education, 2024

The purpose of this study was to explore the nature of interactions among second/foreign language (L2) writing subskills. Two types of relationships were investigated: subskill-item and subskill-subskill relationships. To achieve the first purpose, using writing data obtained from the writing essays of 500 English as a foreign language (EFL)…

Descriptors: Second Language Learning, Writing Instruction, Writing Skills, Writing Tests

The Impact of Operational Scoring Experience and Additional Mentored Training on Raters' Essay Scoring Accuracy

Peer reviewed

Direct link

Choi, Ikkyu; Wolfe, Edward W. – Applied Measurement in Education, 2020

Rater training is essential in ensuring the quality of constructed response scoring. Most of the current knowledge about rater training comes from experimental contexts with an emphasis on short-term effects. Few sources are available for empirical evidence on whether and how raters become more accurate as they gain scoring experiences or what…

Descriptors: Scoring, Accuracy, Training, Evaluators

Prediction of Essay Scores from Writing Process and Product Features Using Data Mining Methods

Peer reviewed

Direct link

Sinharay, Sandip; Zhang, Mo; Deane, Paul – Applied Measurement in Education, 2019

Analysis of keystroke logging data is of increasing interest, as evident from a substantial amount of recent research on the topic. Some of the research on keystroke logging data has focused on the prediction of essay scores from keystroke logging features, but linear regression is the only prediction method that has been used in this research.…

Descriptors: Scores, Prediction, Writing Processes, Data Analysis

Applying Cognitive Theory to the Human Essay Rating Process

Peer reviewed

Direct link

Finn, Bridgid; Arslan, Burcu; Walsh, Matthew – Applied Measurement in Education, 2020

To score an essay response, raters draw on previously trained skills and knowledge about the underlying rubric and score criterion. Cognitive processes such as remembering, forgetting, and skill decay likely influence rater performance. To investigate how forgetting influences scoring, we evaluated raters' scoring accuracy on TOEFL and GRE essays.…

Descriptors: Epistemology, Essay Tests, Evaluators, Cognitive Processes

Establishing a Crosswalk between the Common European Framework for Languages (CEFR) and Writing Domains Scored by Automated Essay Scoring

Peer reviewed

Direct link

Shermis, Mark D. – Applied Measurement in Education, 2018

This article employs the Common European Framework Reference for Language Acquisition (CEFR) as a basis for evaluating writing in the context of machine scoring. The CEFR was designed as a framework for evaluating proficiency levels of speaking for the 49 languages comprising the European Union. The intent was to impact language instruction so…

Descriptors: Scoring, Automation, Essays, Language Proficiency

Statistically Comparing the Performance of Multiple Automated Raters across Multiple Items

Peer reviewed

Direct link

Kieftenbeld, Vincent; Boyer, Michelle – Applied Measurement in Education, 2017

Automated scoring systems are typically evaluated by comparing the performance of a single automated rater item-by-item to human raters. This presents a challenge when the performance of multiple raters needs to be compared across multiple items. Rankings could depend on specifics of the ranking procedure; observed differences could be due to…

Descriptors: Automation, Scoring, Comparative Analysis, Test Items

Appraising the Scoring Performance of Automated Essay Scoring Systems--Some Additional Considerations: Which Essays? Which Human Raters? Which Scores?

Peer reviewed

Direct link

Raczynski, Kevin; Cohen, Allan – Applied Measurement in Education, 2018

The literature on Automated Essay Scoring (AES) systems has provided useful validation frameworks for any assessment that includes AES scoring. Furthermore, evidence for the scoring fidelity of AES systems is accumulating. Yet questions remain when appraising the scoring performance of AES systems. These questions include: (a) which essays are…

Descriptors: Essay Tests, Test Scoring Machines, Test Validity, Evaluators

Validating Human and Automated Scoring of Essays against "True" Scores

Peer reviewed

Direct link

Cohen, Yoav; Levi, Effi; Ben-Simon, Anat – Applied Measurement in Education, 2018

In the current study, two pools of 250 essays, all written as a response to the same prompt, were rated by two groups of raters (14 or 15 raters per group), thereby providing an approximation to the essay's true score. An automated essay scoring (AES) system was trained on the datasets and then scored the essays using a cross-validation scheme. By…

Descriptors: Test Validity, Automation, Scoring, Computer Assisted Testing

Evaluating Comparative Judgment as an Approach to Essay Scoring

Peer reviewed

Direct link

Steedle, Jeffrey T.; Ferrara, Steve – Applied Measurement in Education, 2016

As an alternative to rubric scoring, comparative judgment generates essay scores by aggregating decisions about the relative quality of the essays. Comparative judgment eliminates certain scorer biases and potentially reduces training requirements, thereby allowing a large number of judges, including teachers, to participate in essay evaluation.…

Descriptors: Essays, Scoring, Comparative Analysis, Evaluators

Designing, Evaluating, and Deploying Automated Scoring Systems with Validity in Mind: Methodological Design Decisions

Peer reviewed

Direct link

Rupp, André A. – Applied Measurement in Education, 2018

This article discusses critical methodological design decisions for collecting, interpreting, and synthesizing empirical evidence during the design, deployment, and operational quality-control phases for automated scoring systems. The discussion is inspired by work on operational large-scale systems for automated essay scoring but many of the…

Descriptors: Design, Automation, Scoring, Test Scoring Machines

Comparing Human and Automated Essay Scoring for Prospective Graduate Students with Learning Disabilities and/or ADHD

Peer reviewed

Direct link

Buzick, Heather; Oliveri, Maria Elena; Attali, Yigal; Flor, Michael – Applied Measurement in Education, 2016

Automated essay scoring is a developing technology that can provide efficient scoring of large numbers of written responses. Its use in higher education admissions testing provides an opportunity to collect validity and fairness evidence to support current uses and inform its emergence in other areas such as K-12 large-scale assessment. In this…

Descriptors: Essays, Learning Disabilities, Attention Deficit Hyperactivity Disorder, Scoring

Validating Automated Essay Scoring: A (Modest) Refinement of the "Gold Standard"

Peer reviewed

Direct link

Powers, Donald E.; Escoffery, David S.; Duchnowski, Matthew P. – Applied Measurement in Education, 2015

By far, the most frequently used method of validating (the interpretation and use of) automated essay scores has been to compare them with scores awarded by human raters. Although this practice is questionable, human-machine agreement is still often regarded as the "gold standard." Our objective was to refine this model and apply it to…

Descriptors: Essays, Test Scoring Machines, Program Validation, Criterion Referenced Tests

Comparison of Human and Machine Scoring of Essays: Differences by Gender, Ethnicity, and Country

Peer reviewed

Direct link

Bridgeman, Brent; Trapani, Catherine; Attali, Yigal – Applied Measurement in Education, 2012

Essay scores generated by machine and by human raters are generally comparable; that is, they can produce scores with similar means and standard deviations, and machine scores generally correlate as highly with human scores as scores from one human correlate with scores from another human. Although human and machine essay scores are highly related…

Descriptors: Scoring, Essay Tests, College Entrance Examinations, High Stakes Tests

Previous Page | Next Page »

Pages: 1 | 2

Essay Tests	19
Scoring	14
Writing Tests	10
Evaluators	9
Essays	8
Interrater Reliability	8
Writing Skills	6
Comparative Analysis	5
Correlation	5
Scores	5
Statistical Analysis	5
Test Scoring Machines	5
Writing Evaluation	5
Accuracy	4
Automation	4
College Entrance Examinations	4
College Students	4
Foreign Countries	4
Standardized Tests	4
Test Validity	4
Computer Assisted Testing	3
English (Second Language)	3
Graduate Study	3
Higher Education	3
Predictor Variables	3
More ▼

Powers, Donald E.	4
Attali, Yigal	2
Bridgeman, Brent	2
Fowles, Mary E.	2
Angoff, William H.	1
Arslan, Burcu	1
Bejar, Isaac I.	1
Ben-Simon, Anat	1
Boyer, Michelle	1
Busch, John Christian	1
Buzick, Heather	1
Choi, Ikkyu	1
Cohen, Allan	1
Cohen, Yoav	1
Deane, Paul	1
Duchnowski, Matthew P.	1
Engelhard, George, Jr.	1
Escoffery, David S.	1
Farshad Effatpanah	1
Ferrara, Steve	1
Finn, Bridgid	1
Flor, Michael	1
Gabrielson, Stephen	1
Glazer, Nancy	1
More ▼