ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	0
Since 2006 (last 20 years)	4

Descriptor

Interrater Reliability	26
Testing Programs	26
Scoring	15
Writing Evaluation	12
State Programs	11
Evaluation Methods	10
Educational Assessment	9
Essay Tests	7
Performance Based Assessment	6
Test Construction	6
Elementary Secondary Education	5
Grade 8	5
High Schools	5
Standardized Tests	5
Student Evaluation	5
Academic Achievement	4
Essays	4
Evaluators	4
Grade 4	4
Grade 6	4
Holistic Evaluation	4
Scores	4
Test Reliability	4
Writing (Composition)	4
Writing Tests	4
More ▼

Source

College Teaching	2
American Journal of Business…	1
ETS Research Report Series	1
Educational Assessment	1
Journal of Educational…	1
Journal of Experimental…	1
Journal of Industrial Teacher…	1
Journal of Personnel…	1
Journal of Professional…	1
National Center for Research…	1
New York State Education…	1
Online Submission	1
More ▼

Publication Type

Reports - Research	15
Journal Articles	10
Reports - Evaluative	9
Speeches/Meeting Papers	8
Numerical/Quantitative Data	3
Opinion Papers	2
Reports - Descriptive	2
Tests/Questionnaires	2

Education Level

Grade 4	3
Elementary Education	2
Grade 5	2
Grade 6	2
Grade 8	2
Early Childhood Education	1
Grade 3	1
Grade 7	1
High Schools	1
Higher Education	1
Intermediate Grades	1
Junior High Schools	1
Middle Schools	1
Postsecondary Education	1
Primary Education	1
Secondary Education	1
More ▼

Audience

Practitioners	2
Teachers	2
Researchers	1

Location

Pennsylvania	2
California	1
New York	1
Texas	1

Laws, Policies, & Programs

No Child Left Behind Act 2001	2
Individuals with Disabilities…	1

Assessments and Surveys

Advanced Placement…	3
National Assessment of…	2
General Educational…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 26 results Save | Export

New York State Alternate Assessment Technical Report, 2013-14

Download full text

New York State Education Department, 2014

This technical report provides an overview of the New York State Alternate Assessment (NYSAA), including a description of the purpose of the NYSAA, the processes utilized to develop and implement the NYSAA program, and Stakeholder involvement in those processes. The purpose of this report is to document the technical aspects of the 2013-14 NYSAA.…

Descriptors: Alternative Assessment, Educational Assessment, State Departments of Education, Student Evaluation

Investigating the Suitability of Implementing the "e-rater"® Scoring Engine in a Large-Scale English Language Testing Program. Research Report. ETS RR-13-36

Peer reviewed
PDF on ERIC

Download full text

Zhang, Mo; Breyer, F. Jay; Lorenz, Florian – ETS Research Report Series, 2013

In this research, we investigated the suitability of implementing "e-rater"® automated essay scoring in a high-stakes large-scale English language testing program. We examined the effectiveness of generic scoring and 2 variants of prompt-based scoring approaches. Effectiveness was evaluated on a number of dimensions, including agreement…

Descriptors: Computer Assisted Testing, Computer Software, Scoring, Language Tests

Business Education Innovation: How Common Exams Can Improve University Teaching

Peer reviewed
PDF on ERIC

Download full text

Unger, Darian – American Journal of Business Education, 2010

Although there is significant research on improving college-level teaching practices, most literature in the field assumes an incentive for improvement. The research presented in this paper addresses the issue of poor incentives for improving university-level teaching. Specifically, it proposes instructor-designed common examinations as an…

Descriptors: Educational Innovation, Educational Improvement, Instructional Improvement, Business Administration Education

Evaluating Nurse Competency: Evidence of Validity for a Skills Recredentialing Program.

Peer reviewed

Jones, Terry; Cason, Carolyn L.; Mancini, Mary E. – Journal of Professional Nursing, 2002

Registered nurses (n=368) participated in a skills recredentialing program in which competencies were assessed by a knowledge test and performance test under simulated conditions and evaluator ratings in actual patient-care situations. No significant differences in results between the simulated and actual conditions support the validity of the…

Descriptors: Competence, Credentials, Interrater Reliability, Nurses

What's Wrong at E.T.S.? Insider's View of Grading A.P. Government Essays.

Peer reviewed

Miller, Jeff – College Teaching, 1999

A college faculty member who has graded Advanced Placement exam essays on U.S. government and politics, taken mostly by high school juniors and seniors, suggests that high school teachers and college faculty who assess the essays are not the best qualified persons to do so and that despite efforts to ensure consistency, the resulting scores are…

Descriptors: Advanced Placement, College Instruction, Essays, Evaluation Criteria

Educational Testing Service Responds.

Peer reviewed

McLauchlan, William – College Teaching, 1999

A faculty consultant to the Educational Testing Service for advanced placement (AP) test reading in U.S. government and politics responds to an article criticizing essay evaluation methods and criteria, finding in it a fundamental misunderstanding of the AP reading process and explaining why the essays are subject to less scrutiny for style,…

Descriptors: Advanced Placement, College Instruction, Essays, Evaluation Criteria

Reliability and Decision Consistency: An Analysis of Writing Mode at Two Times on a Statewide Test.

Peer reviewed

Hollenbeck, Keith; Tindal, Gerald; Almond, Patricia – Educational Assessment, 1999

Studied the amount of measurement error in a state's performance-based writing task as it relates to high-stakes decision reproducibility. Using 175 eighth-grade writing samples, the study finds moderate correlations between the two raters' scores, with significant differences for the rates for the handwritten, but not the typed, essays.(SLD)

Descriptors: Decision Making, Error of Measurement, Essay Tests, Grade 8

Development of a Procedure for Establishing Occupational Examination Cut Scores: A NOCTI Example.

Peer reviewed

Walter, Richard A.; Kapes, Jerome T. – Journal of Industrial Teacher Education, 2003

To identify a procedure for establishing cut scores for National Occupational Competency Testing Institute examinations in Pennsylvania, an expert panel assessed written and performance test items for minimally competent workers. Recommendations about the number, type, and training of judges used were made. (Contains 18 references.) (SK)

Descriptors: Cutting Scores, Interrater Reliability, Occupational Tests, Teacher Competency Testing

Issues in Portfolio Assessment: The Scorability of Narrative Collections. Project 3.1: Studies in Improving Classroom and Local Assessments.

Download full text

Gearhart, Maryl; Novak, John R.; Herman, Joan L. – 1994

Technical questions regarding the reliability and validity of large-scale portfolio assessment were studied which focused on: (1) whether raters can score collections of writing reliably with rubrics designed for single samples; (2) whether ratings derived from different frameworks differ in their capacities to support technically sound…

Descriptors: Educational Assessment, Elementary Education, Elementary School Students, Essay Tests

The Stability of Rater Severity in Large-Scale Assessment Programs.

Peer reviewed

Congdon, Peter J.; McQueen, Joy – Journal of Educational Measurement, 2000

Studied the stability of rater severity over an extended rating period by applying multifaceted Rasch analysis to ratings of 16 raters of writing performances of 8,285 elementary school students. Findings cast doubt on the practice of using a single calibration of rate severity as the basis for adjustment of person measures. (SLD)

Descriptors: Educational Assessment, Elementary Education, Elementary School Students, Interrater Reliability

Computer Grading of Student Prose, Using Modern Concepts and Software.

Peer reviewed

Page, Ellis Batten – Journal of Experimental Education, 1994

National Assessment of Educational Progress writing sample essays from 1988 and 1990 (495 and 599 essays) were subjected to computerized grading and human ratings. Cross-validation suggests that computer scoring is superior to a two-judge panel, a finding encouraging for large programs of essay evaluation. (SLD)

Descriptors: Computer Assisted Testing, Computer Software, Essays, Evaluation Methods

Creating Accurate Science Benchmark Assessments to Inform Instruction. CSE Technical Report 730

Download full text

Vendlinski, Terry P.; Nagashima, Sam; Herman, Joan L. – National Center for Research on Evaluation, Standards, and Student Testing (CRESST), 2007

Current educational policy highlights the important role that assessment can play in improving education. State standards and the assessments that are aligned with them establish targets for learning and promote school accountability for helping all students succeed; at the same time, feedback from assessment results is expected to provide …

Descriptors: Elementary School Science, Federal Legislation, State Standards, Educational Improvement

"Can Johnny Write?" Florida Minimal Writing Skills Assessment. Technical Report, 1978-79.

University of South Florida, Tampa. Coll. of Education. – 1980

This report describes the procedures followed in scoring the October 1978 Florida Minimal Writing Production Skills Assessment and reports the results of that assessment. The assessment was conducted on a sample of Florida public school students in grades 3, 5, 8, and 11. Sections include descriptions of the rating scale and scorer's guide as well…

Descriptors: Educational Assessment, Elementary Secondary Education, Interrater Reliability, Minimum Competency Testing

A Detailed Analysis of Statewide Teacher Appraisal Scores.

Peer reviewed

Tyson, LeaAnn; Silverman, Stephen – Journal of Personnel Evaluation in Education, 1994

Differences in the Texas Teacher Appraisal System scores of teacher subgroups over 2 years were examined for 2,366 teachers for scores on individual domains, sums of scores of the 1st 4 domains, and overall summary performance scores, as well as appraiser differences. Implications for teacher evaluation are discussed. (SLD)

Descriptors: Educational Assessment, Elementary Secondary Education, Evaluation Methods, Evaluators

Effects of Essay Order on Raters' Score Assignments in a Large-Scale Writing Assessment.

Ferrara, Steven F. – 1987

The necessity of controlling the order in which trained essay raters for a statewide writing assessment program receive student essays was studied. The underlying theoretical question concerns possible rater bias caused by raters reading long strings of essays of homogeneous quality; this problem is usually referred to as context effect or…

Descriptors: Context Effect, Essay Tests, Evaluators, Graduation Requirements

Previous Page | Next Page »

Pages: 1 | 2

Ferrara, Steven F.	2
Herman, Joan L.	2
Almond, Patricia	1
Auchter, Joan Chikos	1
Brauchle, Paul E.	1
Braungart-Bloom, Diane S.	1
Breyer, F. Jay	1
Cason, Carolyn L.	1
Congdon, Peter J.	1
Friedman, Greg	1
Gearhart, Maryl	1
Hollenbeck, Keith	1
Howard, Edward H.	1
Jones, Terry	1
Kapes, Jerome T.	1
Linn, Robert L.	1
Lorenz, Florian	1
Mancini, Mary E.	1
Masters, James R.	1
McLauchlan, William	1
McQueen, Joy	1
Michaels, Hillary	1
Miller, Jeff	1
Mislevy, Robert J.	1
More ▼