ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	0
Since 2006 (last 20 years)	5

Descriptor

Evaluation Methods	21
Scoring	21
Testing Programs	21
State Programs	11
Elementary Secondary Education	7
Writing Evaluation	7
Educational Assessment	6
Interrater Reliability	6
Performance Based Assessment	6
Higher Education	5
Student Evaluation	5
Test Construction	5
Test Reliability	5
Writing (Composition)	5
Program Implementation	4
Scores	4
Standardized Tests	4
Test Items	4
Test Validity	4
Testing Problems	4
Essays	3
Item Response Theory	3
Portfolios (Background…	3
Psychometrics	3
Reading Tests	3
More ▼

Source

Applied Measurement in…	2
College Teaching	2
Anatomical Sciences Education	1
ETS Research Report Series	1
English in Australia	1
Journal of Experimental…	1
Online Submission	1
Pennsylvania Department of…	1

Publication Type

Journal Articles	8
Reports - Evaluative	8
Reports - Research	6
Reports - Descriptive	5
Speeches/Meeting Papers	4
Guides - Non-Classroom	2
Numerical/Quantitative Data	2
Opinion Papers	2
Guides - General	1
Tests/Questionnaires	1

Education Level

Higher Education	2
Elementary Secondary Education	1
Grade 4	1
Grade 6	1
Grade 8	1
Postsecondary Education	1

Audience

Practitioners	2
Researchers	2
Teachers	2

Location

Vermont	2
Australia	1
Pennsylvania	1
Tennessee	1

Laws, Policies, & Programs

Comprehensive Education…

Assessments and Surveys

Advanced Placement…	2
National Assessment of…	2
National Teacher Examinations	1

What Works Clearinghouse Rating

Showing 1 to 15 of 21 results Save | Export

Using Rasch Measurement to Score, Evaluate, and Improve Examinations in an Anatomy Course

Peer reviewed

Direct link

Royal, Kenneth D.; Gilliland, Kurt O.; Kernick, Edward T. – Anatomical Sciences Education, 2014

Any examination that involves moderate to high stakes implications for examinees should be psychometrically sound and legally defensible. Currently, there are two broad and competing families of test theories that are used to score examination data. The majority of instructors outside the high-stakes testing arena rely on classical test theory…

Descriptors: Item Response Theory, Scoring, Evaluation Methods, Anatomy

The Machine Scoring of Writing

Peer reviewed

Direct link

McCurry, Doug – English in Australia, 2010

This article provides an introduction to the kind of computer software that is used to score student writing in some high stakes testing programs, and that is being promoted as a teaching and learning tool to schools. It sketches the state of play with machines for the scoring of writing, and describes how these machines work and what they do.…

Descriptors: Testing Programs, High Stakes Tests, Computer Software, Scoring

Detecting and Correcting Scale Drift in Test Equating: An Illustration from a Large Scale Testing Program

Peer reviewed

Direct link

Puhan, Gautam – Applied Measurement in Education, 2009

The purpose of this study is to determine the extent of scale drift on a test that employs cut scores. It was essential to examine scale drift for this testing program because new forms in this testing program are often put on scale through a series of intermediate equatings (known as equating chains). This process may cause equating error to…

Descriptors: Testing Programs, Testing, Measurement Techniques, Item Response Theory

The 2008-2009 Pennsylvania System of School Assessment Handbook for Assessment Coordinators: Writing, Reading and Mathematics, Science

Download full text

Pennsylvania Department of Education, 2010

This handbook describes the responsibilities of district and school assessment coordinators in the administration of the Pennsylvania System of School Assessment (PSSA). This updated guidebook contains the following sections: (1) General Assessment Guidelines for All Assessments; (2) Writing Specific Guidelines; (3) Reading and Mathematics…

Descriptors: Guidelines, Guides, Educational Assessment, Writing Tests

Automated Scoring of Spontaneous Speech Using SpeechRater? v1.0. Research Report. ETS RR-08-62

Peer reviewed
PDF on ERIC

Download full text

Xi, Xiaoming; Higgins, Derrick; Zechner, Klaus; Williamson, David M. – ETS Research Report Series, 2008

This report presents the results of a research and development effort for SpeechRater? Version 1.0 (v1.0), an automated scoring system for the spontaneous speech of English language learners used operationally in the Test of English as a Foreign Language™ (TOEFL®) Practice Online assessment (TPO). The report includes a summary of the validity…

Descriptors: Speech, Scoring, Scoring Rubrics, Scoring Formulas

Computer Grading of Student Prose, Using Modern Concepts and Software.

Peer reviewed

Page, Ellis Batten – Journal of Experimental Education, 1994

National Assessment of Educational Progress writing sample essays from 1988 and 1990 (495 and 599 essays) were subjected to computerized grading and human ratings. Cross-validation suggests that computer scoring is superior to a two-judge panel, a finding encouraging for large programs of essay evaluation. (SLD)

Descriptors: Computer Assisted Testing, Computer Software, Essays, Evaluation Methods

Guidelines for the Management of Performance Assessments in Large-Scale Assessment Programs.

Download full text

Roeber, Edward D. – 1996

This paper is based on guidelines developed in 1989 for training workshops for state and local educators to demonstrate the processes by which performance assessments could be created, validated, and used in statewide assessment programs. These guidelines are based on work with the National Assessment of Educational Progress and several statewide…

Descriptors: Evaluation Methods, Performance Based Assessment, Sampling, Scoring

Examining the Costs of Performance Assessment.

Peer reviewed

Hardy, Roy A. – Applied Measurement in Education, 1995

Cost factors associated with the development, administration, and scoring of performance assessment tasks are examined in the context of a statewide or other large-scale assessment program. Resources of money, time, and expertise are discussed. (SLD)

Descriptors: Cost Estimates, Costs, Educational Assessment, Estimation (Mathematics)

Alaska Writing Assessment - 1997: Preliminary Technical Report.

Download full text

Fenton, Ray; Straugh, Tom; Stofflet, Fred – 1997

Writing assessment began in Alaska in the 1970s, and the Alaska Writing Assessment (AWA) that was piloted in 1997 built on previous efforts. The 1997 AWA involved more than 20,000 students in grades 5, 7, and 10 from 43 school districts, and the mandatory assessment planned for 1998 will include approximately 28,000 students. This review of the…

Descriptors: Elementary Secondary Education, Evaluation Methods, Program Implementation, Resource Allocation

10 Steps to District Performance Assessment.

Download full text

Driscoll, Lydia Abell – 1996

In the 1995-96 school year, the Memphis (Tennessee) City Schools released standards for student performance in seven content areas and began laying the foundation for a standards-based curriculum and assessment system. The steps taken to develop and implement this project are outlined as follows: (1) defining the objectives and the project scope;…

Descriptors: Educational Finance, Educational Planning, Educational Testing, Elementary Secondary Education

The Effect of Year-to-Year Rater Variation on IRT Linking

Download full text

Yen, Shu Jing; Ochieng, Charles; Michaels, Hillary; Friedman, Greg – Online Submission, 2005

Year-to-year rater variation may result in constructed response (CR) parameter changes, making CR items inappropriate to use in anchor sets for linking or equating. This study demonstrates how rater severity affected the writing and reading scores. Rater adjustments were made to statewide results using an item response theory (IRT) methodology…

Descriptors: Test Items, Writing Tests, Reading Tests, Measures (Individuals)

The Reliability of Vermont Portfolio Scores in the 1992-93 School Year. Interim Report. RAND Reprints Series.

Download full text

Koretz, Daniel; And Others – 1994

The 1992-93 school year saw the second statewide implementation of the Vermont portfolio-assessment program, and RAND continued its ongoing evaluation of the program's implementation, effects, and data quality. While the first year's study found evidence of the impact of the assessment program and low reliability of portfolio scoring, this year's…

Descriptors: Educational Assessment, Elementary Secondary Education, Evaluation Methods, Mathematics

What's Wrong at E.T.S.? Insider's View of Grading A.P. Government Essays.

Peer reviewed

Miller, Jeff – College Teaching, 1999

A college faculty member who has graded Advanced Placement exam essays on U.S. government and politics, taken mostly by high school juniors and seniors, suggests that high school teachers and college faculty who assess the essays are not the best qualified persons to do so and that despite efforts to ensure consistency, the resulting scores are…

Descriptors: Advanced Placement, College Instruction, Essays, Evaluation Criteria

Educational Testing Service Responds.

Peer reviewed

McLauchlan, William – College Teaching, 1999

A faculty consultant to the Educational Testing Service for advanced placement (AP) test reading in U.S. government and politics responds to an article criticizing essay evaluation methods and criteria, finding in it a fundamental misunderstanding of the AP reading process and explaining why the essays are subject to less scrutiny for style,…

Descriptors: Advanced Placement, College Instruction, Essays, Evaluation Criteria

Writing, Grades Six and Eight. Report of Student Performance, 1985-86. North Carolina Annual Testing Program, Basic Skills.

PDF pending restoration

North Carolina State Dept. of Public Instruction, Raleigh. Div. of Research. – 1986

This report describes the North Carolina Annual Testing Programs writing task which was administered in 1985-86. Grade six students were tested on their ability to write a clarification composition; while grade 8 students were evaluated on their skills in writing a persuasive composition. The timed composition (50 minutes) was scored by two…

Descriptors: Basic Skills, Coherence, Cohesion (Written Composition), Elementary Education

Previous Page | Next Page »

Pages: 1 | 2

Koretz, Daniel	2
Bowman, Harry L.	1
Driscoll, Lydia Abell	1
Fenton, Ray	1
Friedman, Greg	1
Gearhart, Maryl	1
Gilliland, Kurt O.	1
Hardy, Roy A.	1
Herman, Joan L.	1
Higgins, Derrick	1
Kernick, Edward T.	1
Koffler, Stephen L.	1
McCurry, Doug	1
McLauchlan, William	1
Michaels, Hillary	1
Miller, Jeff	1
Moody, David	1
Novak, John R.	1
Ochieng, Charles	1
Page, Ellis Batten	1
Petry, John R.	1
Puhan, Gautam	1
Roeber, Edward D.	1
Royal, Kenneth D.	1
More ▼