ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	0
Since 2006 (last 20 years)	6

Descriptor

Error of Measurement	11
Testing Programs	11
Equated Scores	4
Scores	3
Statistical Analysis	3
Academic Achievement	2
Academic Standards	2
College Entrance Examinations	2
Estimation (Mathematics)	2
Evaluation Criteria	2
Regression (Statistics)	2
Scoring	2
State Standards	2
Student Evaluation	2
Test Items	2
Test Reliability	2
Testing Problems	2
Accountability	1
Accuracy	1
Admission (School)	1
Change Strategies	1
Classification	1
Classroom Techniques	1
College Admission	1
Community Involvement	1
More ▼

Source

Journal of Educational and…	2
Psychometrika	2
Applied Measurement in…	1
Behavioral Research and…	1
Educational Testing Service	1
International Association for…	1
Journal of Educational…	1

Author

Guo, Hongwen	2
Haberman, Shelby J.	2
Alonzo, Julie	1
Angoff, William H.	1
Chen, Wen-Hung	1
Ferrara, Steve	1
Irvin, P. Shawn	1
Johnson, Eugene	1
Kapes, Jerome T.	1
Kolen, Michael J.	1
Lai, Cheng-Fei	1
Park, Bitnara Jasmine	1
Segall, Daniel O.	1
Sinharay, Sandip	1
Sturgis, Chris	1
Tindal, Gerald	1
Welch, Frederick G.	1
More ▼

Publication Type

Reports - Evaluative	11
Journal Articles	6
Numerical/Quantitative Data	1

Education Level

Grade 2	1
Grade 3	1
Grade 7	1
High Schools	1
Higher Education	1
Middle Schools	1
Postsecondary Education	1
Secondary Education	1

Audience

Location

Maine	1
Michigan	1
New Hampshire	1
Oregon	1
Pennsylvania	1

Laws, Policies, & Programs

Assessments and Surveys

ACT Assessment	1
SAT (College Admission Test)	1

What Works Clearinghouse Rating

Showing all 11 results Save | Export

Accumulative Equating Error after a Chain of Linear Equatings

Peer reviewed

Direct link

Guo, Hongwen – Psychometrika, 2010

After many equatings have been conducted in a testing program, equating errors can accumulate to a degree that is not negligible compared to the standard error of measurement. In this paper, the author investigates the asymptotic accumulative standard error of equating (ASEE) for linear equating methods, including chained linear, Tucker, and…

Descriptors: Testing Programs, Testing, Error of Measurement, Equated Scores

Nonparametric Item Response Curve Estimation with Correction for Measurement Error

Peer reviewed

Direct link

Guo, Hongwen; Sinharay, Sandip – Journal of Educational and Behavioral Statistics, 2011

Nonparametric or kernel regression estimation of item response curves (IRCs) is often used in item analysis in testing programs. These estimates are biased when the observed scores are used as the regressor because the observed scores are contaminated by measurement error. Accuracy of this estimation is a concern theoretically and operationally.…

Descriptors: Testing Programs, Measurement, Item Analysis, Error of Measurement

Limits on the Accuracy of Linking. Research Report. ETS RR-10-22

Download full text

Haberman, Shelby J. – Educational Testing Service, 2010

Sampling errors limit the accuracy with which forms can be linked. Limitations on accuracy are especially important in testing programs in which a very large number of forms are employed. Standard inequalities in mathematical statistics may be used to establish lower bounds on the achievable inking accuracy. To illustrate results, a variety of…

Descriptors: Testing Programs, Equated Scores, Sampling, Accuracy

Progress and Proficiency: Redesigning Grading for Competency Education. CompetencyWorks Issue Brief

Download full text

Sturgis, Chris – International Association for K-12 Online Learning, 2014

This paper is part of a series investigating the implementation of competency education. The purpose of the paper is to explore how districts and schools can redesign grading systems to best help students to excel in academics and to gain the skills that are needed to be successful in college, the community, and the workplace. In order to make the…

Descriptors: Grading, Competency Based Education, Evaluation Methods, Evaluation Research

Analyzing the Reliability of the easyCBM Reading Comprehension Measures: Grade 7. Technical Report #1206

Download full text

Irvin, P. Shawn; Alonzo, Julie; Lai, Cheng-Fei; Park, Bitnara Jasmine; Tindal, Gerald – Behavioral Research and Teaching, 2012

In this technical report, we present the results of a reliability study of the seventh-grade multiple choice reading comprehension measures available on the easyCBM learning system conducted in the spring of 2011. Analyses include split-half reliability, alternate form reliability, person and item reliability as derived from Rasch analysis,…

Descriptors: Reading Comprehension, Testing Programs, Statistical Analysis, Grade 7

When Can Subscores Have Value?

Peer reviewed

Direct link

Haberman, Shelby J. – Journal of Educational and Behavioral Statistics, 2008

In educational tests, subscores are often generated from a portion of the items in a larger test. Guidelines based on mean squared error are proposed to indicate whether subscores are worth reporting. Alternatives considered are direct reports of subscores, estimates of subscores based on total score, combined estimates based on subscores and…

Descriptors: Testing Programs, Regression (Statistics), Scores, Student Evaluation

The Reliability of Linearly Equated Tests.

Peer reviewed

Segall, Daniel O. – Psychometrika, 1994

An asymptotic expression for the reliability of a linearly equated test is developed using normal theory. Reliability is expressed as the product of test reliability before equating and an adjustment term that is a function of the sample sizes used to estimate the linear equating transformation. The approach is illustrated. (SLD)

Descriptors: Equated Scores, Error of Measurement, Estimation (Mathematics), Sample Size

Vertically Articulated Performance Standards: Logic, Procedures, and Likely Classification Accuracy

Peer reviewed

Direct link

Ferrara, Steve; Johnson, Eugene; Chen, Wen-Hung – Applied Measurement in Education, 2005

Psychometricians continue to develop and evaluate methods for linking test scores, both horizontally and vertically. This article describes a social moderation process for articulating (i.e., linking) performance standards across grade levels for an operational state assessment program. The researchers used generated data to evaluate the likely…

Descriptors: Grade 2, Grade 3, Scores, Error of Measurement

The Determination of Empirical Standard Errors of Equating the Scores on SAT-Verbal and SAT-Mathematical.

Download full text

Angoff, William H. – 1991

An attempt was made to evaluate the standard error of equating (at the mean of the scores) in an ongoing testing program. The interest in estimating the empirical standard error of equating is occasioned by some discomfort with the error normally reported for test scores. Data used for this evaluation came from the Admissions Testing Program of…

Descriptors: College Entrance Examinations, Equated Scores, Error of Measurement, High School Students

Conditional Standard Errors of Measurement for Scale Scores.

Peer reviewed

Kolen, Michael J.; And Others – Journal of Educational Measurement, 1992

A procedure is described for estimating the reliability and conditional standard errors of measurement of scale scores incorporating the discrete transformation of raw scores to scale scores. The method is illustrated using a strong true score model, and practical applications are described. (SLD)

Descriptors: College Entrance Examinations, Equations (Mathematics), Error of Measurement, Estimation (Mathematics)

Review of the Scoring Procedures for the Occupational Competency Assessment Program in Pennsylvania. Final Report. Vocational-Technical Education Research Report.

Kapes, Jerome T.; Welch, Frederick G. – 1985

Procedures for establishing the cutoff scores for the occupational competency exams administered to individuals graduating from Pennsylvania vocational teacher education programs were reviewed. A total of 595 National Occupational Competency Testing Institute (NOCTI) Exams administered in 43 occupations or trades at three colleges between the…

Descriptors: Cutting Scores, Educational Policy, Error of Measurement, Evaluation Criteria