ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	0
Since 2006 (last 20 years)	3

Descriptor

Error of Measurement	15
Testing Problems	15
Test Items	5
Estimation (Mathematics)	4
Scores	4
Test Construction	4
Test Reliability	4
True Scores	4
Evaluation Problems	3
Mathematical Models	3
Scoring	3
Test Interpretation	3
Accountability	2
Achievement Tests	2
Comparative Analysis	2
Correlation	2
Elementary Secondary Education	2
Equations (Mathematics)	2
Evaluation Criteria	2
Evaluation Methods	2
Grading	2
Item Analysis	2
Item Banks	2
Item Response Theory	2
Measurement	2
More ▼

Source

American Educational Research…	1
Educational Measurement:…	1
Educational Testing Service	1
Educational and Psychological…	1
International Association for…	1
Journal of Educational…	1
School Administrator	1
School Psychology Review	1

Author

Hambleton, Ronald K.	2
Altepeter, Tom	1
Cahan, Sorel	1
Cohen, Nora	1
Gardner, Eric	1
Goldberg, Gail Lynn	1
Haberman, Shelby J.	1
Harvill, Leo M.	1
Jones, Russell W.	1
Lance, Charles E.	1
Moomaw, Michael E.	1
Papay, John P.	1
Popham, W. James	1
Shale, Doug	1
Sturgis, Chris	1
Walker-Bartnick, Leslie	1
Wilcox, Rand R.	1
Woodruff, David	1
More ▼

Publication Type

Reports - Evaluative	15
Journal Articles	6
Speeches/Meeting Papers	6
ERIC Digests in Full Text	1
ERIC Publications	1
Reports - Descriptive	1
Reports - Research	1

Education Level

Elementary Education	1
Elementary Secondary Education	1
Grade 3	1
Grade 4	1
Grade 5	1
High Schools	1
Higher Education	1
Postsecondary Education	1
Secondary Education	1

Audience

Researchers

Location

Maine	1
Michigan	1
New Hampshire	1
Oregon	1

Laws, Policies, & Programs

Assessments and Surveys

Expressive One Word Picture…	1
Graduate Management Admission…	1
Stanford Achievement Tests	1

What Works Clearinghouse Rating

Showing all 15 results Save | Export

Limits on the Accuracy of Linking. Research Report. ETS RR-10-22

Download full text

Haberman, Shelby J. – Educational Testing Service, 2010

Sampling errors limit the accuracy with which forms can be linked. Limitations on accuracy are especially important in testing programs in which a very large number of forms are employed. Standard inequalities in mathematical statistics may be used to establish lower bounds on the achievable inking accuracy. To illustrate results, a variety of…

Descriptors: Testing Programs, Equated Scores, Sampling, Accuracy

Progress and Proficiency: Redesigning Grading for Competency Education. CompetencyWorks Issue Brief

Download full text

Sturgis, Chris – International Association for K-12 Online Learning, 2014

This paper is part of a series investigating the implementation of competency education. The purpose of the paper is to explore how districts and schools can redesign grading systems to best help students to excel in academics and to gain the skills that are needed to be successful in college, the community, and the workplace. In order to make the…

Descriptors: Grading, Competency Based Education, Evaluation Methods, Evaluation Research

Different Tests, Different Answers: The Stability of Teacher Value-Added Estimates across Outcome Measures

Peer reviewed

Direct link

Papay, John P. – American Educational Research Journal, 2011

Recently, educational researchers and practitioners have turned to value-added models to evaluate teacher performance. Although value-added estimates depend on the assessment used to measure student achievement, the importance of outcome selection has received scant attention in the literature. Using data from a large, urban school district, I…

Descriptors: Urban Schools, Teacher Effectiveness, Reading Achievement, Achievement Tests

Five Common Misuses of Tests. ERIC Digest No. 108.

Download full text

Gardner, Eric – 1989

Five of the common misuses of tests are reviewed: (1) acceptance of the test title as an accurate and complete description of the variable being measured (failure to examine the manual and the items carefully to know the specific aspects to be tested can result in misuse through selection of an inappropriate test for a particular purpose or…

Descriptors: Error of Measurement, Evaluation Problems, Examiners, Scoring

Self-Correction of Wrong Answers as an Alternative to the Arbitrary Setting of Observed-Score Standards in Competency Testing.

Peer reviewed

Cahan, Sorel; Cohen, Nora – Educational and Psychological Measurement, 1990

A solution is offered to problems associated with the inequality in the manipulability of probabilities of classification errors of masters versus nonmasters, based on competency test results. Eschewing the typical arbitrary establishment of observed-score standards below 100 percent, the solution incorporates a self-correction of wrong answers.…

Descriptors: Classification, Error of Measurement, Mastery Tests, Minimum Competency Testing

Influence of Item Parameter Errors in Test Development.

Download full text

Hambleton, Ronald K.; And Others – 1990

Item response theory (IRT) model parameter estimates have considerable merit and open up new directions for test development, but misleading results are often obtained because of errors in the item parameter estimates. The problem of the effects of item parameter estimation errors on the test development process is discussed, and the seriousness…

Descriptors: Error of Measurement, Estimation (Mathematics), Item Response Theory, Sampling

Stepping Up Test Score Conditional Variances.

Peer reviewed

Woodruff, David – Journal of Educational Measurement, 1991

Improvements are made on previous estimates for the conditional standard error of measurement in prediction, the conditional standard error of estimation (CSEE), and the conditional standard error of prediction (CSEP). Better estimates of how test length affects CSEE and CSEP are derived. (SLD)

Descriptors: Equations (Mathematics), Error of Measurement, Estimation (Mathematics), Mathematical Models

Item Parameter Estimation Errors and Their Influence on Test Information Functions.

Download full text

Hambleton, Ronald K.; Jones, Russell W. – 1993

Errors in item parameter estimates have a negative impact on the accuracy of item and test information functions. The estimation errors may be random, but because items with higher levels of discriminating power are more likely to be selected for a test, and these items are most apt to contain positive errors, the result is that item information…

Descriptors: Computer Simulation, Error of Measurement, Estimation (Mathematics), Item Banks

An Alternative Interpretation of Three Stability Models. Measurement and Methodology, Work Unit 2: Technical Adequacy of Tests.

Wilcox, Rand R. – 1978

Two fundamental problems in mental test theory are to estimate true score and to estimate the amount of error when testing an examinee. In this report, three probability models which characterize a single test item in terms of a population of examinees are described. How these models may be modified to characterize a single examinee in terms of an…

Descriptors: Achievement Tests, Comparative Analysis, Error of Measurement, Mathematical Models

The Mismeasurement of Educational Quality.

Popham, W. James – School Administrator, 2000

American educators are participating in a harmful, unwinnable contest--the score-boosting game. Standardized tests yield an inaccurate picture of school staff's instructional practices. Administrators should provide assessment-literacy training, brief educational policymakers, encourage autonomous parent groups, analyze test items, and create…

Descriptors: Accountability, Educational Quality, Elementary Secondary Education, Error of Measurement

NCME Instructional Module: Standard Error of Measurement.

Peer reviewed

Harvill, Leo M. – Educational Measurement: Issues and Practice, 1991

This paper discusses standard error of measurement (SEM), the amount of variation or spread in the measurement errors for a test, and gives information needed to interpret test scores using SEMs. SEMs at various score levels should be used in calculating score bands rather than a single SEM value. (SLD)

Descriptors: Definitions, Equations (Mathematics), Error of Measurement, Estimation (Mathematics)

A Discussion of the Expressive One-Word Picture Vocabulary Test.

Peer reviewed

Altepeter, Tom – School Psychology Review, 1983

A critical review of the Expressive One-Word Picture Vocabulary Test (Gardner) is offered. The reviewer feels that the instrument cannot be recommended in its present form. Further research concerning the manual, and theoretical issues, (particularly test-retest stability) is strongly recommended. (Author/PN)

Descriptors: Error of Measurement, Intelligence Tests, Item Analysis, Pictorial Stimuli

Essay Reliability: Form and Meaning.

Download full text

Shale, Doug – 1986

This study is an attempt at a cohesive characterization of the concept of essay reliability. As such, it takes as a basic premise that previous and current practices in reporting reliability estimates for essay tests have certain shortcomings. The study provides an analysis of these shortcomings--partly to encourage a fuller understanding of the…

Descriptors: Analysis of Variance, Correlation, Error of Measurement, Essay Tests

Maintaining Scoring Standards over a Rubric Transition Process.

Goldberg, Gail Lynn; Walker-Bartnick, Leslie – 1988

A scoring rubric transition study is described. It was designed to evaluate possible drift in scoring the Maryland Writing Test from year to year (when using a modified holistic scoring method), to evaluate strategies for revising swing rubrics from narrative and explanatory writing while maintaining original scoring standards, and to establish…

Descriptors: Educational Assessment, Elementary Secondary Education, Error of Measurement, Grading

Assessing the Psychometric Quality of Performance Rating Scales: Comparisons among Evaluative Criteria.

Download full text

Lance, Charles E.; Moomaw, Michael E. – 1983

Direct assessments of the accuracy with which raters can use a rating instrument are presented. This study demonstrated how surplus behavioral incidents scaled during the development of Behaviorally Anchored Rating Scales (BARS) can be used effectively in the evaluation of the newly developed scales. Construction of scenarios of hypothetical…

Descriptors: Behavior Rating Scales, Comparative Analysis, Error of Measurement, Evaluation Criteria