ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	5

Descriptor

Error of Measurement	9
Interrater Reliability	9
Test Items	9
Cutting Scores	4
Scoring	4
English	3
Goodness of Fit	3
Test Reliability	3
Academic Standards	2
Computation	2
Correlation	2
Evaluators	2
Foreign Countries	2
Generalizability Theory	2
Item Response Theory	2
Mathematics Achievement	2
Measures (Individuals)	2
Psychometrics	2
Public Education	2
Raw Scores	2
Reading Achievement	2
Spanish	2
Standard Setting	2
Standard Setting (Scoring)	2
Statistical Analysis	2
More ▼

Source

New Mexico Public Education…	2
Educational Measurement:…	1
Society for Research on…	1
Sociological Methods &…	1

Publication Type

Reports - Research	5
Speeches/Meeting Papers	4
Reports - Descriptive	3
Journal Articles	2
Numerical/Quantitative Data	2
Reports - Evaluative	1

Education Level

Elementary Secondary Education	2
Elementary Education	1

Audience

Researchers

Location

New Mexico	2
Netherlands	1

Laws, Policies, & Programs

Assessments and Surveys

Trends in International…

What Works Clearinghouse Rating

Showing all 9 results Save | Export

Can Survey Item Characteristics Relevant to Measurement Error Be Coded Reliably? A Case Study on 11 Dutch General Population Surveys

Peer reviewed

Direct link

Bais, Frank; Schouten, Barry; Lugtig, Peter; Toepoel, Vera; Arends-Tòth, Judit; Douhou, Salima; Kieruj, Natalia; Morren, Mattijn; Vis, Corrie – Sociological Methods & Research, 2019

Item characteristics can have a significant effect on survey data quality and may be associated with measurement error. Literature on data quality and measurement error is often inconclusive. This could be because item characteristics used for detecting measurement error are not coded unambiguously. In our study, we use a systematic coding…

Descriptors: Foreign Countries, National Surveys, Error of Measurement, Test Items

High-Dimensional Explanatory Random Item Effects Models for Rater-Mediated Assessments

Peer reviewed
PDF on ERIC

Download full text

Kelcey, Ben; Wang, Shanshan; Cox, Kyle – Society for Research on Educational Effectiveness, 2016

Valid and reliable measurement of unobserved latent variables is essential to understanding and improving education. A common and persistent approach to assessing latent constructs in education is the use of rater inferential judgment. The purpose of this study is to develop high-dimensional explanatory random item effects models designed for…

Descriptors: Test Items, Models, Evaluators, Longitudinal Studies

Effects of Assigning Raters to Items

Peer reviewed

Direct link

Sykes, Robert C.; Ito, Kyoko; Wang, Zhen – Educational Measurement: Issues and Practice, 2008

Student responses to a large number of constructed response items in three Math and three Reading tests were scored on two occasions using three ways of assigning raters: single reader scoring, a different reader for each response (item-specific), and three readers each scoring a rater item block (RIB) containing approximately one-third of a…

Descriptors: Test Items, Mathematics Tests, Reading Tests, Scoring

A Generalizability Study of the Angoff Method Applied to Setting Cutoff Scores of Professional Certification Tests.

Cope, Ronald T. – 1987

This study used generalizability theory and other statistical concepts to assess the application of the Angoff method to setting cutoff scores on two professional certification tests. A panel of ten judges gave pre- and post-feedback Angoff probability ratings of items of two forms of a professional certification test, and another panel of nine…

Descriptors: Certification, Correlation, Cutting Scores, Error of Measurement

New Mexico Standards-Based Assessment Technical Report: Spring 2007 Administration

Download full text

New Mexico Public Education Department, 2007

The purpose of the NMSBA technical report is to provide users and other interested parties with a general overview of and technical characteristics of the 2007 NMSBA. The 2007 technical report contains the following information: (1) Test development; (2) Scoring procedures; (3) Summary of student performance; (4) Statistical analyses of item and…

Descriptors: Interrater Reliability, Standard Setting, Measures (Individuals), Scoring

Assessing Inconsistencies in Standard Setting with the Angoff or Nedelsky Technique.

Download full text

van der Linden, Wim J. – 1982

A latent trait method is presented to investigate the possibility that Angoff or Nedelsky judges specify inconsistent probabilities in standard setting techniques for objectives-based instructional programs. It is suggested that judges frequently specify a low probability of success for an easy item but a large probability for a hard item. The…

Descriptors: Criterion Referenced Tests, Cutting Scores, Error of Measurement, Interrater Reliability

Generalizability Theory in Program Evaluation.

Rothman, M. L.; And Others – 1982

A practical application of generalizability theory, demonstrating how the variance components contribute to understanding and interpreting the data collected to evaluate a program, is described. The evaluation concerned 120 learning modules developed for the Dental Auxiliary Education Project. The goals of the project were to design, implement,…

Descriptors: Correlation, Data Collection, Dental Schools, Educational Research

The Generalizability of Scoring TIMSS Open-Ended Items.

Download full text

Smith, Teresa A. – 1997

The Third International Mathematics and Science Study (TIMSS) measured mathematics and science achievement of middle school students in more than 40 countries. About one quarter of the tests' nearly 300 items were free response items requiring students to generate their own answers. Scoring these responses used a two-digit diagnostic code rubric…

Descriptors: Comparative Education, English, Error of Measurement, Foreign Countries

New Mexico Standards Based Assessment (NMSBA) Technical Report: 2006 Spring Administration

Download full text

Griph, Gerald W. – New Mexico Public Education Department, 2006

The purpose of the NMSBA technical report is to provide users and other interested parties with a general overview of and technical characteristics of the 2006 NMSBA. The 2006 technical report contains the following information: (1) Test development; (2) Scoring procedures; (3) Calibration, scaling, and equating procedures; (4) Standard setting;…

Descriptors: Interrater Reliability, Standard Setting, Measures (Individuals), Scoring

Arends-Tòth, Judit	1
Bais, Frank	1
Cope, Ronald T.	1
Cox, Kyle	1
Douhou, Salima	1
Griph, Gerald W.	1
Ito, Kyoko	1
Kelcey, Ben	1
Kieruj, Natalia	1
Lugtig, Peter	1
Morren, Mattijn	1
Rothman, M. L.	1
Schouten, Barry	1
Smith, Teresa A.	1
Sykes, Robert C.	1
Toepoel, Vera	1
Vis, Corrie	1
Wang, Shanshan	1
Wang, Zhen	1
van der Linden, Wim J.	1
More ▼