ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	0
Since 2006 (last 20 years)	5

Descriptor

Testing Programs	8
Scores	5
Item Response Theory	4
State Programs	4
Psychometrics	3
Achievement Tests	2
Correlation	2
Cutting Scores	2
Grade 5	2
Scaling	2
Standard Setting (Scoring)	2
Test Results	2
Test Validity	2
Academic Achievement	1
Bayesian Statistics	1
Beginning Teachers	1
Classification	1
College Graduates	1
Comparative Analysis	1
Computation	1
Constructed Response	1
Criterion Referenced Tests	1
Difficulty Level	1
Elementary Education	1
Elementary School Teachers	1
More ▼

Source

Educational and Psychological…

Author

Capps, Lee	1
Carvajal, Jorge	1
Ferrara, Steven	1
Jiao, Hong	1
Keller, Lisa A.	1
Keller, Robert R.	1
Lee, Guemin	1
Lewis, Daniel M.	1
Moore, Don	1
Pomplun, Mark	1
Skorupski, William P.	1
Wang, Shudong	1
Wyse, Adam E.	1
Yen, Wendy M.	1
More ▼

Publication Type

Journal Articles	8
Reports - Research	6
Reports - Evaluative	2

Education Level

Grade 5	2
Elementary Secondary Education	1
Grade 10	1
Grade 11	1
Grade 3	1
Grade 4	1
Grade 6	1
Grade 7	1
Grade 8	1
Grade 9	1

Audience

Location

Indiana	1
Kansas	1

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 8 results Save | Export

The Long-Term Sustainability of Different Item Response Theory Scaling Methods

Peer reviewed

Direct link

Keller, Lisa A.; Keller, Robert R. – Educational and Psychological Measurement, 2011

This article investigates the accuracy of examinee classification into performance categories and the estimation of the theta parameter for several item response theory (IRT) scaling techniques when applied to six administrations of a test. Previous research has investigated only two administrations; however, many testing programs equate tests…

Descriptors: Item Response Theory, Scaling, Sustainability, Classification

Peer reviewed

Direct link

Wyse, Adam E. – Educational and Psychological Measurement, 2011

Standard setting is a method used to set cut scores on large-scale assessments. One of the most popular standard setting methods is the Bookmark method. In the Bookmark method, panelists are asked to envision a response probability (RP) criterion and move through a booklet of ordered items based on a RP criterion. This study investigates whether…

Descriptors: Testing Programs, Standard Setting (Scoring), Cutting Scores, Probability

A Comparison of Approaches for Improving the Reliability of Objective Level Scores

Peer reviewed

Direct link

Skorupski, William P.; Carvajal, Jorge – Educational and Psychological Measurement, 2010

This study is an evaluation of the psychometric issues associated with estimating objective level scores, often referred to as "subscores." The article begins by introducing the concepts of reliability and validity for subscores from statewide achievement tests. These issues are discussed with reference to popular scaling techniques, classical…

Descriptors: Testing Programs, Test Validity, Achievement Tests, Scores

A Generalizability Theory Approach to Standard Error Estimates for Bookmark Standard Settings

Peer reviewed

Direct link

Lee, Guemin; Lewis, Daniel M. – Educational and Psychological Measurement, 2008

The bookmark standard-setting procedure is an item response theory-based method that is widely implemented in state testing programs. This study estimates standard errors for cut scores resulting from bookmark standard settings under a generalizability theory model and investigates the effects of different universes of generalization and error…

Descriptors: Generalizability Theory, Testing Programs, Error of Measurement, Cutting Scores

Construct Equivalence across Grades in a Vertical Scale for a K-12 Large-Scale Reading Assessment

Peer reviewed

Direct link

Wang, Shudong; Jiao, Hong – Educational and Psychological Measurement, 2009

In practice, vertical scales have been continually used to measure students' achievement progress across several grade levels and have been considered very challenging psychometric procedures. Recently, such practices have been drawing many criticisms. The major criticisms focus on dimensionality and construct equivalence of the latent trait or…

Descriptors: Reading Comprehension, Elementary Secondary Education, Measures (Individuals), Psychometrics

Correlations of National Teacher Examination Core Battery Scores and College Grade Point Average with Teaching Effectiveness of First-Year Teachers.

Peer reviewed

Moore, Don; And Others – Educational and Psychological Measurement, 1991

Correlations of National Teacher Examination (NTE) Core Battery scores and college grade point average (GPA) with a measure of teaching effectiveness for 493 first-year teachers indicate that the correlation is higher for GPA than for the Core Battery. NTE core scores do not predict effectiveness better than GPA alone. (SLD)

Descriptors: Beginning Teachers, College Graduates, Correlation, Elementary School Teachers

Gender Differences for Constructed-Response Mathematics Items.

Peer reviewed

Pomplun, Mark; Capps, Lee – Educational and Psychological Measurement, 1999

Studied gender differences in answers to constructed-response mathematics items on approximately 500 papers from grades 7 and 10 from the Kansas Assessment Program. Rubric-relevant variables were highly predictive of holistic scores and accounted for some of the gender differences, especially in grade 7. (SLD)

Descriptors: Constructed Response, Grade 10, Grade 7, High School Students

The Maryland School Performance Assessment Program: Performance Assessment with Psychometric Quality Suitable for High Stakes Usage.

Peer reviewed

Yen, Wendy M.; Ferrara, Steven – Educational and Psychological Measurement, 1997

The program design and psychometric characteristics of the Maryland School Performance Assessment Program (MSPAP) are described, focusing on scaling, equating, standard setting, score accuracy, and validity. The MSPAP is an innovative performance-based testing program administered annually to students in grades three, five, and eight. (SLD)

Descriptors: Academic Achievement, Achievement Tests, Elementary Education, Grade 3