Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 0 |
Since 2006 (last 20 years) | 5 |
Descriptor
Testing Programs | 8 |
Scores | 5 |
Item Response Theory | 4 |
State Programs | 4 |
Psychometrics | 3 |
Achievement Tests | 2 |
Correlation | 2 |
Cutting Scores | 2 |
Grade 5 | 2 |
Scaling | 2 |
Standard Setting (Scoring) | 2 |
More ▼ |
Source
Educational and Psychological… | 8 |
Author
Capps, Lee | 1 |
Carvajal, Jorge | 1 |
Ferrara, Steven | 1 |
Jiao, Hong | 1 |
Keller, Lisa A. | 1 |
Keller, Robert R. | 1 |
Lee, Guemin | 1 |
Lewis, Daniel M. | 1 |
Moore, Don | 1 |
Pomplun, Mark | 1 |
Skorupski, William P. | 1 |
More ▼ |
Publication Type
Journal Articles | 8 |
Reports - Research | 6 |
Reports - Evaluative | 2 |
Education Level
Grade 5 | 2 |
Elementary Secondary Education | 1 |
Grade 10 | 1 |
Grade 11 | 1 |
Grade 3 | 1 |
Grade 4 | 1 |
Grade 6 | 1 |
Grade 7 | 1 |
Grade 8 | 1 |
Grade 9 | 1 |
Audience
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Keller, Lisa A.; Keller, Robert R. – Educational and Psychological Measurement, 2011
This article investigates the accuracy of examinee classification into performance categories and the estimation of the theta parameter for several item response theory (IRT) scaling techniques when applied to six administrations of a test. Previous research has investigated only two administrations; however, many testing programs equate tests…
Descriptors: Item Response Theory, Scaling, Sustainability, Classification
Wyse, Adam E. – Educational and Psychological Measurement, 2011
Standard setting is a method used to set cut scores on large-scale assessments. One of the most popular standard setting methods is the Bookmark method. In the Bookmark method, panelists are asked to envision a response probability (RP) criterion and move through a booklet of ordered items based on a RP criterion. This study investigates whether…
Descriptors: Testing Programs, Standard Setting (Scoring), Cutting Scores, Probability
Skorupski, William P.; Carvajal, Jorge – Educational and Psychological Measurement, 2010
This study is an evaluation of the psychometric issues associated with estimating objective level scores, often referred to as "subscores." The article begins by introducing the concepts of reliability and validity for subscores from statewide achievement tests. These issues are discussed with reference to popular scaling techniques, classical…
Descriptors: Testing Programs, Test Validity, Achievement Tests, Scores
Lee, Guemin; Lewis, Daniel M. – Educational and Psychological Measurement, 2008
The bookmark standard-setting procedure is an item response theory-based method that is widely implemented in state testing programs. This study estimates standard errors for cut scores resulting from bookmark standard settings under a generalizability theory model and investigates the effects of different universes of generalization and error…
Descriptors: Generalizability Theory, Testing Programs, Error of Measurement, Cutting Scores
Wang, Shudong; Jiao, Hong – Educational and Psychological Measurement, 2009
In practice, vertical scales have been continually used to measure students' achievement progress across several grade levels and have been considered very challenging psychometric procedures. Recently, such practices have been drawing many criticisms. The major criticisms focus on dimensionality and construct equivalence of the latent trait or…
Descriptors: Reading Comprehension, Elementary Secondary Education, Measures (Individuals), Psychometrics

Moore, Don; And Others – Educational and Psychological Measurement, 1991
Correlations of National Teacher Examination (NTE) Core Battery scores and college grade point average (GPA) with a measure of teaching effectiveness for 493 first-year teachers indicate that the correlation is higher for GPA than for the Core Battery. NTE core scores do not predict effectiveness better than GPA alone. (SLD)
Descriptors: Beginning Teachers, College Graduates, Correlation, Elementary School Teachers

Pomplun, Mark; Capps, Lee – Educational and Psychological Measurement, 1999
Studied gender differences in answers to constructed-response mathematics items on approximately 500 papers from grades 7 and 10 from the Kansas Assessment Program. Rubric-relevant variables were highly predictive of holistic scores and accounted for some of the gender differences, especially in grade 7. (SLD)
Descriptors: Constructed Response, Grade 10, Grade 7, High School Students

Yen, Wendy M.; Ferrara, Steven – Educational and Psychological Measurement, 1997
The program design and psychometric characteristics of the Maryland School Performance Assessment Program (MSPAP) are described, focusing on scaling, equating, standard setting, score accuracy, and validity. The MSPAP is an innovative performance-based testing program administered annually to students in grades three, five, and eight. (SLD)
Descriptors: Academic Achievement, Achievement Tests, Elementary Education, Grade 3