Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 0 |
Since 2006 (last 20 years) | 2 |
Descriptor
State Programs | 6 |
Testing Programs | 6 |
Item Response Theory | 3 |
Scores | 3 |
Achievement Tests | 2 |
Psychometrics | 2 |
Standards | 2 |
Test Reliability | 2 |
Test Results | 2 |
Test Validity | 2 |
Academic Achievement | 1 |
More ▼ |
Source
Educational and Psychological… | 6 |
Author
Pomplun, Mark | 2 |
Capps, Lee | 1 |
Carvajal, Jorge | 1 |
Fan, Xitao | 1 |
Ferrara, Steven | 1 |
Lee, Guemin | 1 |
Lewis, Daniel M. | 1 |
Omar, Md Hafidz | 1 |
Skorupski, William P. | 1 |
Yen, Wendy M. | 1 |
Publication Type
Journal Articles | 6 |
Reports - Evaluative | 3 |
Reports - Research | 3 |
Education Level
Audience
Location
Kansas | 1 |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Skorupski, William P.; Carvajal, Jorge – Educational and Psychological Measurement, 2010
This study is an evaluation of the psychometric issues associated with estimating objective level scores, often referred to as "subscores." The article begins by introducing the concepts of reliability and validity for subscores from statewide achievement tests. These issues are discussed with reference to popular scaling techniques, classical…
Descriptors: Testing Programs, Test Validity, Achievement Tests, Scores
Lee, Guemin; Lewis, Daniel M. – Educational and Psychological Measurement, 2008
The bookmark standard-setting procedure is an item response theory-based method that is widely implemented in state testing programs. This study estimates standard errors for cut scores resulting from bookmark standard settings under a generalizability theory model and investigates the effects of different universes of generalization and error…
Descriptors: Generalizability Theory, Testing Programs, Error of Measurement, Cutting Scores

Fan, Xitao – Educational and Psychological Measurement, 1998
This study empirically examined the behaviors of item and person statistics derived from item response theory and classical test theory, focusing on item and person statistics and using a large-scale statewide assessment. Findings show that the person and item statistics from the two measurement frameworks are quite comparable. (SLD)
Descriptors: Item Response Theory, State Programs, Statistical Analysis, Test Items

Pomplun, Mark; Omar, Md Hafidz – Educational and Psychological Measurement, 1997
Four threats to validity of an alternative objective test item format, the multiple-mark format, were studied with data from a state-mandated assessment with about 30,000 students at each of three grade levels. Reliability and validity coefficients show that the format has promise as an objective format that can be aligned with new curriculum…
Descriptors: Curriculum Development, Elementary School Students, Elementary Secondary Education, Objective Tests

Pomplun, Mark; Capps, Lee – Educational and Psychological Measurement, 1999
Studied gender differences in answers to constructed-response mathematics items on approximately 500 papers from grades 7 and 10 from the Kansas Assessment Program. Rubric-relevant variables were highly predictive of holistic scores and accounted for some of the gender differences, especially in grade 7. (SLD)
Descriptors: Constructed Response, Grade 10, Grade 7, High School Students

Yen, Wendy M.; Ferrara, Steven – Educational and Psychological Measurement, 1997
The program design and psychometric characteristics of the Maryland School Performance Assessment Program (MSPAP) are described, focusing on scaling, equating, standard setting, score accuracy, and validity. The MSPAP is an innovative performance-based testing program administered annually to students in grades three, five, and eight. (SLD)
Descriptors: Academic Achievement, Achievement Tests, Elementary Education, Grade 3