Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 2 |
Since 2006 (last 20 years) | 5 |
Descriptor
Classification | 7 |
Error of Measurement | 7 |
Psychometrics | 4 |
Accuracy | 3 |
Item Response Theory | 3 |
Reliability | 3 |
Computation | 2 |
Scores | 2 |
Statistical Bias | 2 |
Academic Achievement | 1 |
Accountability | 1 |
More ▼ |
Source
Journal of Educational… | 7 |
Author
Kim, Stella Y. | 2 |
Lee, Won-Chan | 2 |
Betebenner, Damian W. | 1 |
Choi, Jiwon | 1 |
Harris, Deborah J. | 1 |
Jiao, Hong | 1 |
Jin, Ying | 1 |
Kamata, Akihito | 1 |
Kang, Yujin | 1 |
Kolen, Michael J. | 1 |
Kupermintz, Haggai | 1 |
More ▼ |
Publication Type
Journal Articles | 7 |
Reports - Evaluative | 4 |
Reports - Research | 3 |
Reports - Descriptive | 1 |
Education Level
Elementary Education | 1 |
Elementary Secondary Education | 1 |
Grade 4 | 1 |
Intermediate Grades | 1 |
Audience
Location
Laws, Policies, & Programs
No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
Progress in International… | 1 |
Work Keys (ACT) | 1 |
What Works Clearinghouse Rating
Kim, Stella Y.; Lee, Won-Chan – Journal of Educational Measurement, 2020
The current study aims to evaluate the performance of three non-IRT procedures (i.e., normal approximation, Livingston-Lewis, and compound multinomial) for estimating classification indices when the observed score distribution shows atypical patterns: (a) bimodality, (b) structural (i.e., systematic) bumpiness, or (c) structural zeros (i.e., no…
Descriptors: Classification, Accuracy, Scores, Cutting Scores
Lee, Won-Chan; Kim, Stella Y.; Choi, Jiwon; Kang, Yujin – Journal of Educational Measurement, 2020
This article considers psychometric properties of composite raw scores and transformed scale scores on mixed-format tests that consist of a mixture of multiple-choice and free-response items. Test scores on several mixed-format tests are evaluated with respect to conditional and overall standard errors of measurement, score reliability, and…
Descriptors: Raw Scores, Item Response Theory, Test Format, Multiple Choice Tests
Rutkowski, Leslie; Zhou, Yan – Journal of Educational Measurement, 2015
Given the importance of large-scale assessments to educational policy conversations, it is critical that subpopulation achievement is estimated reliably and with sufficient precision. Despite this importance, biased subpopulation estimates have been found to occur when variables in the conditioning model side of a latent regression model contain…
Descriptors: Error of Measurement, Error Correction, Regression (Statistics), Computation
Jiao, Hong; Kamata, Akihito; Wang, Shudong; Jin, Ying – Journal of Educational Measurement, 2012
The applications of item response theory (IRT) models assume local item independence and that examinees are independent of each other. When a representative sample for psychometric analysis is selected using a cluster sampling method in a testlet-based assessment, both local item dependence and local person dependence are likely to be induced.…
Descriptors: Item Response Theory, Test Items, Markov Processes, Monte Carlo Methods
Betebenner, Damian W.; Shang, Yi; Xiang, Yun; Zhao, Yan; Yue, Xiaohui – Journal of Educational Measurement, 2008
No Child Left Behind (NCLB) performance mandates, embedded within state accountability systems, focus school AYP (adequate yearly progress) compliance squarely on the percentage of students at or above proficient. The singular importance of this quantity for decision-making purposes has initiated extensive research into percent proficient as a…
Descriptors: Classification, Error of Measurement, Statistics, Reliability
Kupermintz, Haggai – Journal of Educational Measurement, 2004
A decision-theoretic approach to the question of reliability in categorically scored examinations is explored. The concepts of true scores and errors are discussed as they deviate from conventional psychometric definitions and measurement error in categorical scores is cast in terms of misclassifications. A reliability measure based on…
Descriptors: Test Reliability, Error of Measurement, Psychometrics, Test Theory

Wang, Tianyou; Kolen, Michael J.; Harris, Deborah J. – Journal of Educational Measurement, 2000
Describes procedures for calculating conditional standard error of measurement (CSEM) and reliability of scale scores and classification of consistency of performance levels. Applied these procedures to data from the American College Testing Program's Work Keys Writing Assessment with sample sizes of 7,097, 1,035, and 1,793. Results show that the…
Descriptors: Adults, Classification, Error of Measurement, Item Response Theory