Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 3 |
Since 2016 (last 10 years) | 7 |
Since 2006 (last 20 years) | 12 |
Descriptor
Error of Measurement | 15 |
Scores | 10 |
Item Response Theory | 7 |
Comparative Analysis | 5 |
Simulation | 5 |
Accuracy | 4 |
Scaling | 4 |
Estimation (Mathematics) | 3 |
Models | 3 |
Raw Scores | 3 |
Reliability | 3 |
More ▼ |
Source
Journal of Educational… | 7 |
Applied Measurement in… | 2 |
Applied Psychological… | 1 |
Educational Measurement:… | 1 |
Educational and Psychological… | 1 |
International Journal of… | 1 |
Journal of Educational and… | 1 |
Author
Lee, Won-Chan | 15 |
Brennan, Robert L. | 5 |
Kolen, Michael J. | 5 |
Kim, Stella Y. | 2 |
Choi, Jiwon | 1 |
Chon, Kyong Hee | 1 |
Dunbar, Stephen B. | 1 |
Huang, Feifei | 1 |
Kang, Yujin | 1 |
Kim, Hyung Jin | 1 |
Kim, Kyung Yong | 1 |
More ▼ |
Publication Type
Journal Articles | 14 |
Reports - Research | 9 |
Reports - Descriptive | 3 |
Reports - Evaluative | 3 |
Speeches/Meeting Papers | 2 |
Education Level
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
Iowa Tests of Basic Skills | 2 |
ACT Assessment | 1 |
What Works Clearinghouse Rating
Song, Yoon Ah; Lee, Won-Chan – Applied Measurement in Education, 2022
This article presents the performance of item response theory (IRT) models when double ratings are used as item scores over single ratings when rater effects are present. Study 1 examined the influence of the number of ratings on the accuracy of proficiency estimation in the generalized partial credit model (GPCM). Study 2 compared the accuracy of…
Descriptors: Item Response Theory, Item Analysis, Scores, Accuracy
Kim, Stella Yun; Lee, Won-Chan – Applied Measurement in Education, 2023
This study evaluates various scoring methods including number-correct scoring, IRT theta scoring, and hybrid scoring in terms of scale-score stability over time. A simulation study was conducted to examine the relative performance of five scoring methods in terms of preserving the first two moments of scale scores for a population in a chain of…
Descriptors: Scoring, Comparative Analysis, Item Response Theory, Simulation
Kim, Hyung Jin; Brennan, Robert L.; Lee, Won-Chan – Journal of Educational Measurement, 2020
In equating, smoothing techniques are frequently used to diminish sampling error. There are typically two types of smoothing: presmoothing and postsmoothing. For polynomial log-linear presmoothing, an optimum smoothing degree can be determined statistically based on the Akaike information criterion or Chi-square difference criterion. For…
Descriptors: Equated Scores, Sampling, Error of Measurement, Statistical Analysis
Wang, Shaojie; Zhang, Minqiang; Lee, Won-Chan; Huang, Feifei; Li, Zonglong; Li, Yixing; Yu, Sufang – Journal of Educational Measurement, 2022
Traditional IRT characteristic curve linking methods ignore parameter estimation errors, which may undermine the accuracy of estimated linking constants. Two new linking methods are proposed that take into account parameter estimation errors. The item- (IWCC) and test-information-weighted characteristic curve (TWCC) methods employ weighting…
Descriptors: Item Response Theory, Error of Measurement, Accuracy, Monte Carlo Methods
Kim, Stella Y.; Lee, Won-Chan – Journal of Educational Measurement, 2020
The current study aims to evaluate the performance of three non-IRT procedures (i.e., normal approximation, Livingston-Lewis, and compound multinomial) for estimating classification indices when the observed score distribution shows atypical patterns: (a) bimodality, (b) structural (i.e., systematic) bumpiness, or (c) structural zeros (i.e., no…
Descriptors: Classification, Accuracy, Scores, Cutting Scores
Lee, Won-Chan; Kim, Stella Y.; Choi, Jiwon; Kang, Yujin – Journal of Educational Measurement, 2020
This article considers psychometric properties of composite raw scores and transformed scale scores on mixed-format tests that consist of a mixture of multiple-choice and free-response items. Test scores on several mixed-format tests are evaluated with respect to conditional and overall standard errors of measurement, score reliability, and…
Descriptors: Raw Scores, Item Response Theory, Test Format, Multiple Choice Tests
Kim, Kyung Yong; Lee, Won-Chan – Journal of Educational Measurement, 2018
Reporting confidence intervals with test scores helps test users make important decisions about examinees by providing information about the precision of test scores. Although a variety of estimation procedures based on the binomial error model are available for computing intervals for test scores, these procedures assume that items are randomly…
Descriptors: Weighted Scores, Error of Measurement, Test Use, Decision Making
Kolen, Michael J.; Wang, Tianyou; Lee, Won-Chan – International Journal of Testing, 2012
Composite scores are often formed from test scores on educational achievement test batteries to provide a single index of achievement over two or more content areas or two or more item types on that test. Composite scores are subject to measurement error, and as with scores on individual tests, the amount of error variability typically depends on…
Descriptors: Mathematics Tests, Achievement Tests, College Entrance Examinations, Error of Measurement
Kolen, Michael J.; Lee, Won-Chan – Educational Measurement: Issues and Practice, 2011
This paper illustrates that the psychometric properties of scores and scales that are used with mixed-format educational tests can impact the use and interpretation of the scores that are reported to examinees. Psychometric properties that include reliability and conditional standard errors of measurement are considered in this paper. The focus is…
Descriptors: Test Use, Test Format, Error of Measurement, Raw Scores
Chon, Kyong Hee; Lee, Won-Chan; Dunbar, Stephen B. – Journal of Educational Measurement, 2010
In this study we examined procedures for assessing model-data fit of item response theory (IRT) models for mixed format data. The model fit indices used in this study include PARSCALE's G[superscript 2], Orlando and Thissen's S-X[superscript 2] and S-G[superscript 2], and Stone's chi[superscript 2*] and G[superscript 2*]. To investigate the…
Descriptors: Test Length, Goodness of Fit, Item Response Theory, Simulation
Lee, Won-Chan – Applied Psychological Measurement, 2007
This article introduces a multinomial error model, which models an examinee's test scores obtained over repeated measurements of an assessment that consists of polytomously scored items. A compound multinomial error model is also introduced for situations in which items are stratified according to content categories and/or prespecified numbers of…
Descriptors: Simulation, Error of Measurement, Scoring, Test Items

Brennan, Robert L.; Lee, Won-Chan – Educational and Psychological Measurement, 1999
Develops two procedures for estimating individual-level conditional standard errors of measurement for scale scores, assuming tests of dichotomously scored items. Compares the two procedures to a polynomial procedure and a procedure developed by L. Feldt and A. Qualls (1998) using data from the Iowa Tests of Basic Skills. Contains 22 references.…
Descriptors: Error of Measurement, Estimation (Mathematics), Scaling, Scores
Lee, Won-Chan; Brennan, Robert L.; Kolen, Michael J. – Journal of Educational and Behavioral Statistics, 2006
Assuming errors of measurement are distributed binomially, this article reviews various procedures for constructing an interval for an individual's true number-correct score; presents two general interval estimation procedures for an individual's true scale score (i.e., normal approximation and endpoints conversion methods); compares various…
Descriptors: Probability, Intervals, Guidelines, Computer Simulation
Interval Estimation for True Scores under Various Scale Transformations. ACT Research Report Series.
Lee, Won-Chan; Brennan, Robert L.; Kolen, Michael J. – 2002
This paper reviews various procedures for constructing an interval for an individual's true score given the assumption that errors of measurement are distributed as binomial. This paper also presents two general interval estimation procedures (i.e., normal approximation and endpoints conversion methods) for an individual's true scale score;…
Descriptors: Bayesian Statistics, Error of Measurement, Estimation (Mathematics), Scaling

Lee, Won-Chan; Brennan, Robert L.; Kolen, Michael J. – Journal of Educational Measurement, 2000
Describes four procedures previously developed for estimating conditional standard errors of measurement for scale scores and compares them in a simulation study. All four procedures appear viable. Recommends that test users select a procedure based on various factors such as the type of scale score of concern, test characteristics, assumptions…
Descriptors: Error of Measurement, Estimation (Mathematics), Item Response Theory, Scaling