Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 2 |
Since 2006 (last 20 years) | 7 |
Descriptor
Error of Measurement | 19 |
Estimation (Mathematics) | 7 |
Item Response Theory | 7 |
Scores | 7 |
Scaling | 6 |
Equated Scores | 5 |
Raw Scores | 5 |
Reliability | 5 |
Comparative Analysis | 4 |
Simulation | 4 |
Accuracy | 3 |
More ▼ |
Source
Author
Kolen, Michael J. | 19 |
Lee, Won-Chan | 5 |
Brennan, Robert L. | 3 |
Harris, Deborah J. | 3 |
Wang, Tianyou | 3 |
Hanson, Bradley A. | 2 |
Jarjoura, David | 2 |
Liu, Chunyan | 2 |
Colton, Dean A. | 1 |
Cui, Zhongmin | 1 |
Forsyth, Robert A. | 1 |
More ▼ |
Publication Type
Journal Articles | 16 |
Reports - Evaluative | 8 |
Reports - Research | 6 |
Reports - Descriptive | 5 |
Speeches/Meeting Papers | 2 |
Collected Works - General | 1 |
Education Level
Elementary Secondary Education | 1 |
Audience
Location
Laws, Policies, & Programs
No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
ACT Assessment | 2 |
Iowa Tests of Basic Skills | 2 |
Iowa Tests of Educational… | 1 |
National Assessment of… | 1 |
Work Keys (ACT) | 1 |
What Works Clearinghouse Rating
Liu, Chunyan; Kolen, Michael J. – Journal of Educational Measurement, 2020
Smoothing is designed to yield smoother equating results that can reduce random equating error without introducing very much systematic error. The main objective of this study is to propose a new statistic and to compare its performance to the performance of the Akaike information criterion and likelihood ratio chi-square difference statistics in…
Descriptors: Equated Scores, Statistical Analysis, Error of Measurement, Criteria
Liu, Chunyan; Kolen, Michael J. – Journal of Educational Measurement, 2018
Smoothing techniques are designed to improve the accuracy of equating functions. The main purpose of this study is to compare seven model selection strategies for choosing the smoothing parameter (C) for polynomial loglinear presmoothing and one procedure for model selection in cubic spline postsmoothing for mixed-format pseudo tests under the…
Descriptors: Comparative Analysis, Accuracy, Models, Sample Size
Kolen, Michael J.; Wang, Tianyou; Lee, Won-Chan – International Journal of Testing, 2012
Composite scores are often formed from test scores on educational achievement test batteries to provide a single index of achievement over two or more content areas or two or more item types on that test. Composite scores are subject to measurement error, and as with scores on individual tests, the amount of error variability typically depends on…
Descriptors: Mathematics Tests, Achievement Tests, College Entrance Examinations, Error of Measurement
Kolen, Michael J.; Lee, Won-Chan – Educational Measurement: Issues and Practice, 2011
This paper illustrates that the psychometric properties of scores and scales that are used with mixed-format educational tests can impact the use and interpretation of the scores that are reported to examinees. Psychometric properties that include reliability and conditional standard errors of measurement are considered in this paper. The focus is…
Descriptors: Test Use, Test Format, Error of Measurement, Raw Scores
Tong, Ye; Kolen, Michael J. – Educational Measurement: Issues and Practice, 2010
"Scaling" is the process of constructing a score scale that associates numbers or other ordered indicators with the performance of examinees. Scaling typically is conducted to aid users in interpreting test results. This module describes different types of raw scores and scale scores, illustrates how to incorporate various sources of…
Descriptors: Test Results, Scaling, Measures (Individuals), Raw Scores
Cui, Zhongmin; Kolen, Michael J. – Applied Psychological Measurement, 2008
This article considers two methods of estimating standard errors of equipercentile equating: the parametric bootstrap method and the nonparametric bootstrap method. Using a simulation study, these two methods are compared under three sample sizes (300, 1,000, and 3,000), for two test content areas (the Iowa Tests of Basic Skills Maps and Diagrams…
Descriptors: Test Length, Test Content, Simulation, Computation

Kolen, Michael J. – Journal of Educational Measurement, 1988
Linear and nonlinear methods for incorporating score precision information when the score scale is established for educational tests are compared. Examples illustrate the methods, which discourage overinterpretation of small score differences and enhance score interpretability by equalizing error variance along the score scale. Measurement error…
Descriptors: Error of Measurement, Measures (Individuals), Scaling, Scoring

Harris, Deborah J.; Kolen, Michael J. – Educational and Psychological Measurement, 1988
Three methods of estimating point-biserial correlation coefficient standard errors were compared: (1) assuming normality; (2) not assuming normality; and (3) bootstrapping. Although errors estimated assuming normality were biased, such estimates were less variable and easier to compute, suggesting that this might be the method of choice in some…
Descriptors: Error of Measurement, Estimation (Mathematics), Item Analysis, Statistical Analysis
Interval Estimation for True Scores under Various Scale Transformations. ACT Research Report Series.
Lee, Won-Chan; Brennan, Robert L.; Kolen, Michael J. – 2002
This paper reviews various procedures for constructing an interval for an individual's true score given the assumption that errors of measurement are distributed as binomial. This paper also presents two general interval estimation procedures (i.e., normal approximation and endpoints conversion methods) for an individual's true scale score;…
Descriptors: Bayesian Statistics, Error of Measurement, Estimation (Mathematics), Scaling

Lee, Won-Chan; Brennan, Robert L.; Kolen, Michael J. – Journal of Educational Measurement, 2000
Describes four procedures previously developed for estimating conditional standard errors of measurement for scale scores and compares them in a simulation study. All four procedures appear viable. Recommends that test users select a procedure based on various factors such as the type of scale score of concern, test characteristics, assumptions…
Descriptors: Error of Measurement, Estimation (Mathematics), Item Response Theory, Scaling

Kolen, Michael J.; Zeng, Lingjia; Hanson, Bradley A. – Journal of Educational Measurement, 1996
Presents an Item Response Theory (IRT) method for estimating standard errors of measurement of scale scores for the situation in which scale scores are nonlinear transformations of number-correct scores. Also describes procedures for estimating the average conditional standard error of measurement for scale scores and the reliability of scale…
Descriptors: Error of Measurement, Estimation (Mathematics), Item Response Theory, Reliability

Tsai, Tsung-Hsun; Hanson, Bradley A.; Kolen, Michael J.; Forsyth, Robert A. – Applied Measurement in Education, 2001
Compared bootstrap standard errors of five item response theory (IRT) equating methods for the common-item nonequivalent groups design using test results for 1,493 and 1,793 examinees taking a professional certification test. Results suggest that standard errors of equating less than 0.1 standard deviation units could be obtained with any of the…
Descriptors: Equated Scores, Error of Measurement, Item Response Theory, Licensing Examinations (Professions)
Lee, Won-Chan; Brennan, Robert L.; Kolen, Michael J. – Journal of Educational and Behavioral Statistics, 2006
Assuming errors of measurement are distributed binomially, this article reviews various procedures for constructing an interval for an individual's true number-correct score; presents two general interval estimation procedures for an individual's true scale score (i.e., normal approximation and endpoints conversion methods); compares various…
Descriptors: Probability, Intervals, Guidelines, Computer Simulation
Colton, Dean A.; Gao, Xiaohong; Harris, Deborah J.; Kolen, Michael J.; Martinovich-Barhite, Dara; Wang, Tianyou; Welch, Catherine J. – 1997
This collection consists of six papers, each dealing with some aspects of reliability and performance testing. Each paper has an abstract, and each contains its own references. Papers include: (1) "Using Reliabilities To Make Decisions" (Deborah J. Harris); (2) "Conditional Standard Errors, Reliability, and Decision Consistency…
Descriptors: Decision Making, Error of Measurement, Item Response Theory, Performance Based Assessment
Kolen, Michael J. – 1984
Large sample standard errors for the Tucker method of linear equating under the common item nonrandom groups design are derived under normality assumptions as well as under less restrictive assumptions. Standard errors of Tucker equating are estimated using the bootstrap method described by Efron. The results from different methods are compared…
Descriptors: Certification, Comparative Analysis, Equated Scores, Error of Measurement
Previous Page | Next Page ยป
Pages: 1 | 2