ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	7

Source

Journal of Educational…	7
Educational Measurement:…	2
Applied Measurement in…	1
Applied Psychological…	1
Educational and Psychological…	1
International Journal of…	1
Journal of Educational…	1
Journal of Educational and…	1
Psychometrika	1

Author

Kolen, Michael J.	19
Lee, Won-Chan	5
Brennan, Robert L.	3
Harris, Deborah J.	3
Wang, Tianyou	3
Hanson, Bradley A.	2
Jarjoura, David	2
Liu, Chunyan	2
Colton, Dean A.	1
Cui, Zhongmin	1
Forsyth, Robert A.	1
Gao, Xiaohong	1
Martinovich-Barhite, Dara	1
Tong, Ye	1
Tsai, Tsung-Hsun	1
Welch, Catherine J.	1
Zeng, Lingjia	1
More ▼

Publication Type

Journal Articles	16
Reports - Evaluative	8
Reports - Research	6
Reports - Descriptive	5
Speeches/Meeting Papers	2
Collected Works - General	1

Education Level

Elementary Secondary Education

Audience

Location

Laws, Policies, & Programs

No Child Left Behind Act 2001

Assessments and Surveys

ACT Assessment	2
Iowa Tests of Basic Skills	2
Iowa Tests of Educational…	1
National Assessment of…	1
Work Keys (ACT)	1

What Works Clearinghouse Rating

Kolen, Michael J. X

Showing 1 to 15 of 19 results Save | Export

A New Statistic for Selecting the Smoothing Parameter for Polynomial Loglinear Equating under the Random Groups Design

Peer reviewed

Direct link

Liu, Chunyan; Kolen, Michael J. – Journal of Educational Measurement, 2020

Smoothing is designed to yield smoother equating results that can reduce random equating error without introducing very much systematic error. The main objective of this study is to propose a new statistic and to compare its performance to the performance of the Akaike information criterion and likelihood ratio chi-square difference statistics in…

Descriptors: Equated Scores, Statistical Analysis, Error of Measurement, Criteria

A Comparison of Strategies for Smoothing Parameter Selection for Mixed-Format Tests under the Random Groups Design

Peer reviewed

Direct link

Liu, Chunyan; Kolen, Michael J. – Journal of Educational Measurement, 2018

Smoothing techniques are designed to improve the accuracy of equating functions. The main purpose of this study is to compare seven model selection strategies for choosing the smoothing parameter (C) for polynomial loglinear presmoothing and one procedure for model selection in cubic spline postsmoothing for mixed-format pseudo tests under the…

Descriptors: Comparative Analysis, Accuracy, Models, Sample Size

Conditional Standard Errors of Measurement for Composite Scores Using IRT

Peer reviewed

Direct link

Kolen, Michael J.; Wang, Tianyou; Lee, Won-Chan – International Journal of Testing, 2012

Composite scores are often formed from test scores on educational achievement test batteries to provide a single index of achievement over two or more content areas or two or more item types on that test. Composite scores are subject to measurement error, and as with scores on individual tests, the amount of error variability typically depends on…

Descriptors: Mathematics Tests, Achievement Tests, College Entrance Examinations, Error of Measurement

Psychometric Properties of Raw and Scale Scores on Mixed-Format Tests

Peer reviewed

Direct link

Kolen, Michael J.; Lee, Won-Chan – Educational Measurement: Issues and Practice, 2011

This paper illustrates that the psychometric properties of scores and scales that are used with mixed-format educational tests can impact the use and interpretation of the scores that are reported to examinees. Psychometric properties that include reliability and conditional standard errors of measurement are considered in this paper. The focus is…

Descriptors: Test Use, Test Format, Error of Measurement, Raw Scores

Scaling: An Items Module

Peer reviewed

Direct link

Tong, Ye; Kolen, Michael J. – Educational Measurement: Issues and Practice, 2010

"Scaling" is the process of constructing a score scale that associates numbers or other ordered indicators with the performance of examinees. Scaling typically is conducted to aid users in interpreting test results. This module describes different types of raw scores and scale scores, illustrates how to incorporate various sources of…

Descriptors: Test Results, Scaling, Measures (Individuals), Raw Scores

Comparison of Parametric and Nonparametric Bootstrap Methods for Estimating Random Error in Equipercentile Equating

Peer reviewed

Direct link

Cui, Zhongmin; Kolen, Michael J. – Applied Psychological Measurement, 2008

This article considers two methods of estimating standard errors of equipercentile equating: the parametric bootstrap method and the nonparametric bootstrap method. Using a simulation study, these two methods are compared under three sample sizes (300, 1,000, and 3,000), for two test content areas (the Iowa Tests of Basic Skills Maps and Diagrams…

Descriptors: Test Length, Test Content, Simulation, Computation

Defining Score Scales in Relation to Measurement Error.

Peer reviewed

Kolen, Michael J. – Journal of Educational Measurement, 1988

Linear and nonlinear methods for incorporating score precision information when the score scale is established for educational tests are compared. Examples illustrate the methods, which discourage overinterpretation of small score differences and enhance score interpretability by equalizing error variance along the score scale. Measurement error…

Descriptors: Error of Measurement, Measures (Individuals), Scaling, Scoring

Bootstrap and Traditional Standard Errors of the Point-Biserial.

Peer reviewed

Harris, Deborah J.; Kolen, Michael J. – Educational and Psychological Measurement, 1988

Three methods of estimating point-biserial correlation coefficient standard errors were compared: (1) assuming normality; (2) not assuming normality; and (3) bootstrapping. Although errors estimated assuming normality were biased, such estimates were less variable and easier to compute, suggesting that this might be the method of choice in some…

Descriptors: Error of Measurement, Estimation (Mathematics), Item Analysis, Statistical Analysis

Interval Estimation for True Scores under Various Scale Transformations. ACT Research Report Series.

Download full text

Lee, Won-Chan; Brennan, Robert L.; Kolen, Michael J. – 2002

This paper reviews various procedures for constructing an interval for an individual's true score given the assumption that errors of measurement are distributed as binomial. This paper also presents two general interval estimation procedures (i.e., normal approximation and endpoints conversion methods) for an individual's true scale score;…

Descriptors: Bayesian Statistics, Error of Measurement, Estimation (Mathematics), Scaling

Estimators of Conditional Scale-Score Standard Errors of Measurement: A Simulation Study.

Peer reviewed

Lee, Won-Chan; Brennan, Robert L.; Kolen, Michael J. – Journal of Educational Measurement, 2000

Describes four procedures previously developed for estimating conditional standard errors of measurement for scale scores and compares them in a simulation study. All four procedures appear viable. Recommends that test users select a procedure based on various factors such as the type of scale score of concern, test characteristics, assumptions…

Descriptors: Error of Measurement, Estimation (Mathematics), Item Response Theory, Scaling

Conditional Standard Errors of Measurement for Scale Scores Using IRT.

Peer reviewed

Kolen, Michael J.; Zeng, Lingjia; Hanson, Bradley A. – Journal of Educational Measurement, 1996

Presents an Item Response Theory (IRT) method for estimating standard errors of measurement of scale scores for the situation in which scale scores are nonlinear transformations of number-correct scores. Also describes procedures for estimating the average conditional standard error of measurement for scale scores and the reliability of scale…

Descriptors: Error of Measurement, Estimation (Mathematics), Item Response Theory, Reliability

A Comparison of Bootstrap Standard Errors of IRT Equating Methods for the Common-Item Nonequivalent Groups Design.

Peer reviewed

Tsai, Tsung-Hsun; Hanson, Bradley A.; Kolen, Michael J.; Forsyth, Robert A. – Applied Measurement in Education, 2001

Compared bootstrap standard errors of five item response theory (IRT) equating methods for the common-item nonequivalent groups design using test results for 1,493 and 1,793 examinees taking a professional certification test. Results suggest that standard errors of equating less than 0.1 standard deviation units could be obtained with any of the…

Descriptors: Equated Scores, Error of Measurement, Item Response Theory, Licensing Examinations (Professions)

Interval Estimation for True Raw and Scale Scores under the Binomial Error Model

Peer reviewed

Direct link

Lee, Won-Chan; Brennan, Robert L.; Kolen, Michael J. – Journal of Educational and Behavioral Statistics, 2006

Assuming errors of measurement are distributed binomially, this article reviews various procedures for constructing an interval for an individual's true number-correct score; presents two general interval estimation procedures for an individual's true scale score (i.e., normal approximation and endpoints conversion methods); compares various…

Descriptors: Probability, Intervals, Guidelines, Computer Simulation

Reliability Issues with Performance Assessments: A Collection of Papers. ACT Research Report Series 97-3.

Download full text

Colton, Dean A.; Gao, Xiaohong; Harris, Deborah J.; Kolen, Michael J.; Martinovich-Barhite, Dara; Wang, Tianyou; Welch, Catherine J. – 1997

This collection consists of six papers, each dealing with some aspects of reliability and performance testing. Each paper has an abstract, and each contains its own references. Papers include: (1) "Using Reliabilities To Make Decisions" (Deborah J. Harris); (2) "Conditional Standard Errors, Reliability, and Decision Consistency…

Descriptors: Decision Making, Error of Measurement, Item Response Theory, Performance Based Assessment

Standard Errors of the Tucker Method for Linear Equating under the Common Item Nonrandom Groups Design. ACT Technical Bullegin Number 44.

Download full text

Kolen, Michael J. – 1984

Large sample standard errors for the Tucker method of linear equating under the common item nonrandom groups design are derived under normality assumptions as well as under less restrictive assumptions. Standard errors of Tucker equating are estimated using the bootstrap method described by Efron. The results from different methods are compared…

Descriptors: Certification, Comparative Analysis, Equated Scores, Error of Measurement

Previous Page | Next Page »

Pages: 1 | 2

Error of Measurement	19
Estimation (Mathematics)	7
Item Response Theory	7
Scores	7
Scaling	6
Equated Scores	5
Raw Scores	5
Reliability	5
Comparative Analysis	4
Simulation	4
Accuracy	3
Sample Size	3
Testing Programs	3
True Scores	3
College Entrance Examinations	2
Computation	2
Computer Simulation	2
Item Analysis	2
Licensing Examinations…	2
Mathematical Models	2
Measures (Individuals)	2
Performance Based Assessment	2
Psychometrics	2
Statistical Analysis	2
Statistical Distributions	2
More ▼