Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 0 |
Since 2006 (last 20 years) | 4 |
Descriptor
Source
Author
Kolen, Michael J. | 2 |
Lee, Won-Chan | 2 |
Brennan, Robert L. | 1 |
Cantor, Nancy K. | 1 |
Fan, Xitao | 1 |
Hoover, H. D. | 1 |
Lee, Guemin | 1 |
Liu, Yuming | 1 |
McBee, Matthew T. | 1 |
Peters, Scott J. | 1 |
Qualls-Payne, Audrey L. | 1 |
More ▼ |
Publication Type
Journal Articles | 8 |
Reports - Research | 6 |
Reports - Evaluative | 3 |
Speeches/Meeting Papers | 3 |
Reports - Descriptive | 1 |
Education Level
Elementary Secondary Education | 2 |
Audience
Researchers | 1 |
Location
Georgia | 1 |
Laws, Policies, & Programs
No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
Iowa Tests of Basic Skills | 10 |
Cognitive Abilities Test | 2 |
ACT Assessment | 1 |
Iowa Tests of Educational… | 1 |
National Assessment of… | 1 |
Stanford Achievement Tests | 1 |
Texas Assessment of Academic… | 1 |
What Works Clearinghouse Rating
Kolen, Michael J.; Wang, Tianyou; Lee, Won-Chan – International Journal of Testing, 2012
Composite scores are often formed from test scores on educational achievement test batteries to provide a single index of achievement over two or more content areas or two or more item types on that test. Composite scores are subject to measurement error, and as with scores on individual tests, the amount of error variability typically depends on…
Descriptors: Mathematics Tests, Achievement Tests, College Entrance Examinations, Error of Measurement
McBee, Matthew T.; Peters, Scott J.; Waterman, Craig – Gifted Child Quarterly, 2014
Best practice in gifted and talented identification procedures involves making decisions on the basis of multiple measures. However, very little research has investigated the impact of different methods of combining multiple measures. This article examines the consequences of the conjunctive ("and"), disjunctive/complementary…
Descriptors: Best Practices, Ability Identification, Academically Gifted, Correlation
Tong, Ye; Kolen, Michael J. – Educational Measurement: Issues and Practice, 2010
"Scaling" is the process of constructing a score scale that associates numbers or other ordered indicators with the performance of examinees. Scaling typically is conducted to aid users in interpreting test results. This module describes different types of raw scores and scale scores, illustrates how to incorporate various sources of…
Descriptors: Test Results, Scaling, Measures (Individuals), Raw Scores
Liu, Yuming; Schulz, E. Matthew; Yu, Lei – Journal of Educational and Behavioral Statistics, 2008
A Markov chain Monte Carlo (MCMC) method and a bootstrap method were compared in the estimation of standard errors of item response theory (IRT) true score equating. Three test form relationships were examined: parallel, tau-equivalent, and congeneric. Data were simulated based on Reading Comprehension and Vocabulary tests of the Iowa Tests of…
Descriptors: Reading Comprehension, Test Format, Markov Processes, Educational Testing

Brennan, Robert L.; Lee, Won-Chan – Educational and Psychological Measurement, 1999
Develops two procedures for estimating individual-level conditional standard errors of measurement for scale scores, assuming tests of dichotomously scored items. Compares the two procedures to a polynomial procedure and a procedure developed by L. Feldt and A. Qualls (1998) using data from the Iowa Tests of Basic Skills. Contains 22 references.…
Descriptors: Error of Measurement, Estimation (Mathematics), Scaling, Scores

Lee, Guemin – Applied Measurement in Education, 2000
Investigated incorporating a testlet definition into the estimation of the conditional standard error of measurement (SEM) for tests composed of testlets using five conditional SEM estimation methods. Results from 3,876 tests from the Iowa Tests of Basic Skills and 1,000 simulated responses show that item-based methods provide lower conditional…
Descriptors: Error of Measurement, Estimation (Mathematics), Simulation, Test Construction
Fan, Xitao; Yin, Ping – 2001
The literature on measurement reliability shows the consensus that group heterogeneity with regard to the trait being measured is a factor that affects the sample measurement reliability, but the degree of such effect is not entirely clear. Sample performance also has the potential to affect measurement reliability because of its effect on the…
Descriptors: Error of Measurement, Measurement Techniques, Reliability, Sample Size

Williams, Richard H.; And Others – Journal of Experimental Education, 1987
Because of limitations in simple gain scores, pychometrists have proposed alternate methods for measuring change, two of which are residualized difference and base-free change. This paper provides large sample empirical estimates of the reliability these change measures. It checks theoretical predictions derived from inequalities involving all…
Descriptors: Error of Measurement, Estimation (Mathematics), Measurement Techniques, Pretests Posttests

Qualls-Payne, Audrey L. – Journal of Educational Measurement, 1992
Six methods for estimating the standard error of measurement (SEM) at specific score levels are compared by comparing score level SEM estimates from a single test administration to estimates from two test administrations, using Iowa Tests of Basic Skills data for 2,138 examinees. L. S. Feldt's method is preferred. (SLD)
Descriptors: Comparative Testing, Elementary Education, Elementary School Students, Error of Measurement
Cantor, Nancy K.; Hoover, H. D. – 1986
This paper isolates and examines separately three distinct sources of error in essay scores: lack of agreement between raters; inconsistencies in performance within mode of discourse, and inconsistencies in performance between modes of discourse. Essay prompts in the Iowa Tests of Basic Skills (ITBS) Writing Supplement were designed to assess…
Descriptors: Academic Achievement, Cues, Elementary Secondary Education, Error of Measurement