ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	5

Descriptor

Error of Measurement	10
Simulation	10
Test Construction	10
Test Items	7
Equated Scores	4
Item Response Theory	4
Computer Assisted Testing	3
Test Format	3
Test Reliability	3
Comparative Analysis	2
Difficulty Level	2
Estimation (Mathematics)	2
Item Analysis	2
Item Banks	2
Latent Trait Theory	2
Mathematical Models	2
Scores	2
Adaptive Testing	1
Bayesian Statistics	1
Best Practices	1
Career Development	1
Coding	1
Computer Programs	1
Context Effect	1
Educational Testing	1
More ▼

Source

Applied Measurement in…	2
ETS Research Report Series	1
Education and Information…	1
Journal of Educational…	1
Measurement:…	1
Research in the Schools	1

Publication Type

Reports - Research	9
Journal Articles	7
Speeches/Meeting Papers	3
Reports - Evaluative	1

Education Level

Audience

Location

Arkansas

Laws, Policies, & Programs

Assessments and Surveys

Iowa Tests of Basic Skills

What Works Clearinghouse Rating

Showing all 10 results Save | Export

Practical Considerations in Choosing an Anchor Test Form for Equating under the Random Groups Design

Peer reviewed

Direct link

Cui, Zhongmin; He, Yong – Measurement: Interdisciplinary Research and Perspectives, 2023

Careful considerations are necessary when there is a need to choose an anchor test form from a list of old test forms for equating under the random groups design. The choice of the anchor form potentially affects the accuracy of equated scores on new test forms. Few guidelines, however, can be found in the literature on choosing the anchor form.…

Descriptors: Test Format, Equated Scores, Best Practices, Test Construction

Measuring Language Ability of Students with Compensatory Multidimensional CAT: A Post-Hoc Simulation Study

Peer reviewed

Direct link

Ozdemir, Burhanettin; Gelbal, Selahattin – Education and Information Technologies, 2022

The computerized adaptive tests (CAT) apply an adaptive process in which the items are tailored to individuals' ability scores. The multidimensional CAT (MCAT) designs differ in terms of different item selection, ability estimation, and termination methods being used. This study aims at investigating the performance of the MCAT designs used to…

Descriptors: Scores, Computer Assisted Testing, Test Items, Language Proficiency

The Effect of Anchor Test Construction on Scale Drift

Peer reviewed

Direct link

Antal, Judit; Proctor, Thomas P.; Melican, Gerald J. – Applied Measurement in Education, 2014

In common-item equating the anchor block is generally built to represent a miniature form of the total test in terms of content and statistical specifications. The statistical properties frequently reflect equal mean and spread of item difficulty. Sinharay and Holland (2007) suggested that the requirement for equal spread of difficulty may be too…

Descriptors: Test Items, Equated Scores, Difficulty Level, Item Response Theory

Exploring Alternative Test Form Linking Designs with Modified Equating Sample Size and Anchor Test Length. Research Report. ETS RR-13-02

Peer reviewed
PDF on ERIC

Download full text

Wang, Lin; Qian, Jiahe; Lee, Yi-Hsuan – ETS Research Report Series, 2013

The purpose of this study was to evaluate the combined effects of reduced equating sample size and shortened anchor test length on item response theory (IRT)-based linking and equating results. Data from two independent operational forms of a large-scale testing program were used to establish the baseline results for evaluating the results from…

Descriptors: Test Construction, Item Response Theory, Testing Programs, Simulation

The Impact of Incorrect Responses to Reverse-Coded Survey Items

Peer reviewed

Direct link

Hughes, Gail D. – Research in the Schools, 2009

The impacts of incorrect responses to reverse-coded survey items were examined in this simulation study by reversing responses to traditional Likert-format items from 700 administrators in randomly selected schools in a 7-county region in central Arkansas that were obtained from an archival dataset. Specifically, the number of reverse-coded items…

Descriptors: Surveys, Coding, Context Effect, Measures (Individuals)

Estimating Conditional Standard Errors of Measurement for Tests Composed of Testlets.

Peer reviewed

Lee, Guemin – Applied Measurement in Education, 2000

Investigated incorporating a testlet definition into the estimation of the conditional standard error of measurement (SEM) for tests composed of testlets using five conditional SEM estimation methods. Results from 3,876 tests from the Iowa Tests of Basic Skills and 1,000 simulated responses show that item-based methods provide lower conditional…

Descriptors: Error of Measurement, Estimation (Mathematics), Simulation, Test Construction

Influence of Item Parameter Errors in Test Development.

Download full text

Hambleton, Ronald K.; And Others – 1990

Item response theory (IRT) model parameter estimates have considerable merit and open up new directions for test development, but misleading results are often obtained because of errors in the item parameter estimates. The problem of the effects of item parameter estimation errors on the test development process is discussed, and the seriousness…

Descriptors: Error of Measurement, Estimation (Mathematics), Item Response Theory, Sampling

Data Sparseness and On-Line Pretest Item Calibration-Scaling Methods in CAT.

Peer reviewed

Ban, Jae-Chun; Hanson, Bradley A.; Yi, Qing; Harris, Deborah J. – Journal of Educational Measurement, 2002

Compared three online pretest calibration scaling methods through simulation: (1) marginal maximum likelihood with one expectation maximization (EM) cycle (OEM) method; (2) marginal maximum likelihood with multiple EM cycles (MEM); and (3) M. Stocking's method B. MEM produced the smallest average total error in parameter estimation; OEM yielded…

Descriptors: Computer Assisted Testing, Error of Measurement, Maximum Likelihood Statistics, Online Systems

Invariance of Rasch Model Ability Parameter Estimates Over Different Collections of Items.

Curry, Allen R.; And Others – 1978

The efficacy of employing subsets of items from a calibrated item pool to estimate the Rasch model person parameters was investigated. Specifically, the degree of invariance of Rasch model ability-parameter estimates was examined across differing collections of simulated items. The ability-parameter estimates were obtained from a simulation of…

Descriptors: Career Development, Difficulty Level, Equated Scores, Error of Measurement

Operational Characteristics of a Rasch Model Tailored Testing Procedure when Program Parameters and Item Pool Attributes are Varied.

Download full text

Patience, Wayne M.; Reckase, Mark D. – 1979

Simulated tailored tests were used to investigate the relationships between characteristics of the item pool and the computer program, and the reliability and bias of the resulting ability estimates. The computer program was varied to provide for various step sizes (differences in difficulty between successive steps) and different acceptance…

Descriptors: Adaptive Testing, Computer Assisted Testing, Computer Programs, Educational Testing

Antal, Judit	1
Ban, Jae-Chun	1
Cui, Zhongmin	1
Curry, Allen R.	1
Gelbal, Selahattin	1
Hambleton, Ronald K.	1
Hanson, Bradley A.	1
Harris, Deborah J.	1
He, Yong	1
Hughes, Gail D.	1
Lee, Guemin	1
Lee, Yi-Hsuan	1
Melican, Gerald J.	1
Ozdemir, Burhanettin	1
Patience, Wayne M.	1
Proctor, Thomas P.	1
Qian, Jiahe	1
Reckase, Mark D.	1
Wang, Lin	1
Yi, Qing	1
More ▼