ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	12

Descriptor

Ability	14
Item Response Theory	9
Comparative Analysis	8
Computation	8
Test Items	7
Maximum Likelihood Statistics	6
Accuracy	4
Equated Scores	4
Error of Measurement	4
Models	4
Statistical Bias	4
Statistical Analysis	3
Validity	3
Bayesian Statistics	2
Differences	2
Mathematics	2
Methods	2
Multiple Choice Tests	2
Reliability	2
Simulation	2
Access to Education	1
Adaptive Testing	1
Background	1
College Entrance Examinations	1
Difficulty Level	1
More ▼

Source

ETS Research Report Series

Author

Haberman, Shelby J.	4
Kim, Sooyeon	3
Zhang, Jinming	3
Guo, Hongwen	2
Lee, Yi-Hsuan	2
Moses, Tim	2
Hansen, Eric G.	1
Livingston, Samuel A.	1
Lu, Ru	1
Lu, Ting	1
Mislevy, Robert J.	1
Oh, Hyeonjoo J.	1
Steinberg, Linda S.	1
Wang, Zhen	1
Yao, Lihua	1
von Davier, Matthias	1
More ▼

Publication Type

Journal Articles	14
Reports - Research	13
Numerical/Quantitative Data	2
Reports - Descriptive	1

Education Level

Higher Education	1
Postsecondary Education	1

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

Praxis Series	2
Graduate Record Examinations	1
National Assessment of…	1

What Works Clearinghouse Rating

Showing all 14 results Save | Export

A Simulation Study to Compare Nonequivalent Groups with Anchor Test Equating and Pseudo-Equivalent Group Linking. Research Report. ETS RR-18-08

Peer reviewed
PDF on ERIC

Download full text

Lu, Ru; Guo, Hongwen – ETS Research Report Series, 2018

In this paper we compare the newly developed pseudo-equivalent groups (PEG) linking method with the linking methods based on the traditional nonequivalent groups with anchor test (NEAT) design and illustrate how to use the PEG methods under imperfect equating conditions. To do this, we proposed a new method that combines the features of PEG…

Descriptors: Equated Scores, Comparative Analysis, Test Items, Background

Investigating Robustness of Item Response Theory Proficiency Estimators to Atypical Response Behaviors under Two-Stage Multistage Testing. ETS GRE® Board Research Report. ETS GRE®-16-03. ETS Research Report No. RR-16-22

Peer reviewed
PDF on ERIC

Download full text

Kim, Sooyeon; Moses, Tim – ETS Research Report Series, 2016

The purpose of this study is to evaluate the extent to which item response theory (IRT) proficiency estimation methods are robust to the presence of aberrant responses under the "GRE"® General Test multistage adaptive testing (MST) design. To that end, a wide range of atypical response behaviors affecting as much as 10% of the test items…

Descriptors: Item Response Theory, Computation, Robustness (Statistics), Response Style (Tests)

The Effects of Rater Severity and Rater Distribution on Examinees' Ability Estimation for Constructed-Response Items. Research Report. ETS RR-13-23

Peer reviewed
PDF on ERIC

Download full text

Wang, Zhen; Yao, Lihua – ETS Research Report Series, 2013

The current study used simulated data to investigate the properties of a newly proposed method (Yao's rater model) for modeling rater severity and its distribution under different conditions. Our study examined the effects of rater severity, distributions of rater severity, the difference between item response theory (IRT) models with rater effect…

Descriptors: Test Format, Test Items, Responses, Computation

A Study of Frequency Estimation Equipercentile Equating When There Are Large Ability Differences. Research Report. ETS RR-09-45

Peer reviewed
PDF on ERIC

Download full text

Guo, Hongwen; Oh, Hyeonjoo J. – ETS Research Report Series, 2009

In operational equating, frequency estimation (FE) equipercentile equating is often excluded from consideration when the old and new groups have a large ability difference. This convention may, in some instances, cause the exclusion of one competitive equating method from the set of methods under consideration. In this report, we study the…

Descriptors: Equated Scores, Computation, Statistical Analysis, Test Items

Methods of Linking with Small Samples in a Common-Item Design: An Empirical Comparison. Research Report. ETS RR-09-38

Peer reviewed
PDF on ERIC

Download full text

Kim, Sooyeon; Livingston, Samuel A. – ETS Research Report Series, 2009

A series of resampling studies was conducted to compare the accuracy of equating in a common item design using four different methods: chained equipercentile equating of smoothed distributions, chained linear equating, chained mean equating, and the circle-arc method. Four operational test forms, each containing more than 100 items, were used for…

Descriptors: Sampling, Sample Size, Accuracy, Test Items

Comparison of Multidimensional Item Response Models: Multivariate Normal Ability Distributions versus Multivariate Polytomous Ability Distributions. Research Report. ETS RR-08-45

Peer reviewed
PDF on ERIC

Download full text

Haberman, Shelby J.; von Davier, Matthias; Lee, Yi-Hsuan – ETS Research Report Series, 2008

Multidimensional item response models can be based on multivariate normal ability distributions or on multivariate polytomous ability distributions. For the case of simple structure in which each item corresponds to a unique dimension of the ability vector, some applications of the two-parameter logistic model to empirical data are employed to…

Descriptors: Item Response Theory, Comparative Analysis, Ability, Models

Comparing Different Approaches of Bias Correction for Ability Estimation in IRT Models. Research Report. ETS RR-08-13

Peer reviewed
PDF on ERIC

Download full text

Lee, Yi-Hsuan; Zhang, Jinming – ETS Research Report Series, 2008

The method of maximum-likelihood is typically applied to item response theory (IRT) models when the ability parameter is estimated while conditioning on the true item parameters. In practice, the item parameters are unknown and need to be estimated first from a calibration sample. Lewis (1985) and Zhang and Lu (2007) proposed the expected response…

Descriptors: Item Response Theory, Comparative Analysis, Computation, Ability

Refinement of a Bias-Correction Procedure for the Weighted Likelihood Estimator of Ability. Research Report. ETS RR-07-23

Peer reviewed
PDF on ERIC

Download full text

Zhang, Jinming; Lu, Ting – ETS Research Report Series, 2007

In practical applications of item response theory (IRT), item parameters are usually estimated first from a calibration sample. After treating these estimates as fixed and known, ability parameters are then estimated. However, the statistical inferences based on the estimated abilities can be misleading if the uncertainty of the item parameter…

Descriptors: Item Response Theory, Ability, Error of Measurement, Maximum Likelihood Statistics

Adaptive Quadrature for Item Response Models. Research Report. ETS RR-06-29

Peer reviewed
PDF on ERIC

Download full text

Haberman, Shelby J. – ETS Research Report Series, 2006

Adaptive quadrature is applied to marginal maximum likelihood estimation for item response models with normal ability distributions. Even in one dimension, significant gains in speed and accuracy of computation may be achieved.

Descriptors: Item Response Theory, Maximum Likelihood Statistics, Computation, Ability

The Information a Test Provides on an Ability Parameter. Research Report. ETS RR-07-18

Peer reviewed
PDF on ERIC

Download full text

Haberman, Shelby J. – ETS Research Report Series, 2007

In item-response theory, if a latent-structure model has an ability variable, then elementary information theory may be employed to provide a criterion for evaluation of the information the test provides concerning ability. This criterion may be considered even in cases in which the latent-structure model is not valid, although interpretation of…

Descriptors: Item Response Theory, Ability, Information Theory, Computation

Reliability and the Nonequivalent Groups with Anchor Test Design. Research Report. ETS RR-07-16

Peer reviewed
PDF on ERIC

Download full text

Moses, Tim; Kim, Sooyeon – ETS Research Report Series, 2007

This study evaluated the impact of unequal reliability on test equating methods in the nonequivalent groups with anchor test (NEAT) design. Classical true score-based models were compared in terms of their assumptions about how reliability impacts test scores. These models were related to treatment of population ability differences by different…

Descriptors: Reliability, Equated Scores, Test Items, Statistical Analysis

Identifiability of Parameters in Item Response Models with Unconstrained Ability Distributions. Research Report. ETS RR-05-24

Peer reviewed
PDF on ERIC

Download full text

Haberman, Shelby J. – ETS Research Report Series, 2005

If a parametric model for the ability distribution is not assumed, then the customary two-parameter and three-parameter logistic models for item response analysis present identifiability problems not encountered with the Rasch model. These problems impose substantial restrictions on possible models for ability distributions.

Descriptors: Item Response Theory, Ability, Models, Maximum Likelihood Statistics

Evidence-Centered Assessment Design for Reasoning about Accommodations for Individuals with Disabilities in NAEP Reading and Mathematics. Research Report. ETS RR-08-38

Peer reviewed
PDF on ERIC

Download full text

Direct link

Hansen, Eric G.; Mislevy, Robert J.; Steinberg, Linda S. – ETS Research Report Series, 2008

Accommodations play a key role in enabling individuals with disabilities to participate in the National Assessment of Educational Progress (NAEP) and other large-scale assessments. However, it can be difficult to know how accommodations affect the validity of results, thus making it difficult to determine which accommodations should be allowed.…

Descriptors: National Competency Tests, Disabilities, Reading Instruction, Mathematics Instruction

Bias Correction for the Maximum Likelihood Estimate of Ability. Research Report. ETS RR-05-15

Peer reviewed
PDF on ERIC

Download full text

Zhang, Jinming – ETS Research Report Series, 2005

Lord's bias function and the weighted likelihood estimation method are effective in reducing the bias of the maximum likelihood estimate of an examinee's ability under the assumption that the true item parameters are known. This paper presents simulation studies to determine the effectiveness of these two methods in reducing the bias when the item…

Descriptors: Statistical Bias, Maximum Likelihood Statistics, Computation, Ability