ERIC - Search Results

Publication Date

In 2025	1
Since 2024	1
Since 2021 (last 5 years)	3
Since 2016 (last 10 years)	8
Since 2006 (last 20 years)	20

Descriptor

Ability	38
Test Length	38
Item Response Theory	22
Test Items	18
Sample Size	15
Simulation	14
Estimation (Mathematics)	10
Adaptive Testing	9
Bayesian Statistics	9
Accuracy	8
Computation	8
Computer Assisted Testing	8
Statistical Bias	8
Comparative Analysis	7
Error of Measurement	7
Maximum Likelihood Statistics	7
Statistical Distributions	7
Test Construction	7
Item Banks	6
Differences	5
Foreign Countries	5
Models	5
Test Format	5
Correlation	4
Difficulty Level	4
More ▼

Source

Educational and Psychological…	5
Applied Psychological…	4
Journal of Educational…	2
Measurement:…	2
Applied Measurement in…	1
Center on Children and…	1
ETS Research Report Series	1
Educational Measurement:…	1
Educational Sciences: Theory…	1
International Journal of…	1
International Journal of…	1
International Journal of…	1
Journal of Mental Health…	1
ProQuest LLC	1
Psychometrika	1
More ▼

Publication Type

Journal Articles	22
Reports - Research	20
Reports - Evaluative	17
Speeches/Meeting Papers	10
Dissertations/Theses -…	1
Reports - Descriptive	1

Education Level

Secondary Education	2
Early Childhood Education	1
Preschool Education	1

Audience

Location

Colombia	1
Indonesia	1
Jordan	1
Peru	1
Qatar	1
Turkey	1

Laws, Policies, & Programs

Assessments and Surveys

Program for International…	2
Advanced Placement…	1
COMPASS (Computer Assisted…	1
Test of English as a Foreign…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 38 results Save | Export

The Effect of Polytomous Item Ratio on Ability Estimation in Multistage Tests

Peer reviewed
PDF on ERIC

Download full text

Hasibe Yahsi Sari; Hulya Kelecioglu – International Journal of Assessment Tools in Education, 2025

The aim of the study is to examine the effect of polytomous item ratio on ability estimation in different conditions in multistage tests (MST) using mixed tests. The study is simulation-based research. In the PISA 2018 application, the ability parameters of the individuals and the item pool were created by using the item parameters estimated from…

Descriptors: Test Items, Test Format, Accuracy, Test Length

What Are the Conditions Associated with Subscore Added Value Noninvariance? Implications for Improving Subscore Interpretation Fairness

Peer reviewed

Direct link

Rios, Joseph A.; Miranda, Alejandra A. – Educational Measurement: Issues and Practice, 2021

Subscore added value analyses assume invariance across test taking populations; however, this assumption may be untenable in practice as differential subdomain relationships may be present among subgroups. The purpose of this simulation study was to understand the conditions associated with subscore added value noninvariance when manipulating: (1)…

Descriptors: Scores, Test Length, Ability, Correlation

Assessing Ability Recovery of the Sequential IRT Model with Unstructured Multiple-Attempt Data

Peer reviewed
PDF on ERIC

Download full text

Direct link

Ziying Li; A. Corinne Huggins-Manley; Walter L. Leite; M. David Miller; Eric A. Wright – Educational and Psychological Measurement, 2022

The unstructured multiple-attempt (MA) item response data in virtual learning environments (VLEs) are often from student-selected assessment data sets, which include missing data, single-attempt responses, multiple-attempt responses, and unknown growth ability across attempts, leading to a complex and complicated scenario for using this kind of…

Descriptors: Sequential Approach, Item Response Theory, Data, Simulation

Parameter Estimation Bias of Dichotomous Logistic Item Response Theory Models Using Different Variables

Peer reviewed
PDF on ERIC

Download full text

Köse, Alper; Dogan, C. Deha – International Journal of Evaluation and Research in Education, 2019

The aim of this study was to examine the precision of item parameter estimation in different sample sizes and test lengths under three parameter logistic model (3PL) item response theory (IRT) model, where the trait measured by a test was not normally distributed or had a skewed distribution. In the study, number of categories (1-0), and item…

Descriptors: Statistical Bias, Item Response Theory, Simulation, Accuracy

Estimation of Mixture Rasch Models from Skewed Latent Ability Distributions

Peer reviewed

Direct link

Karadavut, Tugba; Cohen, Allan S.; Kim, Seock-Ho – Measurement: Interdisciplinary Research and Perspectives, 2020

Mixture Rasch (MixRasch) models conventionally assume normal distributions for latent ability. Previous research has shown that the assumption of normality is often unmet in educational and psychological measurement. When normality is assumed, asymmetry in the actual latent ability distribution has been shown to result in extraction of spurious…

Descriptors: Item Response Theory, Ability, Statistical Distributions, Sample Size

The Matching Criterion Purification for Differential Item Functioning Analyses in a Large-Scale Assessment

Peer reviewed

Direct link

Lee, HyeSun; Geisinger, Kurt F. – Educational and Psychological Measurement, 2016

The current study investigated the impact of matching criterion purification on the accuracy of differential item functioning (DIF) detection in large-scale assessments. The three matching approaches for DIF analyses (block-level matching, pooled booklet matching, and equated pooled booklet matching) were employed with the Mantel-Haenszel…

Descriptors: Test Bias, Measurement, Accuracy, Statistical Analysis

Effects of Differential Item Functioning on Examinees' Test Performance and Reliability of Test

Peer reviewed

Direct link

Lee, Yi-Hsuan; Zhang, Jinming – International Journal of Testing, 2017

Simulations were conducted to examine the effect of differential item functioning (DIF) on measurement consequences such as total scores, item response theory (IRT) ability estimates, and test reliability in terms of the ratio of true-score variance to observed-score variance and the standard error of estimation for the IRT ability parameter. The…

Descriptors: Test Bias, Test Reliability, Performance, Scores

Student Test Scores: How the Sausage Is Made and Why You Should Care. Evidence Speaks Reports, Vol 1, #25

Direct link

Jacob, Brian A. – Center on Children and Families at Brookings, 2016

Contrary to popular belief, modern cognitive assessments--including the new Common Core tests--produce test scores based on sophisticated statistical models rather than the simple percent of items a student answers correctly. While there are good reasons for this, it means that reported test scores depend on many decisions made by test designers,…

Descriptors: Scores, Common Core State Standards, Test Length, Test Content

A Nonparametric Approach to Estimate Classification Accuracy and Consistency

Peer reviewed

Direct link

Lathrop, Quinn N.; Cheng, Ying – Journal of Educational Measurement, 2014

When cut scores for classifications occur on the total score scale, popular methods for estimating classification accuracy (CA) and classification consistency (CC) require assumptions about a parametric form of the test scores or about a parametric response model, such as item response theory (IRT). This article develops an approach to estimate CA…

Descriptors: Cutting Scores, Classification, Computation, Nonparametric Statistics

The Influence of Item Calibration Error on Variable-Length Computerized Adaptive Testing

Peer reviewed

Direct link

Patton, Jeffrey M.; Cheng, Ying; Yuan, Ke-Hai; Diao, Qi – Applied Psychological Measurement, 2013

Variable-length computerized adaptive testing (VL-CAT) allows both items and test length to be "tailored" to examinees, thereby achieving the measurement goal (e.g., scoring precision or classification) with as few items as possible. Several popular test termination rules depend on the standard error of the ability estimate, which in turn depends…

Descriptors: Adaptive Testing, Computer Assisted Testing, Test Length, Ability

An Assessment of the Nonparametric Approach for Evaluating the Fit of Item Response Models

Peer reviewed

Direct link

Liang, Tie; Wells, Craig S.; Hambleton, Ronald K. – Journal of Educational Measurement, 2014

As item response theory has been more widely applied, investigating the fit of a parametric model becomes an important part of the measurement process. There is a lack of promising solutions to the detection of model misfit in IRT. Douglas and Cohen introduced a general nonparametric approach, RISE (Root Integrated Squared Error), for detecting…

Descriptors: Item Response Theory, Measurement Techniques, Nonparametric Statistics, Models

Deriving Stopping Rules for Multidimensional Computerized Adaptive Testing

Peer reviewed

Direct link

Wang, Chun; Chang, Hua-Hua; Boughton, Keith A. – Applied Psychological Measurement, 2013

Multidimensional computerized adaptive testing (MCAT) is able to provide a vector of ability estimates for each examinee, which could be used to provide a more informative profile of an examinee's performance. The current literature on MCAT focuses on the fixed-length tests, which can generate less accurate results for those examinees whose…

Descriptors: Computer Assisted Testing, Adaptive Testing, Test Length, Item Banks

Comparing Performances (Type I Error and Power) of IRT Likelihood Ratio SIBTEST and Mantel-Haenszel Methods in the Determination of Differential Item Functioning

Peer reviewed
PDF on ERIC

Download full text

Atalay Kabasakal, Kübra; Arsan, Nihan; Gök, Bilge; Kelecioglu, Hülya – Educational Sciences: Theory and Practice, 2014

This simulation study compared the performances (Type I error and power) of Mantel-Haenszel (MH), SIBTEST, and item response theory-likelihood ratio (IRT-LR) methods under certain conditions. Manipulated factors were sample size, ability differences between groups, test length, the percentage of differential item functioning (DIF), and underlying…

Descriptors: Comparative Analysis, Item Response Theory, Statistical Analysis, Test Bias

Mixed-Format Test Score Equating: Effect of Item-Type Multidimensionality, Length and Composition of Common-Item Set, and Group Ability Difference

Direct link

Wang, Wei – ProQuest LLC, 2013

Mixed-format tests containing both multiple-choice (MC) items and constructed-response (CR) items are now widely used in many testing programs. Mixed-format tests often are considered to be superior to tests containing only MC items although the use of multiple item formats leads to measurement challenges in the context of equating conducted under…

Descriptors: Equated Scores, Test Format, Test Items, Test Length

Item Pool Design for an Operational Variable-Length Computerized Adaptive Test

Peer reviewed

Direct link

He, Wei; Reckase, Mark D. – Educational and Psychological Measurement, 2014

For computerized adaptive tests (CATs) to work well, they must have an item pool with sufficient numbers of good quality items. Many researchers have pointed out that, in developing item pools for CATs, not only is the item pool size important but also the distribution of item parameters and practical considerations such as content distribution…

Descriptors: Item Banks, Test Length, Computer Assisted Testing, Adaptive Testing

Previous Page | Next Page »

Pages: 1 | 2 | 3

Kim, Seock-Ho	3
Cheng, Ying	2
He, Wei	2
Lee, Yi-Hsuan	2
Reckase, Mark D.	2
Zhang, Jinming	2
A. Corinne Huggins-Manley	1
Abdel-fattah, Abdel-fattah A.	1
Abel, Gene G.	1
Ang, Cheng	1
Arsan, Nihan	1
Atalay Kabasakal, Kübra	1
Ban, Jae-Chun	1
Beland, Sebastien	1
Bergstrom, Betty A.	1
Blasingame, Gerry D.	1
Boughton, Keith A.	1
Chang, Hua-Hua	1
Cohen, Allan S.	1
Diao, Qi	1
Dogan, C. Deha	1
Embretson, Susan E.	1
Eric A. Wright	1
Geisinger, Kurt F.	1
More ▼