ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	17
Since 2006 (last 20 years)	58

Descriptor

Statistical Analysis	91
Test Items	91
Simulation	78
Item Response Theory	39
Item Analysis	24
Comparative Analysis	23
Test Bias	19
Sample Size	18
Difficulty Level	15
Goodness of Fit	15
Models	15
Computer Simulation	13
Error of Measurement	13
Computation	12
Mathematical Models	12
Scores	11
Computer Assisted Testing	10
Evaluation Methods	10
Latent Trait Theory	10
Maximum Likelihood Statistics	10
Test Construction	10
Correlation	9
Equated Scores	9
Achievement Tests	8
Foreign Countries	8
More ▼

Publication Type

Reports - Research	73
Journal Articles	61
Reports - Evaluative	12
Speeches/Meeting Papers	9
Dissertations/Theses -…	5
Numerical/Quantitative Data	1
Reports - Descriptive	1
Tests/Questionnaires	1

Education Level

Secondary Education	4
Elementary Secondary Education	3
Higher Education	3
Elementary Education	2
Grade 4	2
Intermediate Grades	2
Postsecondary Education	2
Grade 12	1
Grade 8	1
High Schools	1
Junior High Schools	1
Middle Schools	1
More ▼

Audience

Researchers

Location

Canada	1
Florida	1
Minnesota	1
Turkey	1

Laws, Policies, & Programs

Assessments and Surveys

National Assessment of…	2
Program for International…	2
Raven Advanced Progressive…	2
Test of English as a Foreign…	2
Comprehensive Tests of Basic…	1
Florida Comprehensive…	1
Stanford Binet Intelligence…	1
Trends in International…	1
United States Medical…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 91 results Save | Export

Impacts of Differences in Group Abilities and Anchor Test Features on Three Non-IRT Test Equating Methods

Peer reviewed
PDF on ERIC

Download full text

Inga Laukaityte; Marie Wiberg – Practical Assessment, Research & Evaluation, 2024

The overall aim was to examine effects of differences in group ability and features of the anchor test form on equating bias and the standard error of equating (SEE) using both real and simulated data. Chained kernel equating, Postratification kernel equating, and Circle-arc equating were studied. A college admissions test with four different…

Descriptors: Ability Grouping, Test Items, College Entrance Examinations, High Stakes Tests

Dimension-Corrected Somers' D for the Item Analysis Settings

Peer reviewed
PDF on ERIC

Download full text

Metsämuuronen, Jari – International Journal of Educational Methodology, 2020

A new index of item discrimination power (IDP), dimension-corrected Somers' D (D2) is proposed. Somers' D is one of the superior alternatives for item-total- (Rit) and item-rest correlation (Rir) in reflecting the real IDP with items with scales 0/1 and 0/1/2, that is, up to three categories. D also reaches the extreme value +1 and -1 correctly…

Descriptors: Item Analysis, Correlation, Test Items, Simulation

Is the Factor Observed in Investigations on the Item-Position Effect Actually the Difficulty Factor?

Peer reviewed

Direct link

Schweizer, Karl; Troche, Stefan – Educational and Psychological Measurement, 2018

In confirmatory factor analysis quite similar models of measurement serve the detection of the difficulty factor and the factor due to the item-position effect. The item-position effect refers to the increasing dependency among the responses to successively presented items of a test whereas the difficulty factor is ascribed to the wide range of…

Descriptors: Investigations, Difficulty Level, Factor Analysis, Models

Large Sample Confidence Intervals for Item Response Theory Reliability Coefficients

Peer reviewed

Direct link

Andersson, Björn; Xin, Tao – Educational and Psychological Measurement, 2018

In applications of item response theory (IRT), an estimate of the reliability of the ability estimates or sum scores is often reported. However, analytical expressions for the standard errors of the estimators of the reliability coefficients are not available in the literature and therefore the variability associated with the estimated reliability…

Descriptors: Item Response Theory, Test Reliability, Test Items, Scores

Extension of Caution Indices to Mixed-Format Tests

Peer reviewed
PDF on ERIC

Download full text

Direct link

Sinharay, Sandip – Grantee Submission, 2018

Tatsuoka (1984) suggested several extended caution indices and their standardized versions that have been used as person-fit statistics by researchers such as Drasgow, Levine, and McLaughlin (1987), Glas and Meijer (2003), and Molenaar and Hoijtink (1990). However, these indices are only defined for tests with dichotomous items. This paper extends…

Descriptors: Test Format, Goodness of Fit, Item Response Theory, Error Patterns

IRT Item Parameter Scaling for Developing New Item Pools

Peer reviewed

Direct link

Kang, Hyeon-Ah; Lu, Ying; Chang, Hua-Hua – Applied Measurement in Education, 2017

Increasing use of item pools in large-scale educational assessments calls for an appropriate scaling procedure to achieve a common metric among field-tested items. The present study examines scaling procedures for developing a new item pool under a spiraled block linking design. The three scaling procedures are considered: (a) concurrent…

Descriptors: Item Response Theory, Accuracy, Educational Assessment, Test Items

Do Adaptive Representations of the Item-Position Effect in APM Improve Model Fit? A Simulation Study

Peer reviewed

Direct link

Zeller, Florian; Krampen, Dorothea; Reiß, Siegbert; Schweizer, Karl – Educational and Psychological Measurement, 2017

The item-position effect describes how an item's position within a test, that is, the number of previous completed items, affects the response to this item. Previously, this effect was represented by constraints reflecting simple courses, for example, a linear increase. Due to the inflexibility of these representations our aim was to examine…

Descriptors: Goodness of Fit, Simulation, Factor Analysis, Intelligence Tests

On Using Simulations to Inform Decision Making during Instrument Development

Peer reviewed

Direct link

Morgan, Grant B.; Moore, Courtney A.; Floyd, Harlee S. – Journal of Psychoeducational Assessment, 2018

Although content validity--how well each item of an instrument represents the construct being measured--is foundational in the development of an instrument, statistical validity is also important to the decisions that are made based on the instrument. The primary purpose of this study is to demonstrate how simulation studies can be used to assist…

Descriptors: Simulation, Decision Making, Test Construction, Validity

The Consequences of Ignoring Item Parameter Drift in Longitudinal Item Response Models

Peer reviewed

Direct link

Lee, Wooyeol; Cho, Sun-Joo – Applied Measurement in Education, 2017

Utilizing a longitudinal item response model, this study investigated the effect of item parameter drift (IPD) on item parameters and person scores via a Monte Carlo study. Item parameter recovery was investigated for various IPD patterns in terms of bias and root mean-square error (RMSE), and percentage of time the 95% confidence interval covered…

Descriptors: Item Response Theory, Test Items, Bias, Computation

Type I Error Inflation in DIF Identification with Mantel-Haenszel: An Explanation and a Solution

Peer reviewed

Direct link

Magis, David; De Boeck, Paul – Educational and Psychological Measurement, 2014

It is known that sum score-based methods for the identification of differential item functioning (DIF), such as the Mantel-Haenszel (MH) approach, can be affected by Type I error inflation in the absence of any DIF effect. This may happen when the items differ in discrimination and when there is item impact. On the other hand, outlier DIF methods…

Descriptors: Test Bias, Statistical Analysis, Test Items, Simulation

Effects of Various Simulation Conditions on Latent-Trait Estimates: A Simulation Study

Peer reviewed
PDF on ERIC

Download full text

Kogar, Hakan – International Journal of Assessment Tools in Education, 2018

The aim of this simulation study, determine the relationship between true latent scores and estimated latent scores by including various control variables and different statistical models. The study also aimed to compare the statistical models and determine the effects of different distribution types, response formats and sample sizes on latent…

Descriptors: Simulation, Context Effect, Computation, Statistical Analysis

How Does Polytomous Item Bias Affect Total-Group Survey Score Comparisons?

Peer reviewed

Direct link

Hidalgo, Ma Dolores; Benítez, Isabel; Padilla, Jose-Luis; Gómez-Benito, Juana – Sociological Methods & Research, 2017

The growing use of scales in survey questionnaires warrants the need to address how does polytomous differential item functioning (DIF) affect observed scale score comparisons. The aim of this study is to investigate the impact of DIF on the type I error and effect size of the independent samples t-test on the observed total scale scores. A…

Descriptors: Test Items, Test Bias, Item Response Theory, Surveys

Effect Size Measures for Differential Item Functioning in a Multidimensional IRT Model

Peer reviewed

Direct link

Suh, Youngsuk – Journal of Educational Measurement, 2016

This study adapted an effect size measure used for studying differential item functioning (DIF) in unidimensional tests and extended the measure to multidimensional tests. Two effect size measures were considered in a multidimensional item response theory model: signed weighted P-difference and unsigned weighted P-difference. The performance of…

Descriptors: Effect Size, Goodness of Fit, Statistical Analysis, Statistical Significance

Item Response Data Analysis Using Stata Item Response Theory Package

Peer reviewed

Direct link

Yang, Ji Seung; Zheng, Xiaying – Journal of Educational and Behavioral Statistics, 2018

The purpose of this article is to introduce and review the capability and performance of the Stata item response theory (IRT) package that is available from Stata v.14, 2015. Using a simulated data set and a publicly available item response data set extracted from Programme of International Student Assessment, we review the IRT package from…

Descriptors: Item Response Theory, Item Analysis, Computer Software, Statistical Analysis

Modeling Information Accumulation in Psychological Tests Using Item Response Times

Peer reviewed

Direct link

Ranger, Jochen; Kuhn, Jörg-Tobias – Journal of Educational and Behavioral Statistics, 2015

In this article, a latent trait model is proposed for the response times in psychological tests. The latent trait model is based on the linear transformation model and subsumes popular models from survival analysis, like the proportional hazards model and the proportional odds model. Core of the model is the assumption that an unspecified monotone…

Descriptors: Psychological Testing, Reaction Time, Statistical Analysis, Models

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7

ETS Research Report Series	12
Educational and Psychological…	12
Applied Psychological…	7
Journal of Educational…	7
ProQuest LLC	5
Journal of Educational and…	4
Applied Measurement in…	2
Eurasian Journal of…	2
International Journal of…	2
Psychometrika	2
American Journal of…	1
Educational Testing Service	1
Grantee Submission	1
International Journal of…	1
International Journal of…	1
International Journal of…	1
Journal of Psychoeducational…	1
Large-scale Assessments in…	1
Multivariate Behavioral…	1
Practical Assessment,…	1
Sociological Methods &…	1
Structural Equation Modeling:…	1
Studies in Second Language…	1
More ▼

Lu, Ying	3
Sinharay, Sandip	3
Chang, Hua-Hua	2
Cho, Sun-Joo	2
De Boeck, Paul	2
Hambleton, Ronald K.	2
Knol, Dirk L.	2
Meijer, Rob R.	2
Ranger, Jochen	2
Reckase, Mark D.	2
Rogers, H. Jane	2
Schweizer, Karl	2
Sotaridona, Leonardo S.	2
Spray, Judith A.	2
Suh, Youngsuk	2
Xu, Xueli	2
von Davier, Matthias	2
Abayeva, Nella F.	1
Ackerman, Terry A.	1
Ali, Usama S.	1
Andersson, Björn	1
Arendasy, Martin	1
Asparouhov, Tihomir	1
Benítez, Isabel	1
More ▼