ERIC - Search Results

Publication Date

In 2025	0
Since 2024	2
Since 2021 (last 5 years)	14
Since 2016 (last 10 years)	30
Since 2006 (last 20 years)	45

Descriptor

Error of Measurement	55
Sample Size	55
Test Items	55
Item Response Theory	32
Simulation	18
Item Analysis	16
Test Length	15
Accuracy	12
Comparative Analysis	12
Models	12
Monte Carlo Methods	12
Computation	11
Equated Scores	11
Difficulty Level	10
Test Bias	10
Scores	9
Statistical Analysis	9
Correlation	8
Sampling	8
Statistical Bias	8
Factor Analysis	7
Effect Size	6
Test Reliability	6
Evaluation Methods	5
Goodness of Fit	5
More ▼

Source

Educational and Psychological…	10
Journal of Educational…	7
Applied Measurement in…	4
ETS Research Report Series	4
International Journal of…	4
Online Submission	3
ProQuest LLC	3
Applied Psychological…	2
International Journal of…	2
Educational Sciences: Theory…	1
Educational Testing Service	1
Journal of Education and…	1
Journal of Educational and…	1
Measurement and Evaluation in…	1
Pedagogical Research	1
Practical Assessment,…	1
Research Matters	1
Structural Equation Modeling:…	1
More ▼

Publication Type

Journal Articles	42
Reports - Research	41
Speeches/Meeting Papers	10
Reports - Evaluative	9
Dissertations/Theses -…	3
Reports - Descriptive	2
Numerical/Quantitative Data	1

Education Level

Grade 8	2
Secondary Education	2
Elementary Secondary Education	1
Junior High Schools	1
Middle Schools	1

Audience

Researchers

Location

Turkey

Laws, Policies, & Programs

Assessments and Surveys

Comprehensive Tests of Basic…	1
Graduate Management Admission…	1
Program for International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 55 results Save | Export

How to Obtain the Most Error-Free Estimate of Reliability? Eight Sources of Deflation in the Estimates of Reliability to Avoid

Peer reviewed
PDF on ERIC

Download full text

Metsämuuronen, Jari – Practical Assessment, Research & Evaluation, 2022

The reliability of a test score is usually underestimated and the deflation may be profound, 0.40 - 0.60 units of reliability or 46 - 71%. Eight root sources of the deflation are discussed and quantified by a simulation with 1,440 real-world datasets: (1) errors in the measurement modelling, (2) inefficiency in the estimator of reliability within…

Descriptors: Test Reliability, Scores, Test Items, Correlation

Comparing Accuracy of Parallel Analysis and Fit Statistics for Estimating the Number of Factors with Ordered Categorical Data in Exploratory Factor Analysis

Peer reviewed

Direct link

Hyunjung Lee; Heining Cham – Educational and Psychological Measurement, 2024

Determining the number of factors in exploratory factor analysis (EFA) is crucial because it affects the rest of the analysis and the conclusions of the study. Researchers have developed various methods for deciding the number of factors to retain in EFA, but this remains one of the most difficult decisions in the EFA. The purpose of this study is…

Descriptors: Factor Structure, Factor Analysis, Monte Carlo Methods, Goodness of Fit

Sample Size and Item Parameter Estimation Precision When Utilizing the Masters' Partial Credit Model

Download full text

Custer, Michael; Kim, Jongpil – Online Submission, 2023

This study utilizes an analysis of diminishing returns to examine the relationship between sample size and item parameter estimation precision when utilizing the Masters' Partial Credit Model for polytomous items. Item data from the standardization of the Batelle Developmental Inventory, 3rd Edition were used. Each item was scored with a…

Descriptors: Sample Size, Item Response Theory, Test Items, Computation

Type I Error and Power Rates: A Comparative Analysis of Techniques in Differential Item Functioning

Peer reviewed
PDF on ERIC

Download full text

Ayse Bilicioglu Gunes; Bayram Bicak – International Journal of Assessment Tools in Education, 2023

The main purpose of this study is to examine the Type I error and statistical power ratios of Differential Item Functioning (DIF) techniques based on different theories under different conditions. For this purpose, a simulation study was conducted by using Mantel-Haenszel (MH), Logistic Regression (LR), Lord's [chi-squared], and Raju's Areas…

Descriptors: Test Items, Item Response Theory, Error of Measurement, Test Bias

IRT Characteristic Curve Linking Methods Weighted by Information for Mixed-Format Tests

Peer reviewed

Direct link

Shaojie Wang; Won-Chan Lee; Minqiang Zhang; Lixin Yuan – Applied Measurement in Education, 2024

To reduce the impact of parameter estimation errors on IRT linking results, recent work introduced two information-weighted characteristic curve methods for dichotomous items. These two methods showed outstanding performance in both simulation and pseudo-form pseudo-group analysis. The current study expands upon the concept of information…

Descriptors: Item Response Theory, Test Format, Test Length, Error of Measurement

Investigating Confidence Intervals of Item Parameters When Some Item Parameters Take Priors in the 2PL and 3PL Models

Peer reviewed

Direct link

Paek, Insu; Lin, Zhongtian; Chalmers, Robert Philip – Educational and Psychological Measurement, 2023

To reduce the chance of Heywood cases or nonconvergence in estimating the 2PL or the 3PL model in the marginal maximum likelihood with the expectation-maximization (MML-EM) estimation method, priors for the item slope parameter in the 2PL model or for the pseudo-guessing parameter in the 3PL model can be used and the marginal maximum a posteriori…

Descriptors: Models, Item Response Theory, Test Items, Intervals

An Analysis of Differential Bundle Functioning in Multidimensional Tests Using the SIBTEST Procedure

Peer reviewed
PDF on ERIC

Download full text

Özdogan, Didem; Kelecioglu, Hülya – International Journal of Assessment Tools in Education, 2022

This study aims to analyze the differential bundle functioning in multidimensional tests with a specific purpose to detect this effect through differentiating the location of the item with DIF in the test, the correlation between the dimensions, the sample size, and the ratio of reference to focal group size. The first 10 items of the test that is…

Descriptors: Correlation, Sample Size, Test Items, Item Analysis

The Study of the Effect of Item Parameter Drift on Ability Estimation Obtained from Adaptive Testing under Different Conditions

Peer reviewed
PDF on ERIC

Download full text

Sahin Kursad, Merve; Cokluk Bokeoglu, Omay; Cikrikci, Rahime Nukhet – International Journal of Assessment Tools in Education, 2022

Item parameter drift (IPD) is the systematic differentiation of parameter values of items over time due to various reasons. If it occurs in computer adaptive tests (CAT), it causes errors in the estimation of item and ability parameters. Identification of the underlying conditions of this situation in CAT is important for estimating item and…

Descriptors: Item Analysis, Computer Assisted Testing, Test Items, Error of Measurement

Two IRT Characteristic Curve Linking Methods Weighted by Information

Peer reviewed

Direct link

Wang, Shaojie; Zhang, Minqiang; Lee, Won-Chan; Huang, Feifei; Li, Zonglong; Li, Yixing; Yu, Sufang – Journal of Educational Measurement, 2022

Traditional IRT characteristic curve linking methods ignore parameter estimation errors, which may undermine the accuracy of estimated linking constants. Two new linking methods are proposed that take into account parameter estimation errors. The item- (IWCC) and test-information-weighted characteristic curve (TWCC) methods employ weighting…

Descriptors: Item Response Theory, Error of Measurement, Accuracy, Monte Carlo Methods

A Method to Increase the Power of Monte Carlo Method: Increasing the Number of Iteration

Peer reviewed
PDF on ERIC

Download full text

Koçak, Duygu – Pedagogical Research, 2020

Iteration number in Monte Carlo simulation method used commonly in educational research has an effect on Item Response Theory test and item parameters. The related studies show that the number of iteration is at the discretion of the researcher. Similarly, there is no specific number suggested for the number of iteration in the related literature.…

Descriptors: Monte Carlo Methods, Item Response Theory, Educational Research, Test Items

A Regression Discontinuity Design Framework for Controlling Selection Bias in Evaluations of Differential Item Functioning

Peer reviewed

Direct link

Koziol, Natalie A.; Goodrich, J. Marc; Yoon, HyeonJin – Educational and Psychological Measurement, 2022

Differential item functioning (DIF) is often used to examine validity evidence of alternate form test accommodations. Unfortunately, traditional approaches for evaluating DIF are prone to selection bias. This article proposes a novel DIF framework that capitalizes on regression discontinuity design analysis to control for selection bias. A…

Descriptors: Regression (Statistics), Item Analysis, Validity, Testing Accommodations

Robustness of Weighted Differential Item Functioning (DIF) Analysis: The Case of Mantel-Haenszel DIF Statistics. Research Report. ETS RR-21-12

Peer reviewed
PDF on ERIC

Download full text

Lu, Ru; Guo, Hongwen; Dorans, Neil J. – ETS Research Report Series, 2021

Two families of analysis methods can be used for differential item functioning (DIF) analysis. One family is DIF analysis based on observed scores, such as the Mantel-Haenszel (MH) and the standardized proportion-correct metric for DIF procedures; the other is analysis based on latent ability, in which the statistic is a measure of departure from…

Descriptors: Robustness (Statistics), Weighted Scores, Test Items, Item Analysis

Impact of Item Parameter Drift on Rasch Scale Stability in Small Samples over Multiple Administrations

Peer reviewed

Direct link

Kopp, Jason P.; Jones, Andrew T. – Applied Measurement in Education, 2020

Traditional psychometric guidelines suggest that at least several hundred respondents are needed to obtain accurate parameter estimates under the Rasch model. However, recent research indicates that Rasch equating results in accurate parameter estimates with sample sizes as small as 25. Item parameter drift under the Rasch model has been…

Descriptors: Item Response Theory, Psychometrics, Sample Size, Sampling

Comparing Small-Sample Equating with Angoff Judgement for Linking Cut-Scores on Two Tests

Download full text

Bramley, Tom – Research Matters, 2020

The aim of this study was to compare, by simulation, the accuracy of mapping a cut-score from one test to another by expert judgement (using the Angoff method) versus the accuracy with a small-sample equating method (chained linear equating). As expected, the standard-setting method resulted in more accurate equating when we assumed a higher level…

Descriptors: Cutting Scores, Standard Setting (Scoring), Equated Scores, Accuracy

A Log-Linear Modeling Approach for Differential Item Functioning Detection in Polytomously Scored Items

Peer reviewed

Direct link

Yesiltas, Gonca; Paek, Insu – Educational and Psychological Measurement, 2020

A log-linear model (LLM) is a well-known statistical method to examine the relationship among categorical variables. This study investigated the performance of LLM in detecting differential item functioning (DIF) for polytomously scored items via simulations where various sample sizes, ability mean differences (impact), and DIF types were…

Descriptors: Simulation, Sample Size, Item Analysis, Scores

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4

Paek, Insu	3
Custer, Michael	2
Hambleton, Ronald K.	2
Kelecioglu, Hülya	2
Lee, Yi-Hsuan	2
de la Torre, Jimmy	2
Abulela, Mohammed A. A.	1
Ackerman, Terry A.	1
Ahn, Soyeon	1
Alhija, Fadia Nasser-Abu	1
Andersson, Björn	1
Ankenmann, Robert D.	1
Arikan, Çigdem Akin	1
Ayse Bilicioglu Gunes	1
Bayram Bicak	1
Bramley, Tom	1
Cai, Li	1
Cao, Mengyang	1
Carvajal, Jorge	1
Chalmers, Robert Philip	1
Chason, Walter M.	1
Cheng, Philip E.	1
Chernyshenko, Oleksandr S.	1
Cho, Sun-Joo	1
More ▼