ERIC - Search Results

Publication Date

In 2025	0
Since 2024	2
Since 2021 (last 5 years)	8
Since 2016 (last 10 years)	12
Since 2006 (last 20 years)	24

Descriptor

Error of Measurement	32
Statistical Bias	32
Test Items	32
Item Response Theory	15
Simulation	11
Comparative Analysis	10
Computation	10
Sample Size	8
Difficulty Level	7
Adaptive Testing	6
Equated Scores	6
Test Length	6
Computer Assisted Testing	5
Factor Analysis	5
Item Analysis	5
Maximum Likelihood Statistics	5
Test Reliability	5
Ability	4
Accuracy	4
Correlation	4
Effect Size	4
Psychometrics	4
Scaling	4
Scores	4
Scoring	4
More ▼

Source

Educational and Psychological…	8
ETS Research Report Series	3
ProQuest LLC	3
Applied Psychological…	2
Grantee Submission	2
Journal of Educational…	2
Applied Measurement in…	1
EURASIA Journal of…	1
Educational Testing Service	1
International Journal of…	1
International Journal of…	1
Structural Equation Modeling:…	1
More ▼

Publication Type

Reports - Research	25
Journal Articles	20
Speeches/Meeting Papers	4
Dissertations/Theses -…	3
Reports - Evaluative	3
Numerical/Quantitative Data	1
Reports - Descriptive	1

Education Level

Elementary Education	2
Secondary Education	2
Early Childhood Education	1
Grade 2	1
Grade 3	1
Grade 9	1
High Schools	1
Junior High Schools	1
Middle Schools	1
Primary Education	1

Audience

Location

Indonesia	1
South Carolina	1
Turkey	1

Laws, Policies, & Programs

Assessments and Surveys

Advanced Placement…	1
Comprehensive Tests of Basic…	1
National Assessment of…	1
New Jersey College Basic…	1
Program for International…	1
SAT (College Admission Test)	1

What Works Clearinghouse Rating

Showing 1 to 15 of 32 results Save | Export

Multi-Group Regularized Gaussian Variational Estimation: Fast Detection of DIF

Peer reviewed

Direct link

Weicong Lyu; Chun Wang; Gongjun Xu – Grantee Submission, 2024

Data harmonization is an emerging approach to strategically combining data from multiple independent studies, enabling addressing new research questions that are not answerable by a single contributing study. A fundamental psychometric challenge for data harmonization is to create commensurate measures for the constructs of interest across…

Descriptors: Data Analysis, Test Items, Psychometrics, Item Response Theory

Investigating Confidence Intervals of Item Parameters When Some Item Parameters Take Priors in the 2PL and 3PL Models

Peer reviewed

Direct link

Paek, Insu; Lin, Zhongtian; Chalmers, Robert Philip – Educational and Psychological Measurement, 2023

To reduce the chance of Heywood cases or nonconvergence in estimating the 2PL or the 3PL model in the marginal maximum likelihood with the expectation-maximization (MML-EM) estimation method, priors for the item slope parameter in the 2PL model or for the pseudo-guessing parameter in the 3PL model can be used and the marginal maximum a posteriori…

Descriptors: Models, Item Response Theory, Test Items, Intervals

Maintaining Score Scales over Time: A Comparison of Five Scoring Methods

Peer reviewed

Direct link

Kim, Stella Yun; Lee, Won-Chan – Applied Measurement in Education, 2023

This study evaluates various scoring methods including number-correct scoring, IRT theta scoring, and hybrid scoring in terms of scale-score stability over time. A simulation study was conducted to examine the relative performance of five scoring methods in terms of preserving the first two moments of scale scores for a population in a chain of…

Descriptors: Scoring, Comparative Analysis, Item Response Theory, Simulation

A Regression Discontinuity Design Framework for Controlling Selection Bias in Evaluations of Differential Item Functioning

Peer reviewed

Direct link

Koziol, Natalie A.; Goodrich, J. Marc; Yoon, HyeonJin – Educational and Psychological Measurement, 2022

Differential item functioning (DIF) is often used to examine validity evidence of alternate form test accommodations. Unfortunately, traditional approaches for evaluating DIF are prone to selection bias. This article proposes a novel DIF framework that capitalizes on regression discontinuity design analysis to control for selection bias. A…

Descriptors: Regression (Statistics), Item Analysis, Validity, Testing Accommodations

Beyond Group Comparisons: Accounting for Intersectional Sources of Bias in International Survey Measures

Peer reviewed

Direct link

Rujun Xu; James Soland – International Journal of Testing, 2024

International surveys are increasingly being used to understand nonacademic outcomes like math and science motivation, and to inform education policy changes within countries. Such instruments assume that the measure works consistently across countries, ethnicities, and languages--that is, they assume measurement invariance. While studies have…

Descriptors: Surveys, Statistical Bias, Achievement Tests, Foreign Countries

Item Calibration Methods with Multiple Sub-Scale Multistage Testing

Peer reviewed
PDF on ERIC

Download full text

Direct link

Wang, Chun; Chen, Ping; Jiang, Shengyu – Grantee Submission, 2019

Many large-scale educational surveys have moved from linear form design to multistage testing (MST) design. One advantage of MST is that it can provide more accurate latent trait [theta] estimates using fewer items than required by linear tests. However, MST generates incomplete response data by design; hence questions remain as to how to…

Descriptors: Adaptive Testing, Test Items, Item Response Theory, Maximum Likelihood Statistics

Differential Item Functioning Effect Size from the Multigroup Confirmatory Factor Analysis for a Meta-Analysis: A Simulation Study

Peer reviewed

Direct link

Park, Sung Eun; Ahn, Soyeon; Zopluoglu, Cengiz – Educational and Psychological Measurement, 2021

This study presents a new approach to synthesizing differential item functioning (DIF) effect size: First, using correlation matrices from each study, we perform a multigroup confirmatory factor analysis (MGCFA) that examines measurement invariance of a test item between two subgroups (i.e., focal and reference groups). Then we synthesize, across…

Descriptors: Item Analysis, Effect Size, Difficulty Level, Monte Carlo Methods

A Polytomous Scoring Approach to Handle Not-Reached Items in Low-Stakes Assessments

Peer reviewed

Direct link

Gorgun, Guher; Bulut, Okan – Educational and Psychological Measurement, 2021

In low-stakes assessments, some students may not reach the end of the test and leave some items unanswered due to various reasons (e.g., lack of test-taking motivation, poor time management, and test speededness). Not-reached items are often treated as incorrect or not-administered in the scoring process. However, when the proportion of…

Descriptors: Scoring, Test Items, Response Style (Tests), Mathematics Tests

Comparison of Confirmatory Factor Analysis Estimation Methods on Mixed-Format Data

Peer reviewed
PDF on ERIC

Download full text

Kilic, Abdullah Faruk; Dogan, Nuri – International Journal of Assessment Tools in Education, 2021

Weighted least squares (WLS), weighted least squares mean-and-variance-adjusted (WLSMV), unweighted least squares mean-and-variance-adjusted (ULSMV), maximum likelihood (ML), robust maximum likelihood (MLR) and Bayesian estimation methods were compared in mixed item response type data via Monte Carlo simulation. The percentage of polytomous items,…

Descriptors: Factor Analysis, Computation, Least Squares Statistics, Maximum Likelihood Statistics

Different Methods of Adjusting for Form Difficulty under the Rasch Model: Impact on Consistency of Assessment Results. Research Report. ETS RR-19-08

Peer reviewed
PDF on ERIC

Download full text

Manna, Venessa F.; Gu, Lixiong – ETS Research Report Series, 2019

When using the Rasch model, equating with a nonequivalent groups anchor test design is commonly achieved by adjustment of new form item difficulty using an additive equating constant. Using simulated 5-year data, this report compares 4 approaches to calculating the equating constants and the subsequent impact on equating results. The 4 approaches…

Descriptors: Item Response Theory, Test Items, Test Construction, Sample Size

Imputation Methods to Deal with Missing Responses in Computerized Adaptive Multistage Testing

Peer reviewed

Direct link

Cetin-Berber, Dee Duygu; Sari, Halil Ibrahim; Huggins-Manley, Anne Corinne – Educational and Psychological Measurement, 2019

Routing examinees to modules based on their ability level is a very important aspect in computerized adaptive multistage testing. However, the presence of missing responses may complicate estimation of examinee ability, which may result in misrouting of individuals. Therefore, missing responses should be handled carefully. This study investigated…

Descriptors: Computer Assisted Testing, Adaptive Testing, Error of Measurement, Research Problems

Fitting Large Factor Analysis Models with Ordinal Data

Peer reviewed

Direct link

DiStefano, Christine; McDaniel, Heather L.; Zhang, Liyun; Shi, Dexin; Jiang, Zhehan – Educational and Psychological Measurement, 2019

A simulation study was conducted to investigate the model size effect when confirmatory factor analysis (CFA) models include many ordinal items. CFA models including between 15 and 120 ordinal items were analyzed with mean- and variance-adjusted weighted least squares to determine how varying sample size, number of ordered categories, and…

Descriptors: Factor Analysis, Effect Size, Data, Sample Size

A Comparison of Item Parameter Standard Error Estimation Procedures for Unidimensional and Multidimensional Item Response Theory Modeling

Peer reviewed

Direct link

Paek, Insu; Cai, Li – Educational and Psychological Measurement, 2014

The present study was motivated by the recognition that standard errors (SEs) of item response theory (IRT) model parameters are often of immediate interest to practitioners and that there is currently a lack of comparative research on different SE (or error variance-covariance matrix) estimation procedures. The present study investigated item…

Descriptors: Item Response Theory, Comparative Analysis, Error of Measurement, Computation

Effect of Violating Unidimensional Item Response Theory Vertical Scaling Assumptions on Developmental Score Scales

Direct link

Topczewski, Anna Marie – ProQuest LLC, 2013

Developmental score scales represent the performance of students along a continuum, where as students learn more they move higher along that continuum. Unidimensional item response theory (UIRT) vertical scaling has become a commonly used method to create developmental score scales. Research has shown that UIRT vertical scaling methods can be…

Descriptors: Item Response Theory, Scaling, Scores, Student Development

Mixed-Format Test Score Equating: Effect of Item-Type Multidimensionality, Length and Composition of Common-Item Set, and Group Ability Difference

Direct link

Wang, Wei – ProQuest LLC, 2013

Mixed-format tests containing both multiple-choice (MC) items and constructed-response (CR) items are now widely used in many testing programs. Mixed-format tests often are considered to be superior to tests containing only MC items although the use of multiple item formats leads to measurement challenges in the context of equating conducted under…

Descriptors: Equated Scores, Test Format, Test Items, Test Length

Previous Page | Next Page »

Pages: 1 | 2 | 3

Lee, Guemin	2
Paek, Insu	2
Patience, Wayne M.	2
Reckase, Mark D.	2
Ahn, Soyeon	1
Bulut, Okan	1
Cai, Li	1
Cetin-Berber, Dee Duygu	1
Chalmers, Robert Philip	1
Chang, Hua-Hua	1
Chen, Ping	1
Chun Wang	1
Curley, Edward	1
Daud, Muslem	1
DeMars, Christine E.	1
DiStefano, Christine	1
Diakow, Ronli Phyllis	1
Dogan, Nuri	1
Feigenbaum, Miriam	1
Finch, Holmes	1
Gongjun Xu	1
Goodrich, J. Marc	1
Gorgun, Guher	1
Gu, Lixiong	1
Hedges, Larry V.	1
More ▼