ERIC - Search Results

Publication Date

In 2025	0
Since 2024	3
Since 2021 (last 5 years)	10
Since 2016 (last 10 years)	22
Since 2006 (last 20 years)	30

Descriptor

Accuracy	30
Error of Measurement	30
Test Items	30
Item Response Theory	18
Sample Size	12
Simulation	10
Comparative Analysis	8
Computation	8
Item Analysis	8
Scores	7
Difficulty Level	6
Goodness of Fit	6
Models	6
Foreign Countries	5
Monte Carlo Methods	5
Psychometrics	5
Test Bias	5
Adaptive Testing	4
Computer Assisted Testing	4
Correlation	4
Factor Analysis	4
Maximum Likelihood Statistics	4
Sampling	4
Scoring	4
Statistical Bias	4
More ▼

Source

Educational and Psychological…	6
ProQuest LLC	5
Journal of Educational…	4
Applied Measurement in…	3
International Journal of…	2
Online Submission	2
ETS Research Report Series	1
Educational Testing Service	1
Grantee Submission	1
International Journal of…	1
Journal of Psychoeducational…	1
Practical Assessment,…	1
Research Matters	1
SAGE Open	1
More ▼

Publication Type

Reports - Research	23
Journal Articles	21
Dissertations/Theses -…	5
Speeches/Meeting Papers	2
Reports - Descriptive	1
Reports - Evaluative	1

Education Level

Secondary Education	2
Early Childhood Education	1
Elementary Education	1
Elementary Secondary Education	1
Grade 2	1
Grade 3	1
Grade 8	1
Higher Education	1
Postsecondary Education	1
Primary Education	1

Audience

Location

Chile	1
Saudi Arabia	1
Turkey	1

Laws, Policies, & Programs

Assessments and Surveys

Big Five Inventory	1
Cognitive Abilities Test	1
Program for International…	1
Trends in International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 30 results Save | Export

Examining Appropriacy of CFI and TLI Cutoff Value in Multiple-Group CFA Test of Measurement Invariance to Enhance Accuracy of Test Score Interpretation

Peer reviewed

Direct link

Abdolvahab Khademi; Craig S. Wells; Maria Elena Oliveri; Ester Villalonga-Olives – SAGE Open, 2023

The most common effect size when using a multiple-group confirmatory factor analysis approach to measurement invariance is [delta]CFI and [delta]TLI with a cutoff value of 0.01. However, this recommended cutoff value may not be ubiquitously appropriate and may be of limited application for some tests (e.g., measures using dichotomous items or…

Descriptors: Factor Analysis, Factor Structure, Error of Measurement, Test Items

Sample Size and Item Parameter Estimation Precision When Utilizing the Masters' Partial Credit Model

Download full text

Custer, Michael; Kim, Jongpil – Online Submission, 2023

This study utilizes an analysis of diminishing returns to examine the relationship between sample size and item parameter estimation precision when utilizing the Masters' Partial Credit Model for polytomous items. Item data from the standardization of the Batelle Developmental Inventory, 3rd Edition were used. Each item was scored with a…

Descriptors: Sample Size, Item Response Theory, Test Items, Computation

IRT Characteristic Curve Linking Methods Weighted by Information for Mixed-Format Tests

Peer reviewed

Direct link

Shaojie Wang; Won-Chan Lee; Minqiang Zhang; Lixin Yuan – Applied Measurement in Education, 2024

To reduce the impact of parameter estimation errors on IRT linking results, recent work introduced two information-weighted characteristic curve methods for dichotomous items. These two methods showed outstanding performance in both simulation and pseudo-form pseudo-group analysis. The current study expands upon the concept of information…

Descriptors: Item Response Theory, Test Format, Test Length, Error of Measurement

Two IRT Characteristic Curve Linking Methods Weighted by Information

Peer reviewed

Direct link

Wang, Shaojie; Zhang, Minqiang; Lee, Won-Chan; Huang, Feifei; Li, Zonglong; Li, Yixing; Yu, Sufang – Journal of Educational Measurement, 2022

Traditional IRT characteristic curve linking methods ignore parameter estimation errors, which may undermine the accuracy of estimated linking constants. Two new linking methods are proposed that take into account parameter estimation errors. The item- (IWCC) and test-information-weighted characteristic curve (TWCC) methods employ weighting…

Descriptors: Item Response Theory, Error of Measurement, Accuracy, Monte Carlo Methods

Response Styles in Multiscale Measures

Direct link

Zebing Wu – ProQuest LLC, 2024

Response style, one common aberrancy in non-cognitive assessments in psychological fields, is problematic in terms of inaccurate estimation of item and person parameters, which leads to serious reliability, validity, and fairness issues (Baumgartner & Steenkamp, 2001; Bolt & Johnson, 2009; Bolt & Newton, 2011). Response style refers to…

Descriptors: Response Style (Tests), Accuracy, Preferences, Psychological Testing

Impact of Item Parameter Drift on Rasch Scale Stability in Small Samples over Multiple Administrations

Peer reviewed

Direct link

Kopp, Jason P.; Jones, Andrew T. – Applied Measurement in Education, 2020

Traditional psychometric guidelines suggest that at least several hundred respondents are needed to obtain accurate parameter estimates under the Rasch model. However, recent research indicates that Rasch equating results in accurate parameter estimates with sample sizes as small as 25. Item parameter drift under the Rasch model has been…

Descriptors: Item Response Theory, Psychometrics, Sample Size, Sampling

Examining the Robustness of the Latent Growth Curve Model to Violations of Longitudinal Measurement Equivalence: A Methodological Study with Practical Applications in Child Development

Direct link

Rachel A. Gross – ProQuest LLC, 2020

The present study was motivated by the theory-method mismatch between heterotypic continuity (aspects of development that manifest differently across the lifespan thus cannot be measured the same way over time) and longitudinal measurement equivalence (the statistical assumption that the developmental phenomenon studied is measured on the same…

Descriptors: Robustness (Statistics), Structural Equation Models, Longitudinal Studies, Error of Measurement

Comparing Small-Sample Equating with Angoff Judgement for Linking Cut-Scores on Two Tests

Download full text

Bramley, Tom – Research Matters, 2020

The aim of this study was to compare, by simulation, the accuracy of mapping a cut-score from one test to another by expert judgement (using the Angoff method) versus the accuracy with a small-sample equating method (chained linear equating). As expected, the standard-setting method resulted in more accurate equating when we assumed a higher level…

Descriptors: Cutting Scores, Standard Setting (Scoring), Equated Scores, Accuracy

Variational Estimation for Multidimensional Generalized Partial Credit Model

Peer reviewed

Direct link

Chengyu Cui; Chun Wang; Gongjun Xu – Grantee Submission, 2024

Multidimensional item response theory (MIRT) models have generated increasing interest in the psychometrics literature. Efficient approaches for estimating MIRT models with dichotomous responses have been developed, but constructing an equally efficient and robust algorithm for polytomous models has received limited attention. To address this gap,…

Descriptors: Item Response Theory, Accuracy, Simulation, Psychometrics

Bayesian Approaches to Test Score Measurement Errors in Student Growth Prediction Models

Direct link

Pei-Hsuan Chiu – ProQuest LLC, 2018

Evidence of student growth is a primary outcome of interest for educational accountability systems. When three or more years of student test data are available, questions around how students grow and what their predicted growth is can be answered. Given that test scores contain measurement error, this error should be considered in growth and…

Descriptors: Bayesian Statistics, Scores, Error of Measurement, Growth Models

Modeling of Item Response Functions under the D-Scoring Method

Peer reviewed

Direct link

Dimitrov, Dimiter M. – Educational and Psychological Measurement, 2020

This study presents new models for item response functions (IRFs) in the framework of the D-scoring method (DSM) that is gaining attention in the field of educational and psychological measurement and largescale assessments. In a previous work on DSM, the IRFs of binary items were estimated using a logistic regression model (LRM). However, the LRM…

Descriptors: Item Response Theory, Scoring, True Scores, Scaling

Position of Correct Option and Distractors Impacts Responses to Multiple-Choice Items: Evidence from a National Test

Peer reviewed

Direct link

Lions, Séverin; Dartnell, Pablo; Toledo, Gabriela; Godoy, María Inés; Córdova, Nora; Jiménez, Daniela; Lemarié, Julie – Educational and Psychological Measurement, 2023

Even though the impact of the position of response options on answers to multiple-choice items has been investigated for decades, it remains debated. Research on this topic is inconclusive, perhaps because too few studies have obtained experimental data from large-sized samples in a real-world context and have manipulated the position of both…

Descriptors: Multiple Choice Tests, Test Items, Item Analysis, Responses

A Polytomous Scoring Approach to Handle Not-Reached Items in Low-Stakes Assessments

Peer reviewed

Direct link

Gorgun, Guher; Bulut, Okan – Educational and Psychological Measurement, 2021

In low-stakes assessments, some students may not reach the end of the test and leave some items unanswered due to various reasons (e.g., lack of test-taking motivation, poor time management, and test speededness). Not-reached items are often treated as incorrect or not-administered in the scoring process. However, when the proportion of…

Descriptors: Scoring, Test Items, Response Style (Tests), Mathematics Tests

Evaluating a Computerized Adaptive Testing Version of a Cognitive Ability Test Using a Simulation Study

Peer reviewed

Direct link

Tsaousis, Ioannis; Sideridis, Georgios D.; AlGhamdi, Hannan M. – Journal of Psychoeducational Assessment, 2021

This study evaluated the psychometric quality of a computerized adaptive testing (CAT) version of the general cognitive ability test (GCAT), using a simulation study protocol put forth by Han, K. T. (2018a). For the needs of the analysis, three different sets of items were generated, providing an item pool of 165 items. Before evaluating the…

Descriptors: Computer Assisted Testing, Adaptive Testing, Cognitive Tests, Cognitive Ability

Comparison of Confirmatory Factor Analysis Estimation Methods on Mixed-Format Data

Peer reviewed
PDF on ERIC

Download full text

Kilic, Abdullah Faruk; Dogan, Nuri – International Journal of Assessment Tools in Education, 2021

Weighted least squares (WLS), weighted least squares mean-and-variance-adjusted (WLSMV), unweighted least squares mean-and-variance-adjusted (ULSMV), maximum likelihood (ML), robust maximum likelihood (MLR) and Bayesian estimation methods were compared in mixed item response type data via Monte Carlo simulation. The percentage of polytomous items,…

Descriptors: Factor Analysis, Computation, Least Squares Statistics, Maximum Likelihood Statistics

Previous Page | Next Page »

Pages: 1 | 2

Custer, Michael	2
Abdolvahab Khademi	1
Aksu Dunya, Beyza	1
AlGhamdi, Hannan M.	1
Bolsinova, Maria	1
Bramley, Tom	1
Bulut, Okan	1
Chengyu Cui	1
Cho, Sun-Joo	1
Chun Wang	1
Craig S. Wells	1
Córdova, Nora	1
Dartnell, Pablo	1
DeMars, Christine E.	1
DiStefano, Christine	1
Dimitrov, Dimiter M.	1
Dogan, Nuri	1
Ester Villalonga-Olives	1
Finch, Holmes	1
French, Brian F.	1
Godoy, María Inés	1
Gongjun Xu	1
Gorgun, Guher	1
Haberman, Shelby J.	1
Han, Kyung T.	1
More ▼