Publication Date
In 2025 | 0 |
Since 2024 | 2 |
Since 2021 (last 5 years) | 8 |
Since 2016 (last 10 years) | 17 |
Since 2006 (last 20 years) | 46 |
Descriptor
Error of Measurement | 59 |
Statistical Analysis | 59 |
Test Items | 59 |
Item Response Theory | 25 |
Test Bias | 17 |
Difficulty Level | 13 |
Simulation | 13 |
Comparative Analysis | 12 |
Equated Scores | 11 |
Goodness of Fit | 11 |
Item Analysis | 10 |
More ▼ |
Source
Author
Alonzo, Julie | 3 |
Tindal, Gerald | 3 |
DeMars, Christine E. | 2 |
Feigenbaum, Miriam | 2 |
Gómez-Benito, Juana | 2 |
Han, Kyung T. | 2 |
Holland, Paul W. | 2 |
Liu, Jinghua | 2 |
Livingston, Samuel A. | 2 |
Sinharay, Sandip | 2 |
Wang, Wen-Chung | 2 |
More ▼ |
Publication Type
Journal Articles | 42 |
Reports - Research | 40 |
Reports - Evaluative | 12 |
Dissertations/Theses -… | 5 |
Speeches/Meeting Papers | 5 |
Numerical/Quantitative Data | 4 |
Reports - Descriptive | 3 |
Education Level
Audience
Researchers | 2 |
Location
Germany | 2 |
Japan | 2 |
Turkey | 2 |
Austria | 1 |
Belgium | 1 |
Canada | 1 |
Kuwait | 1 |
Luxembourg | 1 |
Maryland | 1 |
Singapore | 1 |
South Africa | 1 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Raykov, Tenko; Marcoulides, George A.; Pusic, Martin – Measurement: Interdisciplinary Research and Perspectives, 2021
An interval estimation procedure is discussed that can be used to evaluate the probability of a particular response for a binary or binary scored item at a pre-specified point along an underlying latent continuum. The item is assumed to: (a) be part of a unidimensional multi-component measuring instrument that may contain also polytomous items,…
Descriptors: Item Response Theory, Computation, Probability, Test Items
Inga Laukaityte; Marie Wiberg – Practical Assessment, Research & Evaluation, 2024
The overall aim was to examine effects of differences in group ability and features of the anchor test form on equating bias and the standard error of equating (SEE) using both real and simulated data. Chained kernel equating, Postratification kernel equating, and Circle-arc equating were studied. A college admissions test with four different…
Descriptors: Ability Grouping, Test Items, College Entrance Examinations, High Stakes Tests
Wang, Xi; Liu, Yang – Journal of Educational and Behavioral Statistics, 2020
In continuous testing programs, some items are repeatedly used across test administrations, and statistical methods are often used to evaluate whether items become compromised due to examinees' preknowledge. In this study, we proposed a residual method to detect compromised items when a test can be partitioned into two subsets of items: secure…
Descriptors: Test Items, Information Security, Error of Measurement, Cheating
Akin-Arikan, Çigdem; Gelbal, Selahattin – Eurasian Journal of Educational Research, 2021
Purpose: This study aims to compare the performances of Item Response Theory (IRT) equating and kernel equating (KE) methods based on equating errors (RMSD) and standard error of equating (SEE) using the anchor item nonequivalent groups design. Method: Within this scope, a set of conditions, including ability distribution, type of anchor items…
Descriptors: Equated Scores, Item Response Theory, Test Items, Statistical Analysis
Haimiao Yuan – ProQuest LLC, 2022
The application of diagnostic classification models (DCMs) in the field of educational measurement is getting more attention in recent years. To make a valid inference from the model, it is important to ensure that the model fits the data. The purpose of the present study was to investigate the performance of the limited information…
Descriptors: Goodness of Fit, Educational Assessment, Educational Diagnosis, Models
Ozsoy, Seyma Nur; Kilmen, Sevilay – International Journal of Assessment Tools in Education, 2023
In this study, Kernel test equating methods were compared under NEAT and NEC designs. In NEAT design, Kernel post-stratification and chain equating methods taking into account optimal and large bandwidths were compared. In the NEC design, gender and/or computer/tablet use was considered as a covariate, and Kernel test equating methods were…
Descriptors: Equated Scores, Testing, Test Items, Statistical Analysis
Karina Mostert; Clarisse van Rensburg; Reitumetse Machaba – Journal of Applied Research in Higher Education, 2024
Purpose: This study examined the psychometric properties of intention to drop out and study satisfaction measures for first-year South African students. The factorial validity, item bias, measurement invariance and reliability were tested. Design/methodology/approach: A cross-sectional design was used. For the study on intention to drop out, 1,820…
Descriptors: Intention, Potential Dropouts, Student Satisfaction, Test Items
Altintas, Ozge; Wallin, Gabriel – International Journal of Assessment Tools in Education, 2021
Educational assessment tests are designed to measure the same psychological constructs over extended periods. This feature is important considering that test results are often used for admittance to university programs. To ensure fair assessments, especially for those whose results weigh heavily in selection decisions, it is necessary to collect…
Descriptors: College Admission, College Entrance Examinations, Test Bias, Equated Scores
Oranje, Andreas; Kolstad, Andrew – Journal of Educational and Behavioral Statistics, 2019
The design and psychometric methodology of the National Assessment of Educational Progress (NAEP) is constantly evolving to meet the changing interests and demands stemming from a rapidly shifting educational landscape. NAEP has been built on strong research foundations that include conducting extensive evaluations and comparisons before new…
Descriptors: National Competency Tests, Psychometrics, Statistical Analysis, Computation
Jinjin Huang – ProQuest LLC, 2020
Measurement invariance is crucial for an effective and valid measure of a construct. Invariance holds when the latent trait varies consistently across subgroups; in other words, the mean differences among subgroups are only due to true latent ability differences. Differential item functioning (DIF) occurs when measurement invariance is violated.…
Descriptors: Robustness (Statistics), Item Response Theory, Test Items, Item Analysis
Yanan Feng – ProQuest LLC, 2021
This dissertation aims to investigate the effect size measures of differential item functioning (DIF) detection in the context of cognitive diagnostic models (CDMs). A variety of DIF detection techniques have been developed in the context of CDMs. However, most of the DIF detection procedures focus on the null hypothesis significance test. Few…
Descriptors: Effect Size, Item Response Theory, Cognitive Measurement, Models
Cao, Mengyang; Tay, Louis; Liu, Yaowu – Educational and Psychological Measurement, 2017
This study examined the performance of a proposed iterative Wald approach for detecting differential item functioning (DIF) between two groups when preknowledge of anchor items is absent. The iterative approach utilizes the Wald-2 approach to identify anchor items and then iteratively tests for DIF items with the Wald-1 approach. Monte Carlo…
Descriptors: Monte Carlo Methods, Test Items, Test Bias, Error of Measurement
Hidalgo, Ma Dolores; Benítez, Isabel; Padilla, Jose-Luis; Gómez-Benito, Juana – Sociological Methods & Research, 2017
The growing use of scales in survey questionnaires warrants the need to address how does polytomous differential item functioning (DIF) affect observed scale score comparisons. The aim of this study is to investigate the impact of DIF on the type I error and effect size of the independent samples t-test on the observed total scale scores. A…
Descriptors: Test Items, Test Bias, Item Response Theory, Surveys
Li, Feifei – ETS Research Report Series, 2017
An information-correction method for testlet-based tests is introduced. This method takes advantage of both generalizability theory (GT) and item response theory (IRT). The measurement error for the examinee proficiency parameter is often underestimated when a unidimensional conditional-independence IRT model is specified for a testlet dataset. By…
Descriptors: Item Response Theory, Generalizability Theory, Tests, Error of Measurement
Chalmers, R. Philip; Counsell, Alyssa; Flora, David B. – Educational and Psychological Measurement, 2016
Differential test functioning, or DTF, occurs when one or more items in a test demonstrate differential item functioning (DIF) and the aggregate of these effects are witnessed at the test level. In many applications, DTF can be more important than DIF when the overall effects of DIF at the test level can be quantified. However, optimal statistical…
Descriptors: Test Bias, Sampling, Test Items, Statistical Analysis