ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	1
Since 2017 (last 10 years)	3
Since 2007 (last 20 years)	13

Descriptor

Sampling	27
Test Length	27
Sample Size	11
Item Response Theory	9
Test Items	9
Test Construction	8
Computation	7
Error of Measurement	7
Research Methodology	6
Statistical Analysis	6
Reliability	5
Correlation	4
Data Analysis	4
Difficulty Level	4
Simulation	4
Test Format	4
Accuracy	3
Achievement Tests	3
Comparative Analysis	3
Cutting Scores	3
Evaluation Methods	3
Mathematical Models	3
Scores	3
Test Reliability	3
Adaptive Testing	2
More ▼

Source

ProQuest LLC	5
Educational and Psychological…	4
ETS Research Report Series	2
Applied Measurement in…	1
International Journal of…	1
Journal of Educational and…	1
Journal of Experimental…	1
New Directions for Program…	1
Perceptual and Motor Skills	1
Psychometrika	1

Publication Type

Reports - Research	15
Journal Articles	12
Dissertations/Theses -…	5
Speeches/Meeting Papers	5
Reports - Evaluative	4
Opinion Papers	2
Guides - Non-Classroom	1
Reports - Descriptive	1

Education Level

Higher Education	1
Secondary Education	1

Audience

Researchers

Location

Laws, Policies, & Programs

Assessments and Surveys

Law School Admission Test	1
National Assessment of…	1
National Longitudinal Study…	1
Program for International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 27 results Save | Export

Investigation of a Multistage Adaptive Test Based on Test Assembly Methods

Peer reviewed
PDF on ERIC

Download full text

Ebru Dogruöz; Hülya Kelecioglu – International Journal of Assessment Tools in Education, 2024

In this research, multistage adaptive tests (MST) were compared according to sample size, panel pattern and module length for top-down and bottom-up test assembly methods. Within the scope of the research, data from PISA 2015 were used and simulation studies were conducted according to the parameters estimated from these data. Analysis results for…

Descriptors: Adaptive Testing, Test Construction, Foreign Countries, Achievement Tests

Robustness of Weighted Differential Item Functioning (DIF) Analysis: The Case of Mantel-Haenszel DIF Statistics. Research Report. ETS RR-21-12

Peer reviewed
PDF on ERIC

Download full text

Lu, Ru; Guo, Hongwen; Dorans, Neil J. – ETS Research Report Series, 2021

Two families of analysis methods can be used for differential item functioning (DIF) analysis. One family is DIF analysis based on observed scores, such as the Mantel-Haenszel (MH) and the standardized proportion-correct metric for DIF procedures; the other is analysis based on latent ability, in which the statistic is a measure of departure from…

Descriptors: Robustness (Statistics), Weighted Scores, Test Items, Item Analysis

Detection and Treatment of Careless Responses to Improve Item Parameter Estimation

Peer reviewed

Direct link

Patton, Jeffrey M.; Cheng, Ying; Hong, Maxwell; Diao, Qi – Journal of Educational and Behavioral Statistics, 2019

In psychological and survey research, the prevalence and serious consequences of careless responses from unmotivated participants are well known. In this study, we propose to iteratively detect careless responders and cleanse the data by removing their responses. The careless responders are detected using person-fit statistics. In two simulation…

Descriptors: Test Items, Response Style (Tests), Identification, Computation

Evaluating the Impact of Guessing and Its Interactions with Other Test Characteristics on Confidence Interval Procedures for Coefficient Alpha

Peer reviewed

Direct link

Paek, Insu – Educational and Psychological Measurement, 2016

The effect of guessing on the point estimate of coefficient alpha has been studied in the literature, but the impact of guessing and its interactions with other test characteristics on the interval estimators for coefficient alpha has not been fully investigated. This study examined the impact of guessing and its interactions with other test…

Descriptors: Guessing (Tests), Computation, Statistical Analysis, Test Length

Evaluating the Consistency of Angoff-Based Cut Scores Using Subsets of Items within a Generalizability Theory Framework

Peer reviewed

Direct link

Kannan, Priya; Sgammato, Adrienne; Tannenbaum, Richard J.; Katz, Irvin R. – Applied Measurement in Education, 2015

The Angoff method requires experts to view every item on the test and make a probability judgment. This can be time consuming when there are large numbers of items on the test. In this study, a G-theory framework was used to determine if a subset of items can be used to make generalizable cut-score recommendations. Angoff ratings (i.e.,…

Descriptors: Reliability, Standard Setting (Scoring), Cutting Scores, Test Items

Accuracy and Variability of Item Parameter Estimates from Marginal Maximum a Posteriori Estimation and Bayesian Inference via Gibbs Samplers

Direct link

Wu, Yi-Fang – ProQuest LLC, 2015

Item response theory (IRT) uses a family of statistical models for estimating stable characteristics of items and examinees and defining how these characteristics interact in describing item and test performance. With a focus on the three-parameter logistic IRT (Birnbaum, 1968; Lord, 1980) model, the current study examines the accuracy and…

Descriptors: Item Response Theory, Test Items, Accuracy, Computation

Minimum Sample Size Requirements for Mokken Scale Analysis

Peer reviewed

Direct link

Straat, J. Hendrik; van der Ark, L. Andries; Sijtsma, Klaas – Educational and Psychological Measurement, 2014

An automated item selection procedure in Mokken scale analysis partitions a set of items into one or more Mokken scales, if the data allow. Two algorithms are available that pursue the same goal of selecting Mokken scales of maximum length: Mokken's original automated item selection procedure (AISP) and a genetic algorithm (GA). Minimum…

Descriptors: Sampling, Test Items, Effect Size, Scaling

Bi-Factor Multidimensional Item Response Theory Modeling for Subscores Estimation, Reliability, and Classification

Direct link

Md Desa, Zairul Nor Deana – ProQuest LLC, 2012

In recent years, there has been increasing interest in estimating and improving subscore reliability. In this study, the multidimensional item response theory (MIRT) and the bi-factor model were combined to estimate subscores, to obtain subscores reliability, and subscores classification. Both the compensatory and partially compensatory MIRT…

Descriptors: Item Response Theory, Computation, Reliability, Classification

Conditions Affecting the Accuracy of Classical Equating Methods for Small Samples under the NEAT Design: A Simulation Study

Direct link

Sunnassee, Devdass – ProQuest LLC, 2011

Small sample equating remains a largely unexplored area of research. This study attempts to fill in some of the research gaps via a large-scale, IRT-based simulation study that evaluates the performance of seven small-sample equating methods under various test characteristic and sampling conditions. The equating methods considered are typically…

Descriptors: Test Length, Test Format, Sample Size, Simulation

Assessing Goodness of Fit in Item Response Theory with Nonparametric Models: A Comparison of Posterior Probabilities and Kernel-Smoothing Approaches

Peer reviewed

Direct link

Sueiro, Manuel J.; Abad, Francisco J. – Educational and Psychological Measurement, 2011

The distance between nonparametric and parametric item characteristic curves has been proposed as an index of goodness of fit in item response theory in the form of a root integrated squared error index. This article proposes to use the posterior distribution of the latent trait as the nonparametric model and compares the performance of an index…

Descriptors: Goodness of Fit, Item Response Theory, Nonparametric Statistics, Probability

Improving IRT Parameter Estimates with Small Sample Sizes: Evaluating the Efficacy of a New Data Augmentation Technique

Direct link

Foley, Brett Patrick – ProQuest LLC, 2010

The 3PL model is a flexible and widely used tool in assessment. However, it suffers from limitations due to its need for large sample sizes. This study introduces and evaluates the efficacy of a new sample size augmentation technique called Duplicate, Erase, and Replace (DupER) Augmentation through a simulation study. Data are augmented using…

Descriptors: Test Length, Sample Size, Simulation, Item Response Theory

Comparability of Examinee Proficiency Scores on Computer Adaptive Tests Using Real and Simulated Data

Direct link

Evans, Josiah Jeremiah – ProQuest LLC, 2010

In measurement research, data simulations are a commonly used analytical technique. While simulation designs have many benefits, it is unclear if these artificially generated datasets are able to accurately capture real examinee item response behaviors. This potential lack of comparability may have important implications for administration of…

Descriptors: Computer Assisted Testing, Adaptive Testing, Educational Testing, Admission (School)

Evaluation of Methods to Compute Complex Sample Standard Errors in Latent Regression Models. Research Report. ETS RR-09-49

Peer reviewed
PDF on ERIC

Download full text

Oranje, Andreas; Li, Deping; Kandathil, Mathew – ETS Research Report Series, 2009

Several complex sample standard error estimators based on linearization and resampling for the latent regression model of the National Assessment of Educational Progress (NAEP) are studied with respect to design choices such as number of items, number of regressors, and the efficiency of the sample. This paper provides an evaluation of the extent…

Descriptors: Error of Measurement, Computation, Regression (Statistics), National Competency Tests

Sacrificing Reliability and Exalting Sampling Error at the Altar of Parsimony: Some Cautions Concerning Short-Form Test Development.

Download full text

Henson, Robin K. – 2000

The purpose of this paper is to highlight some psychometric cautions that should be observed when seeking to develop short form versions of tests. Several points are made: (1) score reliability is impacted directly by the characteristics of the sample and testing conditions; (2) sampling error has a direct influence on reliability and factor…

Descriptors: Factor Structure, Psychometrics, Reliability, Sampling

Estimating the Sampling Variance of Correlation Corrected for Attenuation Using Coefficient Alpha.

Peer reviewed

Mayer, John D. – Perceptual and Motor Skills, 1983

Kelly's formula estimates sampling variance of correlation corrected for attenuation by using split-half reliabilities. In some cases, coefficient alpha estimate of reliability is preferable. A simulation study suggests a variation of Kelly's formula can be used appropriately with coefficient alpha. Kelly's formula is modified to accept…

Descriptors: Correlation, Measurement Techniques, Reliability, Sampling

Previous Page | Next Page »

Pages: 1 | 2

Abad, Francisco J.	1
Berk, Ronald A.	1
Bradburn, Norman	1
Bush, M. Joan	1
Carifio, James	1
Cheng, Ying	1
Diao, Qi	1
Dorans, Neil J.	1
Ebru Dogruöz	1
Evans, Josiah Jeremiah	1
Foley, Brett Patrick	1
Forsyth, Robert A.	1
Guo, Hongwen	1
Henson, Robin K.	1
Hong, Maxwell	1
Hülya Kelecioglu	1
Kandathil, Mathew	1
Kannan, Priya	1
Katz, Irvin R.	1
Kristof, Walter	1
Lewis, Charles	1
Li, Deping	1
Lord, Frederic M.	1
Lu, Ru	1
Maxwell, Scott E.	1
More ▼