ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	0
Since 2006 (last 20 years)	8

Descriptor

Sample Size	29
Test Length	29
Item Response Theory	17
Simulation	10
Estimation (Mathematics)	9
Ability	8
Comparative Analysis	8
Maximum Likelihood Statistics	7
Statistical Distributions	7
Test Items	7
Mathematical Models	6
Chi Square	5
Computer Simulation	5
Correlation	5
Goodness of Fit	5
Monte Carlo Methods	5
Item Bias	4
Statistical Bias	4
Bayesian Statistics	3
Computation	3
Difficulty Level	3
Matrices	3
Probability	3
Statistical Analysis	3
Effect Size	2
More ▼

Source

Applied Psychological…	5
Educational and Psychological…	4
Applied Measurement in…	3
Journal of Educational…	1
Psychometrika	1

Publication Type

Reports - Evaluative	29
Speeches/Meeting Papers	16
Journal Articles	14

Education Level

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

Law School Admission Test

What Works Clearinghouse Rating

Showing 1 to 15 of 29 results Save | Export

Identification of Differential Item Functioning in Assessment Booklet Designs with Structurally Missing Data

Peer reviewed

Direct link

Goodman, Joshua T.; Willse, John T.; Allen, Nancy L.; Klaric, John S. – Educational and Psychological Measurement, 2011

The Mantel-Haenszel procedure is a popular technique for determining items that may exhibit differential item functioning (DIF). Numerous studies have focused on the strengths and weaknesses of this procedure, but few have focused the performance of the Mantel-Haenszel method when structurally missing data are present as a result of test booklet…

Descriptors: Test Bias, Identification, Tests, Test Length

Checking Dimensionality in Item Response Models with Principal Component Analysis on Standardized Residuals

Peer reviewed

Direct link

Chou, Yeh-Tai; Wang, Wen-Chung – Educational and Psychological Measurement, 2010

Dimensionality is an important assumption in item response theory (IRT). Principal component analysis on standardized residuals has been used to check dimensionality, especially under the family of Rasch models. It has been suggested that an eigenvalue greater than 1.5 for the first eigenvalue signifies a violation of unidimensionality when there…

Descriptors: Test Length, Sample Size, Correlation, Item Response Theory

Marginal Maximum A Posteriori Item Parameter Estimation for the Generalized Graded Unfolding Model

Peer reviewed

Direct link

Roberts, James S.; Thompson, Vanessa M. – Applied Psychological Measurement, 2011

A marginal maximum a posteriori (MMAP) procedure was implemented to estimate item parameters in the generalized graded unfolding model (GGUM). Estimates from the MMAP method were compared with those derived from marginal maximum likelihood (MML) and Markov chain Monte Carlo (MCMC) procedures in a recovery simulation that varied sample size,…

Descriptors: Statistical Analysis, Markov Processes, Computation, Monte Carlo Methods

Formulation of a DIMTEST Effect Size Measure (DESM) and Evaluation of the DESM Estimator Bias

Peer reviewed

Direct link

Seo, Minhee; Roussos, Louis A. – Journal of Educational Measurement, 2010

DIMTEST is a widely used and studied method for testing the hypothesis of test unidimensionality as represented by local item independence. However, DIMTEST does not report the amount of multidimensionality that exists in data when rejecting its null. To provide more information regarding the degree to which data depart from unidimensionality, a…

Descriptors: Effect Size, Statistical Bias, Computation, Test Length

On the Use of Nonparametric Item Characteristic Curve Estimation Techniques for Checking Parametric Model Fit

Peer reviewed

Direct link

Lee, Young-Sun; Wollack, James A.; Douglas, Jeffrey – Educational and Psychological Measurement, 2009

The purpose of this study was to assess the model fit of a 2PL through comparison with the nonparametric item characteristic curve (ICC) estimation procedures. Results indicate that three nonparametric procedures implemented produced ICCs that are similar to that of the 2PL for items simulated to fit the 2PL. However for misfitting items,…

Descriptors: Nonparametric Statistics, Item Response Theory, Test Items, Simulation

Ramsay-Curve Item Response Theory for the Three-Parameter Logistic Item Response Model

Peer reviewed

Direct link

Woods, Carol M. – Applied Psychological Measurement, 2008

In Ramsay-curve item response theory (RC-IRT), the latent variable distribution is estimated simultaneously with the item parameters of a unidimensional item response model using marginal maximum likelihood estimation. This study evaluates RC-IRT for the three-parameter logistic (3PL) model with comparisons to the normal model and to the empirical…

Descriptors: Test Length, Computation, Item Response Theory, Maximum Likelihood Statistics

Bias of Exploratory and Cross-Validated DETECT Index under Unidimensionality

Peer reviewed

Direct link

Monahan, Patrick O.; Stump, Timothy E.; Finch, Holmes; Hambleton, Ronald K. – Applied Psychological Measurement, 2007

DETECT is a nonparametric "full" dimensionality assessment procedure that clusters dichotomously scored items into dimensions and provides a DETECT index of magnitude of multidimensionality. Four factors (test length, sample size, item response theory [IRT] model, and DETECT index) were manipulated in a Monte Carlo study of bias, standard error,…

Descriptors: Test Length, Sample Size, Monte Carlo Methods, Geometric Concepts

Assessing the Dimensionality of Item Response Matrices with Small Sample Sizes and Short Test Lengths.

Peer reviewed

De Champlain, Andre; Gessaroli, Marc E. – Applied Measurement in Education, 1998

Type I error rates and rejection rates for three-dimensionality assessment procedures were studied with data sets simulated to reflect short tests and small samples. Results show that the G-squared difference test (D. Bock, R. Gibbons, and E. Muraki, 1988) suffered from a severely inflated Type I error rate at all conditions simulated. (SLD)

Descriptors: Item Response Theory, Matrices, Sample Size, Simulation

Simultaneous Use of Multiple Answer Copying Indexes to Improve Detection Rates

Peer reviewed

Direct link

Wollack, James A. – Applied Measurement in Education, 2006

Many of the currently available statistical indexes to detect answer copying lack sufficient power at small [alpha] levels or when the amount of copying is relatively small. Furthermore, there is no one index that is uniformly best. Depending on the type or amount of copying, certain indexes are better than others. The purpose of this article was…

Descriptors: Statistical Analysis, Item Analysis, Test Length, Sample Size

Assessing the Dimensionality of Polytomous Item Responses with Small Sample Sizes and Short Test Lengths: A Comparison of Procedures.

PDF pending restoration

De Champlain, Andre F.; Gessaroli, Marc E.; Tang, K. Linda; De Champlain, Judy E. – 1998

The empirical Type I error rates of Poly-DIMTEST (H. Li and W. Stout, 1995) and the LISREL8 chi square fit statistic (K. Joreskog and D. Sorbom, 1993) were compared with polytomous unidimensional data sets simulated to vary as a function of test length and sample size. The rejection rates for both statistics were also studied with two-dimensional…

Descriptors: Chi Square, Goodness of Fit, Item Response Theory, Sample Size

The Influence of Multidimensionality on the Graded Response Model.

Peer reviewed

De Ayala, R. J. – Applied Psychological Measurement, 1994

Previous work on the effects of dimensionality on parameter estimation for dichotomous models is extended to the graded response model. Datasets are generated that differ in the number of latent factors as well as their interdimensional association, number of test items, and sample size. (SLD)

Descriptors: Estimation (Mathematics), Item Response Theory, Maximum Likelihood Statistics, Sample Size

The Effects of Test Length and Sample Size on the Reliability and Equating of Tests Composed of Constructed-Response Items.

Peer reviewed

Fitzpatrick, Anne R.; Yen, Wendy M. – Applied Measurement in Education, 2001

Examined the effects of test length and sample size on the alternate forms reliability and equating of simulated mathematics tests composed of constructed response items scaled using the two-parameter partial credit model. Results suggest that, to obtain acceptable reliabilities and accurate equated scores, tests should have at least 8 6-point…

Descriptors: Constructed Response, Equated Scores, Mathematics Tests, Reliability

A Comparison of Item Parameter Estimates and ICCs Produced with TESTGRAF and BILOG under Different Test Lengths and Sample Sizes.

Download full text

Patsula, Liane N.; Gessaroli, Marc E. – 1995

Among the most popular techniques used to estimate item response theory (IRT) parameters are those used in the LOGIST and BILOG computer programs. Because of its accuracy with smaller sample sizes or differing test lengths, BILOG has become the standard to which new estimation programs are compared. However, BILOG is still complex and…

Descriptors: Comparative Analysis, Effect Size, Estimation (Mathematics), Item Response Theory

A Comparison of Logistic Regression and Analysis of Variance Differential Item Functioning Decision Methods.

Peer reviewed

Whitmore, Marjorie L.; Schumacker, Randall E. – Educational and Psychological Measurement, 1999

Compared differential item functioning detection rates for logistic regression and analysis of variance for dichotomously scored items using simulated data and varying test length, sample size, discrimination rate, and underlying ability. Explains why the logistic regression method is recommended for most applications. (SLD)

Descriptors: Ability, Analysis of Variance, Comparative Analysis, Item Bias

Assessing the Dimensionality of Item Response Matrices Using a Goodness-of-Fit Index Based on Noncentrality.

Download full text

De Champlain, Andre – 1996

The usefulness of a goodness-of-fit index proposed by R. P. McDonald (1989) was investigated with regard to assessing the dimensionality of item response matrices. The m subscript k index, which is based on an estimate of the noncentrality parameter of the noncentral chi-square distribution, possesses several advantages over traditional tests of…

Descriptors: Chi Square, Cutting Scores, Goodness of Fit, Item Response Theory

Previous Page | Next Page »

Pages: 1 | 2

Gessaroli, Marc E.	5
De Champlain, Andre	3
Kim, Seock-Ho	3
De Ayala, R. J.	2
De Champlain, Andre F.	2
Schumacker, Randall E.	2
Stone, Clement A.	2
Wollack, James A.	2
Abdel-fattah, Abdel-fattah A.	1
Allen, Nancy L.	1
Ang, Cheng	1
Ankenmann, Robert D.	1
Bush, M. Joan	1
Chou, Yeh-Tai	1
Cohen, Allan S.	1
De Champlain, Judy E.	1
Douglas, Jeffrey	1
Finch, Holmes	1
Fitzpatrick, Anne R.	1
Goodman, Joshua T.	1
Hambleton, Ronald K.	1
Klaric, John S.	1
Lee, Young-Sun	1
Miller, M. David	1
More ▼