ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	4
Since 2006 (last 20 years)	5

Descriptor

Sample Size	13
Test Construction	13
Test Length	13
Item Response Theory	6
Test Items	6
Models	4
Correlation	3
Equated Scores	3
Item Banks	3
Mathematical Models	3
Achievement Tests	2
Computer Simulation	2
Criterion Referenced Tests	2
Cutting Scores	2
Error of Measurement	2
Factor Analysis	2
Sampling	2
Simulation	2
Testing Problems	2
Ability	1
Accuracy	1
Classification	1
College Freshmen	1
College Students	1
Comparative Testing	1
More ▼

Source

Applied Measurement in…	2
ETS Research Report Series	1
Educational Sciences: Theory…	1
Educational and Psychological…	1
International Journal of…	1
Journal of Experimental…	1
Multivariate Behavioral…	1

Publication Type

Reports - Research	9
Journal Articles	8
Speeches/Meeting Papers	5
Reports - Evaluative	2
Guides - Non-Classroom	1
Numerical/Quantitative Data	1
Reports - Descriptive	1

Education Level

Higher Education	1
Postsecondary Education	1

Audience

Researchers

Location

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 13 results Save | Export

Number of Response Categories and Sample Size Requirements in Polytomous IRT Models

Peer reviewed

Direct link

Dubravka Svetina Valdivia; Shenghai Dai – Journal of Experimental Education, 2024

Applications of polytomous IRT models in applied fields (e.g., health, education, psychology) are abound. However, little is known about the impact of the number of categories and sample size requirements for precise parameter recovery. In a simulation study, we investigated the impact of the number of response categories and required sample size…

Descriptors: Item Response Theory, Sample Size, Models, Classification

The Online Survey as a "Qualitative" Research Tool

Peer reviewed

Direct link

Braun, Virginia; Clarke, Victoria; Boulton, Elicia; Davey, Louise; McEvoy, Charlotte – International Journal of Social Research Methodology, 2021

Fully "qualitative" surveys, which prioritise qualitative research values, and harness the rich potential of qualitative data, have much to offer qualitative researchers, especially given online delivery options. Yet the method remains underutilised, and there is little in the way of methodological discussion of qualitative surveys.…

Descriptors: Online Surveys, Qualitative Research, Social Science Research, Disclosure

Different Methods of Adjusting for Form Difficulty under the Rasch Model: Impact on Consistency of Assessment Results. Research Report. ETS RR-19-08

Peer reviewed
PDF on ERIC

Download full text

Manna, Venessa F.; Gu, Lixiong – ETS Research Report Series, 2019

When using the Rasch model, equating with a nonequivalent groups anchor test design is commonly achieved by adjustment of new form item difficulty using an additive equating constant. Using simulated 5-year data, this report compares 4 approaches to calculating the equating constants and the subsequent impact on equating results. The 4 approaches…

Descriptors: Item Response Theory, Test Items, Test Construction, Sample Size

The Effects of Test Length and Sample Size on Item Parameters in Item Response Theory

Peer reviewed
PDF on ERIC

Download full text

Sahin, Alper; Anil, Duygu – Educational Sciences: Theory and Practice, 2017

This study investigates the effects of sample size and test length on item-parameter estimation in test development utilizing three unidimensional dichotomous models of item response theory (IRT). For this purpose, a real language test comprised of 50 items was administered to 6,288 students. Data from this test was used to obtain data sets of…

Descriptors: Test Length, Sample Size, Item Response Theory, Test Construction

Minimum Sample Size Requirements for Mokken Scale Analysis

Peer reviewed

Direct link

Straat, J. Hendrik; van der Ark, L. Andries; Sijtsma, Klaas – Educational and Psychological Measurement, 2014

An automated item selection procedure in Mokken scale analysis partitions a set of items into one or more Mokken scales, if the data allow. Two algorithms are available that pursue the same goal of selecting Mokken scales of maximum length: Mokken's original automated item selection procedure (AISP) and a genetic algorithm (GA). Minimum…

Descriptors: Sampling, Test Items, Effect Size, Scaling

The Effects of Test Length and Sample Size on the Reliability and Equating of Tests Composed of Constructed-Response Items.

Peer reviewed

Fitzpatrick, Anne R.; Yen, Wendy M. – Applied Measurement in Education, 2001

Examined the effects of test length and sample size on the alternate forms reliability and equating of simulated mathematics tests composed of constructed response items scaled using the two-parameter partial credit model. Results suggest that, to obtain acceptable reliabilities and accurate equated scores, tests should have at least 8 6-point…

Descriptors: Constructed Response, Equated Scores, Mathematics Tests, Reliability

An Investigation of the Power of Stout's Test of Essential Unidimensionality.

Download full text

Ang, Cheng; Miller, M. David – 1993

The power of the procedure of W. Stout to detect deviations from essential unidimensionality in two-dimensional data was investigated for minor, moderate, and large deviations from unidimensionality using criteria for deviations from unidimensionality based on prior research. Test lengths of 20 and 40 items and sample sizes of 700 and 1,500 were…

Descriptors: Ability, Comparative Testing, Correlation, Item Response Theory

Item Parameter Estimation Errors and Their Influence on Test Information Functions.

Peer reviewed

Hambleton, Ronald K.; Jones, Russell W. – Applied Measurement in Education, 1994

The impact of capitalizing on chance in item selection on the accuracy of test information functions was studied through simulation, focusing on examinee sample size in item calibration and the ratio of item bank size to test length. (SLD)

Descriptors: Computer Simulation, Estimation (Mathematics), Item Banks, Item Response Theory

Comparing the IRT Pre-equating and Section Pre-equating: A Simulation Study.

Hwang, Chi-en; Cleary, T. Anne – 1986

The results obtained from two basic types of pre-equatings of tests were compared: the item response theory (IRT) pre-equating and section pre-equating (SPE). The simulated data were generated from a modified three-parameter logistic model with a constant guessing parameter. Responses of two replication samples of 3000 examinees on two 72-item…

Descriptors: Computer Simulation, Equated Scores, Latent Trait Theory, Mathematical Models

Item Pool Construction for Use With Latent Trait Models.

PDF pending restoration

Reckase, Mark D. – 1979

Because latent trait models require that large numbers of items be calibrated or that testing of the same large group be repeated, item parameter estimates are often obtained by administering separate tests to different groups and "linking" the results to construct an adequate item pool. Four issues were studied, based upon the analysis…

Descriptors: Achievement Tests, High Schools, Item Banks, Mathematical Models

Second-Order Confirmatory Factor Analysis of the "Reactions to Tests" Scale with Cross-Validation.

Peer reviewed

Benson, Jeri; Bandalos, Deborah L. – Multivariate Behavioral Research, 1992

Factor structure of the Reactions to Tests (RTT) scale measuring test anxiety was studied by testing a series of confirmatory factor models including a second-order structure with 636 college students. Results support a shorter 20-item RTT but also raise questions about the cross-validation of covariance models. (SLD)

Descriptors: College Students, Factor Analysis, Factor Structure, Higher Education

A Method for Determining the Length of Criterion-Referenced Tests Using Reliability and Validity Indices.

Download full text

Mills, Craig N.; Simon, Robert – 1981

When criterion-referenced tests are used to assign examinees to states reflecting their performance level on a test, the better known methods for determining test length, which consider relationships among domain scores and errors of measurement, have their limitations. The purpose of this paper is to present a computer system named TESTLEN, which…

Descriptors: Computer Assisted Testing, Criterion Referenced Tests, Cutting Scores, Error of Measurement

Determining Test Length. Passing Scores and Test Lengths for Objective-Based Tests.

Millman, Jason – 1972

Two aspects of criterion referenced testing are discussed: cutting scores and test length. Several practices in determining passing scores are enumerated: (1) setting passing scores so that a predetermined percent of students pass; (2) inspecting each test item to determine how important it is that it be answered correctly; (3) determining the…

Descriptors: Achievement Tests, Criterion Referenced Tests, Cutting Scores, Educational Problems

Ang, Cheng	1
Anil, Duygu	1
Bandalos, Deborah L.	1
Benson, Jeri	1
Boulton, Elicia	1
Braun, Virginia	1
Clarke, Victoria	1
Cleary, T. Anne	1
Davey, Louise	1
Dubravka Svetina Valdivia	1
Fitzpatrick, Anne R.	1
Gu, Lixiong	1
Hambleton, Ronald K.	1
Hwang, Chi-en	1
Jones, Russell W.	1
Manna, Venessa F.	1
McEvoy, Charlotte	1
Miller, M. David	1
Millman, Jason	1
Mills, Craig N.	1
Reckase, Mark D.	1
Sahin, Alper	1
Shenghai Dai	1
Sijtsma, Klaas	1
Simon, Robert	1
More ▼