ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	3
Since 2006 (last 20 years)	4

Descriptor

Difficulty Level	8
Statistical Analysis	8
Test Length	8
Test Items	4
Item Analysis	3
Item Response Theory	3
Sample Size	3
Adaptive Testing	2
Computer Assisted Testing	2
Correlation	2
Equated Scores	2
Matrices	2
Secondary Education	2
Simulation	2
Test Format	2
Test Reliability	2
Accuracy	1
Achievement Tests	1
Cheating	1
Computer Programs	1
Cost Effectiveness	1
Duplication	1
Efficiency	1
Elementary Secondary Education	1
Error Patterns	1
More ▼

Source

Applied Measurement in…	1
Educational and Psychological…	1
International Journal of…	1
Journal of Experimental…	1

Author

Cliff, Norman	1
DeMars, Christine E.	1
Forsyth, Robert A.	1
Harris, Dickie A.	1
Lee, Won-Chan	1
Levy, Roy	1
Lim, Euijin	1
Penell, Roger J.	1
Scheetz, James P.	1
Socha, Alan	1
Sunbul, Onder	1
Svetina, Dubravka	1
Yormaz, Seha	1
de Jong, John H. A. L.	1
More ▼

Publication Type

Reports - Research	7
Journal Articles	4
Reports - Evaluative	1

Education Level

Audience

Location

Netherlands

Laws, Policies, & Programs

Assessments and Surveys

Stanford Binet Intelligence…

What Works Clearinghouse Rating

Showing all 8 results Save | Export

Subscore Equating and Profile Reporting

Peer reviewed

Direct link

Lim, Euijin; Lee, Won-Chan – Applied Measurement in Education, 2020

The purpose of this study is to address the necessity of subscore equating and to evaluate the performance of various equating methods for subtests. Assuming the random groups design and number-correct scoring, this paper analyzed real data and simulated data with four study factors including test dimensionality, subtest length, form difference in…

Descriptors: Equated Scores, Test Length, Test Format, Difficulty Level

Effects of Test Level Discrimination and Difficulty on Answer-Copying Indices

Peer reviewed
PDF on ERIC

Download full text

Sunbul, Onder; Yormaz, Seha – International Journal of Evaluation and Research in Education, 2018

In this study Type I Error and the power rates of omega (?) and GBT (generalized binomial test) indices were investigated for several nominal alpha levels and for 40 and 80-item test lengths with 10,000-examinee sample size under several test level restrictions. As a result, Type I error rates of both indices were found to be below the acceptable…

Descriptors: Difficulty Level, Cheating, Duplication, Test Length

Dimensionality in Compensatory MIRT When Complex Structure Exists: Evaluation of DETECT and NOHARM

Peer reviewed

Direct link

Svetina, Dubravka; Levy, Roy – Journal of Experimental Education, 2016

This study investigated the effect of complex structure on dimensionality assessment in compensatory multidimensional item response models using DETECT- and NOHARM-based methods. The performance was evaluated via the accuracy of identifying the correct number of dimensions and the ability to accurately recover item groupings using a simple…

Descriptors: Item Response Theory, Accuracy, Correlation, Sample Size

An Investigation of Sample Size Splitting on ATFIND and DIMTEST

Peer reviewed

Direct link

Socha, Alan; DeMars, Christine E. – Educational and Psychological Measurement, 2013

Modeling multidimensional test data with a unidimensional model can result in serious statistical errors, such as bias in item parameter estimates. Many methods exist for assessing the dimensionality of a test. The current study focused on DIMTEST. Using simulated data, the effects of sample size splitting for use with the ATFIND procedure for…

Descriptors: Sample Size, Test Length, Correlation, Test Format

Simulated and Empirical Studies of Flexilevel Testing in Air Force Technical Training Courses. Final Report for Period 1 May 1975-30 April 1977.

Harris, Dickie A.; Penell, Roger J. – 1977

This study used a series of simulations to answer questions about the efficacy of adaptive testing raised by empirical studies. The first study showed that for reasonable high entry points, parameters estimated from paper-and-pencil test protocols cross-validated remarkably well to groups actually tested at a computer terminal. This suggested that…

Descriptors: Adaptive Testing, Computer Assisted Testing, Cost Effectiveness, Difficulty Level

A Comparison of Simple Random Sampling Versus Stratification for Allocating Items to Subtests in Multiple Matrix Sampling.

Download full text

Scheetz, James P.; Forsyth, Robert A. – 1977

Empirical evidence is presented related to the effects of using a stratified sampling of items in multiple matrix sampling on the accuracy of estimates of the population mean. Data were obtained from a sample of 600 high school students for a 36-item mathematics test and a 40-item vocabulary test, both subtests of the Iowa Tests of Educational…

Descriptors: Achievement Tests, Difficulty Level, Item Analysis, Item Sampling

Tailoring Tests to Educational Levels.

Download full text

de Jong, John H. A. L. – 1984

The Netherlands' secondary education system is highly differentiated, with four different school types for four scholastic ability levels. Final examinations must accommodate these four levels, and require a test-independent definition of the intended final ability levels as well as a sample-free evaluation of the range of ability levels at which…

Descriptors: Difficulty Level, Efficiency, Equated Scores, Foreign Countries

Evaluations of Implied Orders as a Basis for Tailored Testing Using Simulations. Technical Report No. 4.

Cliff, Norman; And Others – 1977

TAILOR is a computer program that uses the implied orders concept as the basis for computerized adaptive testing. The basic characteristics of TAILOR, which does not involve pretesting, are reviewed here and two studies of it are reported. One is a Monte Carlo simulation based on the four-parameter Birnbaum model and the other uses a matrix of…

Descriptors: Adaptive Testing, Computer Assisted Testing, Computer Programs, Difficulty Level