ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	5

Descriptor

Error of Measurement	7
Probability	7
Test Length	7
Test Items	4
Item Response Theory	3
Measurement	2
Models	2
Reliability	2
Sample Size	2
Sampling	2
Scores	2
Simulation	2
Statistical Analysis	2
Ability	1
Adaptive Testing	1
Adults	1
Analysis of Variance	1
Change	1
College Students	1
Computer Assisted Testing	1
Cutting Scores	1
Data Analysis	1
Decision Making	1
Difficulty Level	1
Equated Scores	1
More ▼

Source

Applied Measurement in…	2
Educational and Psychological…	2
International Journal of…	1
Journal of Educational…	1

Author

Abad, Francisco J.	1
Andersson, Björn	1
Bergstrom, Betty A.	1
Ellis, Jules L.	1
Emons, Wilco H. M.	1
Kannan, Priya	1
Katz, Irvin R.	1
Kruyen, Peter M.	1
Misanchuk, Earl R.	1
Sgammato, Adrienne	1
Sijtsma, Klaas	1
Sueiro, Manuel J.	1
Tannenbaum, Richard J.	1
More ▼

Publication Type

Reports - Research	7
Journal Articles	6
Speeches/Meeting Papers	2

Education Level

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

California Psychological…

What Works Clearinghouse Rating

Showing all 7 results Save | Export

A Simple Model to Determine the Efficient Duration of Exams

Peer reviewed

Direct link

Ellis, Jules L. – Educational and Psychological Measurement, 2021

This study develops a theoretical model for the costs of an exam as a function of its duration. Two kind of costs are distinguished: (1) the costs of measurement errors and (2) the costs of the measurement. Both costs are expressed in time of the student. Based on a classical test theory model, enriched with assumptions on the context, the costs…

Descriptors: Test Length, Models, Error of Measurement, Measurement

Asymptotic Standard Errors of Observed-Score Equating with Polytomous IRT Models

Peer reviewed

Direct link

Andersson, Björn – Journal of Educational Measurement, 2016

In observed-score equipercentile equating, the goal is to make scores on two scales or tests measuring the same construct comparable by matching the percentiles of the respective score distributions. If the tests consist of different items with multiple categories for each item, a suitable model for the responses is a polytomous item response…

Descriptors: Equated Scores, Item Response Theory, Error of Measurement, Tests

Evaluating the Consistency of Angoff-Based Cut Scores Using Subsets of Items within a Generalizability Theory Framework

Peer reviewed

Direct link

Kannan, Priya; Sgammato, Adrienne; Tannenbaum, Richard J.; Katz, Irvin R. – Applied Measurement in Education, 2015

The Angoff method requires experts to view every item on the test and make a probability judgment. This can be time consuming when there are large numbers of items on the test. In this study, a G-theory framework was used to determine if a subset of items can be used to make generalizable cut-score recommendations. Angoff ratings (i.e.,…

Descriptors: Reliability, Standard Setting (Scoring), Cutting Scores, Test Items

Test Length and Decision Quality in Personnel Selection: When Is Short Too Short?

Peer reviewed

Direct link

Kruyen, Peter M.; Emons, Wilco H. M.; Sijtsma, Klaas – International Journal of Testing, 2012

Personnel selection shows an enduring need for short stand-alone tests consisting of, say, 5 to 15 items. Despite their efficiency, short tests are more vulnerable to measurement error than longer test versions. Consequently, the question arises to what extent reducing test length deteriorates decision quality due to increased impact of…

Descriptors: Measurement, Personnel Selection, Decision Making, Error of Measurement

Assessing Goodness of Fit in Item Response Theory with Nonparametric Models: A Comparison of Posterior Probabilities and Kernel-Smoothing Approaches

Peer reviewed

Direct link

Sueiro, Manuel J.; Abad, Francisco J. – Educational and Psychological Measurement, 2011

The distance between nonparametric and parametric item characteristic curves has been proposed as an index of goodness of fit in item response theory in the form of a root integrated squared error index. This article proposes to use the posterior distribution of the latent trait as the nonparametric model and compares the performance of an index…

Descriptors: Goodness of Fit, Item Response Theory, Nonparametric Statistics, Probability

Relationship Among the Number of Sub-Tests; Skewness, Kurtosis, and Size of Population; And Magnitude of Errors of Estimate in Multiple Matrix Sampling. (Revised Version).

PDF pending restoration

Misanchuk, Earl R. – 1978

Multiple matrix sampling of three subscales of the California Psychological Inventory was used to investigate the effects of four variables on error estimates of the mean (EEM) and variance (EEV). The four variables were examinee population size (600, 450, 300, 150, 100, and 75); number of subtests, (2, 3, 4, 5, 6, and 7), hence the number of…

Descriptors: Adults, Analysis of Variance, Error of Measurement, Item Sampling

Altering the Level of Difficulty in Computer Adaptive Testing.

Peer reviewed

Bergstrom, Betty A.; And Others – Applied Measurement in Education, 1992

Effects of altering test difficulty on examinee ability measures and test length in a computer adaptive test were studied for 225 medical technology students in 3 test difficulty conditions. Results suggest that, with an item pool of sufficient depth and breadth, acceptable targeting to test difficulty is possible. (SLD)

Descriptors: Ability, Adaptive Testing, Change, College Students