ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	5

Descriptor

Difficulty Level	10
Error of Measurement	10
Test Items	7
Test Reliability	4
Comparative Analysis	3
Item Response Theory	3
Mathematical Models	3
Statistical Bias	3
College Entrance Examinations	2
Computation	2
Foreign Countries	2
Item Analysis	2
Monte Carlo Methods	2
Multiple Choice Tests	2
Sample Size	2
Academic Achievement	1
Accuracy	1
Achievement Tests	1
Adaptive Testing	1
Analysis of Variance	1
Bayesian Statistics	1
Computer Assisted Testing	1
Computer Programs	1
Computer Simulation	1
Correlation	1
More ▼

Source

Educational and Psychological…

Publication Type

Journal Articles	9
Reports - Research	8
Reports - Evaluative	1

Education Level

Higher Education	1
Postsecondary Education	1
Secondary Education	1

Audience

Location

Chile	1
United Kingdom (Wales)	1

Laws, Policies, & Programs

Assessments and Surveys

SAT (College Admission Test)

What Works Clearinghouse Rating

Showing all 10 results Save | Export

Differential Item Functioning Effect Size from the Multigroup Confirmatory Factor Analysis for a Meta-Analysis: A Simulation Study

Peer reviewed

Direct link

Park, Sung Eun; Ahn, Soyeon; Zopluoglu, Cengiz – Educational and Psychological Measurement, 2021

This study presents a new approach to synthesizing differential item functioning (DIF) effect size: First, using correlation matrices from each study, we perform a multigroup confirmatory factor analysis (MGCFA) that examines measurement invariance of a test item between two subgroups (i.e., focal and reference groups). Then we synthesize, across…

Descriptors: Item Analysis, Effect Size, Difficulty Level, Monte Carlo Methods

Position of Correct Option and Distractors Impacts Responses to Multiple-Choice Items: Evidence from a National Test

Peer reviewed

Direct link

Lions, Séverin; Dartnell, Pablo; Toledo, Gabriela; Godoy, María Inés; Córdova, Nora; Jiménez, Daniela; Lemarié, Julie – Educational and Psychological Measurement, 2023

Even though the impact of the position of response options on answers to multiple-choice items has been investigated for decades, it remains debated. Research on this topic is inconclusive, perhaps because too few studies have obtained experimental data from large-sized samples in a real-world context and have manipulated the position of both…

Descriptors: Multiple Choice Tests, Test Items, Item Analysis, Responses

A Comparison of Item Parameter Standard Error Estimation Procedures for Unidimensional and Multidimensional Item Response Theory Modeling

Peer reviewed

Direct link

Paek, Insu; Cai, Li – Educational and Psychological Measurement, 2014

The present study was motivated by the recognition that standard errors (SEs) of item response theory (IRT) model parameters are often of immediate interest to practitioners and that there is currently a lack of comparative research on different SE (or error variance-covariance matrix) estimation procedures. The present study investigated item…

Descriptors: Item Response Theory, Comparative Analysis, Error of Measurement, Computation

Observed Score Equating Using a Mini-Version Anchor and an Anchor with Less Spread of Difficulty: A Comparison Study

Peer reviewed

Direct link

Liu, Jinghua; Sinharay, Sandip; Holland, Paul; Feigenbaum, Miriam; Curley, Edward – Educational and Psychological Measurement, 2011

Two different types of anchors are investigated in this study: a mini-version anchor and an anchor that has a less spread of difficulty than the tests to be equated. The latter is referred to as a midi anchor. The impact of these two different types of anchors on observed score equating are evaluated and compared with respect to systematic error…

Descriptors: Equated Scores, Test Items, Difficulty Level, Statistical Bias

Generalizability of Scaling Gradients on Direct Behavior Ratings

Peer reviewed

Direct link

Chafouleas, Sandra M.; Christ, Theodore J.; Riley-Tillman, T. Chris – Educational and Psychological Measurement, 2009

Generalizability theory is used to examine the impact of scaling gradients on a single-item Direct Behavior Rating (DBR). A DBR refers to a type of rating scale used to efficiently record target behavior(s) following an observation occasion. Variance components associated with scale gradients are estimated using a random effects design for persons…

Descriptors: Generalizability Theory, Undergraduate Students, Scaling, Rating Scales

Standard Errors of Estimate in Item-Examinee Sampling as a Function of Test Reliability, Variation in Item Difficulty Indices and Degree of Skewness in the Normative Distribution

Peer reviewed

Shoemaker, David M. – Educational and Psychological Measurement, 1972

Descriptors: Difficulty Level, Error of Measurement, Item Sampling, Simulation

Some Relationships between the Binomial Error Model and Classical Test Theory.

Peer reviewed

Feldt, Leonard S. – Educational and Psychological Measurement, 1984

The binomial error model includes form-to-form difficulty differences as error variance and leads to Ruder-Richardson formula 21 as an estimate of reliability. If the form-to-form component is removed from the estimate of error variance, the binomial model leads to KR 20 as the reliability estimate. (Author/BW)

Descriptors: Achievement Tests, Difficulty Level, Error of Measurement, Mathematical Formulas

A Comparison of Two, Three and Four-Choice Item Tests Given a Fixed Total Number of Choices.

Peer reviewed

Straton, Ralph G.; Catts, Ralph M. – Educational and Psychological Measurement, 1980

Multiple-choice tests composed entirely of two-, three-, or four-choice items were investigated. Results indicated that number of alternatives per item was inversely related to item difficulty, but directly related to item discrimination. Reliability and standard error of measurement of three-choice item tests was equivalent or superior.…

Descriptors: Difficulty Level, Error of Measurement, Foreign Countries, Higher Education

An Empirical Investigation of Lu's Method of Reliability Estimation.

Peer reviewed

Huck, Schuyler W.; And Others – Educational and Psychological Measurement, 1981

Believing that examinee-by-item interaction should be conceptualized as true score variability rather than as a result of errors of measurement, Lu proposed a modification of Hoyt's analysis of variance reliability procedure. Via a computer simulation study, it is shown that Lu's approach does not separate interaction from error. (Author/RL)

Descriptors: Analysis of Variance, Comparative Analysis, Computer Programs, Difficulty Level

The Influence of Dimensionality on CAT Ability Estimation.

Peer reviewed

De Ayala, R. J. – Educational and Psychological Measurement, 1992

Effects of dimensionality on ability estimation of an adaptive test were examined using generated data in Bayesian computerized adaptive testing (CAT) simulations. Generally, increasing interdimensional difficulty association produced a slight decrease in test length and an increase in accuracy of ability estimation as assessed by root mean square…

Descriptors: Adaptive Testing, Bayesian Statistics, Computer Assisted Testing, Computer Simulation

Ahn, Soyeon	1
Cai, Li	1
Catts, Ralph M.	1
Chafouleas, Sandra M.	1
Christ, Theodore J.	1
Curley, Edward	1
Córdova, Nora	1
Dartnell, Pablo	1
De Ayala, R. J.	1
Feigenbaum, Miriam	1
Feldt, Leonard S.	1
Godoy, María Inés	1
Holland, Paul	1
Huck, Schuyler W.	1
Jiménez, Daniela	1
Lemarié, Julie	1
Lions, Séverin	1
Liu, Jinghua	1
Paek, Insu	1
Park, Sung Eun	1
Riley-Tillman, T. Chris	1
Shoemaker, David M.	1
Sinharay, Sandip	1
Straton, Ralph G.	1
More ▼