ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	3
Since 2016 (last 10 years)	5
Since 2006 (last 20 years)	13

Descriptor

Comparative Analysis	19
Difficulty Level	19
Test Items	16
Item Response Theory	9
Computation	5
Psychometrics	5
Test Bias	5
Monte Carlo Methods	4
Correlation	3
Error of Measurement	3
Item Analysis	3
Models	3
Sample Size	3
Scores	3
Ability	2
College Entrance Examinations	2
Criterion Referenced Tests	2
Foreign Countries	2
Goodness of Fit	2
Guessing (Tests)	2
Methods	2
Reading Tests	2
Scoring	2
Simulation	2
Statistical Analysis	2
More ▼

Source

Educational and Psychological…

Publication Type

Journal Articles	17
Reports - Research	15
Reports - Descriptive	1
Reports - Evaluative	1

Education Level

Higher Education	4
Postsecondary Education	4
Elementary Education	2
Grade 3	1
Grade 4	1
Intermediate Grades	1
Primary Education	1
Secondary Education	1

Audience

Location

Chile	1
Germany	1

Laws, Policies, & Programs

Assessments and Surveys

Graduate Record Examinations	1
National Assessment of…	1
Rosenberg Self Esteem Scale	1

What Works Clearinghouse Rating

Showing 1 to 15 of 19 results Save | Export

Seeing the Forest and the Trees: Comparison of Two IRTree Models to Investigate the Impact of Full versus Endpoint-Only Response Option Labeling

Peer reviewed

Direct link

Spratto, Elisabeth M.; Leventhal, Brian C.; Bandalos, Deborah L. – Educational and Psychological Measurement, 2021

In this study, we examined the results and interpretations produced from two different IRTree models--one using paths consisting of only dichotomous decisions, and one using paths consisting of both dichotomous and polytomous decisions. We used data from two versions of an impulsivity measure. In the first version, all the response options had…

Descriptors: Comparative Analysis, Item Response Theory, Decision Making, Data Analysis

Improvement of Norm Score Quality via Regression-Based Continuous Norming

Peer reviewed

Direct link

Lenhard, Wolfgang; Lenhard, Alexandra – Educational and Psychological Measurement, 2021

The interpretation of psychometric test results is usually based on norm scores. We compared semiparametric continuous norming (SPCN) with conventional norming methods by simulating results for test scales with different item numbers and difficulties via an item response theory approach. Subsequently, we modeled the norm scores based on random…

Descriptors: Test Norms, Scores, Regression (Statistics), Test Items

Position of Correct Option and Distractors Impacts Responses to Multiple-Choice Items: Evidence from a National Test

Peer reviewed

Direct link

Lions, Séverin; Dartnell, Pablo; Toledo, Gabriela; Godoy, María Inés; Córdova, Nora; Jiménez, Daniela; Lemarié, Julie – Educational and Psychological Measurement, 2023

Even though the impact of the position of response options on answers to multiple-choice items has been investigated for decades, it remains debated. Research on this topic is inconclusive, perhaps because too few studies have obtained experimental data from large-sized samples in a real-world context and have manipulated the position of both…

Descriptors: Multiple Choice Tests, Test Items, Item Analysis, Responses

Unidimensional IRT Item Parameter Estimates across Equivalent Test Forms with Confounding Specifications within Dimensions

Peer reviewed

Direct link

Matlock, Ki Lynn; Turner, Ronna – Educational and Psychological Measurement, 2016

When constructing multiple test forms, the number of items and the total test difficulty are often equivalent. Not all test developers match the number of items and/or average item difficulty within subcontent areas. In this simulation study, six test forms were constructed having an equal number of items and average item difficulty overall.…

Descriptors: Item Response Theory, Computation, Test Items, Difficulty Level

Multidimensional Classification of Examinees Using the Mixture Random Weights Linear Logistic Test Model

Peer reviewed

Direct link

Choi, In-Hee; Wilson, Mark – Educational and Psychological Measurement, 2015

An essential feature of the linear logistic test model (LLTM) is that item difficulties are explained using item design properties. By taking advantage of this explanatory aspect of the LLTM, in a mixture extension of the LLTM, the meaning of latent classes is specified by how item properties affect item difficulties within each class. To improve…

Descriptors: Classification, Test Items, Difficulty Level, Statistical Analysis

Rasch Mixture Models for DIF Detection: A Comparison of Old and New Score Specifications

Peer reviewed

Direct link

Frick, Hannah; Strobl, Carolin; Zeileis, Achim – Educational and Psychological Measurement, 2015

Rasch mixture models can be a useful tool when checking the assumption of measurement invariance for a single Rasch model. They provide advantages compared to manifest differential item functioning (DIF) tests when the DIF groups are only weakly correlated with the manifest covariates available. Unlike in single Rasch models, estimation of Rasch…

Descriptors: Item Response Theory, Test Bias, Comparative Analysis, Scores

Survey Satisficing Inflates Reliability and Validity Measures: An Experimental Comparison of College and Amazon Mechanical Turk Samples

Peer reviewed

Direct link

Hamby, Tyler; Taylor, Wyn – Educational and Psychological Measurement, 2016

This study examined the predictors and psychometric outcomes of survey satisficing, wherein respondents provide quick, "good enough" answers (satisficing) rather than carefully considered answers (optimizing). We administered surveys to university students and respondents--half of whom held college degrees--from a for-pay survey website,…

Descriptors: Surveys, Test Reliability, Test Validity, Comparative Analysis

A Comparison of Item Parameter Standard Error Estimation Procedures for Unidimensional and Multidimensional Item Response Theory Modeling

Peer reviewed

Direct link

Paek, Insu; Cai, Li – Educational and Psychological Measurement, 2014

The present study was motivated by the recognition that standard errors (SEs) of item response theory (IRT) model parameters are often of immediate interest to practitioners and that there is currently a lack of comparative research on different SE (or error variance-covariance matrix) estimation procedures. The present study investigated item…

Descriptors: Item Response Theory, Comparative Analysis, Error of Measurement, Computation

An IRT Examination of the Psychometric Functioning of Negatively Worded Personality Items

Peer reviewed

Direct link

Sliter, Katherine A.; Zickar, Michael J. – Educational and Psychological Measurement, 2014

This study compared the functioning of positively and negatively worded personality items using item response theory. In Study 1, word pairs from the Goldberg Adjective Checklist were analyzed using the Graded Response Model. Across subscales, negatively worded items produced comparatively higher difficulty and lower discrimination parameters than…

Descriptors: Item Response Theory, Psychometrics, Personality Measures, Test Items

A Comparison of Uniform DIF Effect Size Estimators under the MIMIC and Rasch Models

Peer reviewed

Direct link

Jin, Ying; Myers, Nicholas D.; Ahn, Soyeon; Penfield, Randall D. – Educational and Psychological Measurement, 2013

The Rasch model, a member of a larger group of models within item response theory, is widely used in empirical studies. Detection of uniform differential item functioning (DIF) within the Rasch model typically employs null hypothesis testing with a concomitant consideration of effect size (e.g., signed area [SA]). Parametric equivalence between…

Descriptors: Test Bias, Effect Size, Item Response Theory, Comparative Analysis

Assessing Impact, DIF, and DFF in Accommodated Item Scores: A Comparison of Multilevel Measurement Model Parameterizations

Peer reviewed

Direct link

Beretvas, S. Natasha; Cawthon, Stephanie W.; Lockhart, L. Leland; Kaye, Alyssa D. – Educational and Psychological Measurement, 2012

This pedagogical article is intended to explain the similarities and differences between the parameterizations of two multilevel measurement model (MMM) frameworks. The conventional two-level MMM that includes item indicators and models item scores (Level 1) clustered within examinees (Level 2) and the two-level cross-classified MMM (in which item…

Descriptors: Test Bias, Comparative Analysis, Test Items, Difficulty Level

An Application of Explanatory Item Response Modeling for Model-Based Proficiency Scaling

Peer reviewed

Direct link

Hartig, Johannes; Frey, Andreas; Nold, Gunter; Klieme, Eckhard – Educational and Psychological Measurement, 2012

The article compares three different methods to estimate effects of task characteristics and to use these estimates for model-based proficiency scaling: prediction of item difficulties from the Rasch model, the linear logistic test model (LLTM), and an LLTM including random item effects (LLTM+e). The methods are applied to empirical data from a…

Descriptors: Item Response Theory, Models, Methods, Computation

Reducing the Cognitive Complexity Associated with Standard Setting: A Comparison of the Single-Passage Bookmark and Yes/No Methods

Peer reviewed

Direct link

Skaggs, Gary; Hein, Serge F. – Educational and Psychological Measurement, 2011

Judgmental standard setting methods have been criticized for the cognitive complexity of the judgment task that panelists are asked to complete. This study compared two methods designed to reduce this complexity: the yes/no method and the single-passage bookmark method. Two mock standard setting panel meetings were convened, one for each method,…

Descriptors: Standard Setting (Scoring), Methods, Cutting Scores, Experienced Teachers

A Monte Carlo Comparison of Item and Person Statistics Based on Item Response Theory versus Classical Test Theory.

Peer reviewed

MacDonald, Paul; Paunonen, Sampo V. – Educational and Psychological Measurement, 2002

Examined the behavior of item and person statistics from item response theory and classical test theory frameworks through Monte Carlo methods with simulated test data. Findings suggest that item difficulty and person ability estimates are highly comparable for both approaches. (SLD)

Descriptors: Ability, Comparative Analysis, Difficulty Level, Item Response Theory

A Comparison of Objective-Based and Modified-Bormuth Item Writing Techniques

Peer reviewed

Roid, G. H.; Haladyna, Thomas M. – Educational and Psychological Measurement, 1978

Two techniques for writing achievement test items to accompany instructional materials are contrasted: writing items from statements of instructional objectives, and writing items from semi-automated rules for transforming instructional statements. Both systems resulted in about the same number of faulty items. (Author/JKS)

Descriptors: Achievement Tests, Comparative Analysis, Criterion Referenced Tests, Difficulty Level

Previous Page | Next Page »

Pages: 1 | 2

Ahn, Soyeon	1
Bandalos, Deborah L.	1
Beretvas, S. Natasha	1
Cai, Li	1
Cawthon, Stephanie W.	1
Choi, In-Hee	1
Córdova, Nora	1
Dartnell, Pablo	1
Frey, Andreas	1
Frick, Hannah	1
Godoy, María Inés	1
Haladyna, Thomas M.	1
Hamby, Tyler	1
Hartig, Johannes	1
Hein, Serge F.	1
Huck, Schuyler W.	1
Jiménez, Daniela	1
Jin, Ying	1
Kaye, Alyssa D.	1
Klieme, Eckhard	1
Lemarié, Julie	1
Lenhard, Alexandra	1
Lenhard, Wolfgang	1
Leventhal, Brian C.	1
More ▼