Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 3 |
Since 2016 (last 10 years) | 5 |
Since 2006 (last 20 years) | 13 |
Descriptor
Comparative Analysis | 19 |
Difficulty Level | 19 |
Test Items | 16 |
Item Response Theory | 9 |
Computation | 5 |
Psychometrics | 5 |
Test Bias | 5 |
Monte Carlo Methods | 4 |
Correlation | 3 |
Error of Measurement | 3 |
Item Analysis | 3 |
More ▼ |
Source
Educational and Psychological… | 19 |
Author
Ahn, Soyeon | 1 |
Bandalos, Deborah L. | 1 |
Beretvas, S. Natasha | 1 |
Cai, Li | 1 |
Cawthon, Stephanie W. | 1 |
Choi, In-Hee | 1 |
Córdova, Nora | 1 |
Dartnell, Pablo | 1 |
Frey, Andreas | 1 |
Frick, Hannah | 1 |
Godoy, María Inés | 1 |
More ▼ |
Publication Type
Journal Articles | 17 |
Reports - Research | 15 |
Reports - Descriptive | 1 |
Reports - Evaluative | 1 |
Education Level
Higher Education | 4 |
Postsecondary Education | 4 |
Elementary Education | 2 |
Grade 3 | 1 |
Grade 4 | 1 |
Intermediate Grades | 1 |
Primary Education | 1 |
Secondary Education | 1 |
Audience
Laws, Policies, & Programs
Assessments and Surveys
Graduate Record Examinations | 1 |
National Assessment of… | 1 |
Rosenberg Self Esteem Scale | 1 |
What Works Clearinghouse Rating
Spratto, Elisabeth M.; Leventhal, Brian C.; Bandalos, Deborah L. – Educational and Psychological Measurement, 2021
In this study, we examined the results and interpretations produced from two different IRTree models--one using paths consisting of only dichotomous decisions, and one using paths consisting of both dichotomous and polytomous decisions. We used data from two versions of an impulsivity measure. In the first version, all the response options had…
Descriptors: Comparative Analysis, Item Response Theory, Decision Making, Data Analysis
Lenhard, Wolfgang; Lenhard, Alexandra – Educational and Psychological Measurement, 2021
The interpretation of psychometric test results is usually based on norm scores. We compared semiparametric continuous norming (SPCN) with conventional norming methods by simulating results for test scales with different item numbers and difficulties via an item response theory approach. Subsequently, we modeled the norm scores based on random…
Descriptors: Test Norms, Scores, Regression (Statistics), Test Items
Lions, Séverin; Dartnell, Pablo; Toledo, Gabriela; Godoy, María Inés; Córdova, Nora; Jiménez, Daniela; Lemarié, Julie – Educational and Psychological Measurement, 2023
Even though the impact of the position of response options on answers to multiple-choice items has been investigated for decades, it remains debated. Research on this topic is inconclusive, perhaps because too few studies have obtained experimental data from large-sized samples in a real-world context and have manipulated the position of both…
Descriptors: Multiple Choice Tests, Test Items, Item Analysis, Responses
Matlock, Ki Lynn; Turner, Ronna – Educational and Psychological Measurement, 2016
When constructing multiple test forms, the number of items and the total test difficulty are often equivalent. Not all test developers match the number of items and/or average item difficulty within subcontent areas. In this simulation study, six test forms were constructed having an equal number of items and average item difficulty overall.…
Descriptors: Item Response Theory, Computation, Test Items, Difficulty Level
Choi, In-Hee; Wilson, Mark – Educational and Psychological Measurement, 2015
An essential feature of the linear logistic test model (LLTM) is that item difficulties are explained using item design properties. By taking advantage of this explanatory aspect of the LLTM, in a mixture extension of the LLTM, the meaning of latent classes is specified by how item properties affect item difficulties within each class. To improve…
Descriptors: Classification, Test Items, Difficulty Level, Statistical Analysis
Frick, Hannah; Strobl, Carolin; Zeileis, Achim – Educational and Psychological Measurement, 2015
Rasch mixture models can be a useful tool when checking the assumption of measurement invariance for a single Rasch model. They provide advantages compared to manifest differential item functioning (DIF) tests when the DIF groups are only weakly correlated with the manifest covariates available. Unlike in single Rasch models, estimation of Rasch…
Descriptors: Item Response Theory, Test Bias, Comparative Analysis, Scores
Hamby, Tyler; Taylor, Wyn – Educational and Psychological Measurement, 2016
This study examined the predictors and psychometric outcomes of survey satisficing, wherein respondents provide quick, "good enough" answers (satisficing) rather than carefully considered answers (optimizing). We administered surveys to university students and respondents--half of whom held college degrees--from a for-pay survey website,…
Descriptors: Surveys, Test Reliability, Test Validity, Comparative Analysis
Paek, Insu; Cai, Li – Educational and Psychological Measurement, 2014
The present study was motivated by the recognition that standard errors (SEs) of item response theory (IRT) model parameters are often of immediate interest to practitioners and that there is currently a lack of comparative research on different SE (or error variance-covariance matrix) estimation procedures. The present study investigated item…
Descriptors: Item Response Theory, Comparative Analysis, Error of Measurement, Computation
Sliter, Katherine A.; Zickar, Michael J. – Educational and Psychological Measurement, 2014
This study compared the functioning of positively and negatively worded personality items using item response theory. In Study 1, word pairs from the Goldberg Adjective Checklist were analyzed using the Graded Response Model. Across subscales, negatively worded items produced comparatively higher difficulty and lower discrimination parameters than…
Descriptors: Item Response Theory, Psychometrics, Personality Measures, Test Items
Jin, Ying; Myers, Nicholas D.; Ahn, Soyeon; Penfield, Randall D. – Educational and Psychological Measurement, 2013
The Rasch model, a member of a larger group of models within item response theory, is widely used in empirical studies. Detection of uniform differential item functioning (DIF) within the Rasch model typically employs null hypothesis testing with a concomitant consideration of effect size (e.g., signed area [SA]). Parametric equivalence between…
Descriptors: Test Bias, Effect Size, Item Response Theory, Comparative Analysis
Beretvas, S. Natasha; Cawthon, Stephanie W.; Lockhart, L. Leland; Kaye, Alyssa D. – Educational and Psychological Measurement, 2012
This pedagogical article is intended to explain the similarities and differences between the parameterizations of two multilevel measurement model (MMM) frameworks. The conventional two-level MMM that includes item indicators and models item scores (Level 1) clustered within examinees (Level 2) and the two-level cross-classified MMM (in which item…
Descriptors: Test Bias, Comparative Analysis, Test Items, Difficulty Level
Hartig, Johannes; Frey, Andreas; Nold, Gunter; Klieme, Eckhard – Educational and Psychological Measurement, 2012
The article compares three different methods to estimate effects of task characteristics and to use these estimates for model-based proficiency scaling: prediction of item difficulties from the Rasch model, the linear logistic test model (LLTM), and an LLTM including random item effects (LLTM+e). The methods are applied to empirical data from a…
Descriptors: Item Response Theory, Models, Methods, Computation
Skaggs, Gary; Hein, Serge F. – Educational and Psychological Measurement, 2011
Judgmental standard setting methods have been criticized for the cognitive complexity of the judgment task that panelists are asked to complete. This study compared two methods designed to reduce this complexity: the yes/no method and the single-passage bookmark method. Two mock standard setting panel meetings were convened, one for each method,…
Descriptors: Standard Setting (Scoring), Methods, Cutting Scores, Experienced Teachers

MacDonald, Paul; Paunonen, Sampo V. – Educational and Psychological Measurement, 2002
Examined the behavior of item and person statistics from item response theory and classical test theory frameworks through Monte Carlo methods with simulated test data. Findings suggest that item difficulty and person ability estimates are highly comparable for both approaches. (SLD)
Descriptors: Ability, Comparative Analysis, Difficulty Level, Item Response Theory

Roid, G. H.; Haladyna, Thomas M. – Educational and Psychological Measurement, 1978
Two techniques for writing achievement test items to accompany instructional materials are contrasted: writing items from statements of instructional objectives, and writing items from semi-automated rules for transforming instructional statements. Both systems resulted in about the same number of faulty items. (Author/JKS)
Descriptors: Achievement Tests, Comparative Analysis, Criterion Referenced Tests, Difficulty Level
Previous Page | Next Page »
Pages: 1 | 2