ERIC - Search Results

Publication Date

In 2025	1
Since 2024	3
Since 2021 (last 5 years)	13
Since 2016 (last 10 years)	21
Since 2006 (last 20 years)	55

Descriptor

Difficulty Level	106
Test Items	89
Item Response Theory	41
Item Analysis	26
Comparative Analysis	19
Test Construction	18
Higher Education	16
Multiple Choice Tests	16
Test Bias	15
Models	14
Foreign Countries	13
Statistical Analysis	13
Test Format	13
Computation	12
Test Reliability	12
Psychometrics	11
Error of Measurement	10
Test Validity	10
Correlation	9
Guessing (Tests)	9
Monte Carlo Methods	9
College Entrance Examinations	8
Response Style (Tests)	8
Scores	8
Scoring	8
More ▼

Source

Educational and Psychological…

106

Publication Type

Journal Articles	95
Reports - Research	75
Reports - Evaluative	16
Reports - Descriptive	4
Speeches/Meeting Papers	3
Guides - Non-Classroom	2
Information Analyses	1

Education Level

Higher Education	9
Postsecondary Education	7
Secondary Education	6
Elementary Education	5
Grade 3	3
Early Childhood Education	2
Grade 4	2
Grade 5	2
High Schools	2
Primary Education	2
Elementary Secondary Education	1
Grade 11	1
Grade 6	1
Grade 7	1
Intermediate Grades	1
Preschool Education	1
More ▼

Audience

Location

Germany	3
Australia	2
California	1
Chile	1
Florida	1
Greece	1
Japan	1
Mexico	1
Netherlands	1
Saudi Arabia	1
Singapore	1
United Kingdom (Wales)	1
United States	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

SAT (College Admission Test)	4
Raven Progressive Matrices	2
Advanced Placement…	1
Childrens Manifest Anxiety…	1
General Aptitude Test Battery	1
Graduate Record Examinations	1
National Assessment of…	1
National Teacher Examinations	1
Program for International…	1
Raven Advanced Progressive…	1
Rosenberg Self Esteem Scale	1
SRA Achievement Series	1
More ▼

What Works Clearinghouse Rating

Showing 1 to 15 of 106 results Save | Export

Evaluating Equating Methods for Varying Levels of Form Difference

Peer reviewed

Direct link

Ting Sun; Stella Yun Kim – Educational and Psychological Measurement, 2024

Equating is a statistical procedure used to adjust for the difference in form difficulty such that scores on those forms can be used and interpreted comparably. In practice, however, equating methods are often implemented without considering the extent to which two forms differ in difficulty. The study aims to examine the effect of the magnitude…

Descriptors: Difficulty Level, Data Interpretation, Equated Scores, High School Students

The Impact of Insufficient Effort Responses on the Order of Category Thresholds in the Polytomous Rasch Model

Peer reviewed

Direct link

Kuan-Yu Jin; Thomas Eckes – Educational and Psychological Measurement, 2024

Insufficient effort responding (IER) refers to a lack of effort when answering survey or questionnaire items. Such items typically offer more than two ordered response categories, with Likert-type scales as the most prominent example. The underlying assumption is that the successive categories reflect increasing levels of the latent variable…

Descriptors: Item Response Theory, Test Items, Test Wiseness, Surveys

Why Do Regular and Reversed Items Load on Separate Factors? Response Difficulty vs. Item Extremity

Peer reviewed

Direct link

Kam, Chester Chun Seng – Educational and Psychological Measurement, 2023

When constructing measurement scales, regular and reversed items are often used (e.g., "I am satisfied with my job"/"I am not satisfied with my job"). Some methodologists recommend excluding reversed items because they are more difficult to understand and therefore engender a second, artificial factor distinct from the…

Descriptors: Test Items, Difficulty Level, Test Construction, Construct Validity

Is Effort Moderated Scoring Robust to Multidimensional Rapid Guessing?

Peer reviewed

Direct link

Joseph A. Rios; Jiayi Deng – Educational and Psychological Measurement, 2025

To mitigate the potential damaging consequences of rapid guessing (RG), a form of noneffortful responding, researchers have proposed a number of scoring approaches. The present simulation study examines the robustness of the most popular of these approaches, the unidimensional effort-moderated (EM) scoring procedure, to multidimensional RG (i.e.,…

Descriptors: Scoring, Guessing (Tests), Reaction Time, Item Response Theory

What Affects the Quality of Score Transformations? Potential Issues in True-Score Equating Using the Partial Credit Model

Peer reviewed

Direct link

Fellinghauer, Carolina; Debelak, Rudolf; Strobl, Carolin – Educational and Psychological Measurement, 2023

This simulation study investigated to what extent departures from construct similarity as well as differences in the difficulty and targeting of scales impact the score transformation when scales are equated by means of concurrent calibration using the partial credit model with a common person design. Practical implications of the simulation…

Descriptors: True Scores, Equated Scores, Test Items, Sample Size

On the Relationship between Item Stem Formulation and Criterion Validity of Multiple-Component Measuring Instruments

Peer reviewed

Direct link

Menold, Natalja; Raykov, Tenko – Educational and Psychological Measurement, 2022

The possible dependency of criterion validity on item formulation in a multicomponent measuring instrument is examined. The discussion is concerned with evaluation of the differences in criterion validity between two or more groups (populations/subpopulations) that have been administered instruments with items having differently formulated item…

Descriptors: Test Items, Measures (Individuals), Test Validity, Difficulty Level

Evaluating Different Scoring Methods for Multiple Response Items Providing Partial Credit

Peer reviewed

Direct link

Betts, Joe; Muntean, William; Kim, Doyoung; Kao, Shu-chuan – Educational and Psychological Measurement, 2022

The multiple response structure can underlie several different technology-enhanced item types. With the increased use of computer-based testing, multiple response items are becoming more common. This response type holds the potential for being scored polytomously for partial credit. However, there are several possible methods for computing raw…

Descriptors: Scoring, Test Items, Test Format, Raw Scores

Identifying Ability and Nonability Groups: Incorporating Response Times Using Mixture Modeling

Peer reviewed

Direct link

Sideridis, Georgios; Tsaousis, Ioannis; Al-Harbi, Khaleel – Educational and Psychological Measurement, 2022

The goal of the present study was to address the analytical complexity of incorporating responses and response times through applying the Jeon and De Boeck mixture item response theory model in Mplus 8.7. Using both simulated and real data, we attempt to identify subgroups of responders that are rapid guessers or engage knowledge retrieval…

Descriptors: Reaction Time, Guessing (Tests), Item Response Theory, Information Retrieval

Semisupervised Learning Method to Adjust Biased Item Difficulty Estimates Caused by Nonignorable Missingness in a Virtual Learning Environment

Peer reviewed
PDF on ERIC

Download full text

Direct link

Xue, Kang; Huggins-Manley, Anne Corinne; Leite, Walter – Educational and Psychological Measurement, 2022

In data collected from virtual learning environments (VLEs), item response theory (IRT) models can be used to guide the ongoing measurement of student ability. However, such applications of IRT rely on unbiased item parameter estimates associated with test items in the VLE. Without formal piloting of the items, one can expect a large amount of…

Descriptors: Virtual Classrooms, Artificial Intelligence, Item Response Theory, Item Analysis

Seeing the Forest and the Trees: Comparison of Two IRTree Models to Investigate the Impact of Full versus Endpoint-Only Response Option Labeling

Peer reviewed

Direct link

Spratto, Elisabeth M.; Leventhal, Brian C.; Bandalos, Deborah L. – Educational and Psychological Measurement, 2021

In this study, we examined the results and interpretations produced from two different IRTree models--one using paths consisting of only dichotomous decisions, and one using paths consisting of both dichotomous and polytomous decisions. We used data from two versions of an impulsivity measure. In the first version, all the response options had…

Descriptors: Comparative Analysis, Item Response Theory, Decision Making, Data Analysis

Improvement of Norm Score Quality via Regression-Based Continuous Norming

Peer reviewed

Direct link

Lenhard, Wolfgang; Lenhard, Alexandra – Educational and Psychological Measurement, 2021

The interpretation of psychometric test results is usually based on norm scores. We compared semiparametric continuous norming (SPCN) with conventional norming methods by simulating results for test scales with different item numbers and difficulties via an item response theory approach. Subsequently, we modeled the norm scores based on random…

Descriptors: Test Norms, Scores, Regression (Statistics), Test Items

Differential Item Functioning Effect Size from the Multigroup Confirmatory Factor Analysis for a Meta-Analysis: A Simulation Study

Peer reviewed

Direct link

Park, Sung Eun; Ahn, Soyeon; Zopluoglu, Cengiz – Educational and Psychological Measurement, 2021

This study presents a new approach to synthesizing differential item functioning (DIF) effect size: First, using correlation matrices from each study, we perform a multigroup confirmatory factor analysis (MGCFA) that examines measurement invariance of a test item between two subgroups (i.e., focal and reference groups). Then we synthesize, across…

Descriptors: Item Analysis, Effect Size, Difficulty Level, Monte Carlo Methods

Position of Correct Option and Distractors Impacts Responses to Multiple-Choice Items: Evidence from a National Test

Peer reviewed

Direct link

Lions, Séverin; Dartnell, Pablo; Toledo, Gabriela; Godoy, María Inés; Córdova, Nora; Jiménez, Daniela; Lemarié, Julie – Educational and Psychological Measurement, 2023

Even though the impact of the position of response options on answers to multiple-choice items has been investigated for decades, it remains debated. Research on this topic is inconclusive, perhaps because too few studies have obtained experimental data from large-sized samples in a real-world context and have manipulated the position of both…

Descriptors: Multiple Choice Tests, Test Items, Item Analysis, Responses

Is the Factor Observed in Investigations on the Item-Position Effect Actually the Difficulty Factor?

Peer reviewed

Direct link

Schweizer, Karl; Troche, Stefan – Educational and Psychological Measurement, 2018

In confirmatory factor analysis quite similar models of measurement serve the detection of the difficulty factor and the factor due to the item-position effect. The item-position effect refers to the increasing dependency among the responses to successively presented items of a test whereas the difficulty factor is ascribed to the wide range of…

Descriptors: Investigations, Difficulty Level, Factor Analysis, Models

Unidimensional IRT Item Parameter Estimates across Equivalent Test Forms with Confounding Specifications within Dimensions

Peer reviewed

Direct link

Matlock, Ki Lynn; Turner, Ronna – Educational and Psychological Measurement, 2016

When constructing multiple test forms, the number of items and the total test difficulty are often equivalent. Not all test developers match the number of items and/or average item difficulty within subcontent areas. In this simulation study, six test forms were constructed having an equal number of items and average item difficulty overall.…

Descriptors: Item Response Theory, Computation, Test Items, Difficulty Level

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8

Cizek, Gregory J.	3
Plake, Barbara S.	3
Ahn, Soyeon	2
Andrich, David	2
DeMars, Christine E.	2
Green, Kathy	2
Huck, Schuyler W.	2
Kubinger, Klaus D.	2
Lord, Frederic M.	2
Strobl, Carolin	2
Wilson, Mark	2
Ace, Merle C.	1
Aiken, Lewis R.	1
Al-Harbi, Khaleel	1
Albano, Anthony D.	1
Andrews, Glenda	1
Bandalos, Deborah L.	1
Batchelder, William H.	1
Beretvas, S. Natasha	1
Betts, Joe	1
Birney, Damian P.	1
Blumberg, Phyllis	1
Cai, Li	1
Catts, Ralph M.	1
More ▼