ERIC - Search Results

Publication Date

In 2025	0
Since 2024	3
Since 2021 (last 5 years)	9
Since 2016 (last 10 years)	14
Since 2006 (last 20 years)	23

Descriptor

Test Format	98
Test Items	48
Higher Education	35
Test Construction	27
Test Validity	25
Test Reliability	22
Item Response Theory	19
Multiple Choice Tests	19
Foreign Countries	17
Item Analysis	14
Scores	14
College Students	13
Difficulty Level	13
Psychometrics	13
Response Style (Tests)	12
Adults	10
Comparative Analysis	10
Factor Structure	10
High Schools	10
Computer Assisted Testing	9
Factor Analysis	9
High School Students	9
Questionnaires	9
Test Length	9
Models	8
More ▼

Source

Educational and Psychological…

Publication Type

Journal Articles	98
Reports - Research	83
Reports - Evaluative	14
Speeches/Meeting Papers	3
Information Analyses	2
Reports - Descriptive	2
Opinion Papers	1
Tests/Questionnaires	1

Education Level

Secondary Education	3
Elementary Secondary Education	2
Grade 8	2
High Schools	2
Junior High Schools	2
Middle Schools	2
Elementary Education	1
Grade 9	1
Higher Education	1
Postsecondary Education	1

Audience

Location

Australia	3
Canada	2
Germany	2
Japan	2
Hong Kong	1
Israel	1
New Zealand	1
Spain	1
Taiwan (Taipei)	1
United Kingdom	1
United Kingdom (Wales)	1
United States	1
More ▼

Laws, Policies, & Programs

What Works Clearinghouse Rating

Showing 1 to 15 of 98 results Save | Export

Artificial Neural Networks for Short-Form Development of Psychometric Tests: A Study on Synthetic Populations Using Autoencoders

Peer reviewed

Direct link

Monica Casella; Pasquale Dolce; Michela Ponticorvo; Nicola Milano; Davide Marocco – Educational and Psychological Measurement, 2024

Short-form development is an important topic in psychometric research, which requires researchers to face methodological choices at different steps. The statistical techniques traditionally used for shortening tests, which belong to the so-called exploratory model, make assumptions not always verified in psychological data. This article proposes a…

Descriptors: Artificial Intelligence, Test Construction, Test Format, Psychometrics

Evaluating Equating Methods for Varying Levels of Form Difference

Peer reviewed

Direct link

Ting Sun; Stella Yun Kim – Educational and Psychological Measurement, 2024

Equating is a statistical procedure used to adjust for the difference in form difficulty such that scores on those forms can be used and interpreted comparably. In practice, however, equating methods are often implemented without considering the extent to which two forms differ in difficulty. The study aims to examine the effect of the magnitude…

Descriptors: Difficulty Level, Data Interpretation, Equated Scores, High School Students

Comparing the Psychometric Properties of a Scale across Three Likert and Three Alternative Formats: An Application to the Rosenberg Self-Esteem Scale

Peer reviewed

Direct link

Zhang, Xijuan; Zhou, Linnan; Savalei, Victoria – Educational and Psychological Measurement, 2023

Zhang and Savalei proposed an alternative scale format to the Likert format, called the Expanded format. In this format, response options are presented in complete sentences, which can reduce acquiescence bias and method effects. The goal of the current study was to compare the psychometric properties of the Rosenberg Self-Esteem Scale (RSES) in…

Descriptors: Psychometrics, Self Concept Measures, Self Esteem, Comparative Analysis

On the Relationship between Item Stem Formulation and Criterion Validity of Multiple-Component Measuring Instruments

Peer reviewed

Direct link

Menold, Natalja; Raykov, Tenko – Educational and Psychological Measurement, 2022

The possible dependency of criterion validity on item formulation in a multicomponent measuring instrument is examined. The discussion is concerned with evaluation of the differences in criterion validity between two or more groups (populations/subpopulations) that have been administered instruments with items having differently formulated item…

Descriptors: Test Items, Measures (Individuals), Test Validity, Difficulty Level

Can High-Dimensional Questionnaires Resolve the Ipsativity Issue of Forced-Choice Response Formats?

Peer reviewed

Direct link

Schulte, Niklas; Holling, Heinz; Bürkner, Paul-Christian – Educational and Psychological Measurement, 2021

Forced-choice questionnaires can prevent faking and other response biases typically associated with rating scales. However, the derived trait scores are often unreliable and ipsative, making interindividual comparisons in high-stakes situations impossible. Several studies suggest that these problems vanish if the number of measured traits is high.…

Descriptors: Questionnaires, Measurement Techniques, Test Format, Scoring

Polytomous Testlet Response Models for Technology-Enhanced Innovative Items: Implications on Model Fit and Trait Inference

Peer reviewed

Direct link

Kang, Hyeon-Ah; Han, Suhwa; Kim, Doyoung; Kao, Shu-Chuan – Educational and Psychological Measurement, 2022

The development of technology-enhanced innovative items calls for practical models that can describe polytomous testlet items. In this study, we evaluate four measurement models that can characterize polytomous items administered in testlets: (a) generalized partial credit model (GPCM), (b) testlet-as-a-polytomous-item model (TPIM), (c)…

Descriptors: Goodness of Fit, Item Response Theory, Test Items, Scoring

Evaluating Different Scoring Methods for Multiple Response Items Providing Partial Credit

Peer reviewed

Direct link

Betts, Joe; Muntean, William; Kim, Doyoung; Kao, Shu-chuan – Educational and Psychological Measurement, 2022

The multiple response structure can underlie several different technology-enhanced item types. With the increased use of computer-based testing, multiple response items are becoming more common. This response type holds the potential for being scored polytomously for partial credit. However, there are several possible methods for computing raw…

Descriptors: Scoring, Test Items, Test Format, Raw Scores

Fused SDT/IRT Models for Mixed-Format Exams

Peer reviewed

Direct link

Lawrence T. DeCarlo – Educational and Psychological Measurement, 2024

A psychological framework for different types of items commonly used with mixed-format exams is proposed. A choice model based on signal detection theory (SDT) is used for multiple-choice (MC) items, whereas an item response theory (IRT) model is used for open-ended (OE) items. The SDT and IRT models are shown to share a common conceptualization…

Descriptors: Test Format, Multiple Choice Tests, Item Response Theory, Models

Diagnostic Classification Model for Forced-Choice Items and Noncognitive Tests

Peer reviewed

Direct link

Huang, Hung-Yu – Educational and Psychological Measurement, 2023

The forced-choice (FC) item formats used for noncognitive tests typically develop a set of response options that measure different traits and instruct respondents to make judgments among these options in terms of their preference to control the response biases that are commonly observed in normative tests. Diagnostic classification models (DCMs)…

Descriptors: Test Items, Classification, Bayesian Statistics, Decision Making

Efficient Standard Errors in Item Response Theory Models for Short Tests

Peer reviewed

Direct link

Ippel, Lianne; Magis, David – Educational and Psychological Measurement, 2020

In dichotomous item response theory (IRT) framework, the asymptotic standard error (ASE) is the most common statistic to evaluate the precision of various ability estimators. Easy-to-use ASE formulas are readily available; however, the accuracy of some of these formulas was recently questioned and new ASE formulas were derived from a general…

Descriptors: Item Response Theory, Error of Measurement, Accuracy, Standards

A Bayesian Random Block Item Response Theory Model for Forced-Choice Formats

Peer reviewed

Direct link

Lee, HyeSun; Smith, Weldon Z. – Educational and Psychological Measurement, 2020

Based on the framework of testlet models, the current study suggests the Bayesian random block item response theory (BRB IRT) model to fit forced-choice formats where an item block is composed of three or more items. To account for local dependence among items within a block, the BRB IRT model incorporated a random block effect into the response…

Descriptors: Bayesian Statistics, Item Response Theory, Monte Carlo Methods, Test Format

Can Reliability of Multiple Component Measuring Instruments Depend on Response Option Presentation Mode?

Peer reviewed

Direct link

Menold, Natalja; Raykov, Tenko – Educational and Psychological Measurement, 2016

This article examines the possible dependency of composite reliability on presentation format of the elements of a multi-item measuring instrument. Using empirical data and a recent method for interval estimation of group differences in reliability, we demonstrate that the reliability of an instrument need not be the same when polarity of the…

Descriptors: Test Reliability, Test Format, Test Items, Differences

A Multilevel Bifactor Approach to Construct Validation of Mixed-Format Scales

Peer reviewed

Direct link

Wang, Yan; Kim, Eun Sook; Dedrick, Robert F.; Ferron, John M.; Tan, Tony – Educational and Psychological Measurement, 2018

Wording effects associated with positively and negatively worded items have been found in many scales. Such effects may threaten construct validity and introduce systematic bias in the interpretation of results. A variety of models have been applied to address wording effects, such as the correlated uniqueness model and the correlated traits and…

Descriptors: Test Items, Test Format, Correlation, Construct Validity

Effects of Design Properties on Parameter Estimation in Large-Scale Assessments

Peer reviewed

Direct link

Hecht, Martin; Weirich, Sebastian; Siegle, Thilo; Frey, Andreas – Educational and Psychological Measurement, 2015

The selection of an appropriate booklet design is an important element of large-scale assessments of student achievement. Two design properties that are typically optimized are the "balance" with respect to the positions the items are presented and with respect to the mutual occurrence of pairs of items in the same booklet. The purpose…

Descriptors: Measurement, Computation, Test Format, Test Items

Item Response Theory Models for Wording Effects in Mixed-Format Scales

Peer reviewed

Direct link

Wang, Wen-Chung; Chen, Hui-Fang; Jin, Kuan-Yu – Educational and Psychological Measurement, 2015

Many scales contain both positively and negatively worded items. Reverse recoding of negatively worded items might not be enough for them to function as positively worded items do. In this study, we commented on the drawbacks of existing approaches to wording effect in mixed-format scales and used bi-factor item response theory (IRT) models to…

Descriptors: Item Response Theory, Test Format, Language Usage, Test Items

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7

Schriesheim, Chester A.	6
Plake, Barbara S.	4
Aiken, Lewis R.	2
Benson, Jeri	2
Dixon, Paul N.	2
Kim, Doyoung	2
Kubinger, Klaus D.	2
Menold, Natalja	2
Raykov, Tenko	2
Savalei, Victoria	2
Trevisan, Michael S.	2
Wang, Wen-Chung	2
Wilcox, Rand R.	2
Zhang, Xijuan	2
Adler, Nurit	1
Alarcon, Odette	1
Andrich, David	1
Ansorge, Charles J.	1
Arnau, Randolph C.	1
Arthur, Winfred, Jr.	1
Baldauf, Richard B., Jr.	1
Benson, Philip G.	1
Betts, Joe	1
Biswas, Dipayan	1
More ▼

Rosenberg Self Esteem Scale	3
Trends in International…	3
Raven Advanced Progressive…	2
ACT Assessment	1
Academic Motivation Scale	1
Approaches to Studying…	1
Beck Depression Inventory	1
Bem Sex Role Inventory	1
California Test of Mental…	1
Dimensions of Self Concept	1
Frostig Developmental Test of…	1
General Educational…	1
Leader Behavior Description…	1
Marlowe Crowne Social…	1
Minnesota Multiphasic…	1
Minnesota Satisfaction…	1
Myers Briggs Type Indicator	1
Need for Cognition Scale	1
Peabody Picture Vocabulary…	1
Piers Harris Childrens Self…	1
Program for International…	1
Raven Progressive Matrices	1
Sentence Completion Test	1
Sixteen Personality Factor…	1
Test of English as a Foreign…	1
More ▼