ERIC - Search Results

Publication Date

In 2025	3
Since 2024	4
Since 2021 (last 5 years)	15

Source

Educational and Psychological…

Publication Type

Journal Articles	15
Reports - Research	13
Reports - Evaluative	2

Education Level

Higher Education	2
Postsecondary Education	2
Secondary Education	2
Elementary Secondary Education	1
High Schools	1

Audience

Location

Mexico (Mexico City)	1
Netherlands	1

Laws, Policies, & Programs

Assessments and Surveys

Program for International…

What Works Clearinghouse Rating

Showing all 15 results Save | Export

Evaluating the Predictive Reliability of Neural Networks in Psychological Research with Random Datasets

Peer reviewed

Direct link

Yongtian Cheng; K. V. Petrides – Educational and Psychological Measurement, 2025

Psychologists are emphasizing the importance of predictive conclusions. Machine learning methods, such as supervised neural networks, have been used in psychological studies as they naturally fit prediction tasks. However, we are concerned about whether neural networks fitted with random datasets (i.e., datasets where there is no relationship…

Descriptors: Psychological Studies, Artificial Intelligence, Cognitive Processes, Predictive Validity

KR20 and KR21 for Some Nondichotomous Data (It's Not Just Cronbach's Alpha)

Peer reviewed

Direct link

Foster, Robert C. – Educational and Psychological Measurement, 2021

This article presents some equivalent forms of the common Kuder-Richardson Formula 21 and 20 estimators for nondichotomous data belonging to certain other exponential families, such as Poisson count data, exponential data, or geometric counts of trials until failure. Using the generalized framework of Foster (2020), an equation for the reliability…

Descriptors: Test Reliability, Data, Computation, Mathematical Formulas

On the Pitfalls of Estimating and Using Standardized Reliability Coefficients

Peer reviewed

Direct link

Raykov, Tenko; Marcoulides, George A. – Educational and Psychological Measurement, 2021

The population discrepancy between unstandardized and standardized reliability of homogeneous multicomponent measuring instruments is examined. Within a latent variable modeling framework, it is shown that the standardized reliability coefficient for unidimensional scales can be markedly higher than the corresponding unstandardized reliability…

Descriptors: Test Reliability, Computation, Measures (Individuals), Research Problems

Exploring the Influence of Response Styles on Continuous Scale Assessments: Insights from a Novel Modeling Approach

Peer reviewed

Direct link

Hung-Yu Huang – Educational and Psychological Measurement, 2025

The use of discrete categorical formats to assess psychological traits has a long-standing tradition that is deeply embedded in item response theory models. The increasing prevalence and endorsement of computer- or web-based testing has led to greater focus on continuous response formats, which offer numerous advantages in both respondent…

Descriptors: Response Style (Tests), Psychological Characteristics, Item Response Theory, Test Reliability

Is Effort Moderated Scoring Robust to Multidimensional Rapid Guessing?

Peer reviewed

Direct link

Joseph A. Rios; Jiayi Deng – Educational and Psychological Measurement, 2025

To mitigate the potential damaging consequences of rapid guessing (RG), a form of noneffortful responding, researchers have proposed a number of scoring approaches. The present simulation study examines the robustness of the most popular of these approaches, the unidimensional effort-moderated (EM) scoring procedure, to multidimensional RG (i.e.,…

Descriptors: Scoring, Guessing (Tests), Reaction Time, Item Response Theory

The Importance of Thinking Multivariately When Setting Subscale Cutoff Scores

Peer reviewed

Direct link

Kroc, Edward; Olvera Astivia, Oscar L. – Educational and Psychological Measurement, 2022

Setting cutoff scores is one of the most common practices when using scales to aid in classification purposes. This process is usually done univariately where each optimal cutoff value is decided sequentially, subscale by subscale. While it is widely known that this process necessarily reduces the probability of "passing" such a test,…

Descriptors: Multivariate Analysis, Cutting Scores, Classification, Measurement

Separation of Traits and Extreme Response Style in IRTree Models: The Role of Mimicry Effects for the Meaningful Interpretation of Estimates

Peer reviewed

Direct link

Viola Merhof; Caroline M. Böhm; Thorsten Meiser – Educational and Psychological Measurement, 2024

Item response tree (IRTree) models are a flexible framework to control self-reported trait measurements for response styles. To this end, IRTree models decompose the responses to rating items into sub-decisions, which are assumed to be made on the basis of either the trait being measured or a response style, whereby the effects of such person…

Descriptors: Item Response Theory, Test Interpretation, Test Reliability, Test Validity

Extended Multivariate Generalizability Theory with Complex Design Structures

Peer reviewed

Direct link

Brennan, Robert L.; Kim, Stella Y.; Lee, Won-Chan – Educational and Psychological Measurement, 2022

This article extends multivariate generalizability theory (MGT) to tests with different random-effects designs for each level of a fixed facet. There are numerous situations in which the design of a test and the resulting data structure are not definable by a single design. One example is mixed-format tests that are composed of multiple-choice and…

Descriptors: Multivariate Analysis, Generalizability Theory, Multiple Choice Tests, Test Construction

A Short Note on Optimizing Cost-Generalizability via a Machine-Learning Approach

Peer reviewed

Direct link

Jiang, Zhehan; Shi, Dexin; Distefano, Christine – Educational and Psychological Measurement, 2021

The costs of an objective structured clinical examination (OSCE) are of concern to health profession educators globally. As OSCEs are usually designed under generalizability theory (G-theory) framework, this article proposes a machine-learning-based approach to optimize the costs, while maintaining the minimum required generalizability…

Descriptors: Artificial Intelligence, Generalizability Theory, Objective Tests, Foreign Countries

Can High-Dimensional Questionnaires Resolve the Ipsativity Issue of Forced-Choice Response Formats?

Peer reviewed

Direct link

Schulte, Niklas; Holling, Heinz; Bürkner, Paul-Christian – Educational and Psychological Measurement, 2021

Forced-choice questionnaires can prevent faking and other response biases typically associated with rating scales. However, the derived trait scores are often unreliable and ipsative, making interindividual comparisons in high-stakes situations impossible. Several studies suggest that these problems vanish if the number of measured traits is high.…

Descriptors: Questionnaires, Measurement Techniques, Test Format, Scoring

How Days between Tests Impacts Alternate Forms Reliability in Computerized Adaptive Tests

Peer reviewed

Direct link

Wyse, Adam E. – Educational and Psychological Measurement, 2021

An essential question when computing test--retest and alternate forms reliability coefficients is how many days there should be between tests. This article uses data from reading and math computerized adaptive tests to explore how the number of days between tests impacts alternate forms reliability coefficients. Results suggest that the highest…

Descriptors: Computer Assisted Testing, Adaptive Testing, Test Reliability, Reading Tests

Performance of Coefficient Alpha and Its Alternatives: Effects of Different Types of Non-Normality

Peer reviewed

Direct link

Xiao, Leifeng; Hau, Kit-Tai – Educational and Psychological Measurement, 2023

We examined the performance of coefficient alpha and its potential competitors (ordinal alpha, omega total, Revelle's omega total [omega RT], omega hierarchical [omega h], greatest lower bound [GLB], and coefficient "H") with continuous and discrete data having different types of non-normality. Results showed the estimation bias was…

Descriptors: Statistical Bias, Statistical Analysis, Likert Scales, Statistical Distributions

A Simple Model to Determine the Efficient Duration of Exams

Peer reviewed

Direct link

Ellis, Jules L. – Educational and Psychological Measurement, 2021

This study develops a theoretical model for the costs of an exam as a function of its duration. Two kind of costs are distinguished: (1) the costs of measurement errors and (2) the costs of the measurement. Both costs are expressed in time of the student. Based on a classical test theory model, enriched with assumptions on the context, the costs…

Descriptors: Test Length, Models, Error of Measurement, Measurement

Treatments of Differential Item Functioning: A Comparison of Four Methods

Peer reviewed

Direct link

Liu, Xiaowen; Jane Rogers, H. – Educational and Psychological Measurement, 2022

Test fairness is critical to the validity of group comparisons involving gender, ethnicities, culture, or treatment conditions. Detection of differential item functioning (DIF) is one component of efforts to ensure test fairness. The current study compared four treatments for items that have been identified as showing DIF: deleting, ignoring,…

Descriptors: Item Analysis, Comparative Analysis, Culture Fair Tests, Test Validity

Are Speeded Tests Unfair? Modeling the Impact of Time Limits on the Gender Gap in Mathematics

Peer reviewed

Direct link

Stoevenbelt, Andrea H.; Wicherts, Jelte M.; Flore, Paulette C.; Phillips, Lorraine A. T.; Pietschnig, Jakob; Verschuere, Bruno; Voracek, Martin; Schwabe, Inga – Educational and Psychological Measurement, 2023

When cognitive and educational tests are administered under time limits, tests may become speeded and this may affect the reliability and validity of the resulting test scores. Prior research has shown that time limits may create or enlarge gender gaps in cognitive and academic testing. On average, women complete fewer items than men when a test…

Descriptors: Timed Tests, Gender Differences, Item Response Theory, Correlation

Test Reliability	15
Item Response Theory	6
Test Items	4
Correlation	3
Error of Measurement	3
Foreign Countries	3
Sample Size	3
Scores	3
Test Construction	3
Test Validity	3
Artificial Intelligence	2
Bayesian Statistics	2
Computation	2
Evaluation Criteria	2
Gender Differences	2
Generalizability Theory	2
Mathematics Tests	2
Measurement	2
Multivariate Analysis	2
Predictor Variables	2
Response Style (Tests)	2
Robustness (Statistics)	2
Scoring	2
Test Bias	2
Achievement Tests	1
More ▼

Brennan, Robert L.	1
Bürkner, Paul-Christian	1
Caroline M. Böhm	1
Distefano, Christine	1
Ellis, Jules L.	1
Flore, Paulette C.	1
Foster, Robert C.	1
Hau, Kit-Tai	1
Holling, Heinz	1
Hung-Yu Huang	1
Jane Rogers, H.	1
Jiang, Zhehan	1
Jiayi Deng	1
Joseph A. Rios	1
K. V. Petrides	1
Kim, Stella Y.	1
Kroc, Edward	1
Lee, Won-Chan	1
Liu, Xiaowen	1
Marcoulides, George A.	1
Olvera Astivia, Oscar L.	1
Phillips, Lorraine A. T.	1
Pietschnig, Jakob	1
Raykov, Tenko	1
Schulte, Niklas	1
More ▼