ERIC - Search Results

Publication Date

In 2025	1
Since 2024	13
Since 2021 (last 5 years)	39
Since 2016 (last 10 years)	64
Since 2006 (last 20 years)	93

Descriptor

Test Items	122
Item Response Theory	55
Models	34
Statistical Analysis	26
Computation	23
Simulation	23
Computer Assisted Testing	21
Item Analysis	20
Adaptive Testing	19
Bayesian Statistics	17
Foreign Countries	16
Probability	15
Scores	15
Test Bias	15
Goodness of Fit	14
Accuracy	13
Monte Carlo Methods	13
Maximum Likelihood Statistics	12
Reaction Time	12
Achievement Tests	11
Comparative Analysis	11
Scoring	11
Difficulty Level	10
Markov Processes	10
Psychometrics	10
More ▼

Source

Journal of Educational and…

122

Publication Type

Journal Articles	122
Reports - Research	69
Reports - Evaluative	28
Reports - Descriptive	23
Opinion Papers	3
Speeches/Meeting Papers	1

Education Level

Secondary Education	12
Higher Education	9
Elementary Education	8
Grade 8	6
Middle Schools	6
Postsecondary Education	6
Junior High Schools	5
Early Childhood Education	3
Elementary Secondary Education	3
Grade 4	3
Intermediate Grades	3
Primary Education	3
Grade 3	2
Grade 12	1
Grade 2	1
Grade 5	1
Grade 6	1
Grade 7	1
High Schools	1
More ▼

Audience

Location

China	3
Sweden	2
Belgium	1
Germany	1
Netherlands (Amsterdam)	1
United States	1
Wisconsin	1

Laws, Policies, & Programs

Assessments and Surveys

National Assessment of…	5
Program for International…	5
Law School Admission Test	2
Trends in International…	2
Armed Services Vocational…	1
Behavioral Risk Factor…	1
National Longitudinal Study…	1
National Longitudinal Survey…	1
Raven Advanced Progressive…	1
SAT (College Admission Test)	1
United States Medical…	1
Wisconsin Knowledge and…	1
More ▼

What Works Clearinghouse Rating

Showing 1 to 15 of 122 results Save | Export

Optimizing Diagnostic Classification Models Application Considering Real-Life Constraints

Peer reviewed

Direct link

Su, Kun; Henson, Robert A. – Journal of Educational and Behavioral Statistics, 2023

This article provides a process to carefully evaluate the suitability of a content domain for which diagnostic classification models (DCMs) could be applicable and then optimized steps for constructing a test blueprint for applying DCMs and a real-life example illustrating this process. The content domains were carefully evaluated using a set of…

Descriptors: Classification, Models, Science Tests, Physics

A Multidimensional Partially Compensatory Response Time Model on Basis of the Log-Normal Distribution

Peer reviewed

Direct link

Jochen Ranger; Christoph König; Benjamin W. Domingue; Jörg-Tobias Kuhn; Andreas Frey – Journal of Educational and Behavioral Statistics, 2024

In the existing multidimensional extensions of the log-normal response time (LNRT) model, the log response times are decomposed into a linear combination of several latent traits. These models are fully compensatory as low levels on traits can be counterbalanced by high levels on other traits. We propose an alternative multidimensional extension…

Descriptors: Models, Statistical Distributions, Item Response Theory, Response Rates (Questionnaires)

What Is Actually Equated in "Test Equating"? A Didactic Note

Peer reviewed

Direct link

van der Linden, Wim J. – Journal of Educational and Behavioral Statistics, 2022

The current literature on test equating generally defines it as the process necessary to obtain score comparability between different test forms. The definition is in contrast with Lord's foundational paper which viewed equating as the process required to obtain comparability of measurement scale between forms. The distinction between the notions…

Descriptors: Equated Scores, Test Items, Scores, Probability

Deep Learning Imputation for Asymmetric and Incomplete Likert-Type Items

Peer reviewed

Direct link

Zachary K. Collier; Minji Kong; Olushola Soyoye; Kamal Chawla; Ann M. Aviles; Yasser Payne – Journal of Educational and Behavioral Statistics, 2024

Asymmetric Likert-type items in research studies can present several challenges in data analysis, particularly concerning missing data. These items are often characterized by a skewed scaling, where either there is no neutral response option or an unequal number of possible positive and negative responses. The use of conventional techniques, such…

Descriptors: Likert Scales, Test Items, Item Analysis, Evaluation Methods

A Psychometric Framework for Evaluating Fairness in Algorithmic Decision Making: Differential Algorithmic Functioning

Peer reviewed

Direct link

Youmi Suk; Kyung T. Han – Journal of Educational and Behavioral Statistics, 2024

As algorithmic decision making is increasingly deployed in every walk of life, many researchers have raised concerns about fairness-related bias from such algorithms. But there is little research on harnessing psychometric methods to uncover potential discriminatory bias inside decision-making algorithms. The main goal of this article is to…

Descriptors: Psychometrics, Ethics, Decision Making, Algorithms

Analyzing Polytomous Test Data: A Comparison between an Information-Based IRT Model and the Generalized Partial Credit Model

Peer reviewed

Direct link

Joakim Wallmark; James O. Ramsay; Juan Li; Marie Wiberg – Journal of Educational and Behavioral Statistics, 2024

Item response theory (IRT) models the relationship between the possible scores on a test item against a test taker's attainment of the latent trait that the item is intended to measure. In this study, we compare two models for tests with polytomously scored items: the optimal scoring (OS) model, a nonparametric IRT model based on the principles of…

Descriptors: Item Response Theory, Test Items, Models, Scoring

Artificial Intelligence and Educational Measurement: Opportunities and Threats

Peer reviewed

Direct link

Andrew D. Ho – Journal of Educational and Behavioral Statistics, 2024

I review opportunities and threats that widely accessible Artificial Intelligence (AI)-powered services present for educational statistics and measurement. Algorithmic and computational advances continue to improve approaches to item generation, scale maintenance, test security, test scoring, and score reporting. Predictable misuses of AI for…

Descriptors: Artificial Intelligence, Measurement, Educational Assessment, Technology Uses in Education

Utilizing Real-Time Test Data to Solve Attenuation Paradox in Computerized Adaptive Testing to Enhance Optimal Design

Peer reviewed

Direct link

Jyun-Hong Chen; Hsiu-Yi Chao – Journal of Educational and Behavioral Statistics, 2024

To solve the attenuation paradox in computerized adaptive testing (CAT), this study proposes an item selection method, the integer programming approach based on real-time test data (IPRD), to improve test efficiency. The IPRD method turns information regarding the ability distribution of the population from real-time test data into feasible test…

Descriptors: Data Use, Computer Assisted Testing, Adaptive Testing, Design

Extending an Identified Four-Parameter IRT Model: The Confirmatory Set-4PNO Model

Peer reviewed

Direct link

Justin L. Kern – Journal of Educational and Behavioral Statistics, 2024

Given the frequent presence of slipping and guessing in item responses, models for the inclusion of their effects are highly important. Unfortunately, the most common model for their inclusion, the four-parameter item response theory model, potentially has severe deficiencies related to its possible unidentifiability. With this issue in mind, the…

Descriptors: Item Response Theory, Models, Bayesian Statistics, Generalization

Development of a High-Accuracy and Effective Online Calibration Method in CD-CAT Based on Gini Index

Peer reviewed

Direct link

Tan, Qingrong; Cai, Yan; Luo, Fen; Tu, Dongbo – Journal of Educational and Behavioral Statistics, 2023

To improve the calibration accuracy and calibration efficiency of cognitive diagnostic computerized adaptive testing (CD-CAT) for new items and, ultimately, contribute to the widespread application of CD-CAT in practice, the current article proposed a Gini-based online calibration method that can simultaneously calibrate the Q-matrix and item…

Descriptors: Cognitive Tests, Computer Assisted Testing, Adaptive Testing, Accuracy

Generalizing beyond the Test: Permutation-Based Profile Analysis for Explaining DIF Using Item Features

Peer reviewed

Direct link

Maria Bolsinova; Jesper Tijmstra; Leslie Rutkowski; David Rutkowski – Journal of Educational and Behavioral Statistics, 2024

Profile analysis is one of the main tools for studying whether differential item functioning can be related to specific features of test items. While relevant, profile analysis in its current form has two restrictions that limit its usefulness in practice: It assumes that all test items have equal discrimination parameters, and it does not test…

Descriptors: Test Items, Item Analysis, Generalizability Theory, Achievement Tests

On the Generalized S-X[superscript 2]-Test of Item Fit: Some Variants, Residuals, and a Graphical Visualization

Peer reviewed

Direct link

Ranger, Jochen; Brauer, Kay – Journal of Educational and Behavioral Statistics, 2022

The generalized S-X[superscript 2]-test is a test of item fit for items with polytomous responses format. The test is based on a comparison of the observed and expected number of responses in strata defined by the test score. In this article, we make four contributions. We demonstrate that the performance of the generalized S-X[superscript 2]-test…

Descriptors: Goodness of Fit, Test Items, Statistical Analysis, Item Response Theory

Testing Differential Item Functioning without Predefined Anchor Items Using Robust Regression

Peer reviewed

Direct link

Wang, Weimeng; Liu, Yang; Liu, Hongyun – Journal of Educational and Behavioral Statistics, 2022

Differential item functioning (DIF) occurs when the probability of endorsing an item differs across groups for individuals with the same latent trait level. The presence of DIF items may jeopardize the validity of an instrument; therefore, it is crucial to identify DIF items in routine operations of educational assessment. While DIF detection…

Descriptors: Test Bias, Test Items, Equated Scores, Regression (Statistics)

A Randomization P-Value Test for Detecting Copying on Multiple-Choice Exams

Peer reviewed

Direct link

Lang, Joseph B. – Journal of Educational and Behavioral Statistics, 2023

This article is concerned with the statistical detection of copying on multiple-choice exams. As an alternative to existing permutation- and model-based copy-detection approaches, a simple randomization p-value (RP) test is proposed. The RP test, which is based on an intuitive match-score statistic, makes no assumptions about the distribution of…

Descriptors: Identification, Cheating, Multiple Choice Tests, Item Response Theory

Detecting Item Preknowledge Using Revisits with Speed and Accuracy

Peer reviewed

Direct link

Demirkaya, Onur; Bezirhan, Ummugul; Zhang, Jinming – Journal of Educational and Behavioral Statistics, 2023

Examinees with item preknowledge tend to obtain inflated test scores that undermine test score validity. With the availability of process data collected in computer-based assessments, the research on detecting item preknowledge has progressed on using both item scores and response times. Item revisit patterns of examinees can also be utilized as…

Descriptors: Test Items, Prior Learning, Knowledge Level, Reaction Time

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

van der Linden, Wim J.	12
Sinharay, Sandip	7
Wainer, Howard	6
Chang, Hua-Hua	4
Zwick, Rebecca	4
Berger, Martijn P. F.	3
Douglas, Jeffrey A.	3
Robitzsch, Alexander	3
Thissen, David	3
Veldkamp, Bernard P.	3
Wang, Chun	3
Wiberg, Marie	3
Andrich, David	2
Benjamin W. Domingue	2
Bradlow, Eric T.	2
Cai, Yan	2
De Boeck, Paul	2
Fan, Zhewen	2
Hartig, Johannes	2
Johnson, Matthew S.	2
Junker, Brian W.	2
Liu, Hongyun	2
Liu, Yang	2
Longford, Nicholas T.	2
More ▼