ERIC - Search Results

Publication Date

In 2025	1
Since 2024	2
Since 2021 (last 5 years)	4
Since 2016 (last 10 years)	6
Since 2006 (last 20 years)	8

Descriptor

Correlation	8
Test Items	8
Item Analysis	4
Accuracy	3
Adaptive Testing	3
Computer Assisted Testing	3
Item Response Theory	3
Reliability	3
Simulation	3
College Entrance Examinations	2
Comparative Analysis	2
Computation	2
Educational Assessment	2
Item Banks	2
Models	2
Test Bias	2
Test Construction	2
Algorithms	1
Bayesian Statistics	1
Causal Models	1
Classification	1
Cognitive Ability	1
Design	1
Difficulty Level	1
Effect Size	1
More ▼

Source

Journal of Educational and…

Publication Type

Journal Articles	8
Reports - Research	7
Reports - Evaluative	1

Education Level

Higher Education	2
Early Childhood Education	1
Elementary Education	1
Grade 2	1
Postsecondary Education	1
Primary Education	1

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

Law School Admission Test	1
SAT (College Admission Test)	1

What Works Clearinghouse Rating

Showing all 8 results Save | Export

Estimating Difference-Score Reliability in Pretest-Posttest Settings

Peer reviewed

Direct link

Gu, Zhengguo; Emons, Wilco H. M.; Sijtsma, Klaas – Journal of Educational and Behavioral Statistics, 2021

Clinical, medical, and health psychologists use difference scores obtained from pretest--posttest designs employing the same test to assess intraindividual change possibly caused by an intervention addressing, for example, anxiety, depression, eating disorder, or addiction. Reliability of difference scores is important for interpreting observed…

Descriptors: Test Reliability, Scores, Pretests Posttests, Computation

A Comparison of Latent Semantic Analysis and Latent Dirichlet Allocation in Educational Measurement

Peer reviewed

Direct link

Jordan M. Wheeler; Allan S. Cohen; Shiyu Wang – Journal of Educational and Behavioral Statistics, 2024

Topic models are mathematical and statistical models used to analyze textual data. The objective of topic models is to gain information about the latent semantic space of a set of related textual data. The semantic space of a set of textual data contains the relationship between documents and words and how they are used. Topic models are becoming…

Descriptors: Semantics, Educational Assessment, Evaluators, Reliability

The Reliability of the Posterior Probability of Skill Attainment in Diagnostic Classification Models

Peer reviewed

Direct link

Johnson, Matthew S.; Sinharay, Sandip – Journal of Educational and Behavioral Statistics, 2020

One common score reported from diagnostic classification assessments is the vector of posterior means of the skill mastery indicators. As with any assessment, it is important to derive and report estimates of the reliability of the reported scores. After reviewing a reliability measure suggested by Templin and Bradshaw, this article suggests three…

Descriptors: Reliability, Probability, Skill Development, Classification

Speed-Accuracy Trade-Off? Not so Fast: Marginal Changes in Speed Have Inconsistent Relationships with Accuracy in Real-World Settings

Peer reviewed
PDF on ERIC

Download full text

Direct link

Domingue, Benjamin W.; Kanopka, Klint; Stenhaug, Ben; Sulik, Michael J.; Beverly, Tanesia; Brinkhuis, Matthieu; Circi, Ruhan; Faul, Jessica; Liao, Dandan; McCandliss, Bruce; Obradovic, Jelena; Piech, Chris; Porter, Tenelle; Soland, James; Weeks, Jon; Wise, Steven L.; Yeatman, Jason – Journal of Educational and Behavioral Statistics, 2022

The speed-accuracy trade-off (SAT) suggests that time constraints reduce response accuracy. Its relevance in observational settings--where response time (RT) may not be constrained but respondent speed may still vary--is unclear. Using 29 data sets containing data from cognitive tasks, we use a flexible method for identification of the SAT (which…

Descriptors: Accuracy, Reaction Time, Task Analysis, College Entrance Examinations

Disentangling Person-Dependent and Item-Dependent Causal Effects: Applications of Item Response Theory to the Estimation of Treatment Effect Heterogeneity

Peer reviewed

Direct link

Joshua B. Gilbert; Luke W. Miratrix; Mridul Joshi; Benjamin W. Domingue – Journal of Educational and Behavioral Statistics, 2025

Analyzing heterogeneous treatment effects (HTEs) plays a crucial role in understanding the impacts of educational interventions. A standard practice for HTE analysis is to examine interactions between treatment status and preintervention participant characteristics, such as pretest scores, to identify how different groups respond to treatment.…

Descriptors: Causal Models, Item Response Theory, Statistical Inference, Psychometrics

A Comparative Study of Online Item Calibration Methods in Multidimensional Computerized Adaptive Testing

Peer reviewed

Direct link

Chen, Ping – Journal of Educational and Behavioral Statistics, 2017

Calibration of new items online has been an important topic in item replenishment for multidimensional computerized adaptive testing (MCAT). Several online calibration methods have been proposed for MCAT, such as multidimensional "one expectation-maximization (EM) cycle" (M-OEM) and multidimensional "multiple EM cycles"…

Descriptors: Test Items, Item Response Theory, Test Construction, Adaptive Testing

Improving Measurement Precision of Hierarchical Latent Traits Using Adaptive Testing

Peer reviewed

Direct link

Wang, Chun – Journal of Educational and Behavioral Statistics, 2014

Many latent traits in social sciences display a hierarchical structure, such as intelligence, cognitive ability, or personality. Usually a second-order factor is linearly related to a group of first-order factors (also called domain abilities in cognitive ability measures), and the first-order factors directly govern the actual item responses.…

Descriptors: Measurement, Accuracy, Item Response Theory, Adaptive Testing

Assembling a Computerized Adaptive Testing Item Pool as a Set of Linear Tests

Peer reviewed

Direct link

van der Linden, Wim J.; Ariel, Adelaide; Veldkamp, Bernard P. – Journal of Educational and Behavioral Statistics, 2006

Test-item writing efforts typically results in item pools with an undesirable correlational structure between the content attributes of the items and their statistical information. If such pools are used in computerized adaptive testing (CAT), the algorithm may be forced to select items with less than optimal information, that violate the content…

Descriptors: Adaptive Testing, Computer Assisted Testing, Test Items, Item Banks

Allan S. Cohen	1
Ariel, Adelaide	1
Benjamin W. Domingue	1
Beverly, Tanesia	1
Brinkhuis, Matthieu	1
Chen, Ping	1
Circi, Ruhan	1
Domingue, Benjamin W.	1
Emons, Wilco H. M.	1
Faul, Jessica	1
Gu, Zhengguo	1
Johnson, Matthew S.	1
Jordan M. Wheeler	1
Joshua B. Gilbert	1
Kanopka, Klint	1
Liao, Dandan	1
Luke W. Miratrix	1
McCandliss, Bruce	1
Mridul Joshi	1
Obradovic, Jelena	1
Piech, Chris	1
Porter, Tenelle	1
Shiyu Wang	1
Sijtsma, Klaas	1
Sinharay, Sandip	1
More ▼