ERIC - Search Results

Publication Date

In 2025	0
Since 2024	2
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	8
Since 2006 (last 20 years)	12

Descriptor

Bayesian Statistics	14
Correlation	14
Test Items	14
Item Response Theory	10
Comparative Analysis	6
Models	6
Accuracy	5
Sample Size	5
Mathematics Tests	4
Monte Carlo Methods	4
Reading Tests	4
Computation	3
Foreign Countries	3
Goodness of Fit	3
Item Analysis	3
Markov Processes	3
Simulation	3
Statistical Analysis	3
Achievement Tests	2
Classification	2
Computer Software	2
Difficulty Level	2
Elementary School Students	2
Elementary Secondary Education	2
Error of Measurement	2
More ▼

Source

Educational and Psychological…	3
ProQuest LLC	3
Journal of Educational…	2
Applied Psychological…	1
Assessment & Evaluation in…	1
ETS Research Report Series	1
Grantee Submission	1
Journal of Educational and…	1

Publication Type

Journal Articles	9
Reports - Research	9
Dissertations/Theses -…	3
Reports - Evaluative	2

Education Level

Elementary Education	3
Grade 8	2
Higher Education	2
Intermediate Grades	2
Postsecondary Education	2
Early Childhood Education	1
Elementary Secondary Education	1
Grade 4	1
Grade 5	1
Grade 7	1
Grade 9	1
High Schools	1
Junior High Schools	1
Kindergarten	1
Middle Schools	1
Primary Education	1
Secondary Education	1
More ▼

Audience

Location

Germany	2
Botswana	1
Canada	1
Chile	1
Georgia Republic	1
Malaysia	1
Norway	1
Philippines	1
Poland	1
Russia	1
Saudi Arabia	1
Singapore	1
Switzerland	1
Taiwan	1
Thailand	1
United States	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

National Assessment of…	1
Progress in International…	1
Trends in International…	1

What Works Clearinghouse Rating

Showing all 14 results Save | Export

An Evaluation of Fit Indices Used in Model Selection of Dichotomous Mixture IRT Models

Peer reviewed

Direct link

Sedat Sen; Allan S. Cohen – Educational and Psychological Measurement, 2024

A Monte Carlo simulation study was conducted to compare fit indices used for detecting the correct latent class in three dichotomous mixture item response theory (IRT) models. Ten indices were considered: Akaike's information criterion (AIC), the corrected AIC (AICc), Bayesian information criterion (BIC), consistent AIC (CAIC), Draper's…

Descriptors: Goodness of Fit, Item Response Theory, Sample Size, Classification

Addressing Uncodable Behaviors: A Bayesian Ordinal Mixture Model Applied to a Mathematics Learning Trajectory Teaching Experiment

Peer reviewed

Direct link

Pavel Chernyavskiy; Traci S. Kutaka; Carson Keeter; Julie Sarama; Douglas Clements – Grantee Submission, 2024

When researchers code behavior that is undetectable or falls outside of the validated ordinal scale, the resultant outcomes often suffer from informative missingness. Incorrect analysis of such data can lead to biased arguments around efficacy and effectiveness in the context of experimental and intervention research. Here, we detail a new…

Descriptors: Bayesian Statistics, Mathematics Instruction, Learning Trajectories, Item Response Theory

A Short Note on Obtaining Point Estimates of the IRT Ability Parameter with MCMC Estimation in Mplus: How Many Plausible Values Are Needed?

Peer reviewed

Direct link

Luo, Yong; Dimitrov, Dimiter M. – Educational and Psychological Measurement, 2019

Plausible values can be used to either estimate population-level statistics or compute point estimates of latent variables. While it is well known that five plausible values are usually sufficient for accurate estimation of population-level statistics in large-scale surveys, the minimum number of plausible values needed to obtain accurate latent…

Descriptors: Item Response Theory, Monte Carlo Methods, Markov Processes, Outcome Measures

Person-Fit Statistics for Joint Models for Accuracy and Speed

Peer reviewed

Direct link

Fox, Jean-Paul; Marianti, Sukaesi – Journal of Educational Measurement, 2017

Response accuracy and response time data can be analyzed with a joint model to measure ability and speed of working, while accounting for relationships between item and person characteristics. In this study, person-fit statistics are proposed for joint models to detect aberrant response accuracy and/or response time patterns. The person-fit tests…

Descriptors: Accuracy, Reaction Time, Statistics, Test Items

Detecting Differential Item Discrimination (DID) and the Consequences of Ignoring DID in Multilevel Item Response Models

Peer reviewed

Direct link

Lee, Woo-yeol; Cho, Sun-Joo – Journal of Educational Measurement, 2017

Cross-level invariance in a multilevel item response model can be investigated by testing whether the within-level item discriminations are equal to the between-level item discriminations. Testing the cross-level invariance assumption is important to understand constructs in multilevel data. However, in most multilevel item response model…

Descriptors: Test Items, Item Response Theory, Item Analysis, Simulation

Application of the IRT and TRT Models to a Reading Comprehension Test

Direct link

Kim, Weon H. – ProQuest LLC, 2017

The purpose of the present study is to apply the item response theory (IRT) and testlet response theory (TRT) models to a reading comprehension test. This study applied the TRT models and the traditional IRT model to a seventh-grade reading comprehension test (n = 8,815) with eight testlets. These three models were compared to determine the best…

Descriptors: Item Response Theory, Test Items, Correlation, Reading Tests

A Comparative Study of Online Item Calibration Methods in Multidimensional Computerized Adaptive Testing

Peer reviewed

Direct link

Chen, Ping – Journal of Educational and Behavioral Statistics, 2017

Calibration of new items online has been an important topic in item replenishment for multidimensional computerized adaptive testing (MCAT). Several online calibration methods have been proposed for MCAT, such as multidimensional "one expectation-maximization (EM) cycle" (M-OEM) and multidimensional "multiple EM cycles"…

Descriptors: Test Items, Item Response Theory, Test Construction, Adaptive Testing

The Performance of the Linear Logistic Test Model When the Q-Matrix Is Misspecified: A Simulation Study

Direct link

MacDonald, George T. – ProQuest LLC, 2014

A simulation study was conducted to explore the performance of the linear logistic test model (LLTM) when the relationships between items and cognitive components were misspecified. Factors manipulated included percent of misspecification (0%, 1%, 5%, 10%, and 15%), form of misspecification (under-specification, balanced misspecification, and…

Descriptors: Simulation, Item Response Theory, Models, Test Items

Dealing with Omitted and Not-Reached Items in Competence Tests: Evaluating Approaches Accounting for Missing Responses in Item Response Theory Models

Peer reviewed

Direct link

Pohl, Steffi; Gräfe, Linda; Rose, Norman – Educational and Psychological Measurement, 2014

Data from competence tests usually show a number of missing responses on test items due to both omitted and not-reached items. Different approaches for dealing with missing responses exist, and there are no clear guidelines on which of those to use. While classical approaches rely on an ignorable missing data mechanism, the most recently developed…

Descriptors: Test Items, Achievement Tests, Item Response Theory, Models

Interpretation of the Three-Parameter Testlet Response Model and Information Function

Peer reviewed

Direct link

Ip, Edward H. – Applied Psychological Measurement, 2010

The testlet response model is designed for handling items that are clustered, such as those embedded within the same reading passage. Although the testlet is a powerful tool for handling item clusters in educational and psychological testing, the interpretations of its item parameters, the conditional correlation between item pairs, and the…

Descriptors: Item Response Theory, Models, Test Items, Correlation

Diagnosing Examinees' Attributes-Mastery Using the Bayesian Inference for Binomial Proportion: A New Method for Cognitive Diagnostic Assessment

Direct link

Kim, Hyun Seok John – ProQuest LLC, 2011

Cognitive diagnostic assessment (CDA) is a new theoretical framework for psychological and educational testing that is designed to provide detailed information about examinees' strengths and weaknesses in specific knowledge structures and processing skills. During the last three decades, more than a dozen psychometric models have been developed…

Descriptors: Cognitive Measurement, Diagnostic Tests, Bayesian Statistics, Statistical Inference

Comparing Future Teachers' Beliefs across Countries: Approximate Measurement Invariance with Bayesian Elastic Constraints for Local Item Dependence and Differential Item Functioning

Peer reviewed

Direct link

Braeken, Johan; Blömeke, Sigrid – Assessment & Evaluation in Higher Education, 2016

Using data from the international Teacher Education and Development Study: Learning to Teach Mathematics (TEDS-M), the measurement equivalence of teachers' beliefs across countries is investigated for the case of "mathematics-as-a fixed-ability". Measurement equivalence is a crucial topic in all international large-scale assessments and…

Descriptors: Comparative Analysis, Bayesian Statistics, Test Bias, Teacher Education

The Respective Advantages and Disadvantages of Different Ways of Measuring the Instructional Sensitivity of Reading Comprehension Test Items.

Perkins, Kyle – 1987

In this paper four classes of procedures for measuring the instructional sensitivity of reading comprehension test items are reviewed. True experimental designs are not recommended because some of the most important reading comprehension variables do not lend themselves to experimental manipulation. "Ex post facto" factorial designs are…

Descriptors: Bayesian Statistics, Correlation, Elementary Secondary Education, Evaluation Methods

Model Diagnostics for Bayesian Networks. Research Report. ETS RR-04-17

Peer reviewed
PDF on ERIC

Download full text

Sinharay, Sandip – ETS Research Report Series, 2004

Assessing fit of psychometric models has always been an issue of enormous interest, but there exists no unanimously agreed upon item fit diagnostic for the models. Bayesian networks, frequently used in educational assessments (see, for example, Mislevy, Almond, Yan, & Steinberg, 2001) primarily for learning about students' knowledge and…

Descriptors: Bayesian Statistics, Networks, Models, Goodness of Fit

Allan S. Cohen	1
Blömeke, Sigrid	1
Braeken, Johan	1
Carson Keeter	1
Chen, Ping	1
Cho, Sun-Joo	1
Dimitrov, Dimiter M.	1
Douglas Clements	1
Fox, Jean-Paul	1
Gräfe, Linda	1
Ip, Edward H.	1
Julie Sarama	1
Kim, Hyun Seok John	1
Kim, Weon H.	1
Lee, Woo-yeol	1
Luo, Yong	1
MacDonald, George T.	1
Marianti, Sukaesi	1
Pavel Chernyavskiy	1
Perkins, Kyle	1
Pohl, Steffi	1
Rose, Norman	1
Sedat Sen	1
Sinharay, Sandip	1
Traci S. Kutaka	1
More ▼