ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	4
Since 2006 (last 20 years)	10

Descriptor

Error of Measurement	11
Probability	11
Models	4
Computation	3
Data Analysis	3
Item Response Theory	3
Sample Size	3
Statistical Analysis	3
Statistical Inference	3
Adults	2
Bayesian Statistics	2
Effect Size	2
Evaluation Methods	2
Goodness of Fit	2
Test Length	2
Test Reliability	2
Ability	1
Accuracy	1
Classification	1
Comparative Analysis	1
Competence	1
Computer Simulation	1
Correlation	1
Equipment	1
Evaluators	1
More ▼

Source

Educational and Psychological…

Author

Raykov, Tenko	2
Abad, Francisco J.	1
Beretvas, S. Natasha	1
Carstensen, Claus H.	1
Conger, Anthony J.	1
Ellis, Jules L.	1
Köhler, Carmen	1
Le, Huy	1
Lee, HwaYoung	1
Li, Tenglong	1
Liu, Chih-Yu	1
Marcoulides, George A.	1
Marcus, Justin	1
Pohl, Steffi	1
Sueiro, Manuel J.	1
Trafimow, David	1
Wang, Wen-Chung	1
Zimmerman, Donald W.	1
More ▼

Publication Type

Journal Articles	11
Reports - Research	9
Reports - Descriptive	2

Education Level

Grade 9	1
High Schools	1
Junior High Schools	1
Middle Schools	1
Secondary Education	1

Audience

Location

Germany

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 11 results Save | Export

A Simple Model to Determine the Efficient Duration of Exams

Peer reviewed

Direct link

Ellis, Jules L. – Educational and Psychological Measurement, 2021

This study develops a theoretical model for the costs of an exam as a function of its duration. Two kind of costs are distinguished: (1) the costs of measurement errors and (2) the costs of the measurement. Both costs are expressed in time of the student. Based on a classical test theory model, enriched with assumptions on the context, the costs…

Descriptors: Test Length, Models, Error of Measurement, Measurement

On the Unlikely Case of an Error-Free Principal Component from a Set of Fallible Measures

Peer reviewed

Direct link

Raykov, Tenko; Marcoulides, George A.; Li, Tenglong – Educational and Psychological Measurement, 2018

This note extends the results in the 2016 article by Raykov, Marcoulides, and Li to the case of correlated errors in a set of observed measures subjected to principal component analysis. It is shown that when at least two measures are fallible, the probability is zero for any principal component--and in particular for the first principal…

Descriptors: Factor Analysis, Error of Measurement, Correlation, Reliability

Kappa and Rater Accuracy: Paradigms and Parameters

Peer reviewed

Direct link

Conger, Anthony J. – Educational and Psychological Measurement, 2017

Drawing parallels to classical test theory, this article clarifies the difference between rater accuracy and reliability and demonstrates how category marginal frequencies affect rater agreement and Cohen's kappa. Category assignment paradigms are developed: comparing raters to a standard (index) versus comparing two raters to one another…

Descriptors: Interrater Reliability, Evaluators, Accuracy, Statistical Analysis

Using the Coefficient of Confidence to Make the Philosophical Switch from a Posteriori to a Priori Inferential Statistics

Peer reviewed

Direct link

Trafimow, David – Educational and Psychological Measurement, 2017

There has been much controversy over the null hypothesis significance testing procedure, with much of the criticism centered on the problem of inverse inference. Specifically, p gives the probability of the finding (or one more extreme) given the null hypothesis, whereas the null hypothesis significance testing procedure involves drawing a…

Descriptors: Statistical Inference, Hypothesis Testing, Probability, Intervals

Propensity Score Analysis with Fallible Covariates: A Note on a Latent Variable Modeling Approach

Peer reviewed

Direct link

Raykov, Tenko – Educational and Psychological Measurement, 2012

A latent variable modeling approach that permits estimation of propensity scores in observational studies containing fallible independent variables is outlined, with subsequent examination of treatment effect. When at least one covariate is measured with error, it is indicated that the conventional propensity score need not possess the desirable…

Descriptors: Computation, Probability, Error of Measurement, Observation

Taking the Missing Propensity into Account When Estimating Competence Scores: Evaluation of Item Response Theory Models for Nonignorable Omissions

Peer reviewed

Direct link

Köhler, Carmen; Pohl, Steffi; Carstensen, Claus H. – Educational and Psychological Measurement, 2015

When competence tests are administered, subjects frequently omit items. These missing responses pose a threat to correctly estimating the proficiency level. Newer model-based approaches aim to take nonignorable missing data processes into account by incorporating a latent missing propensity into the measurement model. Two assumptions are typically…

Descriptors: Competence, Tests, Evaluation Methods, Adults

Evaluation of Two Types of Differential Item Functioning in Factor Mixture Models with Binary Outcomes

Peer reviewed

Direct link

Lee, HwaYoung; Beretvas, S. Natasha – Educational and Psychological Measurement, 2014

Conventional differential item functioning (DIF) detection methods (e.g., the Mantel-Haenszel test) can be used to detect DIF only across observed groups, such as gender or ethnicity. However, research has found that DIF is not typically fully explained by an observed variable. True sources of DIF may include unobserved, latent variables, such as…

Descriptors: Item Analysis, Factor Structure, Bayesian Statistics, Goodness of Fit

The Overall Odds Ratio as an Intuitive Effect Size Index for Multiple Logistic Regression: Examination of Further Refinements

Peer reviewed

Direct link

Le, Huy; Marcus, Justin – Educational and Psychological Measurement, 2012

This study used Monte Carlo simulation to examine the properties of the overall odds ratio (OOR), which was recently introduced as an index for overall effect size in multiple logistic regression. It was found that the OOR was relatively independent of study base rate and performed better than most commonly used R-square analogs in indexing model…

Descriptors: Monte Carlo Methods, Probability, Mathematical Concepts, Effect Size

Assessing Goodness of Fit in Item Response Theory with Nonparametric Models: A Comparison of Posterior Probabilities and Kernel-Smoothing Approaches

Peer reviewed

Direct link

Sueiro, Manuel J.; Abad, Francisco J. – Educational and Psychological Measurement, 2011

The distance between nonparametric and parametric item characteristic curves has been proposed as an index of goodness of fit in item response theory in the form of a root integrated squared error index. This article proposes to use the posterior distribution of the latent trait as the nonparametric model and compares the performance of an index…

Descriptors: Goodness of Fit, Item Response Theory, Nonparametric Statistics, Probability

Formulation and Application of the Generalized Multilevel Facets Model

Peer reviewed

Direct link

Wang, Wen-Chung; Liu, Chih-Yu – Educational and Psychological Measurement, 2007

In this study, the authors develop a generalized multilevel facets model, which is not only a multilevel and two-parameter generalization of the facets model, but also a multilevel and facet generalization of the generalized partial credit model. Because the new model is formulated within a framework of nonlinear mixed models, no efforts are…

Descriptors: Generalization, Item Response Theory, Models, Equipment

Variability of Deviation IQ's Based on Multiple-Choice Test Scores.

Peer reviewed

Zimmerman, Donald W. – Educational and Psychological Measurement, 1985

A computer program simulated guessing on multiple-choice test items and calculated deviation IQ's from observed scores which contained a guessing component. Extensive variability in deviation IQ's due entirely to chance was found. (Author/LMO)

Descriptors: Computer Simulation, Error of Measurement, Guessing (Tests), Intelligence Quotient