ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	4
Since 2016 (last 10 years)	7
Since 2006 (last 20 years)	23

Descriptor

Error of Measurement	32
Probability	32
Statistical Analysis	12
Computation	10
Item Response Theory	8
Comparative Analysis	6
Maximum Likelihood Statistics	6
Models	6
Classification	5
Monte Carlo Methods	5
Statistical Distributions	5
Evaluation Methods	4
Mathematical Models	4
Psychometrics	4
Reliability	4
Sample Size	4
Simulation	4
Test Items	4
True Scores	4
Academic Achievement	3
Computer Simulation	3
Equated Scores	3
Evaluation	3
Foreign Countries	3
Goodness of Fit	3
More ▼

Source

Journal of Educational and…	5
Multivariate Behavioral…	3
Psychometrika	3
Applied Psychological…	2
Journal of Educational…	2
Cornell Higher Education…	1
Educational Research	1
Educational Researcher	1
Journal of Experimental…	1
Journal of Human Resources	1
Measurement:…	1
National Center for Research…	1
Performance Improvement…	1
Psychological Review	1
Research Synthesis Methods	1
Sociological Methods &…	1
Structural Equation Modeling:…	1
More ▼

Publication Type

Reports - Evaluative	32
Journal Articles	25
Speeches/Meeting Papers	3
Numerical/Quantitative Data	1
Opinion Papers	1
Reports - Research	1

Education Level

Higher Education	2
Postsecondary Education	1

Audience

Location

Europe	1
Ohio	1
Pennsylvania	1
United Kingdom (England)	1

Laws, Policies, & Programs

Assessments and Surveys

ACT Assessment	1
California Learning…	1
National Assessment of…	1
Work Keys (ACT)	1

What Works Clearinghouse Rating

Showing 1 to 15 of 32 results Save | Export

Estimation of Heterogeneity Variance Based on a Generalized "Q" Statistic in Meta-Analysis of Log-Odds-Ratio

Peer reviewed

Direct link

Kulinskaya, Elena; Hoaglin, David C. – Research Synthesis Methods, 2023

For estimation of heterogeneity variance T[superscript 2] in meta-analysis of log-odds-ratio, we derive new mean- and median-unbiased point estimators and new interval estimators based on a generalized Q statistic, Q[subscript F], in which the weights depend on only the studies' effective sample sizes. We compare them with familiar estimators…

Descriptors: Q Methodology, Statistical Analysis, Meta Analysis, Intervals

Model Misspecification and Robustness of Observed-Score Test Equating Using Propensity Scores

Peer reviewed

Direct link

Wallin, Gabriel; Wiberg, Marie – Journal of Educational and Behavioral Statistics, 2023

This study explores the usefulness of covariates on equating test scores from nonequivalent test groups. The covariates are captured by an estimated propensity score, which is used as a proxy for latent ability to balance the test groups. The objective is to assess the sensitivity of the equated scores to various misspecifications in the…

Descriptors: Models, Error of Measurement, Robustness (Statistics), Equated Scores

Interval Estimation of Item Response Probabilities along Studied Latent Dimensions

Peer reviewed

Direct link

Raykov, Tenko; Marcoulides, George A.; Pusic, Martin – Measurement: Interdisciplinary Research and Perspectives, 2021

An interval estimation procedure is discussed that can be used to evaluate the probability of a particular response for a binary or binary scored item at a pre-specified point along an underlying latent continuum. The item is assumed to: (a) be part of a unidimensional multi-component measuring instrument that may contain also polytomous items,…

Descriptors: Item Response Theory, Computation, Probability, Test Items

BIC Extensions for Order-Constrained Model Selection

Peer reviewed

Direct link

Mulder, J.; Raftery, A. E. – Sociological Methods & Research, 2022

The Schwarz or Bayesian information criterion (BIC) is one of the most widely used tools for model comparison in social science research. The BIC, however, is not suitable for evaluating models with order constraints on the parameters of interest. This article explores two extensions of the BIC for evaluating order-constrained models, one where a…

Descriptors: Models, Social Science Research, Programming Languages, Bayesian Statistics

Examining the Precision of Cut Scores within a Generalizability Theory Framework: A Closer Look at the Item Effect

Peer reviewed

Direct link

Clauser, Brian E.; Kane, Michael; Clauser, Jerome C. – Journal of Educational Measurement, 2020

An Angoff standard setting study generally yields judgments on a number of items by a number of judges (who may or may not be nested in panels). Variability associated with judges (and possibly panels) contributes error to the resulting cut score. The variability associated with items plays a more complicated role. To the extent that the mean item…

Descriptors: Cutting Scores, Generalization, Decision Making, Standard Setting

A Comparison of Propensity Score Weighting Methods for Evaluating the Effects of Programs with Multiple Versions

Peer reviewed

Direct link

Leite, Walter L.; Aydin, Burak; Gurel, Sungur – Journal of Experimental Education, 2019

This Monte Carlo simulation study compares methods to estimate the effects of programs with multiple versions when assignment of individuals to program version is not random. These methods use generalized propensity scores, which are predicted probabilities of receiving a particular level of the treatment conditional on covariates, to remove…

Descriptors: Probability, Weighted Scores, Monte Carlo Methods, Statistical Bias

Estimation of Expected Fisher Information for IRT Models

Peer reviewed

Direct link

Monroe, Scott – Journal of Educational and Behavioral Statistics, 2019

In item response theory (IRT) modeling, the Fisher information matrix is used for numerous inferential procedures such as estimating parameter standard errors, constructing test statistics, and facilitating test scoring. In principal, these procedures may be carried out using either the expected information or the observed information. However, in…

Descriptors: Item Response Theory, Error of Measurement, Scoring, Inferences

Standard Error of Linear Observed-Score Equating for the NEAT Design with Nonnormally Distributed Data

Peer reviewed

Direct link

Zu, Jiyun; Yuan, Ke-Hai – Journal of Educational Measurement, 2012

In the nonequivalent groups with anchor test (NEAT) design, the standard error of linear observed-score equating is commonly estimated by an estimator derived assuming multivariate normality. However, real data are seldom normally distributed, causing this normal estimator to be inconsistent. A general estimator, which does not rely on the…

Descriptors: Sample Size, Equated Scores, Test Items, Error of Measurement

Two Studies of Specification Error in Models for Categorical Latent Variables

Peer reviewed

Direct link

Kaplan, David; Depaoli, Sarah – Structural Equation Modeling: A Multidisciplinary Journal, 2011

This article examines the problem of specification error in 2 models for categorical latent variables; the latent class model and the latent Markov model. Specification error in the latent class model focuses on the impact of incorrectly specifying the number of latent classes of the categorical latent variable on measures of model adequacy as…

Descriptors: Markov Processes, Longitudinal Studies, Probability, Item Response Theory

Testing Mixture Models of Transitive Preference: Comment on Regenwetter, Dana, and Davis-Stober (2011)

Peer reviewed

Direct link

Birnbaum, Michael H. – Psychological Review, 2011

This article contrasts 2 approaches to analyzing transitivity of preference and other behavioral properties in choice data. The approach of Regenwetter, Dana, and Davis-Stober (2011) assumes that on each choice, a decision maker samples randomly from a mixture of preference orders to determine whether "A" is preferred to "B." In contrast, Birnbaum…

Descriptors: Evidence, Testing, Computation, Probability

Sample Size Determination for Rasch Model Tests

Peer reviewed

Direct link

Draxler, Clemens – Psychometrika, 2010

This paper is concerned with supplementing statistical tests for the Rasch model so that additionally to the probability of the error of the first kind (Type I probability) the probability of the error of the second kind (Type II probability) can be controlled at a predetermined level by basing the test on the appropriate number of observations.…

Descriptors: Statistical Analysis, Probability, Sample Size, Error of Measurement

A New Statistic for Evaluating Item Response Theory Models for Ordinal Data. CRESST Report 839

Download full text

Cai, Li; Monroe, Scott – National Center for Research on Evaluation, Standards, and Student Testing (CRESST), 2014

We propose a new limited-information goodness of fit test statistic C[subscript 2] for ordinal IRT models. The construction of the new statistic lies formally between the M[subscript 2] statistic of Maydeu-Olivares and Joe (2006), which utilizes first and second order marginal probabilities, and the M*[subscript 2] statistic of Cai and Hansen…

Descriptors: Item Response Theory, Models, Goodness of Fit, Probability

Using Ensemble-Based Methods for Directly Estimating Causal Effects: An Investigation of Tree-Based G-Computation

Peer reviewed

Direct link

Austin, Peter C. – Multivariate Behavioral Research, 2012

Researchers are increasingly using observational or nonrandomized data to estimate causal treatment effects. Essential to the production of high-quality evidence is the ability to reduce or minimize the confounding that frequently occurs in observational studies. When using the potential outcome framework to define causal treatment effects, one…

Descriptors: Computation, Regression (Statistics), Statistical Bias, Error of Measurement

Setting Meaningful Criterion-Reference Cut Scores as an Effective Professional Development

Direct link

Munyofu, Paul – Performance Improvement Quarterly, 2010

The state of Pennsylvania, like many organizations interested in performance improvement, routinely engages in professional development activities. Educators in this hands-on activity engaged in setting meaningful criterion-referenced cut scores for career and technical education assessments using two methods. The main purposes of this study were…

Descriptors: Standard Setting, Cutting Scores, Professional Development, Vocational Education

A Meta-Meta-Analysis: Empirical Review of Statistical Power, Type I Error Rates, Effect Sizes, and Model Selection of Meta-Analyses Published in Psychology

Peer reviewed

Direct link

Cafri, Guy; Kromrey, Jeffrey D.; Brannick, Michael T. – Multivariate Behavioral Research, 2010

This article uses meta-analyses published in "Psychological Bulletin" from 1995 to 2005 to describe meta-analyses in psychology, including examination of statistical power, Type I errors resulting from multiple comparisons, and model choice. Retrospective power estimates indicated that univariate categorical and continuous moderators, individual…

Descriptors: Periodicals, Effect Size, Sampling, Psychology

Previous Page | Next Page »

Pages: 1 | 2 | 3

Monroe, Scott	2
Andrich, David	1
Austin, Peter C.	1
Aydin, Burak	1
Birnbaum, Michael H.	1
Bramley, Tom	1
Brannick, Michael T.	1
Brennan, Robert L.	1
Cafri, Guy	1
Cai, Li	1
Clauser, Brian E.	1
Clauser, Jerome C.	1
Davey, Tim	1
DeMars, Christine E.	1
Depaoli, Sarah	1
Draxler, Clemens	1
Gundersen, Craig	1
Gurel, Sungur	1
Haberman, Shelby J.	1
Hoaglin, David C.	1
Hutchinson, J. Wesley	1
Jarrell, Michele G.	1
Kane, Michael	1
Kaplan, David	1
More ▼