ERIC - Search Results

Publication Date

In 2025	1
Since 2024	1
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	8
Since 2006 (last 20 years)	23

Descriptor

Bayesian Statistics	24
Statistical Analysis	24
Computation	13
Models	9
Regression (Statistics)	7
Maximum Likelihood Statistics	5
Monte Carlo Methods	5
Test Bias	5
Test Items	5
Error of Measurement	4
Foreign Countries	4
Goodness of Fit	4
Markov Processes	4
Scores	4
Statistical Bias	4
Comparative Analysis	3
Correlation	3
Educational Research	3
Equations (Mathematics)	3
Hierarchical Linear Modeling	3
Item Response Theory	3
Mathematics Achievement	3
Sample Size	3
Simulation	3
Statistical Inference	3
More ▼

Source

Journal of Educational and…

Publication Type

Journal Articles	24
Reports - Research	19
Reports - Descriptive	2
Reports - Evaluative	2
Opinion Papers	1

Education Level

Elementary Education	3
Grade 5	2
Grade 6	2
Higher Education	2
Intermediate Grades	2
Middle Schools	2
Secondary Education	2
Elementary Secondary Education	1
Grade 3	1
Grade 4	1
Grade 7	1
Grade 8	1
Postsecondary Education	1
More ▼

Audience

Location

Brazil	1
Canada	1
Colombia	1
New York	1
Pennsylvania	1
United Kingdom (England)	1

Laws, Policies, & Programs

Assessments and Surveys

Program for International…

What Works Clearinghouse Rating

Showing 1 to 15 of 24 results Save | Export

Measurement and Uncertainty Preserving Parametric Modeling for Continuous Latent Variables with Discrete Indicators and External Variables

Peer reviewed

Direct link

Roy Levy; Daniel McNeish – Journal of Educational and Behavioral Statistics, 2025

Research in education and behavioral sciences often involves the use of latent variable models that are related to indicators, as well as related to covariates or outcomes. Such models are subject to interpretational confounding, which occurs when fitting the model with covariates or outcomes alters the results for the measurement model. This has…

Descriptors: Models, Statistical Analysis, Measurement, Data Interpretation

The Use of the Posterior Probability in Score Differencing

Peer reviewed
PDF on ERIC

Download full text

Direct link

Sinharay, Sandip; Johnson, Matthew S. – Journal of Educational and Behavioral Statistics, 2021

Score differencing is one of the six categories of statistical methods used to detect test fraud (Wollack & Schoenig, 2018) and involves the testing of the null hypothesis that the performance of an examinee is similar over two item sets versus the alternative hypothesis that the performance is better on one of the item sets. We suggest, to…

Descriptors: Probability, Bayesian Statistics, Cheating, Statistical Analysis

A Bayesian Nonparametric Latent Approach for Score Distributions in Test Equating

Peer reviewed

Direct link

Varas, Inés M.; González, Jorge; Quintana, Fernando A. – Journal of Educational and Behavioral Statistics, 2020

Equating is a family of statistical models and methods used to adjust scores on different test forms so that they can be comparable and used interchangeably. Equated scores are obtained estimating the equating transformation function, which maps the scores on the scale of one test form into their equivalents on the scale of other one. All the…

Descriptors: Bayesian Statistics, Nonparametric Statistics, Equated Scores, Statistical Analysis

Statistical Equivalence Testing Approaches for Mantel-Haenszel DIF Analysis

Peer reviewed

Direct link

Casabianca, Jodi M.; Lewis, Charles – Journal of Educational and Behavioral Statistics, 2018

The null hypothesis test used in differential item functioning (DIF) detection tests for a subgroup difference in item-level performance--if the null hypothesis of "no DIF" is rejected, the item is flagged for DIF. Conversely, an item is kept in the test form if there is insufficient evidence of DIF. We present frequentist and empirical…

Descriptors: Test Bias, Hypothesis Testing, Bayesian Statistics, Statistical Analysis

Interval Estimation of Latent Variable Scores in Item Response Theory

Peer reviewed

Direct link

Liu, Yang; Yang, Ji Seung – Journal of Educational and Behavioral Statistics, 2018

The uncertainty arising from item parameter estimation is often not negligible and must be accounted for when calculating latent variable (LV) scores in item response theory (IRT). It is particularly so when the calibration sample size is limited and/or the calibration IRT model is complex. In the current work, we treat two-stage IRT scoring as a…

Descriptors: Intervals, Scores, Item Response Theory, Bayesian Statistics

Avoiding Bias When Estimating the Consistency and Stability of Value-Added School Effects

Peer reviewed

Direct link

Leckie, George – Journal of Educational and Behavioral Statistics, 2018

The traditional approach to estimating the consistency of school effects across subject areas and the stability of school effects across time is to fit separate value-added multilevel models to each subject or cohort and to correlate the resulting empirical Bayes predictions. We show that this gives biased correlations and these biases cannot be…

Descriptors: Value Added Models, Reliability, Statistical Bias, Computation

On Matrix Sampling and Imputation of Context Questionnaires with Implications for the Generation of Plausible Values in Large-Scale Assessments

Peer reviewed

Direct link

Kaplan, David; Su, Dan – Journal of Educational and Behavioral Statistics, 2016

This article presents findings on the consequences of matrix sampling of context questionnaires for the generation of plausible values in large-scale assessments. Three studies are conducted. Study 1 uses data from PISA 2012 to examine several different forms of missing data imputation within the chained equations framework: predictive mean…

Descriptors: Sampling, Questionnaires, Measurement, International Assessment

Detection of Differential Item Functioning Using the Lasso Approach

Peer reviewed

Direct link

Magis, David; Tuerlinckx, Francis; De Boeck, Paul – Journal of Educational and Behavioral Statistics, 2015

This article proposes a novel approach to detect differential item functioning (DIF) among dichotomously scored items. Unlike standard DIF methods that perform an item-by-item analysis, we propose the "LR lasso DIF method": logistic regression (LR) model is formulated for all item responses. The model contains item-specific intercepts,…

Descriptors: Test Bias, Test Items, Regression (Statistics), Scores

A Quasi-Parametric Method for Fitting Flexible Item Response Functions

Peer reviewed

Direct link

Liang, Longjuan; Browne, Michael W. – Journal of Educational and Behavioral Statistics, 2015

If standard two-parameter item response functions are employed in the analysis of a test with some newly constructed items, it can be expected that, for some items, the item response function (IRF) will not fit the data well. This lack of fit can also occur when standard IRFs are fitted to personality or psychopathology items. When investigating…

Descriptors: Item Response Theory, Statistical Analysis, Goodness of Fit, Bayesian Statistics

Weakly Informative Prior for Point Estimation of Covariance Matrices in Hierarchical Models

Peer reviewed

Direct link

Chung, Yeojin; Gelman, Andrew; Rabe-Hesketh, Sophia; Liu, Jingchen; Dorie, Vincent – Journal of Educational and Behavioral Statistics, 2015

When fitting hierarchical regression models, maximum likelihood (ML) estimation has computational (and, for some users, philosophical) advantages compared to full Bayesian inference, but when the number of groups is small, estimates of the covariance matrix (S) of group-level varying coefficients are often degenerate. One can do better, even from…

Descriptors: Regression (Statistics), Hierarchical Linear Modeling, Bayesian Statistics, Statistical Inference

Using Data-Dependent Priors to Mitigate Small Sample Bias in Latent Growth Models: A Discussion and Illustration Using M"plus"

Peer reviewed

Direct link

McNeish, Daniel M. – Journal of Educational and Behavioral Statistics, 2016

Mixed-effects models (MEMs) and latent growth models (LGMs) are often considered interchangeable save the discipline-specific nomenclature. Software implementations of these models, however, are not interchangeable, particularly with small sample sizes. Restricted maximum likelihood estimation that mitigates small sample bias in MEMs has not been…

Descriptors: Models, Statistical Analysis, Hierarchical Linear Modeling, Sample Size

Correcting for Test Score Measurement Error in ANCOVA Models for Estimating Treatment Effects

Peer reviewed

Direct link

Lockwood, J. R.; McCaffrey, Daniel F. – Journal of Educational and Behavioral Statistics, 2014

A common strategy for estimating treatment effects in observational studies using individual student-level data is analysis of covariance (ANCOVA) or hierarchical variants of it, in which outcomes (often standardized test scores) are regressed on pretreatment test scores, other student characteristics, and treatment group indicators. Measurement…

Descriptors: Error of Measurement, Scores, Statistical Analysis, Computation

Improving Mantel-Haenszel DIF Estimation through Bayesian Updating

Peer reviewed

Direct link

Zwick, Rebecca; Ye, Lei; Isham, Steven – Journal of Educational and Behavioral Statistics, 2012

This study demonstrates how the stability of Mantel-Haenszel (MH) DIF (differential item functioning) methods can be improved by integrating information across multiple test administrations using Bayesian updating (BU). The authors conducted a simulation that showed that this approach, which is based on earlier work by Zwick, Thayer, and Lewis,…

Descriptors: Test Bias, Computation, Statistical Analysis, Bayesian Statistics

A Semiparametric Model for Jointly Analyzing Response Times and Accuracy in Computerized Testing

Peer reviewed

Direct link

Wang, Chun; Fan, Zhewen; Chang, Hua-Hua; Douglas, Jeffrey A. – Journal of Educational and Behavioral Statistics, 2013

The item response times (RTs) collected from computerized testing represent an underutilized type of information about items and examinees. In addition to knowing the examinees' responses to each item, we can investigate the amount of time examinees spend on each item. Current models for RTs mainly focus on parametric models, which have the…

Descriptors: Reaction Time, Computer Assisted Testing, Test Items, Accuracy

Two Simple Approaches to Overcome a Problem with the Mantel-Haenszel Statistic: Comments on Wang, Bradlow, Wainer, and Muller (2008)

Peer reviewed

Direct link

Sinharay, Sandip; Dorans, Neil J. – Journal of Educational and Behavioral Statistics, 2010

The Mantel-Haenszel (MH) procedure (Mantel and Haenszel) is a popular method for estimating and testing a common two-factor association parameter in a 2 x 2 x K table. Holland and Holland and Thayer described how to use the procedure to detect differential item functioning (DIF) for tests with dichotomously scored items. Wang, Bradlow, Wainer, and…

Descriptors: Test Bias, Statistical Analysis, Computation, Bayesian Statistics

Previous Page | Next Page »

Pages: 1 | 2

Sinharay, Sandip	3
McCaffrey, Daniel F.	2
Boyd, Donald	1
Bradlow, Eric T.	1
Brown, C. Hendricks	1
Browne, Michael W.	1
Casabianca, Jodi M.	1
Cepeda-Cuervo, Edilberto	1
Chang, Hua-Hua	1
Chung, Yeojin	1
Dagne, Getachew A.	1
Dalal, Siddhartha R.	1
Daniel McNeish	1
De Boeck, Paul	1
Dorans, Neil J.	1
Dorie, Vincent	1
Douglas, Jeffrey A.	1
Fan, Zhewen	1
Gamerman, Dani	1
Gelman, Andrew	1
Goncalves, Flavio B.	1
González, Jorge	1
Han, Bing	1
Howe, George W.	1
Isham, Steven	1
More ▼