ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	5
Since 2006 (last 20 years)	8

Descriptor

Classification	10
Error of Measurement	10
Accuracy	6
Statistical Analysis	4
Effect Size	3
Test Items	3
Comparative Analysis	2
Data Analysis	2
Interrater Reliability	2
Models	2
Nonparametric Statistics	2
Sample Size	2
Statistical Bias	2
Adaptive Testing	1
Bayesian Statistics	1
Computation	1
Computer Assisted Testing	1
Data	1
Data Collection	1
Discriminant Analysis	1
Equations (Mathematics)	1
Error Correction	1
Evaluation Criteria	1
Evaluation Methods	1
Evaluation Problems	1
More ▼

Source

Educational and Psychological…

Publication Type

Journal Articles	10
Reports - Research	7
Reports - Evaluative	3
Speeches/Meeting Papers	1

Education Level

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 10 results Save | Export

Kappa and Rater Accuracy: Paradigms and Parameters

Peer reviewed

Direct link

Conger, Anthony J. – Educational and Psychological Measurement, 2017

Drawing parallels to classical test theory, this article clarifies the difference between rater accuracy and reliability and demonstrates how category marginal frequencies affect rater agreement and Cohen's kappa. Category assignment paradigms are developed: comparing raters to a standard (index) versus comparing two raters to one another…

Descriptors: Interrater Reliability, Evaluators, Accuracy, Statistical Analysis

An Unbiased Estimate of Global Interrater Agreement

Peer reviewed

Direct link

Cousineau, Denis; Laurencelle, Louis – Educational and Psychological Measurement, 2017

Assessing global interrater agreement is difficult as most published indices are affected by the presence of mixtures of agreements and disagreements. A previously proposed method was shown to be specifically sensitive to global agreement, excluding mixtures, but also negatively biased. Here, we propose two alternatives in an attempt to find what…

Descriptors: Interrater Reliability, Evaluation Methods, Statistical Bias, Accuracy

Fitting Large Factor Analysis Models with Ordinal Data

Peer reviewed

Direct link

DiStefano, Christine; McDaniel, Heather L.; Zhang, Liyun; Shi, Dexin; Jiang, Zhehan – Educational and Psychological Measurement, 2019

A simulation study was conducted to investigate the model size effect when confirmatory factor analysis (CFA) models include many ordinal items. CFA models including between 15 and 120 ordinal items were analyzed with mean- and variance-adjusted weighted least squares to determine how varying sample size, number of ordered categories, and…

Descriptors: Factor Analysis, Effect Size, Data, Sample Size

The Development of MST Test Information for the Prediction of Test Performances

Peer reviewed

Direct link

Park, Ryoungsun; Kim, Jiseon; Chung, Hyewon; Dodd, Barbara G. – Educational and Psychological Measurement, 2017

The current study proposes novel methods to predict multistage testing (MST) performance without conducting simulations. This method, called MST test information, is based on analytic derivation of standard errors of ability estimates across theta levels. We compared standard errors derived analytically to the simulation results to demonstrate the…

Descriptors: Testing, Performance, Prediction, Error of Measurement

Development and Monte Carlo Study of a Procedure for Correcting the Standardized Mean Difference for Measurement Error in the Independent Variable

Peer reviewed

Direct link

Nugent, William Robert; Moore, Matthew; Story, Erin – Educational and Psychological Measurement, 2015

The standardized mean difference (SMD) is perhaps the most important meta-analytic effect size. It is typically used to represent the difference between treatment and control population means in treatment efficacy research. It is also used to represent differences between populations with different characteristics, such as persons who are…

Descriptors: Error of Measurement, Error Correction, Predictor Variables, Monte Carlo Methods

The Impact of Ignoring the Level of Nesting Structure in Nonparametric Multilevel Latent Class Models

Peer reviewed

Direct link

Park, Jungkyu; Yu, Hsiu-Ting – Educational and Psychological Measurement, 2016

The multilevel latent class model (MLCM) is a multilevel extension of a latent class model (LCM) that is used to analyze nested structure data structure. The nonparametric version of an MLCM assumes a discrete latent variable at a higher-level nesting structure to account for the dependency among observations nested within a higher-level unit. In…

Descriptors: Hierarchical Linear Modeling, Nonparametric Statistics, Data Analysis, Simulation

Effectiveness of Combining Statistical Tests and Effect Sizes When Using Logistic Discriminant Function Regression to Detect Differential Item Functioning for Polytomous Items

Peer reviewed

Direct link

Gómez-Benito, Juana; Hidalgo, Maria Dolores; Zumbo, Bruno D. – Educational and Psychological Measurement, 2013

The objective of this article was to find an optimal decision rule for identifying polytomous items with large or moderate amounts of differential functioning. The effectiveness of combining statistical tests with effect size measures was assessed using logistic discriminant function analysis and two effect size measures: R[superscript 2] and…

Descriptors: Item Analysis, Test Items, Effect Size, Statistical Analysis

DIF Trees: Using Classification Trees to Detect Differential Item Functioning

Peer reviewed

Direct link

Vaughn, Brandon K.; Wang, Qiu – Educational and Psychological Measurement, 2010

A nonparametric tree classification procedure is used to detect differential item functioning for items that are dichotomously scored. Classification trees are shown to be an alternative procedure to detect differential item functioning other than the use of traditional Mantel-Haenszel and logistic regression analysis. A nonparametric…

Descriptors: Test Bias, Classification, Nonparametric Statistics, Regression (Statistics)

Coefficient Kappa: Some Uses, Misuses, and Alternatives.

Peer reviewed

Brennan, Robert L.; Prediger, Dale J. – Educational and Psychological Measurement, 1981

This paper considers some appropriate and inappropriate uses of coefficient kappa and alternative kappa-like statistics. Discussion is restricted to the descriptive characteristics of these statistics for measuring agreement with categorical data in studies of reliability and validity. (Author)

Descriptors: Classification, Error of Measurement, Mathematical Models, Test Reliability

Self-Correction of Wrong Answers as an Alternative to the Arbitrary Setting of Observed-Score Standards in Competency Testing.

Peer reviewed

Cahan, Sorel; Cohen, Nora – Educational and Psychological Measurement, 1990

A solution is offered to problems associated with the inequality in the manipulability of probabilities of classification errors of masters versus nonmasters, based on competency test results. Eschewing the typical arbitrary establishment of observed-score standards below 100 percent, the solution incorporates a self-correction of wrong answers.…

Descriptors: Classification, Error of Measurement, Mastery Tests, Minimum Competency Testing

Brennan, Robert L.	1
Cahan, Sorel	1
Chung, Hyewon	1
Cohen, Nora	1
Conger, Anthony J.	1
Cousineau, Denis	1
DiStefano, Christine	1
Dodd, Barbara G.	1
Gómez-Benito, Juana	1
Hidalgo, Maria Dolores	1
Jiang, Zhehan	1
Kim, Jiseon	1
Laurencelle, Louis	1
McDaniel, Heather L.	1
Moore, Matthew	1
Nugent, William Robert	1
Park, Jungkyu	1
Park, Ryoungsun	1
Prediger, Dale J.	1
Shi, Dexin	1
Story, Erin	1
Vaughn, Brandon K.	1
Wang, Qiu	1
Yu, Hsiu-Ting	1
Zhang, Liyun	1
More ▼