ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	3
Since 2016 (last 10 years)	17
Since 2006 (last 20 years)	33

Descriptor

Accuracy	33
Error of Measurement	33
Statistical Analysis	33
Computation	14
Sample Size	10
Statistical Bias	10
Comparative Analysis	8
Correlation	8
Equated Scores	6
Simulation	6
Evaluation Methods	5
Models	5
Foreign Countries	4
Interrater Reliability	4
Item Response Theory	4
Meta Analysis	4
Regression (Statistics)	4
Monte Carlo Methods	3
Probability	3
Randomized Controlled Trials	3
Research Design	3
Scores	3
Bayesian Statistics	2
Classification	2
Control Groups	2
More ▼

Publication Type

Journal Articles	27
Reports - Research	27
Reports - Evaluative	4
Dissertations/Theses -…	2
Tests/Questionnaires	2
Speeches/Meeting Papers	1

Education Level

Early Childhood Education	2
Elementary Education	2
Higher Education	1
Preschool Education	1

Audience

Location

California	1
European Union	1
Ireland	1
Japan	1
Netherlands (Amsterdam)	1
United States	1

Laws, Policies, & Programs

Assessments and Surveys

Early Childhood Longitudinal…

What Works Clearinghouse Rating

Showing 1 to 15 of 33 results Save | Export

Investigating Latent Interaction Effects in Multiple-Group Analysis in the Structural Equation Modeling Framework

Peer reviewed

Direct link

Suyoung Kim; Sooyong Lee; Jiwon Kim; Tiffany A. Whittaker – Structural Equation Modeling: A Multidisciplinary Journal, 2024

This study aims to address a gap in the social and behavioral sciences literature concerning interaction effects between latent factors in multiple-group analysis. By comparing two approaches for estimating latent interactions within multiple-group analysis frameworks using simulation studies and empirical data, we assess their relative merits.…

Descriptors: Social Science Research, Behavioral Sciences, Structural Equation Models, Statistical Analysis

Estimation of Heterogeneity Variance Based on a Generalized "Q" Statistic in Meta-Analysis of Log-Odds-Ratio

Peer reviewed

Direct link

Kulinskaya, Elena; Hoaglin, David C. – Research Synthesis Methods, 2023

For estimation of heterogeneity variance T[superscript 2] in meta-analysis of log-odds-ratio, we derive new mean- and median-unbiased point estimators and new interval estimators based on a generalized Q statistic, Q[subscript F], in which the weights depend on only the studies' effective sample sizes. We compare them with familiar estimators…

Descriptors: Q Methodology, Statistical Analysis, Meta Analysis, Intervals

A New Statistic for Selecting the Smoothing Parameter for Polynomial Loglinear Equating under the Random Groups Design

Peer reviewed

Direct link

Liu, Chunyan; Kolen, Michael J. – Journal of Educational Measurement, 2020

Smoothing is designed to yield smoother equating results that can reduce random equating error without introducing very much systematic error. The main objective of this study is to propose a new statistic and to compare its performance to the performance of the Akaike information criterion and likelihood ratio chi-square difference statistics in…

Descriptors: Equated Scores, Statistical Analysis, Error of Measurement, Criteria

Adaptive Pairwise Comparison for Educational Measurement

Peer reviewed

Direct link

Crompvoets, Elise A. V.; Béguin, Anton A.; Sijtsma, Klaas – Journal of Educational and Behavioral Statistics, 2020

Pairwise comparison is becoming increasingly popular as a holistic measurement method in education. Unfortunately, many comparisons are required for reliable measurement. To reduce the number of required comparisons, we developed an adaptive selection algorithm (ASA) that selects the most informative comparisons while taking the uncertainty of the…

Descriptors: Comparative Analysis, Statistical Analysis, Mathematics, Measurement

Statistical Power When Adjusting for Multiple Hypothesis Tests: Methodology Expansions and Software Tools

Peer reviewed

Direct link

Kristin Porter; Luke Miratrix; Kristen Hunter – Society for Research on Educational Effectiveness, 2021

Background: Researchers are often interested in testing the effectiveness of an intervention on multiple outcomes, for multiple subgroups, at multiple points in time, or across multiple treatment groups. The resulting multiplicity of statistical hypothesis tests can lead to spurious findings of effects. Multiple testing procedures (MTPs)…

Descriptors: Statistical Analysis, Hypothesis Testing, Computer Software, Randomized Controlled Trials

Asymptotic Standard Errors of Equating Coefficients Using the Characteristic Curve Methods for the Graded Response Model

Peer reviewed

Direct link

Zhang, Zhonghua – Applied Measurement in Education, 2020

The characteristic curve methods have been applied to estimate the equating coefficients in test equating under the graded response model (GRM). However, the approaches for obtaining the standard errors for the estimates of these coefficients have not been developed and examined. In this study, the delta method was applied to derive the…

Descriptors: Error of Measurement, Computation, Equated Scores, True Scores

Guidance for Deriving and Presenting Percentage Study Weights in Meta-Analysis of Test Accuracy Studies

Peer reviewed

Direct link

Burke, Danielle L.; Ensor, Joie; Snell, Kym I. E.; van der Windt, Danielle; Riley, Richard D. – Research Synthesis Methods, 2018

Percentage study weights in meta-analysis reveal the contribution of each study toward the overall summary results and are especially important when some studies are considered outliers or at high risk of bias. In meta-analyses of test accuracy reviews, such as a bivariate meta-analysis of sensitivity and specificity, the percentage study weights…

Descriptors: Meta Analysis, Research Reports, Statistical Analysis, Sample Size

Partial-Interval Estimation of Count: Uncorrected and Poisson-Corrected Error Levels

Peer reviewed

Direct link

Yoder, Paul J.; Ledford, Jennifer R.; Harbison, Amy L.; Tapp, Jon T. – Journal of Early Intervention, 2018

A simulation study that used 3,000 computer-generated event streams with known behavior rates, interval durations, and session durations was conducted to test whether the main and interaction effects of true rate and interval duration affect the error level of uncorrected and Poisson-transformed (i.e., "corrected") count as estimated by…

Descriptors: Computation, Child Behavior, Early Childhood Education, Early Intervention

Kappa and Rater Accuracy: Paradigms and Parameters

Peer reviewed

Direct link

Conger, Anthony J. – Educational and Psychological Measurement, 2017

Drawing parallels to classical test theory, this article clarifies the difference between rater accuracy and reliability and demonstrates how category marginal frequencies affect rater agreement and Cohen's kappa. Category assignment paradigms are developed: comparing raters to a standard (index) versus comparing two raters to one another…

Descriptors: Interrater Reliability, Evaluators, Accuracy, Statistical Analysis

An Unbiased Estimate of Global Interrater Agreement

Peer reviewed

Direct link

Cousineau, Denis; Laurencelle, Louis – Educational and Psychological Measurement, 2017

Assessing global interrater agreement is difficult as most published indices are affected by the presence of mixtures of agreements and disagreements. A previously proposed method was shown to be specifically sensitive to global agreement, excluding mixtures, but also negatively biased. Here, we propose two alternatives in an attempt to find what…

Descriptors: Interrater Reliability, Evaluation Methods, Statistical Bias, Accuracy

A Comparison of Methods for Estimating Relationships in the Change between Two Time Points for Latent Variables

Peer reviewed

Direct link

Finch, W. Holmes; Shim, Sungok Serena – Educational and Psychological Measurement, 2018

Collection and analysis of longitudinal data is an important tool in understanding growth and development over time in a whole range of human endeavors. Ideally, researchers working in the longitudinal framework are able to collect data at more than two points in time, as this will provide them with the potential for a deeper understanding of the…

Descriptors: Comparative Analysis, Computation, Time, Change

Estimating Hazard Ratios from Published Kaplan-Meier Survival Curves: A Methods Validation Study

Peer reviewed

Direct link

Saluja, Ronak; Cheng, Sierra; delos Santos, Keemo Althea; Chan, Kelvin K. W. – Research Synthesis Methods, 2019

Objective: Various statistical methods have been developed to estimate hazard ratios (HRs) from published Kaplan-Meier (KM) curves for the purpose of performing meta-analyses. The objective of this study was to determine the reliability, accuracy, and precision of four commonly used methods by Guyot, Williamson, Parmar, and Hoyle and Henley.…

Descriptors: Meta Analysis, Reliability, Accuracy, Randomized Controlled Trials

Accuracy of a Classical Test Theory-Based Procedure for Estimating the Reliability of a Multistage Test. Research Report. ETS RR-17-02

Peer reviewed
PDF on ERIC

Download full text

Kim, Sooyeon; Livingston, Samuel A. – ETS Research Report Series, 2017

The purpose of this simulation study was to assess the accuracy of a classical test theory (CTT)-based procedure for estimating the alternate-forms reliability of scores on a multistage test (MST) having 3 stages. We generated item difficulty and discrimination parameters for 10 parallel, nonoverlapping forms of the complete 3-stage test and…

Descriptors: Accuracy, Test Theory, Test Reliability, Adaptive Testing

Methods to Estimate the Variance of Some Indices of the Signal Detection Theory: A Simulation Study

Peer reviewed
PDF on ERIC

Download full text

Suero, Manuel; Privado, Jesús; Botella, Juan – Psicologica: International Journal of Methodology and Experimental Psychology, 2017

A simulation study is presented to evaluate and compare three methods to estimate the variance of the estimates of the parameters d and "C" of the signal detection theory (SDT). Several methods have been proposed to calculate the variance of their estimators, "d'" and "c." Those methods have been mostly assessed by…

Descriptors: Evaluation Methods, Theories, Simulation, Statistical Analysis

Inter-Rater and Test-Retest (Between-Sessions) Reliability of the 4-Skills Scan for Dutch Elementary School Children

Peer reviewed

Direct link

van Kernebeek, Willem G.; de Schipper, Antoine W.; Savelsbergh, Geert J. P.; Toussaint, Huub M. – Measurement in Physical Education and Exercise Science, 2018

In The Netherlands, the 4-Skills Scan is an instrument for physical education teachers to assess gross motor skills of elementary school children. Little is known about its reliability. Therefore, in this study the test-retest and inter-rater reliability was determined. Respectively, 624 and 557 Dutch 6- to 12-year-old children were analyzed for…

Descriptors: Foreign Countries, Interrater Reliability, Pretests Posttests, Psychomotor Skills

Previous Page | Next Page »

Pages: 1 | 2 | 3

ETS Research Report Series	4
Research Synthesis Methods	4
Educational and Psychological…	3
Society for Research on…	3
Journal of Educational…	2
Journal of Educational and…	2
Practical Assessment,…	2
ProQuest LLC	2
Applied Measurement in…	1
International Journal of…	1
Journal of Cognition and…	1
Journal of Early Intervention	1
Journal of Research on…	1
Language Testing	1
Measurement in Physical…	1
Online Submission	1
Psicologica: International…	1
Social Indicators Research	1
Structural Equation Modeling:…	1
More ▼

Kim, Sooyeon	2
Moses, Tim	2
Botella, Juan	1
Braadbaart, Lieke	1
Burke, Danielle L.	1
Béguin, Anton A.	1
Casey, Jackie M.	1
Chan, Kelvin K. W.	1
Cheema, Jehanzeb	1
Cheng, Sierra	1
Conger, Anthony J.	1
Cook, Thomas D.	1
Cousineau, Denis	1
Crompvoets, Elise A. V.	1
Culmer, Peter R.	1
DeMars, Christine E.	1
Ensor, Joie	1
Finch, W. Holmes	1
Goedeme, Tim	1
Han, Kyung T.	1
Harbison, Amy L.	1
Hoaglin, David C.	1
Holland, Paul	1
Jiwon Kim	1
Keller, Bryan S. B.	1
More ▼