Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 7 |
Since 2006 (last 20 years) | 18 |
Descriptor
Error of Measurement | 19 |
Probability | 19 |
Sample Size | 19 |
Statistical Analysis | 10 |
Sampling | 7 |
Effect Size | 6 |
Comparative Analysis | 4 |
Equated Scores | 4 |
Item Response Theory | 4 |
Simulation | 4 |
Educational Research | 3 |
More ▼ |
Source
Author
Phillips, Gary W. | 2 |
Abad, Francisco J. | 1 |
Andersson, Björn | 1 |
Axelson, Erika D. | 1 |
Barr, James | 1 |
Beretvas, S. Natasha | 1 |
Bonnett, Laura J. | 1 |
Draxler, Clemens | 1 |
Fan, Xitao | 1 |
Haberman, Shelby J. | 1 |
Henson, Robin K. | 1 |
More ▼ |
Publication Type
Journal Articles | 17 |
Reports - Research | 12 |
Reports - Evaluative | 4 |
Reports - Descriptive | 3 |
Information Analyses | 1 |
Speeches/Meeting Papers | 1 |
Education Level
Elementary Education | 1 |
Grade 8 | 1 |
Higher Education | 1 |
Junior High Schools | 1 |
Middle Schools | 1 |
Secondary Education | 1 |
Audience
Teachers | 2 |
Location
Laws, Policies, & Programs
Assessments and Surveys
National Assessment of… | 1 |
What Works Clearinghouse Rating
Metsämuuronen, Jari – Practical Assessment, Research & Evaluation, 2022
The reliability of a test score is usually underestimated and the deflation may be profound, 0.40 - 0.60 units of reliability or 46 - 71%. Eight root sources of the deflation are discussed and quantified by a simulation with 1,440 real-world datasets: (1) errors in the measurement modelling, (2) inefficiency in the estimator of reliability within…
Descriptors: Test Reliability, Scores, Test Items, Correlation
Kulinskaya, Elena; Hoaglin, David C. – Research Synthesis Methods, 2023
For estimation of heterogeneity variance T[superscript 2] in meta-analysis of log-odds-ratio, we derive new mean- and median-unbiased point estimators and new interval estimators based on a generalized Q statistic, Q[subscript F], in which the weights depend on only the studies' effective sample sizes. We compare them with familiar estimators…
Descriptors: Q Methodology, Statistical Analysis, Meta Analysis, Intervals
Qian, Jiahe – ETS Research Report Series, 2020
The finite population correction (FPC) factor is often used to adjust variance estimators for survey data sampled from a finite population without replacement. As a replicated resampling approach, the jackknife approach is usually implemented without the FPC factor incorporated in its variance estimates. A paradigm is proposed to compare the…
Descriptors: Computation, Sampling, Data, Statistical Analysis
White, Simon R.; Bonnett, Laura J. – Teaching Statistics: An International Journal for Teachers, 2019
The statistical concept of sampling is often given little direct attention, typically reduced to the mantra "take a random sample". This low resource and adaptable activity demonstrates sampling and explores issues that arise due to biased sampling.
Descriptors: Statistical Bias, Sampling, Statistical Analysis, Learning Activities
Phillips, Gary W.; Jiang, Tao – Practical Assessment, Research & Evaluation, 2016
Power analysis is a fundamental prerequisite for conducting scientific research. Without power analysis the researcher has no way of knowing whether the sample size is large enough to detect the effect he or she is looking for. This paper demonstrates how psychometric factors such as measurement error and equating error affect the power of…
Descriptors: Error of Measurement, Statistical Analysis, Equated Scores, Sample Size
Andersson, Björn – Journal of Educational Measurement, 2016
In observed-score equipercentile equating, the goal is to make scores on two scales or tests measuring the same construct comparable by matching the percentiles of the respective score distributions. If the tests consist of different items with multiple categories for each item, a suitable model for the responses is a polytomous item response…
Descriptors: Equated Scores, Item Response Theory, Error of Measurement, Tests
Henson, Robin K.; Natesan, Prathiba; Axelson, Erika D. – Journal of Experimental Education, 2014
The authors examined the distributional properties of 3 improvement-over-chance, I, effect sizes each derived from linear and quadratic predictive discriminant analysis and from logistic regression analysis for the 2-group univariate classification. These 3 classification methods (3 levels) were studied under varying levels of data conditions,…
Descriptors: Effect Size, Probability, Comparative Analysis, Classification
Tipton, Elizabeth – Society for Research on Educational Effectiveness, 2014
Replication studies allow for making comparisons and generalizations regarding the effectiveness of an intervention across different populations, versions of a treatment, settings and contexts, and outcomes. One method for making these comparisons across many replication studies is through the use of meta-analysis. A recent innovation in…
Descriptors: Replication (Evaluation), Robustness (Statistics), Meta Analysis, Regression (Statistics)
Lindstromberg, Seth – Language Teaching Research, 2016
This article reviews all (quasi)experimental studies appearing in the first 19 volumes (1997-2015) of "Language Teaching Research" (LTR). Specifically, it provides an overview of how statistical analyses were conducted in these studies and of how the analyses were reported. The overall conclusion is that there has been a tight adherence…
Descriptors: Meta Analysis, Second Language Learning, Second Language Instruction, Guidelines
Zu, Jiyun; Yuan, Ke-Hai – Journal of Educational Measurement, 2012
In the nonequivalent groups with anchor test (NEAT) design, the standard error of linear observed-score equating is commonly estimated by an estimator derived assuming multivariate normality. However, real data are seldom normally distributed, causing this normal estimator to be inconsistent. A general estimator, which does not rely on the…
Descriptors: Sample Size, Equated Scores, Test Items, Error of Measurement
Lee, HwaYoung; Beretvas, S. Natasha – Educational and Psychological Measurement, 2014
Conventional differential item functioning (DIF) detection methods (e.g., the Mantel-Haenszel test) can be used to detect DIF only across observed groups, such as gender or ethnicity. However, research has found that DIF is not typically fully explained by an observed variable. True sources of DIF may include unobserved, latent variables, such as…
Descriptors: Item Analysis, Factor Structure, Bayesian Statistics, Goodness of Fit
Phillips, Gary W. – Applied Measurement in Education, 2015
This article proposes that sampling design effects have potentially huge unrecognized impacts on the results reported by large-scale district and state assessments in the United States. When design effects are unrecognized and unaccounted for they lead to underestimating the sampling error in item and test statistics. Underestimating the sampling…
Descriptors: State Programs, Sampling, Research Design, Error of Measurement
Draxler, Clemens – Psychometrika, 2010
This paper is concerned with supplementing statistical tests for the Rasch model so that additionally to the probability of the error of the first kind (Type I probability) the probability of the error of the second kind (Type II probability) can be controlled at a predetermined level by basing the test on the appropriate number of observations.…
Descriptors: Statistical Analysis, Probability, Sample Size, Error of Measurement
Menil, Violeta C.; Ye, Ruili – MathAMATYC Educator, 2012
This study serves as a teaching aid for teachers of introductory statistics. The aim of this study was limited to determining various sample sizes when estimating population proportion. Tables on sample sizes were generated using a C[superscript ++] program, which depends on population size, degree of precision or error level, and confidence…
Descriptors: Sample Size, Probability, Statistics, Sampling
Sueiro, Manuel J.; Abad, Francisco J. – Educational and Psychological Measurement, 2011
The distance between nonparametric and parametric item characteristic curves has been proposed as an index of goodness of fit in item response theory in the form of a root integrated squared error index. This article proposes to use the posterior distribution of the latent trait as the nonparametric model and compares the performance of an index…
Descriptors: Goodness of Fit, Item Response Theory, Nonparametric Statistics, Probability
Previous Page | Next Page »
Pages: 1 | 2