NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
Researchers1
Location
Taiwan1
Turkey1
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 18 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Duane Knudson – Measurement in Physical Education and Exercise Science, 2025
Small sample sizes contribute to several problems in research and knowledge advancement. This conceptual replication study confirmed and extended the inflation of type II errors and confidence intervals in correlation analyses of small sample sizes common in kinesiology/exercise science. Current population data (N = 18, 230, & 464) on four…
Descriptors: Kinesiology, Exercise, Biomechanics, Movement Education
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Metsämuuronen, Jari – Practical Assessment, Research & Evaluation, 2022
The reliability of a test score is usually underestimated and the deflation may be profound, 0.40 - 0.60 units of reliability or 46 - 71%. Eight root sources of the deflation are discussed and quantified by a simulation with 1,440 real-world datasets: (1) errors in the measurement modelling, (2) inefficiency in the estimator of reliability within…
Descriptors: Test Reliability, Scores, Test Items, Correlation
Yu Wang – ProQuest LLC, 2024
The multiple-choice (MC) item format has been widely used in educational assessments across diverse content domains. MC items purportedly allow for collecting richer diagnostic information. The effectiveness and economy of administering MC items may have further contributed to their popularity not just in educational assessment. The MC item format…
Descriptors: Multiple Choice Tests, Cognitive Tests, Cognitive Measurement, Educational Diagnosis
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Koçak, Duygu – International Journal of Progressive Education, 2020
The aim of this study was to determine the effect of chance success on test equalization. For this purpose, artificially generated 500 and 1000 sample size data sets were synchronized using linear equalization and equal percentage equalization methods. In the data which were produced as a simulative, a total of four cases were created with no…
Descriptors: Test Theory, Equated Scores, Error of Measurement, Sample Size
Peer reviewed Peer reviewed
Direct linkDirect link
Lee, Yi-Hsuan; Zhang, Jinming – International Journal of Testing, 2017
Simulations were conducted to examine the effect of differential item functioning (DIF) on measurement consequences such as total scores, item response theory (IRT) ability estimates, and test reliability in terms of the ratio of true-score variance to observed-score variance and the standard error of estimation for the IRT ability parameter. The…
Descriptors: Test Bias, Test Reliability, Performance, Scores
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Sengul Avsar, Asiye; Tavsancil, Ezel – Educational Sciences: Theory and Practice, 2017
This study analysed polytomous items' psychometric properties according to nonparametric item response theory (NIRT) models. Thus, simulated datasets--three different test lengths (10, 20 and 30 items), three sample distributions (normal, right and left skewed) and three samples sizes (100, 250 and 500)--were generated by conducting 20…
Descriptors: Test Items, Psychometrics, Nonparametric Statistics, Item Response Theory
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Kogar, Esin Yilmaz; Kelecioglu, Hülya – Journal of Education and Learning, 2017
The purpose of this research is to first estimate the item and ability parameters and the standard error values related to those parameters obtained from Unidimensional Item Response Theory (UIRT), bifactor (BIF) and Testlet Response Theory models (TRT) in the tests including testlets, when the number of testlets, number of independent items, and…
Descriptors: Item Response Theory, Models, Mathematics Tests, Test Items
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Temel, Gülhan Orekici; Erdogan, Semra; Selvi, Hüseyin; Kaya, Irem Ersöz – Educational Sciences: Theory and Practice, 2016
Studies based on longitudinal data focus on the change and development of the situation being investigated and allow for examining cases regarding education, individual development, cultural change, and socioeconomic improvement in time. However, as these studies require taking repeated measures in different time periods, they may include various…
Descriptors: Investigations, Sample Size, Longitudinal Studies, Interrater Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Phillips, Gary W. – Applied Measurement in Education, 2015
This article proposes that sampling design effects have potentially huge unrecognized impacts on the results reported by large-scale district and state assessments in the United States. When design effects are unrecognized and unaccounted for they lead to underestimating the sampling error in item and test statistics. Underestimating the sampling…
Descriptors: State Programs, Sampling, Research Design, Error of Measurement
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Tourangeau, Karen; Nord, Christine; Lê, Thanh; Wallner-Allen, Kathleen; Vaden-Kiernan, Nancy; Blaker, Lisa; Najarian, Michelle – National Center for Education Statistics, 2018
This manual provides guidance and documentation for users of the longitudinal kindergarten-fourth grade (K-4) public-use data file of the Early Childhood Longitudinal Study, Kindergarten Class of 2010-11 (ECLS-K:2011), which includes the first release of the public version of the third-grade data. This manual mainly provides information specific…
Descriptors: Longitudinal Studies, Children, Surveys, Kindergarten
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Oh, Hyeonjoo J.; Guo, Hongwen; Walker, Michael E. – ETS Research Report Series, 2009
Issues of equity and fairness across subgroups of the population (e.g., gender or ethnicity) must be seriously considered in any standardized testing program. For this reason, many testing programs require some means for assessing test characteristics, such as reliability, for subgroups of the population. However, often only small sample sizes are…
Descriptors: Standardized Tests, Test Reliability, Sample Size, Bayesian Statistics
Peer reviewed Peer reviewed
Segall, Daniel O. – Psychometrika, 1994
An asymptotic expression for the reliability of a linearly equated test is developed using normal theory. Reliability is expressed as the product of test reliability before equating and an adjustment term that is a function of the sample sizes used to estimate the linear equating transformation. The approach is illustrated. (SLD)
Descriptors: Equated Scores, Error of Measurement, Estimation (Mathematics), Sample Size
Saunders, Joseph C.; Huynh, Huynh – 1980
In most reliability studies, the precision of a reliability estimate varies inversely with the number of examinees (sample size). Thus, to achieve a given level of accuracy, some minimum sample size is required. An approximation for this minimum size may be made if some reasonable assumptions regarding the mean and standard deviation of the test…
Descriptors: Cutting Scores, Difficulty Level, Error of Measurement, Mastery Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Wang, Wen-Chung; Chen, Hsueh-Chu – Educational and Psychological Measurement, 2004
As item response theory (IRT) becomes popular in educational and psychological testing, there is a need of reporting IRT-based effect size measures. In this study, we show how the standardized mean difference can be generalized into such a measure. A disattenuation procedure based on the IRT test reliability is proposed to correct the attenuation…
Descriptors: Test Reliability, Rating Scales, Sample Size, Error of Measurement
Mills, Craig N.; Simon, Robert – 1981
When criterion-referenced tests are used to assign examinees to states reflecting their performance level on a test, the better known methods for determining test length, which consider relationships among domain scores and errors of measurement, have their limitations. The purpose of this paper is to present a computer system named TESTLEN, which…
Descriptors: Computer Assisted Testing, Criterion Referenced Tests, Cutting Scores, Error of Measurement
Previous Page | Next Page »
Pages: 1  |  2