ERIC - Search Results

Publication Date

In 2026	0
Since 2025	1
Since 2022 (last 5 years)	3
Since 2017 (last 10 years)	8
Since 2007 (last 20 years)	11

Descriptor

Error of Measurement	18
Sample Size	18
Test Reliability	18
Item Response Theory	6
Test Items	6
Test Length	5
Test Validity	5
Scores	4
Correlation	3
Cutting Scores	3
Equated Scores	3
Research Design	3
Sampling	3
Accuracy	2
Cognitive Tests	2
Comparative Analysis	2
Computer Assisted Testing	2
Criterion Referenced Tests	2
Data Analysis	2
Difficulty Level	2
Equations (Mathematics)	2
Estimation (Mathematics)	2
Evaluation Methods	2
Experimenter Characteristics	2
Foreign Countries	2
More ▼

Source

Educational Sciences: Theory…	2
Applied Measurement in…	1
ETS Research Report Series	1
Educational and Psychological…	1
International Journal of…	1
International Journal of…	1
Journal of Education and…	1
Measurement in Physical…	1
National Center for Education…	1
Practical Assessment,…	1
ProQuest LLC	1
Psychometrika	1
More ▼

Publication Type

Reports - Research	14
Journal Articles	11
Speeches/Meeting Papers	4
Reports - Evaluative	2
Dissertations/Theses -…	1
Guides - Non-Classroom	1
Numerical/Quantitative Data	1
Reports - Descriptive	1

Education Level

Elementary Secondary Education	1
High Schools	1
Higher Education	1
Kindergarten	1
Postsecondary Education	1
Secondary Education	1

Audience

Researchers

Location

Taiwan	1
Turkey	1

Laws, Policies, & Programs

Assessments and Surveys

Comprehensive Tests of Basic…	1
Early Childhood Longitudinal…	1
National Merit Scholarship…	1
Preliminary Scholastic…	1
Student Teacher Relationship…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 18 results Save | Export

Confirming Increased Statistical Errors in Testing Correlations from Small Sample Sizes

Peer reviewed

Direct link

Duane Knudson – Measurement in Physical Education and Exercise Science, 2025

Small sample sizes contribute to several problems in research and knowledge advancement. This conceptual replication study confirmed and extended the inflation of type II errors and confidence intervals in correlation analyses of small sample sizes common in kinesiology/exercise science. Current population data (N = 18, 230, & 464) on four…

Descriptors: Kinesiology, Exercise, Biomechanics, Movement Education

How to Obtain the Most Error-Free Estimate of Reliability? Eight Sources of Deflation in the Estimates of Reliability to Avoid

Peer reviewed
PDF on ERIC

Download full text

Metsämuuronen, Jari – Practical Assessment, Research & Evaluation, 2022

The reliability of a test score is usually underestimated and the deflation may be profound, 0.40 - 0.60 units of reliability or 46 - 71%. Eight root sources of the deflation are discussed and quantified by a simulation with 1,440 real-world datasets: (1) errors in the measurement modelling, (2) inefficiency in the estimator of reliability within…

Descriptors: Test Reliability, Scores, Test Items, Correlation

Cognitive Diagnosis for Multiple-Choice Responses: Nonparametric Classification Method, Q-Matrix Theory, and Computerized Adaptive Testing

Direct link

Yu Wang – ProQuest LLC, 2024

The multiple-choice (MC) item format has been widely used in educational assessments across diverse content domains. MC items purportedly allow for collecting richer diagnostic information. The effectiveness and economy of administering MC items may have further contributed to their popularity not just in educational assessment. The MC item format…

Descriptors: Multiple Choice Tests, Cognitive Tests, Cognitive Measurement, Educational Diagnosis

The Effect of Chance Success on Equalization Error in Test Equation Based on Classical Test Theory

Peer reviewed
PDF on ERIC

Download full text

Koçak, Duygu – International Journal of Progressive Education, 2020

The aim of this study was to determine the effect of chance success on test equalization. For this purpose, artificially generated 500 and 1000 sample size data sets were synchronized using linear equalization and equal percentage equalization methods. In the data which were produced as a simulative, a total of four cases were created with no…

Descriptors: Test Theory, Equated Scores, Error of Measurement, Sample Size

Effects of Differential Item Functioning on Examinees' Test Performance and Reliability of Test

Peer reviewed

Direct link

Lee, Yi-Hsuan; Zhang, Jinming – International Journal of Testing, 2017

Simulations were conducted to examine the effect of differential item functioning (DIF) on measurement consequences such as total scores, item response theory (IRT) ability estimates, and test reliability in terms of the ratio of true-score variance to observed-score variance and the standard error of estimation for the IRT ability parameter. The…

Descriptors: Test Bias, Test Reliability, Performance, Scores

Examination of Polytomous Items' Psychometric Properties According to Nonparametric Item Response Theory Models in Different Test Conditions

Peer reviewed
PDF on ERIC

Download full text

Sengul Avsar, Asiye; Tavsancil, Ezel – Educational Sciences: Theory and Practice, 2017

This study analysed polytomous items' psychometric properties according to nonparametric item response theory (NIRT) models. Thus, simulated datasets--three different test lengths (10, 20 and 30 items), three sample distributions (normal, right and left skewed) and three samples sizes (100, 250 and 500)--were generated by conducting 20…

Descriptors: Test Items, Psychometrics, Nonparametric Statistics, Item Response Theory

Examination of Different Item Response Theory Models on Tests Composed of Testlets

Peer reviewed
PDF on ERIC

Download full text

Kogar, Esin Yilmaz; Kelecioglu, Hülya – Journal of Education and Learning, 2017

The purpose of this research is to first estimate the item and ability parameters and the standard error values related to those parameters obtained from Unidimensional Item Response Theory (UIRT), bifactor (BIF) and Testlet Response Theory models (TRT) in the tests including testlets, when the number of testlets, number of independent items, and…

Descriptors: Item Response Theory, Models, Mathematics Tests, Test Items

Investigation of Coefficient of Individual Agreement in Terms of Sample Size, Random and Monotone Missing Ratio, and Number of Repeated Measures

Peer reviewed
PDF on ERIC

Download full text

Temel, Gülhan Orekici; Erdogan, Semra; Selvi, Hüseyin; Kaya, Irem Ersöz – Educational Sciences: Theory and Practice, 2016

Studies based on longitudinal data focus on the change and development of the situation being investigated and allow for examining cases regarding education, individual development, cultural change, and socioeconomic improvement in time. However, as these studies require taking repeated measures in different time periods, they may include various…

Descriptors: Investigations, Sample Size, Longitudinal Studies, Interrater Reliability

Impact of Design Effects in Large-Scale District and State Assessments

Peer reviewed

Direct link

Phillips, Gary W. – Applied Measurement in Education, 2015

This article proposes that sampling design effects have potentially huge unrecognized impacts on the results reported by large-scale district and state assessments in the United States. When design effects are unrecognized and unaccounted for they lead to underestimating the sampling error in item and test statistics. Underestimating the sampling…

Descriptors: State Programs, Sampling, Research Design, Error of Measurement

Early Childhood Longitudinal Study, Kindergarten Class of 2010-11 (ECLS-K:2011): User's Manual for the ECLS-K:2011 Kindergarten-Third Grade Data File and Electronic Codebook, Public Version. NCES 2018-034

Peer reviewed
PDF on ERIC

Download full text

Tourangeau, Karen; Nord, Christine; Lê, Thanh; Wallner-Allen, Kathleen; Vaden-Kiernan, Nancy; Blaker, Lisa; Najarian, Michelle – National Center for Education Statistics, 2018

This manual provides guidance and documentation for users of the longitudinal kindergarten-fourth grade (K-4) public-use data file of the Early Childhood Longitudinal Study, Kindergarten Class of 2010-11 (ECLS-K:2011), which includes the first release of the public version of the third-grade data. This manual mainly provides information specific…

Descriptors: Longitudinal Studies, Children, Surveys, Kindergarten

Improved Reliability Estimates for Small Samples Using Empirical Bayes Techniques. Research Report. ETS RR-09-46

Peer reviewed
PDF on ERIC

Download full text

Oh, Hyeonjoo J.; Guo, Hongwen; Walker, Michael E. – ETS Research Report Series, 2009

Issues of equity and fairness across subgroups of the population (e.g., gender or ethnicity) must be seriously considered in any standardized testing program. For this reason, many testing programs require some means for assessing test characteristics, such as reliability, for subgroups of the population. However, often only small sample sizes are…

Descriptors: Standardized Tests, Test Reliability, Sample Size, Bayesian Statistics

The Reliability of Linearly Equated Tests.

Peer reviewed

Segall, Daniel O. – Psychometrika, 1994

An asymptotic expression for the reliability of a linearly equated test is developed using normal theory. Reliability is expressed as the product of test reliability before equating and an adjustment term that is a function of the sample sizes used to estimate the linear equating transformation. The approach is illustrated. (SLD)

Descriptors: Equated Scores, Error of Measurement, Estimation (Mathematics), Sample Size

Consideration for Sample Size in Reliability Studies for Mastery Tests. Publication Series in Mastery Testing.

Download full text

Saunders, Joseph C.; Huynh, Huynh – 1980

In most reliability studies, the precision of a reliability estimate varies inversely with the number of examinees (sample size). Thus, to achieve a given level of accuracy, some minimum sample size is required. An approximation for this minimum size may be made if some reasonable assumptions regarding the mean and standard deviation of the test…

Descriptors: Cutting Scores, Difficulty Level, Error of Measurement, Mastery Tests

The Standardized Mean Difference within the Framework of Item Response Theory

Peer reviewed

Direct link

Wang, Wen-Chung; Chen, Hsueh-Chu – Educational and Psychological Measurement, 2004

As item response theory (IRT) becomes popular in educational and psychological testing, there is a need of reporting IRT-based effect size measures. In this study, we show how the standardized mean difference can be generalized into such a measure. A disattenuation procedure based on the IRT test reliability is proposed to correct the attenuation…

Descriptors: Test Reliability, Rating Scales, Sample Size, Error of Measurement

A Method for Determining the Length of Criterion-Referenced Tests Using Reliability and Validity Indices.

Download full text

Mills, Craig N.; Simon, Robert – 1981

When criterion-referenced tests are used to assign examinees to states reflecting their performance level on a test, the better known methods for determining test length, which consider relationships among domain scores and errors of measurement, have their limitations. The purpose of this paper is to present a computer system named TESTLEN, which…

Descriptors: Computer Assisted Testing, Criterion Referenced Tests, Cutting Scores, Error of Measurement

Previous Page | Next Page »

Pages: 1 | 2

Ackerman, Terry A.	1
Blaker, Lisa	1
Chen, Hsueh-Chu	1
Duane Knudson	1
Erdogan, Semra	1
Evans, John A.	1
Guo, Hongwen	1
Huynh, Huynh	1
Kaya, Irem Ersöz	1
Kelecioglu, Hülya	1
Kogar, Esin Yilmaz	1
Koçak, Duygu	1
Lee, Yi-Hsuan	1
Lê, Thanh	1
Macpherson, Colin R.	1
Metsämuuronen, Jari	1
Mills, Craig N.	1
Najarian, Michelle	1
Nord, Christine	1
Oh, Hyeonjoo J.	1
Olejnik, Stephen F.	1
Phillips, Gary W.	1
Porter, Andrew C.	1
Rowley, Glenn L.	1
Saunders, Joseph C.	1
More ▼