Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 6 |
Since 2006 (last 20 years) | 11 |
Descriptor
Statistical Analysis | 21 |
Test Reliability | 21 |
Simulation | 19 |
Test Validity | 10 |
Error of Measurement | 6 |
Item Response Theory | 6 |
Test Items | 6 |
Item Analysis | 5 |
Adaptive Testing | 4 |
Comparative Analysis | 4 |
Difficulty Level | 4 |
More ▼ |
Source
Author
Andersson, Björn | 1 |
Betz, Nancy E. | 1 |
Beuchert, A. Kent | 1 |
Bulut, Okan | 1 |
Cliff, Norman | 1 |
Cole, Russell | 1 |
Doolen, Jessica | 1 |
Dyer, W. Justin | 1 |
Eason, Hershel | 1 |
Epstein, Kenneth I. | 1 |
Frary, Robert B. | 1 |
More ▼ |
Publication Type
Reports - Research | 17 |
Journal Articles | 9 |
Dissertations/Theses -… | 2 |
Education Level
Audience
Location
Turkey | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Early Childhood Longitudinal… | 1 |
Stanford Binet Intelligence… | 1 |
Trends in International… | 1 |
What Works Clearinghouse Rating
Andersson, Björn; Xin, Tao – Educational and Psychological Measurement, 2018
In applications of item response theory (IRT), an estimate of the reliability of the ability estimates or sum scores is often reported. However, analytical expressions for the standard errors of the estimators of the reliability coefficients are not available in the literature and therefore the variability associated with the estimated reliability…
Descriptors: Item Response Theory, Test Reliability, Test Items, Scores
Kim, Sooyeon; Livingston, Samuel A. – ETS Research Report Series, 2017
The purpose of this simulation study was to assess the accuracy of a classical test theory (CTT)-based procedure for estimating the alternate-forms reliability of scores on a multistage test (MST) having 3 stages. We generated item difficulty and discrimination parameters for 10 parallel, nonoverlapping forms of the complete 3-stage test and…
Descriptors: Accuracy, Test Theory, Test Reliability, Adaptive Testing
Jin, Ying; Eason, Hershel – Journal of Educational Issues, 2016
The effects of mean ability difference (MAD) and short tests on the performance of various DIF methods have been studied extensively in previous simulation studies. Their effects, however, have not been studied under multilevel data structure. MAD was frequently observed in large-scale cross-country comparison studies where the primary sampling…
Descriptors: Test Bias, Simulation, Hierarchical Linear Modeling, Comparative Analysis
Van Norman, Ethan R. – School Psychology Quarterly, 2016
Curriculum-based measurement of oral reading (CBM-R) progress monitoring data is used to measure student response to instruction. Federal legislation permits educators to use CBM-R progress monitoring data as a basis for determining the presence of specific learning disabilities. However, decision making frameworks originally developed for CBM-R…
Descriptors: Oral Reading, Curriculum Based Assessment, Investigations, Progress Monitoring
Kern, Justin L.; McBride, Brent A.; Laxman, Daniel J.; Dyer, W. Justin; Santos, Rosa M.; Jeans, Laurie M. – Grantee Submission, 2016
Measurement invariance (MI) is a property of measurement that is often implicitly assumed, but in many cases, not tested. When the assumption of MI is tested, it generally involves determining if the measurement holds longitudinally or cross-culturally. A growing literature shows that other groupings can, and should, be considered as well.…
Descriptors: Psychology, Measurement, Error of Measurement, Measurement Objectives
Öztürk-Gübes, Nese; Kelecioglu, Hülya – Educational Sciences: Theory and Practice, 2016
The purpose of this study was to examine the impact of dimensionality, common-item set format, and different scale linking methods on preserving equity property with mixed-format test equating. Item response theory (IRT) true-score equating (TSE) and IRT observed-score equating (OSE) methods were used under common-item nonequivalent groups design.…
Descriptors: Test Format, Item Response Theory, True Scores, Equated Scores
Justice, Lenora Jean – ProQuest LLC, 2012
The purpose of this study was to create a valid and reliable instrument to measure teacher perceived barriers to the adoption of games and simulations in instruction. Previous research, interviews with educators, a focus group, an expert review, and a think aloud protocol were used to design a survey instrument. After finalization, the survey was…
Descriptors: Barriers, Games, Simulation, Test Validity
Doolen, Jessica – ProQuest LLC, 2012
High fidelity simulation has become a widespread and costly learning strategy in nursing education because it can fill the gap left by a shortage of clinical sites. In addition, high fidelity simulation is an active learning strategy that is thought to increase higher order thinking such as clinical reasoning and judgment skills in nursing…
Descriptors: Simulation, Nursing Education, Simulated Environment, Psychometrics
May, Henry; Cole, Russell; Haimson, Josh; Perez-Johnson, Irma – Society for Research on Educational Effectiveness, 2010
The purpose of this study is to provide empirical benchmarks of the conditional reliabilities of state tests for samples of the student population defined by ability level. Given that many educational interventions are targeted for samples of low performing students, schools, or districts, the primary goal of this research is to determine how…
Descriptors: Intervention, Statistical Analysis, Academic Achievement, Test Reliability
Bulut, Okan; Kan, Adnan – Eurasian Journal of Educational Research, 2012
Problem Statement: Computerized adaptive testing (CAT) is a sophisticated and efficient way of delivering examinations. In CAT, items for each examinee are selected from an item bank based on the examinee's responses to the items. In this way, the difficulty level of the test is adjusted based on the examinee's ability level. Instead of…
Descriptors: Adaptive Testing, Computer Assisted Testing, College Entrance Examinations, Graduate Students
Rotou, Ourania; Patsula, Liane; Steffen, Manfred; Rizavi, Saba – ETS Research Report Series, 2007
Traditionally, the fixed-length linear paper-and-pencil (P&P) mode of administration has been the standard method of test delivery. With the advancement of technology, however, the popularity of administering tests using adaptive methods like computerized adaptive testing (CAT) and multistage testing (MST) has grown in the field of measurement…
Descriptors: Comparative Analysis, Test Format, Computer Assisted Testing, Models
Mendro, Robert – 1971
A major problem in the research concerning distributional and other properties of reliability coefficients has been the non-existence or inaccessibility of adequate test data for use in empirical verification of hypothetical conclusions. The purpose of this paper is to develop a technique for the simulation of test item scores through the use of…
Descriptors: Computer Programs, Factor Analysis, Models, Reliability

Shoemaker, David M. – Educational and Psychological Measurement, 1972
Descriptors: Difficulty Level, Error of Measurement, Item Sampling, Simulation

Beuchert, A. Kent; Mendoza, Jorge L. – Journal of Educational Measurement, 1979
Ten item discrimination indices, across a variety of item analysis situations, were compared, based on the validities of tests constructed by using each of the indices to select 40 items from a 100-item pool. Item score data were generated by a computer program and included a simulation of guessing. (Author/CTM)
Descriptors: Item Analysis, Simulation, Statistical Analysis, Test Construction
Epstein, Kenneth I.; Knerr, Claramae S. – 1976
The literature on criterion referenced testing is full of discussions concerning whether classical measurement techniques are appropriate, whether variance is necessary, whether new indices of reliability are needed, and the like. What appears to be lacking, however, is a clear and simple discussion of why the problems occur. This paper suggests…
Descriptors: Career Development, Criterion Referenced Tests, Item Analysis, Item Sampling
Previous Page | Next Page »
Pages: 1 | 2