ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	6
Since 2006 (last 20 years)	11

Descriptor

Statistical Analysis	21
Test Reliability	21
Simulation	19
Test Validity	10
Error of Measurement	6
Item Response Theory	6
Test Items	6
Item Analysis	5
Adaptive Testing	4
Comparative Analysis	4
Difficulty Level	4
Mathematical Models	4
Measurement Techniques	4
Scores	4
Academic Ability	3
Computer Assisted Testing	3
Computer Programs	3
Factor Analysis	3
Test Format	3
Test Length	3
Test Theory	3
True Scores	3
Achievement Gains	2
Achievement Tests	2
Career Development	2
More ▼

Source

ETS Research Report Series	2
Educational and Psychological…	2
Journal of Educational…	2
ProQuest LLC	2
Educational Sciences: Theory…	1
Eurasian Journal of…	1
Grantee Submission	1
Journal of Educational Issues	1
School Psychology Quarterly	1
Society for Research on…	1

Publication Type

Reports - Research	17
Journal Articles	9
Dissertations/Theses -…	2

Education Level

Elementary Education	2
Elementary Secondary Education	2
Higher Education	2
Postsecondary Education	2
Early Childhood Education	1
Grade 2	1
Grade 3	1
Grade 8	1
Junior High Schools	1
Middle Schools	1
Primary Education	1
Secondary Education	1
More ▼

Audience

Location

Turkey

Laws, Policies, & Programs

Assessments and Surveys

Early Childhood Longitudinal…	1
Stanford Binet Intelligence…	1
Trends in International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 21 results Save | Export

Large Sample Confidence Intervals for Item Response Theory Reliability Coefficients

Peer reviewed

Direct link

Andersson, Björn; Xin, Tao – Educational and Psychological Measurement, 2018

In applications of item response theory (IRT), an estimate of the reliability of the ability estimates or sum scores is often reported. However, analytical expressions for the standard errors of the estimators of the reliability coefficients are not available in the literature and therefore the variability associated with the estimated reliability…

Descriptors: Item Response Theory, Test Reliability, Test Items, Scores

Accuracy of a Classical Test Theory-Based Procedure for Estimating the Reliability of a Multistage Test. Research Report. ETS RR-17-02

Peer reviewed
PDF on ERIC

Download full text

Kim, Sooyeon; Livingston, Samuel A. – ETS Research Report Series, 2017

The purpose of this simulation study was to assess the accuracy of a classical test theory (CTT)-based procedure for estimating the alternate-forms reliability of scores on a multistage test (MST) having 3 stages. We generated item difficulty and discrimination parameters for 10 parallel, nonoverlapping forms of the complete 3-stage test and…

Descriptors: Accuracy, Test Theory, Test Reliability, Adaptive Testing

DIF Analysis with Multilevel Data: A Simulation Study Using the Latent Variable Approach

Peer reviewed
PDF on ERIC

Download full text

Jin, Ying; Eason, Hershel – Journal of Educational Issues, 2016

The effects of mean ability difference (MAD) and short tests on the performance of various DIF methods have been studied extensively in previous simulation studies. Their effects, however, have not been studied under multilevel data structure. MAD was frequently observed in large-scale cross-country comparison studies where the primary sampling…

Descriptors: Test Bias, Simulation, Hierarchical Linear Modeling, Comparative Analysis

Curriculum-Based Measurement of Oral Reading: A Preliminary Investigation of Confidence Interval Overlap to Detect Reliable Growth

Peer reviewed

Direct link

Van Norman, Ethan R. – School Psychology Quarterly, 2016

Curriculum-based measurement of oral reading (CBM-R) progress monitoring data is used to measure student response to instruction. Federal legislation permits educators to use CBM-R progress monitoring data as a basis for determining the presence of specific learning disabilities. However, decision making frameworks originally developed for CBM-R…

Descriptors: Oral Reading, Curriculum Based Assessment, Investigations, Progress Monitoring

The Role of Multiple-Group Measurement Invariance in Family Psychology Research

Peer reviewed
PDF on ERIC

Download full text

Direct link

Kern, Justin L.; McBride, Brent A.; Laxman, Daniel J.; Dyer, W. Justin; Santos, Rosa M.; Jeans, Laurie M. – Grantee Submission, 2016

Measurement invariance (MI) is a property of measurement that is often implicitly assumed, but in many cases, not tested. When the assumption of MI is tested, it generally involves determining if the measurement holds longitudinally or cross-culturally. A growing literature shows that other groupings can, and should, be considered as well.…

Descriptors: Psychology, Measurement, Error of Measurement, Measurement Objectives

The Impact of Test Dimensionality, Common-Item Set Format, and Scale Linking Methods on Mixed-Format Test Equating

Peer reviewed
PDF on ERIC

Download full text

Öztürk-Gübes, Nese; Kelecioglu, Hülya – Educational Sciences: Theory and Practice, 2016

The purpose of this study was to examine the impact of dimensionality, common-item set format, and different scale linking methods on preserving equity property with mixed-format test equating. Item response theory (IRT) true-score equating (TSE) and IRT observed-score equating (OSE) methods were used under common-item nonequivalent groups design.…

Descriptors: Test Format, Item Response Theory, True Scores, Equated Scores

Identifying the Barriers to Using Games and Simulations in Education: Creating a Valid and Reliable Survey Instrument

Direct link

Justice, Lenora Jean – ProQuest LLC, 2012

The purpose of this study was to create a valid and reliable instrument to measure teacher perceived barriers to the adoption of games and simulations in instruction. Previous research, interviews with educators, a focus group, an expert review, and a think aloud protocol were used to design a survey instrument. After finalization, the survey was…

Descriptors: Barriers, Games, Simulation, Test Validity

The Development of the Simulation Thinking Rubric

Direct link

Doolen, Jessica – ProQuest LLC, 2012

High fidelity simulation has become a widespread and costly learning strategy in nursing education because it can fill the gap left by a shortage of clinical sites. In addition, high fidelity simulation is an active learning strategy that is thought to increase higher order thinking such as clinical reasoning and judgment skills in nursing…

Descriptors: Simulation, Nursing Education, Simulated Environment, Psychometrics

Assessing the Conditional Reliability of State Assessments

Download full text

May, Henry; Cole, Russell; Haimson, Josh; Perez-Johnson, Irma – Society for Research on Educational Effectiveness, 2010

The purpose of this study is to provide empirical benchmarks of the conditional reliabilities of state tests for samples of the student population defined by ability level. Given that many educational interventions are targeted for samples of low performing students, schools, or districts, the primary goal of this research is to determine how…

Descriptors: Intervention, Statistical Analysis, Academic Achievement, Test Reliability

Application of Computerized Adaptive Testing to Entrance Examination for Graduate Studies in Turkey

Peer reviewed
PDF on ERIC

Download full text

Bulut, Okan; Kan, Adnan – Eurasian Journal of Educational Research, 2012

Problem Statement: Computerized adaptive testing (CAT) is a sophisticated and efficient way of delivering examinations. In CAT, items for each examinee are selected from an item bank based on the examinee's responses to the items. In this way, the difficulty level of the test is adjusted based on the examinee's ability level. Instead of…

Descriptors: Adaptive Testing, Computer Assisted Testing, College Entrance Examinations, Graduate Students

Comparison of Multistage Tests with Computerized Adaptive and Paper-and-Pencil Tests. Research Report. ETS RR-07-04

Peer reviewed
PDF on ERIC

Download full text

Rotou, Ourania; Patsula, Liane; Steffen, Manfred; Rizavi, Saba – ETS Research Report Series, 2007

Traditionally, the fixed-length linear paper-and-pencil (P&P) mode of administration has been the standard method of test delivery. With the advancement of technology, however, the popularity of administering tests using adaptive methods like computerized adaptive testing (CAT) and multistage testing (MST) has grown in the field of measurement…

Descriptors: Comparative Analysis, Test Format, Computer Assisted Testing, Models

A Procedure for the Simulation of Test Item Score Distributions.

Download full text

Mendro, Robert – 1971

A major problem in the research concerning distributional and other properties of reliability coefficients has been the non-existence or inaccessibility of adequate test data for use in empirical verification of hypothetical conclusions. The purpose of this paper is to develop a technique for the simulation of test item scores through the use of…

Descriptors: Computer Programs, Factor Analysis, Models, Reliability

Standard Errors of Estimate in Item-Examinee Sampling as a Function of Test Reliability, Variation in Item Difficulty Indices and Degree of Skewness in the Normative Distribution

Peer reviewed

Shoemaker, David M. – Educational and Psychological Measurement, 1972

Descriptors: Difficulty Level, Error of Measurement, Item Sampling, Simulation

A Monte Carlo Comparison of Ten Item Discrimination Indices.

Peer reviewed

Beuchert, A. Kent; Mendoza, Jorge L. – Journal of Educational Measurement, 1979

Ten item discrimination indices, across a variety of item analysis situations, were compared, based on the validities of tests constructed by using each of the indices to select 40 items from a 100-item pool. Item score data were generated by a computer program and included a simulation of guessing. (Author/CTM)

Descriptors: Item Analysis, Simulation, Statistical Analysis, Test Construction

Criterion-Referenced Test Interpretations of "Classical" Measurement Theory.

Download full text

Epstein, Kenneth I.; Knerr, Claramae S. – 1976

The literature on criterion referenced testing is full of discussions concerning whether classical measurement techniques are appropriate, whether variance is necessary, whether new indices of reliability are needed, and the like. What appears to be lacking, however, is a clear and simple discussion of why the problems occur. This paper suggests…

Descriptors: Career Development, Criterion Referenced Tests, Item Analysis, Item Sampling

Previous Page | Next Page »

Pages: 1 | 2

Andersson, Björn	1
Betz, Nancy E.	1
Beuchert, A. Kent	1
Bulut, Okan	1
Cliff, Norman	1
Cole, Russell	1
Doolen, Jessica	1
Dyer, W. Justin	1
Eason, Hershel	1
Epstein, Kenneth I.	1
Frary, Robert B.	1
Grossen, Neal E.	1
Haimson, Josh	1
Jeans, Laurie M.	1
Jin, Ying	1
Justice, Lenora Jean	1
Kan, Adnan	1
Kelecioglu, Hülya	1
Kern, Justin L.	1
Kim, Sooyeon	1
Knerr, Claramae S.	1
Laxman, Daniel J.	1
Livingston, Samuel A.	1
Marshall, J. Laird	1
More ▼