ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	6
Since 2006 (last 20 years)	17

Descriptor

Scores	22
Simulation	22
Test Reliability	22
Test Items	12
Item Response Theory	8
Error of Measurement	6
Psychometrics	5
Computation	4
Computer Assisted Testing	4
Models	4
Sample Size	4
Statistical Analysis	4
Test Bias	4
Test Length	4
Correlation	3
Item Analysis	3
Statistical Bias	3
Test Format	3
Test Validity	3
Adaptive Testing	2
Comparative Analysis	2
Difficulty Level	2
Evaluation Methods	2
Factor Analysis	2
Maximum Likelihood Statistics	2
More ▼

Source

Applied Psychological…	3
ETS Research Report Series	3
Educational and Psychological…	3
Journal of Educational…	2
ProQuest LLC	2
Psychological Methods	2
EURASIA Journal of…	1
Education and Information…	1
Eurasian Journal of…	1
Evaluation and the Health…	1
Journal of Educational and…	1
Psychometrika	1
More ▼

Publication Type

Journal Articles	19
Reports - Research	13
Reports - Evaluative	4
Dissertations/Theses -…	2
Reports - Descriptive	2

Education Level

Grade 9	1
High Schools	1
Higher Education	1
Junior High Schools	1
Middle Schools	1
Postsecondary Education	1
Secondary Education	1

Audience

Location

Indonesia

Laws, Policies, & Programs

Assessments and Surveys

Armed Forces Qualification…

What Works Clearinghouse Rating

Showing 1 to 15 of 22 results Save | Export

Estimating Difference-Score Reliability in Pretest-Posttest Settings

Peer reviewed

Direct link

Gu, Zhengguo; Emons, Wilco H. M.; Sijtsma, Klaas – Journal of Educational and Behavioral Statistics, 2021

Clinical, medical, and health psychologists use difference scores obtained from pretest--posttest designs employing the same test to assess intraindividual change possibly caused by an intervention addressing, for example, anxiety, depression, eating disorder, or addiction. Reliability of difference scores is important for interpreting observed…

Descriptors: Test Reliability, Scores, Pretests Posttests, Computation

Examining of Internal Consistency Coefficients in Mixed-Format Tests in Different Simulation Conditions

Peer reviewed
PDF on ERIC

Download full text

Gurdil Ege, Hatice; Demir, Ergul – Eurasian Journal of Educational Research, 2020

Purpose: The present study aims to evaluate how the reliabilities computed using a, Stratified a, Angoff-Feldt, and Feldt-Raju estimators may differ when sample size (500, 1000, and 2000) and item type ratio of dichotomous to polytomous items (2:1; 1:1, 1:2) included in the scale are varied. Research Methods: In this study, Cronbach's a,…

Descriptors: Test Format, Simulation, Test Reliability, Sample Size

Measuring Language Ability of Students with Compensatory Multidimensional CAT: A Post-Hoc Simulation Study

Peer reviewed

Direct link

Ozdemir, Burhanettin; Gelbal, Selahattin – Education and Information Technologies, 2022

The computerized adaptive tests (CAT) apply an adaptive process in which the items are tailored to individuals' ability scores. The multidimensional CAT (MCAT) designs differ in terms of different item selection, ability estimation, and termination methods being used. This study aims at investigating the performance of the MCAT designs used to…

Descriptors: Scores, Computer Assisted Testing, Test Items, Language Proficiency

A Simulation-Based Method for Finding the Optimal Number of Options for Multiple-Choice Items on a Test. Research Report. ETS RR-18-22

Peer reviewed
PDF on ERIC

Download full text

Guo, Hongwen; Zu, Jiyun; Kyllonen, Patrick – ETS Research Report Series, 2018

For a multiple-choice test under development or redesign, it is important to choose the optimal number of options per item so that the test possesses the desired psychometric properties. On the basis of available data for a multiple-choice assessment with 8 options, we evaluated the effects of changing the number of options on test properties…

Descriptors: Multiple Choice Tests, Test Items, Simulation, Test Construction

Large Sample Confidence Intervals for Item Response Theory Reliability Coefficients

Peer reviewed

Direct link

Andersson, Björn; Xin, Tao – Educational and Psychological Measurement, 2018

In applications of item response theory (IRT), an estimate of the reliability of the ability estimates or sum scores is often reported. However, analytical expressions for the standard errors of the estimators of the reliability coefficients are not available in the literature and therefore the variability associated with the estimated reliability…

Descriptors: Item Response Theory, Test Reliability, Test Items, Scores

A Flexible Latent Class Approach to Estimating Test-Score Reliability

Peer reviewed

Direct link

van der Palm, Daniël W.; van der Ark, L. Andries; Sijtsma, Klaas – Journal of Educational Measurement, 2014

The latent class reliability coefficient (LCRC) is improved by using the divisive latent class model instead of the unrestricted latent class model. This results in the divisive latent class reliability coefficient (DLCRC), which unlike LCRC avoids making subjective decisions about the best solution and thus avoids judgment error. A computational…

Descriptors: Test Reliability, Scores, Computation, Simulation

Simulate to Understand Models, Not Nature. Research Report. ETS RR-14-16

Peer reviewed
PDF on ERIC

Download full text

Dorans, Neil J. – ETS Research Report Series, 2014

Simulations are widely used. Simulations produce numbers that are deductive demonstrations of what a model says will happen.They produce numerical results that are consistent with the premises of the model used to generate the numbers. These simulated numerical results are not empirical data that address aspects of the world that lies outside the…

Descriptors: Simulation, Equated Scores, Scores, Scientific Methodology

Reliability and Model Fit

Peer reviewed

Direct link

Stanley, Leanne M.; Edwards, Michael C. – Educational and Psychological Measurement, 2016

The purpose of this article is to highlight the distinction between the reliability of test scores and the fit of psychometric measurement models, reminding readers why it is important to consider both when evaluating whether test scores are valid for a proposed interpretation and/or use. It is often the case that an investigator judges both the…

Descriptors: Test Reliability, Goodness of Fit, Scores, Patients

Comparing the Performance of Five Multidimensional CAT Selection Procedures with Different Stopping Rules

Peer reviewed

Direct link

Yao, Lihua – Applied Psychological Measurement, 2013

Through simulated data, five multidimensional computerized adaptive testing (MCAT) selection procedures with varying test lengths are examined and compared using different stopping rules. Fixed item exposure rates are used for all the items, and the Priority Index (PI) method is used for the content constraints. Two stopping rules, standard error…

Descriptors: Computer Assisted Testing, Adaptive Testing, Test Items, Selection

Effect of Violating Unidimensional Item Response Theory Vertical Scaling Assumptions on Developmental Score Scales

Direct link

Topczewski, Anna Marie – ProQuest LLC, 2013

Developmental score scales represent the performance of students along a continuum, where as students learn more they move higher along that continuum. Unidimensional item response theory (UIRT) vertical scaling has become a commonly used method to create developmental score scales. Research has shown that UIRT vertical scaling methods can be…

Descriptors: Item Response Theory, Scaling, Scores, Student Development

Multidimensional Computerized Adaptive Testing for Indonesia Junior High School Biology

Peer reviewed

Direct link

Kuo, Bor-Chen; Daud, Muslem; Yang, Chih-Wei – EURASIA Journal of Mathematics, Science & Technology Education, 2015

This paper describes a curriculum-based multidimensional computerized adaptive test that was developed for Indonesia junior high school Biology. In adherence to the Indonesian curriculum of different Biology dimensions, 300 items was constructed, and then tested to 2238 students. A multidimensional random coefficients multinomial logit model was…

Descriptors: Secondary School Science, Science Education, Science Tests, Computer Assisted Testing

The Development of the Simulation Thinking Rubric

Direct link

Doolen, Jessica – ProQuest LLC, 2012

High fidelity simulation has become a widespread and costly learning strategy in nursing education because it can fill the gap left by a shortage of clinical sites. In addition, high fidelity simulation is an active learning strategy that is thought to increase higher order thinking such as clinical reasoning and judgment skills in nursing…

Descriptors: Simulation, Nursing Education, Simulated Environment, Psychometrics

Asymptotically Distribution-Free (ADF) Interval Estimation of Coefficient Alpha

Peer reviewed

Direct link

Maydeu-Olivares, Alberto; Coffman, Donna L.; Hartmann, Wolfgang M. – Psychological Methods, 2007

The point estimate of sample coefficient alpha may provide a misleading impression of the reliability of the test score. Because sample coefficient alpha is consistently biased downward, it is more likely to yield a misleading impression of poor reliability. The magnitude of the bias is greatest precisely when the variability of sample alpha is…

Descriptors: Intervals, Scores, Sample Size, Simulation

Multinomial and Compound Multinomial Error Models for Tests with Complex Item Scoring

Peer reviewed

Direct link

Lee, Won-Chan – Applied Psychological Measurement, 2007

This article introduces a multinomial error model, which models an examinee's test scores obtained over repeated measurements of an assessment that consists of polytomously scored items. A compound multinomial error model is also introduced for situations in which items are stratified according to content categories and/or prespecified numbers of…

Descriptors: Simulation, Error of Measurement, Scoring, Test Items

Measurement Invariance versus Selection Invariance: Is Fair Selection Possible?

Peer reviewed

Direct link

Borsman, Denny; Romeijn, Jan-Willem; Wicherts, Jelte M. – Psychological Methods, 2008

This article shows that measurement invariance (defined in terms of an invariant measurement model in different groups) is generally inconsistent with selection invariance (defined in terms of equal sensitivity and specificity across groups). In particular, when a unidimensional measurement instrument is used and group differences are present in…

Descriptors: Test Items, Minority Groups, Measurement, Scores

Previous Page | Next Page »

Pages: 1 | 2

Sijtsma, Klaas	2
Andersson, Björn	1
Borsman, Denny	1
Coffman, Donna L.	1
Daud, Muslem	1
Demir, Ergul	1
Doolen, Jessica	1
Dorans, Neil J.	1
Edwards, Michael C.	1
Emons, Wilco H. M.	1
Galindo-Garre, Francisca	1
Gelbal, Selahattin	1
Gu, Zhengguo	1
Guo, Hongwen	1
Gurdil Ege, Hatice	1
Hartmann, Wolfgang M.	1
Kuo, Bor-Chen	1
Kyllonen, Patrick	1
Lee, Won-Chan	1
Maydeu-Olivares, Alberto	1
McBride, Helena	1
Meijer, Rob R.	1
Mendro, Robert	1
Muller, Jorg M.	1
More ▼