ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	6
Since 2016 (last 10 years)	22
Since 2006 (last 20 years)	48

Descriptor

Scores	71
Simulation	71
Test Items	71
Item Response Theory	44
Comparative Analysis	21
Error of Measurement	14
Test Bias	14
Models	13
Computer Assisted Testing	12
Item Analysis	12
Test Reliability	12
Psychometrics	11
Statistical Analysis	11
Evaluation Methods	10
Sample Size	10
Adaptive Testing	9
Correlation	9
Foreign Countries	9
Test Length	9
Ability	8
Computation	8
Difficulty Level	8
Estimation (Mathematics)	8
Goodness of Fit	8
Test Format	7
More ▼

Publication Type

Journal Articles	49
Reports - Research	46
Reports - Evaluative	17
Speeches/Meeting Papers	8
Dissertations/Theses -…	5
Reports - Descriptive	3
Tests/Questionnaires	1

Education Level

Secondary Education	5
High Schools	4
Grade 12	2
Elementary Secondary Education	1
Grade 9	1
Higher Education	1
Junior High Schools	1
Middle Schools	1
Postsecondary Education	1

Audience

Researchers

Location

Indonesia	1
Netherlands	1
Saudi Arabia	1
Turkey	1
United Kingdom (England)	1

Laws, Policies, & Programs

Assessments and Surveys

Armed Forces Qualification…	1
Cognitive Abilities Test	1
NEO Personality Inventory	1
National Assessment of…	1
Trends in International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 71 results Save | Export

Using Item Scores and Distractors to Detect Item Compromise and Preknowledge

Peer reviewed

Direct link

Gorney, Kylie; Wollack, James A.; Sinharay, Sandip; Eckerly, Carol – Journal of Educational and Behavioral Statistics, 2023

Any time examinees have had access to items and/or answers prior to taking a test, the fairness of the test and validity of test score interpretations are threatened. Therefore, there is a high demand for procedures to detect both compromised items (CI) and examinees with preknowledge (EWP). In this article, we develop a procedure that uses item…

Descriptors: Scores, Test Validity, Test Items, Prior Learning

Estimating Difference-Score Reliability in Pretest-Posttest Settings

Peer reviewed

Direct link

Gu, Zhengguo; Emons, Wilco H. M.; Sijtsma, Klaas – Journal of Educational and Behavioral Statistics, 2021

Clinical, medical, and health psychologists use difference scores obtained from pretest--posttest designs employing the same test to assess intraindividual change possibly caused by an intervention addressing, for example, anxiety, depression, eating disorder, or addiction. Reliability of difference scores is important for interpreting observed…

Descriptors: Test Reliability, Scores, Pretests Posttests, Computation

Classical Item Analysis from a Signal Detection Perspective

Peer reviewed

Direct link

DeCarlo, Lawrence T. – Journal of Educational Measurement, 2023

A conceptualization of multiple-choice exams in terms of signal detection theory (SDT) leads to simple measures of item difficulty and item discrimination that are closely related to, but also distinct from, those used in classical item analysis (CIA). The theory defines a "true split," depending on whether or not examinees know an item,…

Descriptors: Multiple Choice Tests, Test Items, Item Analysis, Test Wiseness

Examining of Internal Consistency Coefficients in Mixed-Format Tests in Different Simulation Conditions

Peer reviewed
PDF on ERIC

Download full text

Gurdil Ege, Hatice; Demir, Ergul – Eurasian Journal of Educational Research, 2020

Purpose: The present study aims to evaluate how the reliabilities computed using a, Stratified a, Angoff-Feldt, and Feldt-Raju estimators may differ when sample size (500, 1000, and 2000) and item type ratio of dichotomous to polytomous items (2:1; 1:1, 1:2) included in the scale are varied. Research Methods: In this study, Cronbach's a,…

Descriptors: Test Format, Simulation, Test Reliability, Sample Size

Digital Module 13: Monte Carlo Simulation Studies in Item Response Theory

Peer reviewed

Direct link

Leventhal, Brian; Ames, Allison – Educational Measurement: Issues and Practice, 2020

In this digital ITEMS module, Dr. Brian Leventhal and Dr. Allison Ames provide an overview of "Monte Carlo simulation studies" (MCSS) in "item response theory" (IRT). MCSS are utilized for a variety of reasons, one of the most compelling being that they can be used when analytic solutions are impractical or nonexistent because…

Descriptors: Item Response Theory, Monte Carlo Methods, Simulation, Test Items

The Effect of Repeat Exposure to Simulation Based Items

Peer reviewed
PDF on ERIC

Download full text

Tang, Xiaodan; Schultz, Matthew – Practical Assessment, Research & Evaluation, 2020

This study aims to examine the potential impacts on repeat examinees' performance by reusing simulation-based items in a high-stakes standardized assessment. We examined change patterns of item scores, ability estimate, score pattern change, response time and compared the performance of repeat examinees who have received repeat items and those who…

Descriptors: Test Items, Repetition, Simulation, Standardized Tests

Measuring Language Ability of Students with Compensatory Multidimensional CAT: A Post-Hoc Simulation Study

Peer reviewed

Direct link

Ozdemir, Burhanettin; Gelbal, Selahattin – Education and Information Technologies, 2022

The computerized adaptive tests (CAT) apply an adaptive process in which the items are tailored to individuals' ability scores. The multidimensional CAT (MCAT) designs differ in terms of different item selection, ability estimation, and termination methods being used. This study aims at investigating the performance of the MCAT designs used to…

Descriptors: Scores, Computer Assisted Testing, Test Items, Language Proficiency

Bias and Bias Correction Method for Nonproportional Abilities Requirement (NPAR) Tests

Peer reviewed

Direct link

Ip, Edward H.; Strachan, Tyler; Fu, Yanyan; Lay, Alexandra; Willse, John T.; Chen, Shyh-Huei; Rutkowski, Leslie; Ackerman, Terry – Journal of Educational Measurement, 2019

Test items must often be broad in scope to be ecologically valid. It is therefore almost inevitable that secondary dimensions are introduced into a test during test development. A cognitive test may require one or more abilities besides the primary ability to correctly respond to an item, in which case a unidimensional test score overestimates the…

Descriptors: Test Items, Test Bias, Test Construction, Scores

Impact of Item Parameter Drift on Rasch Scale Stability in Small Samples over Multiple Administrations

Peer reviewed

Direct link

Kopp, Jason P.; Jones, Andrew T. – Applied Measurement in Education, 2020

Traditional psychometric guidelines suggest that at least several hundred respondents are needed to obtain accurate parameter estimates under the Rasch model. However, recent research indicates that Rasch equating results in accurate parameter estimates with sample sizes as small as 25. Item parameter drift under the Rasch model has been…

Descriptors: Item Response Theory, Psychometrics, Sample Size, Sampling

A Log-Linear Modeling Approach for Differential Item Functioning Detection in Polytomously Scored Items

Peer reviewed

Direct link

Yesiltas, Gonca; Paek, Insu – Educational and Psychological Measurement, 2020

A log-linear model (LLM) is a well-known statistical method to examine the relationship among categorical variables. This study investigated the performance of LLM in detecting differential item functioning (DIF) for polytomously scored items via simulations where various sample sizes, ability mean differences (impact), and DIF types were…

Descriptors: Simulation, Sample Size, Item Analysis, Scores

Evaluating a Computerized Adaptive Testing Version of a Cognitive Ability Test Using a Simulation Study

Peer reviewed

Direct link

Tsaousis, Ioannis; Sideridis, Georgios D.; AlGhamdi, Hannan M. – Journal of Psychoeducational Assessment, 2021

This study evaluated the psychometric quality of a computerized adaptive testing (CAT) version of the general cognitive ability test (GCAT), using a simulation study protocol put forth by Han, K. T. (2018a). For the needs of the analysis, three different sets of items were generated, providing an item pool of 165 items. Before evaluating the…

Descriptors: Computer Assisted Testing, Adaptive Testing, Cognitive Tests, Cognitive Ability

A Simulation-Based Method for Finding the Optimal Number of Options for Multiple-Choice Items on a Test. Research Report. ETS RR-18-22

Peer reviewed
PDF on ERIC

Download full text

Guo, Hongwen; Zu, Jiyun; Kyllonen, Patrick – ETS Research Report Series, 2018

For a multiple-choice test under development or redesign, it is important to choose the optimal number of options per item so that the test possesses the desired psychometric properties. On the basis of available data for a multiple-choice assessment with 8 options, we evaluated the effects of changing the number of options on test properties…

Descriptors: Multiple Choice Tests, Test Items, Simulation, Test Construction

Testing Latent Variable Distribution Fit in IRT Using Posterior Residuals

Peer reviewed

Direct link

Monroe, Scott – Journal of Educational and Behavioral Statistics, 2021

This research proposes a new statistic for testing latent variable distribution fit for unidimensional item response theory (IRT) models. If the typical assumption of normality is violated, then item parameter estimates will be biased, and dependent quantities such as IRT score estimates will be adversely affected. The proposed statistic compares…

Descriptors: Item Response Theory, Simulation, Scores, Comparative Analysis

A Comparison of Score Aggregation Methods for Unidimensional Tests on Different Dimensions. Research Report. ETS RR-18-01

Peer reviewed
PDF on ERIC

Download full text

Fu, Jianbin; Feng, Yuling – ETS Research Report Series, 2018

In this study, we propose aggregating test scores with unidimensional within-test structure and multidimensional across-test structure based on a 2-level, 1-factor model. In particular, we compare 6 score aggregation methods: average of standardized test raw scores (M1), regression factor score estimate of the 1-factor model based on the…

Descriptors: Comparative Analysis, Scores, Correlation, Standardized Tests

Large Sample Confidence Intervals for Item Response Theory Reliability Coefficients

Peer reviewed

Direct link

Andersson, Björn; Xin, Tao – Educational and Psychological Measurement, 2018

In applications of item response theory (IRT), an estimate of the reliability of the ability estimates or sum scores is often reported. However, analytical expressions for the standard errors of the estimators of the reliability coefficients are not available in the literature and therefore the variability associated with the estimated reliability…

Descriptors: Item Response Theory, Test Reliability, Test Items, Scores

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5

Applied Psychological…	8
Applied Measurement in…	5
ETS Research Report Series	5
Journal of Educational…	5
ProQuest LLC	5
Journal of Educational and…	4
Educational and Psychological…	3
Multivariate Behavioral…	2
EURASIA Journal of…	1
Education and Information…	1
Educational Measurement:…	1
Educational Sciences: Theory…	1
Eurasian Journal of…	1
Hacettepe University Journal…	1
International Journal of…	1
International Journal of…	1
International Journal of…	1
Journal of Psychoeducational…	1
Practical Assessment,…	1
Psychological Methods	1
Psychometrika	1
Research in Mathematics…	1
Sociological Methods &…	1
Structural Equation Modeling:…	1
Studies in Second Language…	1
More ▼

Emons, Wilco H. M.	4
Meijer, Rob R.	4
Sijtsma, Klaas	4
Pommerich, Mary	3
Capar, Nilufer K.	2
Clauser, Brian	2
Kamata, Akihito	2
Nicewander, W. Alan	2
Ackerman, Terry	1
Ackerman, Terry A.	1
AlGhamdi, Hannan M.	1
Ames, Allison	1
Andersson, Björn	1
Atar, Burcu	1
Benítez, Isabel	1
Berberoglu, Giray	1
Borsman, Denny	1
Bramley, Tom	1
Breyer, F. Jay	1
Brusco, Michael J.	1
Béland, Sébastien	1
Chen, Shyh-Huei	1
Cho, Sun-Joo	1
Cui, Zhongmin	1
More ▼