ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	10
Since 2006 (last 20 years)	23

Descriptor

Comparative Analysis	28
Difficulty Level	28
Simulation	28
Test Items	18
Item Response Theory	12
Sample Size	8
Computer Assisted Testing	7
Correlation	7
Adaptive Testing	6
Equated Scores	6
Computation	5
Monte Carlo Methods	5
Ability	4
Guessing (Tests)	4
Psychometrics	4
Scores	4
Test Bias	4
Accuracy	3
Classification	3
Cognitive Processes	3
Item Analysis	3
Models	3
Probability	3
Test Format	3
Test Length	3
More ▼

Source

Journal of Educational…	4
ProQuest LLC	4
Applied Measurement in…	3
ETS Research Report Series	2
Educational Sciences: Theory…	2
Educational and Psychological…	2
Applied Psychological…	1
Educational Research and…	1
Hacettepe University Journal…	1
Journal of Autism and…	1
Perspectives in Education	1
Practical Assessment,…	1
Psicologica: International…	1
Quality Assurance in…	1
More ▼

Publication Type

Journal Articles	21
Reports - Research	19
Dissertations/Theses -…	4
Reports - Evaluative	4
Speeches/Meeting Papers	3
Reports - Descriptive	1

Education Level

Higher Education	3
Postsecondary Education	3
Secondary Education	1

Audience

Location

Turkey

Laws, Policies, & Programs

Assessments and Surveys

Program for International…	1
SAT (College Admission Test)	1
Trends in International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 28 results Save | Export

Closed Formula of Test Length Required for Adaptive Testing with Medium Probability of Solution

Peer reviewed

Direct link

Kárász, Judit T.; Széll, Krisztián; Takács, Szabolcs – Quality Assurance in Education: An International Perspective, 2023

Purpose: Based on the general formula, which depends on the length and difficulty of the test, the number of respondents and the number of ability levels, this study aims to provide a closed formula for the adaptive tests with medium difficulty (probability of solution is p = 1/2) to determine the accuracy of the parameters for each item and in…

Descriptors: Test Length, Probability, Comparative Analysis, Difficulty Level

Comparing the Robustness of Three Nonparametric DIF Procedures to Differential Rapid Guessing

Peer reviewed

Direct link

Abulela, Mohammed A. A.; Rios, Joseph A. – Applied Measurement in Education, 2022

When there are no personal consequences associated with test performance for examinees, rapid guessing (RG) is a concern and can differ between subgroups. To date, the impact of differential RG on item-level measurement invariance has received minimal attention. To that end, a simulation study was conducted to examine the robustness of the…

Descriptors: Comparative Analysis, Robustness (Statistics), Nonparametric Statistics, Item Analysis

Investigating Test Equating Methods in Small Samples through Various Factors

Peer reviewed
PDF on ERIC

Download full text

Asiret, Semih; Sünbül, Seçil Ömür – Educational Sciences: Theory and Practice, 2016

In this study, equating methods for random group design using small samples through factors such as sample size, difference in difficulty between forms, and guessing parameter was aimed for comparison. Moreover, which method gives better results under which conditions was also investigated. In this study, 5,000 dichotomous simulated data…

Descriptors: Equated Scores, Sample Size, Difficulty Level, Guessing (Tests)

Equating with Miditests Using IRT

Peer reviewed

Direct link

Fitzpatrick, Joseph; Skorupski, William P. – Journal of Educational Measurement, 2016

The equating performance of two internal anchor test structures--miditests and minitests--is studied for four IRT equating methods using simulated data. Originally proposed by Sinharay and Holland, miditests are anchors that have the same mean difficulty as the overall test but less variance in item difficulties. Four popular IRT equating methods…

Descriptors: Difficulty Level, Test Items, Comparative Analysis, Test Construction

Unidimensional IRT Item Parameter Estimates across Equivalent Test Forms with Confounding Specifications within Dimensions

Peer reviewed

Direct link

Matlock, Ki Lynn; Turner, Ronna – Educational and Psychological Measurement, 2016

When constructing multiple test forms, the number of items and the total test difficulty are often equivalent. Not all test developers match the number of items and/or average item difficulty within subcontent areas. In this simulation study, six test forms were constructed having an equal number of items and average item difficulty overall.…

Descriptors: Item Response Theory, Computation, Test Items, Difficulty Level

Bayesian Estimation of Multidimensional Item Response Models. A Comparison of Analytic and Simulation Algorithms

Peer reviewed
PDF on ERIC

Download full text

Martin-Fernandez, Manuel; Revuelta, Javier – Psicologica: International Journal of Methodology and Experimental Psychology, 2017

This study compares the performance of two estimation algorithms of new usage, the Metropolis-Hastings Robins-Monro (MHRM) and the Hamiltonian MCMC (HMC), with two consolidated algorithms in the psychometric literature, the marginal likelihood via EM algorithm (MML-EM) and the Markov chain Monte Carlo (MCMC), in the estimation of multidimensional…

Descriptors: Bayesian Statistics, Item Response Theory, Models, Comparative Analysis

Investigating Item Exposure Control Methods in Computerized Adaptive Testing

Peer reviewed
PDF on ERIC

Download full text

Ozturk, Nagihan Boztunc; Dogan, Nuri – Educational Sciences: Theory and Practice, 2015

This study aims to investigate the effects of item exposure control methods on measurement precision and on test security under various item selection methods and item pool characteristics. In this study, the Randomesque (with item group sizes of 5 and 10), Sympson-Hetter, and Fade-Away methods were used as item exposure control methods. Moreover,…

Descriptors: Computer Assisted Testing, Item Analysis, Statistical Analysis, Comparative Analysis

A Maximum Likelihood Based Offline Estimation of Student Capabilities and Question Difficulties with Guessing

Peer reviewed

Direct link

Moothedath, Shana; Chaporkar, Prasanna; Belur, Madhu N. – Perspectives in Education, 2016

In recent years, the computerised adaptive test (CAT) has gained popularity over conventional exams in evaluating student capabilities with desired accuracy. However, the key limitation of CAT is that it requires a large pool of pre-calibrated questions. In the absence of such a pre-calibrated question bank, offline exams with uncalibrated…

Descriptors: Guessing (Tests), Computer Assisted Testing, Adaptive Testing, Maximum Likelihood Statistics

Estimating Item Difficulty with Comparative Judgments. Research Report. ETS RR-14-39

Peer reviewed
PDF on ERIC

Download full text

Attali, Yigal; Saldivia, Luis; Jackson, Carol; Schuppan, Fred; Wanamaker, Wilbur – ETS Research Report Series, 2014

Previous investigations of the ability of content experts and test developers to estimate item difficulty have, for themost part, produced disappointing results. These investigations were based on a noncomparative method of independently rating the difficulty of items. In this article, we argue that, by eliciting comparative judgments of…

Descriptors: Test Items, Difficulty Level, Comparative Analysis, College Entrance Examinations

Centering, Scale Indeterminacy, and Differential Item Functioning Detection in Hierarchical Generalized Linear and Generalized Linear Mixed Models

Peer reviewed

Direct link

Cheong, Yuk Fai; Kamata, Akihito – Applied Measurement in Education, 2013

In this article, we discuss and illustrate two centering and anchoring options available in differential item functioning (DIF) detection studies based on the hierarchical generalized linear and generalized linear mixed modeling frameworks. We compared and contrasted the assumptions of the two options, and examined the properties of their DIF…

Descriptors: Test Bias, Hierarchical Linear Modeling, Comparative Analysis, Test Items

Longitudinal Multistage Testing

Peer reviewed

Direct link

Pohl, Steffi – Journal of Educational Measurement, 2013

This article introduces longitudinal multistage testing (lMST), a special form of multistage testing (MST), as a method for adaptive testing in longitudinal large-scale studies. In lMST designs, test forms of different difficulty levels are used, whereas the values on a pretest determine the routing to these test forms. Since lMST allows for…

Descriptors: Adaptive Testing, Longitudinal Studies, Difficulty Level, Comparative Analysis

Monitoring Items in Real Time to Enhance CAT Security

Peer reviewed

Direct link

Zhang, Jinming; Li, Jie – Journal of Educational Measurement, 2016

An IRT-based sequential procedure is developed to monitor items for enhancing test security. The procedure uses a series of statistical hypothesis tests to examine whether the statistical characteristics of each item under inspection have changed significantly during CAT administration. This procedure is compared with a previously developed…

Descriptors: Computer Assisted Testing, Test Items, Difficulty Level, Item Response Theory

Parameter Recovery and Classification Accuracy under Conditions of Testlet Dependency: A Comparison of the Traditional 2PL, Testlet, and Bi-Factor Models

Peer reviewed

Direct link

Koziol, Natalie A. – Applied Measurement in Education, 2016

Testlets, or groups of related items, are commonly included in educational assessments due to their many logistical and conceptual advantages. Despite their advantages, testlets introduce complications into the theory and practice of educational measurement. Responses to items within a testlet tend to be correlated even after controlling for…

Descriptors: Classification, Accuracy, Comparative Analysis, Models

A Comparison of Different Psychometric Approaches to Modeling Testlet Structures: An Example with C-Tests

Peer reviewed

Direct link

Schroeders, Ulrich; Robitzsch, Alexander; Schipolowski, Stefan – Journal of Educational Measurement, 2014

C-tests are a specific variant of cloze tests that are considered time-efficient, valid indicators of general language proficiency. They are commonly analyzed with models of item response theory assuming local item independence. In this article we estimated local interdependencies for 12 C-tests and compared the changes in item difficulties,…

Descriptors: Comparative Analysis, Psychometrics, Cloze Procedure, Language Tests

An Investigation of the Impact of Misrouting under Two-Stage Multistage Testing: A Simulation Study. Research Report. ETS RR-14-01

Peer reviewed
PDF on ERIC

Download full text

Kim, Sooyeon; Moses, Tim – ETS Research Report Series, 2014

The purpose of this study was to investigate the potential impact of misrouting under a 2-stage multistage test (MST) design, which includes 1 routing and 3 second-stage modules. Simulations were used to create a situation in which a large group of examinees took each of the 3 possible MST paths (high, middle, and low). We compared differences in…

Descriptors: Comparative Analysis, Difficulty Level, Scores, Test Wiseness

Previous Page | Next Page »

Pages: 1 | 2

Kamata, Akihito	2
Abulela, Mohammed A. A.	1
Asiret, Semih	1
Atar, Burcu	1
Attali, Yigal	1
Belur, Madhu N.	1
Biederman, Joseph	1
Bolfek, Anela	1
Carvajal-Espinoza, Jorge E.	1
Chaporkar, Prasanna	1
Cheong, Yuk Fai	1
Cohen, Allan S.	1
Dogan, Nuri	1
Efendioglu, Akin	1
Fitzpatrick, Joseph	1
Fried, Ronna	1
Godfrey, Kathryn M.	1
Goldin, Rachel	1
Hsu, Tse-Chi	1
Jackson, Carol	1
Joshi, Gagan	1
Kim, Seock-Ho	1
Kim, Sooyeon	1
Kirisci, Levent	1
Koziol, Natalie A.	1
More ▼