ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	3
Since 2016 (last 10 years)	7
Since 2006 (last 20 years)	10

Descriptor

Difficulty Level	14
Simulation	14
Test Items	12
Item Response Theory	9
Comparative Analysis	4
Computation	4
Computer Assisted Testing	4
Item Analysis	4
Adaptive Testing	3
Correlation	2
Equated Scores	2
Error of Measurement	2
Estimation (Mathematics)	2
Item Banks	2
Psychometrics	2
Statistics	2
Test Construction	2
Test Reliability	2
Test Wiseness	2
Testing	2
Alternative Assessment	1
Bayesian Statistics	1
Causal Models	1
Change	1
Classification	1
More ▼

Source

Journal of Educational…

Publication Type

Journal Articles	14
Reports - Research	12
Reports - Evaluative	2

Education Level

Elementary Education

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

Early Childhood Longitudinal…	1
National Assessment of…	1

What Works Clearinghouse Rating

Showing all 14 results Save | Export

Curvilinearity in the Reference Composite and Practical Implications for Measurement

Peer reviewed

Direct link

Xiangyi Liao; Daniel M. Bolt; Jee-Seon Kim – Journal of Educational Measurement, 2024

Item difficulty and dimensionality often correlate, implying that unidimensional IRT approximations to multidimensional data (i.e., reference composites) can take a curvilinear form in the multidimensional space. Although this issue has been previously discussed in the context of vertical scaling applications, we illustrate how such a phenomenon…

Descriptors: Difficulty Level, Simulation, Multidimensional Scaling, Graphs

A Framework for Measuring the Amount of Adaptation of Rasch-Based Computerized Adaptive Tests

Peer reviewed

Direct link

Wyse, Adam E.; McBride, James R. – Journal of Educational Measurement, 2021

A key consideration when giving any computerized adaptive test (CAT) is how much adaptation is present when the test is used in practice. This study introduces a new framework to measure the amount of adaptation of Rasch-based CATs based on looking at the differences between the selected item locations (Rasch item difficulty parameters) of the…

Descriptors: Item Response Theory, Computer Assisted Testing, Adaptive Testing, Test Items

Classical Item Analysis from a Signal Detection Perspective

Peer reviewed

Direct link

DeCarlo, Lawrence T. – Journal of Educational Measurement, 2023

A conceptualization of multiple-choice exams in terms of signal detection theory (SDT) leads to simple measures of item difficulty and item discrimination that are closely related to, but also distinct from, those used in classical item analysis (CIA). The theory defines a "true split," depending on whether or not examinees know an item,…

Descriptors: Multiple Choice Tests, Test Items, Item Analysis, Test Wiseness

Efficiency of Targeted Multistage Calibration Designs under Practical Constraints: A Simulation Study

Peer reviewed

Direct link

Berger, Stéphanie; Verschoor, Angela J.; Eggen, Theo J. H. M.; Moser, Urs – Journal of Educational Measurement, 2019

Calibration of an item bank for computer adaptive testing requires substantial resources. In this study, we investigated whether the efficiency of calibration under the Rasch model could be enhanced by improving the match between item difficulty and student ability. We introduced targeted multistage calibration designs, a design type that…

Descriptors: Simulation, Computer Assisted Testing, Test Items, Difficulty Level

Computerized Adaptive Testing in Early Education: Exploring the Impact of Item Position Effects on Ability Estimation

Peer reviewed

Direct link

Albano, Anthony D.; Cai, Liuhan; Lease, Erin M.; McConnell, Scott R. – Journal of Educational Measurement, 2019

Studies have shown that item difficulty can vary significantly based on the context of an item within a test form. In particular, item position may be associated with practice and fatigue effects that influence item parameter estimation. The purpose of this research was to examine the relevance of item position specifically for assessments used in…

Descriptors: Test Items, Computer Assisted Testing, Item Analysis, Difficulty Level

Equating with Miditests Using IRT

Peer reviewed

Direct link

Fitzpatrick, Joseph; Skorupski, William P. – Journal of Educational Measurement, 2016

The equating performance of two internal anchor test structures--miditests and minitests--is studied for four IRT equating methods using simulated data. Originally proposed by Sinharay and Holland, miditests are anchors that have the same mean difficulty as the overall test but less variance in item difficulties. Four popular IRT equating methods…

Descriptors: Difficulty Level, Test Items, Comparative Analysis, Test Construction

Longitudinal Multistage Testing

Peer reviewed

Direct link

Pohl, Steffi – Journal of Educational Measurement, 2013

This article introduces longitudinal multistage testing (lMST), a special form of multistage testing (MST), as a method for adaptive testing in longitudinal large-scale studies. In lMST designs, test forms of different difficulty levels are used, whereas the values on a pretest determine the routing to these test forms. Since lMST allows for…

Descriptors: Adaptive Testing, Longitudinal Studies, Difficulty Level, Comparative Analysis

Monitoring Items in Real Time to Enhance CAT Security

Peer reviewed

Direct link

Zhang, Jinming; Li, Jie – Journal of Educational Measurement, 2016

An IRT-based sequential procedure is developed to monitor items for enhancing test security. The procedure uses a series of statistical hypothesis tests to examine whether the statistical characteristics of each item under inspection have changed significantly during CAT administration. This procedure is compared with a previously developed…

Descriptors: Computer Assisted Testing, Test Items, Difficulty Level, Item Response Theory

A Comparison of Different Psychometric Approaches to Modeling Testlet Structures: An Example with C-Tests

Peer reviewed

Direct link

Schroeders, Ulrich; Robitzsch, Alexander; Schipolowski, Stefan – Journal of Educational Measurement, 2014

C-tests are a specific variant of cloze tests that are considered time-efficient, valid indicators of general language proficiency. They are commonly analyzed with models of item response theory assuming local item independence. In this article we estimated local interdependencies for 12 C-tests and compared the changes in item difficulties,…

Descriptors: Comparative Analysis, Psychometrics, Cloze Procedure, Language Tests

Item Response Theory Models for Performance Decline during Testing

Peer reviewed

Direct link

Jin, Kuan-Yu; Wang, Wen-Chung – Journal of Educational Measurement, 2014

Sometimes, test-takers may not be able to attempt all items to the best of their ability (with full effort) due to personal factors (e.g., low motivation) or testing conditions (e.g., time limit), resulting in poor performances on certain items, especially those located toward the end of a test. Standard item response theory (IRT) models fail to…

Descriptors: Student Evaluation, Item Response Theory, Models, Simulation

The Relationship between Item Parameters and Item Fit

Peer reviewed

Direct link

Dodeen, Hamzeh – Journal of Educational Measurement, 2004

The effect of item parameters (discrimination, difficulty, and level of guessing) on the item-fit statistic was investigated using simulated dichotomous data. Nine tests were simulated using 1,000 persons, 50 items, three levels of item discrimination, three levels of item difficulty, and three levels of guessing. The item fit was estimated using…

Descriptors: Item Response Theory, Difficulty Level, Test Items, Guessing (Tests)

Impact of Changing Difficulty on Inferences from the National Assessment of Educational Progress.

Peer reviewed

Cohen, Jon; Snow, Stephanie – Journal of Educational Measurement, 2002

Studied the impact of changes in item difficulty on National Assessment of Educational Progress (NAEP) estimates over time through a Monte Carlo study that simulated the responses of 1990 NAEP mathematics respondents to 1990 and 1996 NAEP mathematics items. Results support the idea that these changes have not affected the NAEP trend line.…

Descriptors: Change, Difficulty Level, Estimation (Mathematics), Mathematics Tests

Estimation of Classification Consistency When the Probability of a Correct Response Varies.

Peer reviewed

Spray, Judith A.; Welch, Catherine J. – Journal of Educational Measurement, 1990

The effect of large, within-examinee item difficulty variability on estimates of the proportion of consistent classification of examinees into mastery categories was studied over 2 test administrations for 100 simulated examinees. The proportion of consistent classifications was adequately estimated using the technique proposed by M. Subkoviak…

Descriptors: Classification, Difficulty Level, Estimation (Mathematics), Item Response Theory

Applications of the Analytically Derived Asymptotic Standard Errors of Item Response Theory Item Parameter Estimates

Peer reviewed

Direct link

Li, Yuan H.; Lissitz, Robert W. – Journal of Educational Measurement, 2004

The analytically derived asymptotic standard errors (SEs) of maximum likelihood (ML) item estimates can be approximated by a mathematical function without examinees' responses to test items, and the empirically determined SEs of marginal maximum likelihood estimation (MMLE)/Bayesian item estimates can be obtained when the same set of items is…

Descriptors: Test Items, Computation, Item Response Theory, Error of Measurement

Albano, Anthony D.	1
Berger, Stéphanie	1
Cai, Liuhan	1
Cohen, Jon	1
Daniel M. Bolt	1
DeCarlo, Lawrence T.	1
Dodeen, Hamzeh	1
Eggen, Theo J. H. M.	1
Fitzpatrick, Joseph	1
Jee-Seon Kim	1
Jin, Kuan-Yu	1
Lease, Erin M.	1
Li, Jie	1
Li, Yuan H.	1
Lissitz, Robert W.	1
McBride, James R.	1
McConnell, Scott R.	1
Moser, Urs	1
Pohl, Steffi	1
Robitzsch, Alexander	1
Schipolowski, Stefan	1
Schroeders, Ulrich	1
Skorupski, William P.	1
Snow, Stephanie	1
Spray, Judith A.	1
More ▼