ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	6
Since 2016 (last 10 years)	11
Since 2006 (last 20 years)	24

Descriptor

Difficulty Level	61
Test Items	61
Item Response Theory	21
Item Analysis	12
Scores	12
Simulation	12
Test Construction	12
Comparative Analysis	11
Higher Education	11
Models	11
Multiple Choice Tests	9
Test Bias	9
Computer Assisted Testing	8
Test Reliability	8
Computation	7
Equated Scores	7
Adaptive Testing	6
College Entrance Examinations	6
Psychometrics	6
Achievement Tests	5
Correlation	5
Estimation (Mathematics)	5
Guessing (Tests)	5
Mathematics Tests	5
Test Wiseness	5
More ▼

Source

Journal of Educational…

Publication Type

Journal Articles	58
Reports - Research	43
Reports - Evaluative	9
Reports - Descriptive	6
Information Analyses	2
Speeches/Meeting Papers	1

Education Level

Secondary Education	2
Elementary Education	1
Grade 4	1
Grade 8	1
Higher Education	1

Audience

Location

Belgium	1
Israel	1
Turkey	1

Laws, Policies, & Programs

Assessments and Surveys

SAT (College Admission Test)	4
Graduate Record Examinations	3
National Assessment of…	2
Program for International…	2
California Achievement Tests	1
Stanford Achievement Tests	1
Trends in International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 61 results Save | Export

On the Positive Correlation between DIF and Difficulty: A New Theory on the Correlation as Methodological Artifact

Peer reviewed

Direct link

Bolt, Daniel M.; Liao, Xiangyi – Journal of Educational Measurement, 2021

We revisit the empirically observed positive correlation between DIF and difficulty studied by Freedle and commonly seen in tests of verbal proficiency when comparing populations of different mean latent proficiency levels. It is shown that a positive correlation between DIF and difficulty estimates is actually an expected result (absent any true…

Descriptors: Test Bias, Difficulty Level, Correlation, Verbal Tests

Does Timed Testing Affect the Interpretation of Efficiency Scores?--A GLMM Analysis of Reading Components

Peer reviewed

Direct link

Frank Goldhammer; Ulf Kroehne; Carolin Hahnel; Johannes Naumann; Paul De Boeck – Journal of Educational Measurement, 2024

The efficiency of cognitive component skills is typically assessed with speeded performance tests. Interpreting only effective ability or effective speed as efficiency may be challenging because of the within-person dependency between both variables (speed-ability tradeoff, SAT). The present study measures efficiency as effective ability…

Descriptors: Timed Tests, Efficiency, Scores, Test Interpretation

The Impact of Cheating on Score Comparability via Pool-Based IRT Pre-Equating

Peer reviewed

Direct link

Liu, Jinghua; Becker, Kirk – Journal of Educational Measurement, 2022

For any testing programs that administer multiple forms across multiple years, maintaining score comparability via equating is essential. With continuous testing and high-stakes results, especially with less secure online administrations, testing programs must consider the potential for cheating on their exams. This study used empirical and…

Descriptors: Cheating, Item Response Theory, Scores, High Stakes Tests

A Framework for Measuring the Amount of Adaptation of Rasch-Based Computerized Adaptive Tests

Peer reviewed

Direct link

Wyse, Adam E.; McBride, James R. – Journal of Educational Measurement, 2021

A key consideration when giving any computerized adaptive test (CAT) is how much adaptation is present when the test is used in practice. This study introduces a new framework to measure the amount of adaptation of Rasch-based CATs based on looking at the differences between the selected item locations (Rasch item difficulty parameters) of the…

Descriptors: Item Response Theory, Computer Assisted Testing, Adaptive Testing, Test Items

Classical Item Analysis from a Signal Detection Perspective

Peer reviewed

Direct link

DeCarlo, Lawrence T. – Journal of Educational Measurement, 2023

A conceptualization of multiple-choice exams in terms of signal detection theory (SDT) leads to simple measures of item difficulty and item discrimination that are closely related to, but also distinct from, those used in classical item analysis (CIA). The theory defines a "true split," depending on whether or not examinees know an item,…

Descriptors: Multiple Choice Tests, Test Items, Item Analysis, Test Wiseness

Efficiency of Targeted Multistage Calibration Designs under Practical Constraints: A Simulation Study

Peer reviewed

Direct link

Berger, Stéphanie; Verschoor, Angela J.; Eggen, Theo J. H. M.; Moser, Urs – Journal of Educational Measurement, 2019

Calibration of an item bank for computer adaptive testing requires substantial resources. In this study, we investigated whether the efficiency of calibration under the Rasch model could be enhanced by improving the match between item difficulty and student ability. We introduced targeted multistage calibration designs, a design type that…

Descriptors: Simulation, Computer Assisted Testing, Test Items, Difficulty Level

On Joining a Signal Detection Choice Model with Response Time Models

Peer reviewed

Direct link

DeCarlo, Lawrence T. – Journal of Educational Measurement, 2021

In a signal detection theory (SDT) approach to multiple choice exams, examinees are viewed as choosing, for each item, the alternative that is perceived as being the most plausible, with perceived plausibility depending in part on whether or not an item is known. The SDT model is a process model and provides measures of item difficulty, item…

Descriptors: Perception, Bias, Theories, Test Items

Computerized Adaptive Testing in Early Education: Exploring the Impact of Item Position Effects on Ability Estimation

Peer reviewed

Direct link

Albano, Anthony D.; Cai, Liuhan; Lease, Erin M.; McConnell, Scott R. – Journal of Educational Measurement, 2019

Studies have shown that item difficulty can vary significantly based on the context of an item within a test form. In particular, item position may be associated with practice and fatigue effects that influence item parameter estimation. The purpose of this research was to examine the relevance of item position specifically for assessments used in…

Descriptors: Test Items, Computer Assisted Testing, Item Analysis, Difficulty Level

Controlling Bias in Both Constructed Response and Multiple-Choice Items When Analyzed with the Dichotomous Rasch Model

Peer reviewed

Direct link

Andrich, David; Marais, Ida – Journal of Educational Measurement, 2018

Even though guessing biases difficulty estimates as a function of item difficulty in the dichotomous Rasch model, assessment programs with tests which include multiple-choice items often construct scales using this model. Research has shown that when all items are multiple-choice, this bias can largely be eliminated. However, many assessments have…

Descriptors: Multiple Choice Tests, Test Items, Guessing (Tests), Test Bias

Equating with Miditests Using IRT

Peer reviewed

Direct link

Fitzpatrick, Joseph; Skorupski, William P. – Journal of Educational Measurement, 2016

The equating performance of two internal anchor test structures--miditests and minitests--is studied for four IRT equating methods using simulated data. Originally proposed by Sinharay and Holland, miditests are anchors that have the same mean difficulty as the overall test but less variance in item difficulties. Four popular IRT equating methods…

Descriptors: Difficulty Level, Test Items, Comparative Analysis, Test Construction

Situations Where It Is Appropriate to Use Frequency Estimation Equipercentile Equating

Peer reviewed

Direct link

Guo, Hongwen; Oh, Hyeonjoo J.; Eignor, Daniel – Journal of Educational Measurement, 2013

In operational equating situations, frequency estimation equipercentile equating is considered only when the old and new groups have similar abilities. The frequency estimation assumptions are investigated in this study under various situations from both the levels of theoretical interest and practical use. It shows that frequency estimation…

Descriptors: Equated Scores, Computation, Statistical Analysis, Test Items

Monitoring Items in Real Time to Enhance CAT Security

Peer reviewed

Direct link

Zhang, Jinming; Li, Jie – Journal of Educational Measurement, 2016

An IRT-based sequential procedure is developed to monitor items for enhancing test security. The procedure uses a series of statistical hypothesis tests to examine whether the statistical characteristics of each item under inspection have changed significantly during CAT administration. This procedure is compared with a previously developed…

Descriptors: Computer Assisted Testing, Test Items, Difficulty Level, Item Response Theory

A Comparison of Different Psychometric Approaches to Modeling Testlet Structures: An Example with C-Tests

Peer reviewed

Direct link

Schroeders, Ulrich; Robitzsch, Alexander; Schipolowski, Stefan – Journal of Educational Measurement, 2014

C-tests are a specific variant of cloze tests that are considered time-efficient, valid indicators of general language proficiency. They are commonly analyzed with models of item response theory assuming local item independence. In this article we estimated local interdependencies for 12 C-tests and compared the changes in item difficulties,…

Descriptors: Comparative Analysis, Psychometrics, Cloze Procedure, Language Tests

Item Response Theory Models for Performance Decline during Testing

Peer reviewed

Direct link

Jin, Kuan-Yu; Wang, Wen-Chung – Journal of Educational Measurement, 2014

Sometimes, test-takers may not be able to attempt all items to the best of their ability (with full effort) due to personal factors (e.g., low motivation) or testing conditions (e.g., time limit), resulting in poor performances on certain items, especially those located toward the end of a test. Standard item response theory (IRT) models fail to…

Descriptors: Student Evaluation, Item Response Theory, Models, Simulation

Investigating the Effect of Item Position in Computer-Based Tests

Peer reviewed

Direct link

Li, Feiming; Cohen, Allan; Shen, Linjun – Journal of Educational Measurement, 2012

Computer-based tests (CBTs) often use random ordering of items in order to minimize item exposure and reduce the potential for answer copying. Little research has been done, however, to examine item position effects for these tests. In this study, different versions of a Rasch model and different response time models were examined and applied to…

Descriptors: Computer Assisted Testing, Test Items, Item Response Theory, Models

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5

Camilli, Gregory	2
DeCarlo, Lawrence T.	2
Jiao, Hong	2
Liu, Jinghua	2
Prowker, Adam	2
Wang, Shudong	2
Yen, Wendy M.	2
Albano, Anthony D.	1
Andrich, David	1
Attali, Yigal	1
Becker, Kirk	1
Bejar, Isaac I.	1
Beretvas, S. Natasha	1
Berger, Stéphanie	1
Bielinski, John	1
Bleiler, Timothy	1
Bolt, Daniel M.	1
Bridgeman, Brent	1
Cai, Liuhan	1
Carolin Hahnel	1
Chalifour, Clark L.	1
Chiu, Ting-Wei	1
Cline, Frederick	1
Clough, Sara J.	1
More ▼