NotesFAQContact Us
Collection
Advanced
Search Tips
Source
Journal of Educational…61
Audience
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 61 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Bolt, Daniel M.; Liao, Xiangyi – Journal of Educational Measurement, 2021
We revisit the empirically observed positive correlation between DIF and difficulty studied by Freedle and commonly seen in tests of verbal proficiency when comparing populations of different mean latent proficiency levels. It is shown that a positive correlation between DIF and difficulty estimates is actually an expected result (absent any true…
Descriptors: Test Bias, Difficulty Level, Correlation, Verbal Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Frank Goldhammer; Ulf Kroehne; Carolin Hahnel; Johannes Naumann; Paul De Boeck – Journal of Educational Measurement, 2024
The efficiency of cognitive component skills is typically assessed with speeded performance tests. Interpreting only effective ability or effective speed as efficiency may be challenging because of the within-person dependency between both variables (speed-ability tradeoff, SAT). The present study measures efficiency as effective ability…
Descriptors: Timed Tests, Efficiency, Scores, Test Interpretation
Peer reviewed Peer reviewed
Direct linkDirect link
Liu, Jinghua; Becker, Kirk – Journal of Educational Measurement, 2022
For any testing programs that administer multiple forms across multiple years, maintaining score comparability via equating is essential. With continuous testing and high-stakes results, especially with less secure online administrations, testing programs must consider the potential for cheating on their exams. This study used empirical and…
Descriptors: Cheating, Item Response Theory, Scores, High Stakes Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Wyse, Adam E.; McBride, James R. – Journal of Educational Measurement, 2021
A key consideration when giving any computerized adaptive test (CAT) is how much adaptation is present when the test is used in practice. This study introduces a new framework to measure the amount of adaptation of Rasch-based CATs based on looking at the differences between the selected item locations (Rasch item difficulty parameters) of the…
Descriptors: Item Response Theory, Computer Assisted Testing, Adaptive Testing, Test Items
Peer reviewed Peer reviewed
Direct linkDirect link
DeCarlo, Lawrence T. – Journal of Educational Measurement, 2023
A conceptualization of multiple-choice exams in terms of signal detection theory (SDT) leads to simple measures of item difficulty and item discrimination that are closely related to, but also distinct from, those used in classical item analysis (CIA). The theory defines a "true split," depending on whether or not examinees know an item,…
Descriptors: Multiple Choice Tests, Test Items, Item Analysis, Test Wiseness
Peer reviewed Peer reviewed
Direct linkDirect link
Berger, Stéphanie; Verschoor, Angela J.; Eggen, Theo J. H. M.; Moser, Urs – Journal of Educational Measurement, 2019
Calibration of an item bank for computer adaptive testing requires substantial resources. In this study, we investigated whether the efficiency of calibration under the Rasch model could be enhanced by improving the match between item difficulty and student ability. We introduced targeted multistage calibration designs, a design type that…
Descriptors: Simulation, Computer Assisted Testing, Test Items, Difficulty Level
Peer reviewed Peer reviewed
Direct linkDirect link
DeCarlo, Lawrence T. – Journal of Educational Measurement, 2021
In a signal detection theory (SDT) approach to multiple choice exams, examinees are viewed as choosing, for each item, the alternative that is perceived as being the most plausible, with perceived plausibility depending in part on whether or not an item is known. The SDT model is a process model and provides measures of item difficulty, item…
Descriptors: Perception, Bias, Theories, Test Items
Peer reviewed Peer reviewed
Direct linkDirect link
Albano, Anthony D.; Cai, Liuhan; Lease, Erin M.; McConnell, Scott R. – Journal of Educational Measurement, 2019
Studies have shown that item difficulty can vary significantly based on the context of an item within a test form. In particular, item position may be associated with practice and fatigue effects that influence item parameter estimation. The purpose of this research was to examine the relevance of item position specifically for assessments used in…
Descriptors: Test Items, Computer Assisted Testing, Item Analysis, Difficulty Level
Peer reviewed Peer reviewed
Direct linkDirect link
Andrich, David; Marais, Ida – Journal of Educational Measurement, 2018
Even though guessing biases difficulty estimates as a function of item difficulty in the dichotomous Rasch model, assessment programs with tests which include multiple-choice items often construct scales using this model. Research has shown that when all items are multiple-choice, this bias can largely be eliminated. However, many assessments have…
Descriptors: Multiple Choice Tests, Test Items, Guessing (Tests), Test Bias
Peer reviewed Peer reviewed
Direct linkDirect link
Fitzpatrick, Joseph; Skorupski, William P. – Journal of Educational Measurement, 2016
The equating performance of two internal anchor test structures--miditests and minitests--is studied for four IRT equating methods using simulated data. Originally proposed by Sinharay and Holland, miditests are anchors that have the same mean difficulty as the overall test but less variance in item difficulties. Four popular IRT equating methods…
Descriptors: Difficulty Level, Test Items, Comparative Analysis, Test Construction
Peer reviewed Peer reviewed
Direct linkDirect link
Guo, Hongwen; Oh, Hyeonjoo J.; Eignor, Daniel – Journal of Educational Measurement, 2013
In operational equating situations, frequency estimation equipercentile equating is considered only when the old and new groups have similar abilities. The frequency estimation assumptions are investigated in this study under various situations from both the levels of theoretical interest and practical use. It shows that frequency estimation…
Descriptors: Equated Scores, Computation, Statistical Analysis, Test Items
Peer reviewed Peer reviewed
Direct linkDirect link
Zhang, Jinming; Li, Jie – Journal of Educational Measurement, 2016
An IRT-based sequential procedure is developed to monitor items for enhancing test security. The procedure uses a series of statistical hypothesis tests to examine whether the statistical characteristics of each item under inspection have changed significantly during CAT administration. This procedure is compared with a previously developed…
Descriptors: Computer Assisted Testing, Test Items, Difficulty Level, Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Schroeders, Ulrich; Robitzsch, Alexander; Schipolowski, Stefan – Journal of Educational Measurement, 2014
C-tests are a specific variant of cloze tests that are considered time-efficient, valid indicators of general language proficiency. They are commonly analyzed with models of item response theory assuming local item independence. In this article we estimated local interdependencies for 12 C-tests and compared the changes in item difficulties,…
Descriptors: Comparative Analysis, Psychometrics, Cloze Procedure, Language Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Jin, Kuan-Yu; Wang, Wen-Chung – Journal of Educational Measurement, 2014
Sometimes, test-takers may not be able to attempt all items to the best of their ability (with full effort) due to personal factors (e.g., low motivation) or testing conditions (e.g., time limit), resulting in poor performances on certain items, especially those located toward the end of a test. Standard item response theory (IRT) models fail to…
Descriptors: Student Evaluation, Item Response Theory, Models, Simulation
Peer reviewed Peer reviewed
Direct linkDirect link
Li, Feiming; Cohen, Allan; Shen, Linjun – Journal of Educational Measurement, 2012
Computer-based tests (CBTs) often use random ordering of items in order to minimize item exposure and reduce the potential for answer copying. Little research has been done, however, to examine item position effects for these tests. In this study, different versions of a Rasch model and different response time models were examined and applied to…
Descriptors: Computer Assisted Testing, Test Items, Item Response Theory, Models
Previous Page | Next Page »
Pages: 1  |  2  |  3  |  4  |  5