Publication Date
In 2025 | 0 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 4 |
Since 2016 (last 10 years) | 60 |
Since 2006 (last 20 years) | 125 |
Descriptor
Difficulty Level | 177 |
Statistical Analysis | 177 |
Test Items | 177 |
Item Analysis | 60 |
Item Response Theory | 59 |
Foreign Countries | 45 |
Test Construction | 44 |
Multiple Choice Tests | 35 |
Comparative Analysis | 30 |
Scores | 27 |
Test Validity | 27 |
More ▼ |
Source
Author
Tindal, Gerald | 4 |
Alonzo, Julie | 3 |
Livingston, Samuel A. | 3 |
Sinharay, Sandip | 3 |
Baird, Jo-Anne | 2 |
Bejar, Isaac I. | 2 |
Benton, Tom | 2 |
Bernholt, Sascha | 2 |
DeMars, Christine E. | 2 |
Feigenbaum, Miriam | 2 |
Futagi, Yoko | 2 |
More ▼ |
Publication Type
Education Level
Higher Education | 35 |
Postsecondary Education | 27 |
Secondary Education | 26 |
Middle Schools | 18 |
Elementary Education | 16 |
Junior High Schools | 13 |
Grade 8 | 9 |
High Schools | 9 |
Elementary Secondary Education | 6 |
Grade 5 | 6 |
Grade 7 | 6 |
More ▼ |
Audience
Researchers | 4 |
Practitioners | 1 |
Teachers | 1 |
Location
Australia | 7 |
Germany | 4 |
Turkey | 4 |
Canada | 3 |
Japan | 3 |
Minnesota | 3 |
Austria | 2 |
Belgium | 2 |
California | 2 |
Colorado | 2 |
France | 2 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Inga Laukaityte; Marie Wiberg – Practical Assessment, Research & Evaluation, 2024
The overall aim was to examine effects of differences in group ability and features of the anchor test form on equating bias and the standard error of equating (SEE) using both real and simulated data. Chained kernel equating, Postratification kernel equating, and Circle-arc equating were studied. A college admissions test with four different…
Descriptors: Ability Grouping, Test Items, College Entrance Examinations, High Stakes Tests
Tang, Xiaodan; Karabatsos, George; Chen, Haiqin – Applied Measurement in Education, 2020
In applications of item response theory (IRT) models, it is known that empirical violations of the local independence (LI) assumption can significantly bias parameter estimates. To address this issue, we propose a threshold-autoregressive item response theory (TAR-IRT) model that additionally accounts for order dependence among the item responses…
Descriptors: Item Response Theory, Test Items, Models, Computation
Akin-Arikan, Çigdem; Gelbal, Selahattin – Eurasian Journal of Educational Research, 2021
Purpose: This study aims to compare the performances of Item Response Theory (IRT) equating and kernel equating (KE) methods based on equating errors (RMSD) and standard error of equating (SEE) using the anchor item nonequivalent groups design. Method: Within this scope, a set of conditions, including ability distribution, type of anchor items…
Descriptors: Equated Scores, Item Response Theory, Test Items, Statistical Analysis
Benton, Tom – Research Matters, 2020
This article reviews the evidence on the extent to which experts' perceptions of item difficulties, captured using comparative judgement, can predict empirical item difficulties. This evidence is drawn from existing published studies on this topic and also from statistical analysis of data held by Cambridge Assessment. Having reviewed the…
Descriptors: Test Items, Difficulty Level, Expertise, Comparative Analysis
Metsämuuronen, Jari – International Journal of Educational Methodology, 2020
Pearson product-moment correlation coefficient between item g and test score X, known as item-test or item-total correlation ("Rit"), and item-rest correlation ("Rir") are two of the most used classical estimators for item discrimination power (IDP). Both "Rit" and "Rir" underestimate IDP caused by the…
Descriptors: Correlation, Test Items, Scores, Difficulty Level
Lozano, José H.; Revuelta, Javier – Applied Measurement in Education, 2021
The present study proposes a Bayesian approach for estimating and testing the operation-specific learning model, a variant of the linear logistic test model that allows for the measurement of the learning that occurs during a test as a result of the repeated use of the operations involved in the items. The advantages of using a Bayesian framework…
Descriptors: Bayesian Statistics, Computation, Learning, Testing
Luke G. Eglington; Philip I. Pavlik – Grantee Submission, 2020
Decades of research has shown that spacing practice trials over time can improve later memory, but there are few concrete recommendations concerning how to optimally space practice. We show that existing recommendations are inherently suboptimal due to their insensitivity to time costs and individual- and item-level differences. We introduce an…
Descriptors: Scheduling, Drills (Practice), Memory, Testing
Luke G. Eglington; Philip I. Pavlik Jr. – npj Science of Learning, 2020
Decades of research has shown that spacing practice trials over time can improve later memory, but there are few concrete recommendations concerning how to optimally space practice. We show that existing recommendations are inherently suboptimal due to their insensitivity to time costs and individual- and item-level differences. We introduce an…
Descriptors: Scheduling, Drills (Practice), Memory, Testing
Benton, Tom; Leech, Tony; Hughes, Sarah – Cambridge Assessment, 2020
In the context of examinations, the phrase "maintaining standards" usually refers to any activity designed to ensure that it is no easier (or harder) to achieve a given grade in one year than in another. Specifically, it tends to mean activities associated with setting examination grade boundaries. Benton et al (2020) describes a method…
Descriptors: Mathematics Tests, Equated Scores, Comparative Analysis, Difficulty Level
Lenhard, Wolfgang; Lenhard, Alexandra – Educational and Psychological Measurement, 2021
The interpretation of psychometric test results is usually based on norm scores. We compared semiparametric continuous norming (SPCN) with conventional norming methods by simulating results for test scales with different item numbers and difficulties via an item response theory approach. Subsequently, we modeled the norm scores based on random…
Descriptors: Test Norms, Scores, Regression (Statistics), Test Items
Sunbul, Onder; Yormaz, Seha – International Journal of Evaluation and Research in Education, 2018
In this study Type I Error and the power rates of omega (?) and GBT (generalized binomial test) indices were investigated for several nominal alpha levels and for 40 and 80-item test lengths with 10,000-examinee sample size under several test level restrictions. As a result, Type I error rates of both indices were found to be below the acceptable…
Descriptors: Difficulty Level, Cheating, Duplication, Test Length
Sunbul, Onder; Yormaz, Seha – Eurasian Journal of Educational Research, 2018
Purpose: Several studies can be found in the literature that investigate the performance of ? under various conditions. However no study for the effects of item difficulty, item discrimination, and ability restrictions on the performance of ? could be found. The current study aims to investigate the performance of ? for the conditions given below.…
Descriptors: Test Items, Difficulty Level, Ability, Cheating
Ilhan, Mustafa – International Journal of Assessment Tools in Education, 2019
This study investigated the effectiveness of statistical adjustments applied to rater bias in many-facet Rasch analysis. Some changes were first made in the dataset that did not include "rater × examinee" bias to cause to have "rater × examinee" bias. Later, bias adjustment was applied to rater bias included in the data file,…
Descriptors: Statistical Analysis, Item Response Theory, Evaluators, Bias
Arikan, Çigdem Akin – International Journal of Progressive Education, 2018
The main purpose of this study is to compare the test forms to the midi anchor test and the mini anchor test performance based on item response theory. The research was conducted with using simulated data which were generated based on Rasch model. In order to equate two test forms the anchor item nonequivalent groups (internal anchor test) was…
Descriptors: Equated Scores, Comparative Analysis, Item Response Theory, Tests
Is the Factor Observed in Investigations on the Item-Position Effect Actually the Difficulty Factor?
Schweizer, Karl; Troche, Stefan – Educational and Psychological Measurement, 2018
In confirmatory factor analysis quite similar models of measurement serve the detection of the difficulty factor and the factor due to the item-position effect. The item-position effect refers to the increasing dependency among the responses to successively presented items of a test whereas the difficulty factor is ascribed to the wide range of…
Descriptors: Investigations, Difficulty Level, Factor Analysis, Models