Publication Date
| In 2026 | 0 |
| Since 2025 | 15 |
| Since 2022 (last 5 years) | 63 |
| Since 2017 (last 10 years) | 162 |
| Since 2007 (last 20 years) | 321 |
Descriptor
Source
Author
| Hambleton, Ronald K. | 15 |
| Wang, Wen-Chung | 9 |
| Livingston, Samuel A. | 6 |
| Sijtsma, Klaas | 6 |
| Wainer, Howard | 6 |
| Weiss, David J. | 6 |
| Wilcox, Rand R. | 6 |
| Cheng, Ying | 5 |
| Gessaroli, Marc E. | 5 |
| Lee, Won-Chan | 5 |
| Lewis, Charles | 5 |
| More ▼ | |
Publication Type
Education Level
Location
| Turkey | 8 |
| Australia | 7 |
| Canada | 7 |
| China | 5 |
| Netherlands | 5 |
| Japan | 4 |
| Taiwan | 4 |
| United Kingdom | 4 |
| Germany | 3 |
| Michigan | 3 |
| Singapore | 3 |
| More ▼ | |
Laws, Policies, & Programs
| Americans with Disabilities… | 1 |
| Equal Access | 1 |
| Job Training Partnership Act… | 1 |
| Race to the Top | 1 |
| Rehabilitation Act 1973… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Cheng, Ying; Shao, Can – Educational and Psychological Measurement, 2022
Computer-based and web-based testing have become increasingly popular in recent years. Their popularity has dramatically expanded the availability of response time data. Compared to the conventional item response data that are often dichotomous or polytomous, response time has the advantage of being continuous and can be collected in an…
Descriptors: Reaction Time, Test Wiseness, Computer Assisted Testing, Simulation
Erdem-Kara, Basak; Dogan, Nuri – International Journal of Assessment Tools in Education, 2022
Recently, adaptive test approaches have become a viable alternative to traditional fixed-item tests. The main advantage of adaptive tests is that they reach desired measurement precision with fewer items. However, fewer items mean that each item has a more significant effect on ability estimation and therefore those tests are open to more…
Descriptors: Item Analysis, Computer Assisted Testing, Test Items, Test Construction
Xue Zhang; Chun Wang – Grantee Submission, 2022
Item-level fit analysis not only serves as a complementary check to global fit analysis, it is also essential in scale development because the fit results will guide item revision and/or deletion (Liu & Maydeu-Olivares, 2014). During data collection, missing response data may likely happen due to various reasons. Chi-square-based item fit…
Descriptors: Goodness of Fit, Item Response Theory, Scores, Test Length
Ying Xu; Xiaodong Li; Jin Chen – Language Testing, 2025
This article provides a detailed review of the Computer-based English Listening Speaking Test (CELST) used in Guangdong, China, as part of the National Matriculation English Test (NMET) to assess students' English proficiency. The CELST measures listening and speaking skills as outlined in the "English Curriculum for Senior Middle…
Descriptors: Computer Assisted Testing, English (Second Language), Language Tests, Listening Comprehension Tests
Nikola Ebenbeck; Markus Gebhardt – Journal of Special Education Technology, 2024
Technologies that enable individualization for students have significant potential in special education. Computerized Adaptive Testing (CAT) refers to digital assessments that automatically adjust their difficulty level based on students' abilities, allowing for personalized, efficient, and accurate measurement. This article examines whether CAT…
Descriptors: Computer Assisted Testing, Students with Disabilities, Special Education, Grade 3
Kalkan, Ömür Kaya – Measurement: Interdisciplinary Research and Perspectives, 2022
The four-parameter logistic (4PL) Item Response Theory (IRT) model has recently been reconsidered in the literature due to the advances in the statistical modeling software and the recent developments in the estimation of the 4PL IRT model parameters. The current simulation study evaluated the performance of expectation-maximization (EM),…
Descriptors: Comparative Analysis, Sample Size, Test Length, Algorithms
Haimiao Yuan – ProQuest LLC, 2022
The application of diagnostic classification models (DCMs) in the field of educational measurement is getting more attention in recent years. To make a valid inference from the model, it is important to ensure that the model fits the data. The purpose of the present study was to investigate the performance of the limited information…
Descriptors: Goodness of Fit, Educational Assessment, Educational Diagnosis, Models
Sedat Sen; Allan S. Cohen – Educational and Psychological Measurement, 2024
A Monte Carlo simulation study was conducted to compare fit indices used for detecting the correct latent class in three dichotomous mixture item response theory (IRT) models. Ten indices were considered: Akaike's information criterion (AIC), the corrected AIC (AICc), Bayesian information criterion (BIC), consistent AIC (CAIC), Draper's…
Descriptors: Goodness of Fit, Item Response Theory, Sample Size, Classification
Yu, Albert; Douglas, Jeffrey A. – Journal of Educational and Behavioral Statistics, 2023
We propose a new item response theory growth model with item-specific learning parameters, or ISLP, and two variations of this model. In the ISLP model, either items or blocks of items have their own learning parameters. This model may be used to improve the efficiency of learning in a formative assessment. We show ways that the ISLP model's…
Descriptors: Item Response Theory, Learning, Markov Processes, Monte Carlo Methods
Fu, Yanyan; Strachan, Tyler; Ip, Edward H.; Willse, John T.; Chen, Shyh-Huei; Ackerman, Terry – International Journal of Testing, 2020
This research examined correlation estimates between latent abilities when using the two-dimensional and three-dimensional compensatory and noncompensatory item response theory models. Simulation study results showed that the recovery of the latent correlation was best when the test contained 100% of simple structure items for all models and…
Descriptors: Item Response Theory, Models, Test Items, Simulation
Su, Shiyang; Wang, Chun; Weiss, David J. – Educational and Psychological Measurement, 2021
S-X[superscript 2] is a popular item fit index that is available in commercial software packages such as "flex"MIRT. However, no research has systematically examined the performance of S-X[superscript 2] for detecting item misfit within the context of the multidimensional graded response model (MGRM). The primary goal of this study was…
Descriptors: Statistics, Goodness of Fit, Test Items, Models
Timothy S. Faith – Teaching and Learning Excellence through Scholarship, 2024
This study compared traditional methods of college-level instruction, including lecture and class discussion followed by assessment via course content exams, with a variety of other instructional techniques. The intent was to evaluate whether more contemporary instructional techniques are significantly correlated with improved average exam scores…
Descriptors: Community College Students, Business Administration Education, Teaching Methods, Alternative Assessment
Derek Sauder – ProQuest LLC, 2020
The Rasch model is commonly used to calibrate multiple choice items. However, the sample sizes needed to estimate the Rasch model can be difficult to attain (e.g., consider a small testing company trying to pretest new items). With small sample sizes, auxiliary information besides the item responses may improve estimation of the item parameters.…
Descriptors: Item Response Theory, Sample Size, Computation, Test Length
Jingwen Wang; Ying Zheng; Yi Zou – Language Testing in Asia, 2024
Pearson Test of English Academic (PTE Academic), a high-stakes English language proficiency test, underwent substantial revisions in 2021. The test duration was reduced from 3 h to 2 h by reducing specific task numbers and sections. This study investigates the impact of these changes on teachers' perceptions and teaching practices, areas…
Descriptors: Foreign Countries, High Stakes Tests, Language Proficiency, Language Tests
Ellis, Jules L. – Educational and Psychological Measurement, 2021
This study develops a theoretical model for the costs of an exam as a function of its duration. Two kind of costs are distinguished: (1) the costs of measurement errors and (2) the costs of the measurement. Both costs are expressed in time of the student. Based on a classical test theory model, enriched with assumptions on the context, the costs…
Descriptors: Test Length, Models, Error of Measurement, Measurement

Peer reviewed
Direct link
