Publication Date
| In 2026 | 0 |
| Since 2025 | 197 |
| Since 2022 (last 5 years) | 1067 |
| Since 2017 (last 10 years) | 2577 |
| Since 2007 (last 20 years) | 4938 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 653 |
| Teachers | 563 |
| Researchers | 250 |
| Students | 201 |
| Administrators | 81 |
| Policymakers | 22 |
| Parents | 17 |
| Counselors | 8 |
| Community | 7 |
| Support Staff | 3 |
| Media Staff | 1 |
| More ▼ | |
Location
| Turkey | 225 |
| Canada | 223 |
| Australia | 155 |
| Germany | 116 |
| United States | 99 |
| China | 90 |
| Florida | 86 |
| Indonesia | 82 |
| Taiwan | 78 |
| United Kingdom | 73 |
| California | 65 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 4 |
| Meets WWC Standards with or without Reservations | 4 |
| Does not meet standards | 1 |
Yuan, Lu; Huang, Yingshi; Li, Shuhang; Chen, Ping – Journal of Educational Measurement, 2023
Online calibration is a key technology for item calibration in computerized adaptive testing (CAT) and has been widely used in various forms of CAT, including unidimensional CAT, multidimensional CAT (MCAT), CAT with polytomously scored items, and cognitive diagnostic CAT. However, as multidimensional and polytomous assessment data become more…
Descriptors: Computer Assisted Testing, Adaptive Testing, Computation, Test Items
Development of a High-Accuracy and Effective Online Calibration Method in CD-CAT Based on Gini Index
Tan, Qingrong; Cai, Yan; Luo, Fen; Tu, Dongbo – Journal of Educational and Behavioral Statistics, 2023
To improve the calibration accuracy and calibration efficiency of cognitive diagnostic computerized adaptive testing (CD-CAT) for new items and, ultimately, contribute to the widespread application of CD-CAT in practice, the current article proposed a Gini-based online calibration method that can simultaneously calibrate the Q-matrix and item…
Descriptors: Cognitive Tests, Computer Assisted Testing, Adaptive Testing, Accuracy
Jackson, Kayla – ProQuest LLC, 2023
Prior research highlights the benefits of multimode surveys and best practices for item-by-item (IBI) and matrix-type survey items. Some researchers have explored whether mode differences for online and paper surveys persist for these survey item types. However, no studies discuss measurement invariance when both item types and online modes are…
Descriptors: Test Items, Surveys, Error of Measurement, Item Response Theory
Fellinghauer, Carolina; Debelak, Rudolf; Strobl, Carolin – Educational and Psychological Measurement, 2023
This simulation study investigated to what extent departures from construct similarity as well as differences in the difficulty and targeting of scales impact the score transformation when scales are equated by means of concurrent calibration using the partial credit model with a common person design. Practical implications of the simulation…
Descriptors: True Scores, Equated Scores, Test Items, Sample Size
Semih Asiret; Seçil Ömür Sünbül – International Journal of Psychology and Educational Studies, 2023
In this study, it was aimed to examine the effect of missing data in different patterns and sizes on test equating methods under the NEAT design for different factors. For this purpose, as part of this study, factors such as sample size, average difficulty level difference between the test forms, difference between the ability distribution,…
Descriptors: Research Problems, Data, Test Items, Equated Scores
Bayesian Logistic Regression: A New Method to Calibrate Pretest Items in Multistage Adaptive Testing
TsungHan Ho – Applied Measurement in Education, 2023
An operational multistage adaptive test (MST) requires the development of a large item bank and the effort to continuously replenish the item bank due to concerns about test security and validity over the long term. New items should be pretested and linked to the item bank before being used operationally. The linking item volume fluctuations in…
Descriptors: Bayesian Statistics, Regression (Statistics), Test Items, Pretesting
Yasuda, Jun-ichiro; Hull, Michael M.; Mae, Naohiro – Physical Review Physics Education Research, 2023
We aim to graphically analyze the depth of conceptual understanding behind the Force Concept Inventory (FCI) responses of students, focusing on three questions (questions 1, 15, and 28). In our study, we created and implemented subquestions to clarify and quantify the students' reasoning steps in reaching their responses to the original FCI…
Descriptors: Scientific Concepts, Concept Formation, Misconceptions, Visual Aids
Sinharay, Sandip – Educational Measurement: Issues and Practice, 2022
Administrative problems such as computer malfunction and power outage occasionally lead to missing item scores, and hence to incomplete data, on credentialing tests such as the United States Medical Licensing examination. Feinberg compared four approaches for reporting pass-fail decisions to the examinees with incomplete data on credentialing…
Descriptors: Testing Problems, High Stakes Tests, Credentials, Test Items
Godoy-Giménez, M.; González-Rodríguez, A.; Cañadas, F.; Estévez, A. F.; Sayans-Jiménez, P. – Journal of Autism and Developmental Disorders, 2022
Although, the operationalization of the autism spectrum disorder has been updated around two domains, the broad autism phenotype (BAP) one has not. Additionally, the items of the three common BAP measures, the Broad Autism Phenotype Questionnaire (BAPQ), the Autism Quotient, and the Social Responsiveness Scale (SRS), remain organized around a…
Descriptors: Autism, Pervasive Developmental Disorders, Measurement, Screening Tests
Lim, Hwanggyu; Choe, Edison M.; Han, Kyung T. – Journal of Educational Measurement, 2022
Differential item functioning (DIF) of test items should be evaluated using practical methods that can produce accurate and useful results. Among a plethora of DIF detection techniques, we introduce the new "Residual DIF" (RDIF) framework, which stands out for its accessibility without sacrificing efficacy. This framework consists of…
Descriptors: Test Items, Item Response Theory, Identification, Robustness (Statistics)
Menold, Natalja; Raykov, Tenko – Educational and Psychological Measurement, 2022
The possible dependency of criterion validity on item formulation in a multicomponent measuring instrument is examined. The discussion is concerned with evaluation of the differences in criterion validity between two or more groups (populations/subpopulations) that have been administered instruments with items having differently formulated item…
Descriptors: Test Items, Measures (Individuals), Test Validity, Difficulty Level
Guastadisegni, Lucia; Cagnone, Silvia; Moustaki, Irini; Vasdekis, Vassilis – Educational and Psychological Measurement, 2022
This article studies the Type I error, false positive rates, and power of four versions of the Lagrange multiplier test to detect measurement noninvariance in item response theory (IRT) models for binary data under model misspecification. The tests considered are the Lagrange multiplier test computed with the Hessian and cross-product approach,…
Descriptors: Measurement, Statistical Analysis, Item Response Theory, Test Items
Li, Dongmei; Kapoor, Shalini – Educational Measurement: Issues and Practice, 2022
Population invariance is a desirable property of test equating which might not hold when significant changes occur in the test population, such as those brought about by the COVID-19 pandemic. This research aims to investigate whether equating functions are reasonably invariant when the test population is impacted by the pandemic. Based on…
Descriptors: Test Items, Equated Scores, COVID-19, Pandemics
Olney, Andrew M. – Grantee Submission, 2022
Multi-angle question answering models have recently been proposed that promise to perform related tasks like question generation. However, performance on related tasks has not been thoroughly studied. We investigate a leading model called Macaw on the task of multiple choice question generation and evaluate its performance on three angles that…
Descriptors: Test Construction, Multiple Choice Tests, Test Items, Models
Kim, Sooyeon; Walker, Michael E. – Educational Measurement: Issues and Practice, 2022
Test equating requires collecting data to link the scores from different forms of a test. Problems arise when equating samples are not equivalent and the test forms to be linked share no common items by which to measure or adjust for the group nonequivalence. Using data from five operational test forms, we created five pairs of research forms for…
Descriptors: Ability, Tests, Equated Scores, Testing Problems

Peer reviewed
Direct link
