Publication Date
In 2025 | 2 |
Since 2024 | 7 |
Since 2021 (last 5 years) | 22 |
Since 2016 (last 10 years) | 36 |
Since 2006 (last 20 years) | 57 |
Descriptor
Accuracy | 57 |
Test Length | 57 |
Test Items | 36 |
Item Response Theory | 35 |
Sample Size | 23 |
Computation | 20 |
Computer Assisted Testing | 20 |
Adaptive Testing | 17 |
Simulation | 15 |
Correlation | 14 |
Classification | 13 |
More ▼ |
Source
Author
Cheng, Ying | 4 |
Wang, Chun | 3 |
Baris Pekmezci, Fulya | 2 |
Bradshaw, Laine | 2 |
He, Wei | 2 |
Huggins-Manley, Anne Corinne | 2 |
Hull, Michael M. | 2 |
Lathrop, Quinn N. | 2 |
Mae, Naohiro | 2 |
Svetina, Dubravka | 2 |
Wolfe, Edward W. | 2 |
More ▼ |
Publication Type
Reports - Research | 45 |
Journal Articles | 43 |
Dissertations/Theses -… | 10 |
Reports - Evaluative | 2 |
Speeches/Meeting Papers | 2 |
Numerical/Quantitative Data | 1 |
Education Level
Higher Education | 4 |
Postsecondary Education | 4 |
Early Childhood Education | 2 |
Elementary Education | 2 |
Secondary Education | 2 |
Elementary Secondary Education | 1 |
Grade 3 | 1 |
High Schools | 1 |
Preschool Education | 1 |
Primary Education | 1 |
Audience
Laws, Policies, & Programs
Assessments and Surveys
Force Concept Inventory | 1 |
MacArthur Communicative… | 1 |
National Assessment of… | 1 |
Program for International… | 1 |
Trends in International… | 1 |
What Works Clearinghouse Rating
Hasibe Yahsi Sari; Hulya Kelecioglu – International Journal of Assessment Tools in Education, 2025
The aim of the study is to examine the effect of polytomous item ratio on ability estimation in different conditions in multistage tests (MST) using mixed tests. The study is simulation-based research. In the PISA 2018 application, the ability parameters of the individuals and the item pool were created by using the item parameters estimated from…
Descriptors: Test Items, Test Format, Accuracy, Test Length
Sun, Ting; Kim, Stella Yun – Measurement: Interdisciplinary Research and Perspectives, 2021
In many large testing programs, equipercentile equating has been widely used under a random groups design to adjust test difficulty between forms. However, one thorny issue occurs with equipercentile equating when a particular score has no observed frequency. The purpose of this study is to suggest and evaluate six potential methods in…
Descriptors: Equated Scores, Test Length, Sample Size, Methods
Jing Ma – ProQuest LLC, 2024
This study investigated the impact of scoring polytomous items later on measurement precision, classification accuracy, and test security in mixed-format adaptive testing. Utilizing the shadow test approach, a simulation study was conducted across various test designs, lengths, number and location of polytomous item. Results showed that while…
Descriptors: Scoring, Adaptive Testing, Test Items, Classification
Edwards, Ashley A.; Joyner, Keanan J.; Schatschneider, Christopher – Educational and Psychological Measurement, 2021
The accuracy of certain internal consistency estimators have been questioned in recent years. The present study tests the accuracy of six reliability estimators (Cronbach's alpha, omega, omega hierarchical, Revelle's omega, and greatest lower bound) in 140 simulated conditions of unidimensional continuous data with uncorrelated errors with varying…
Descriptors: Reliability, Computation, Accuracy, Sample Size
Xiao, Leifeng; Hau, Kit-Tai – Applied Measurement in Education, 2023
We compared coefficient alpha with five alternatives (omega total, omega RT, omega h, GLB, and coefficient H) in two simulation studies. Results showed for unidimensional scales, (a) all indices except omega h performed similarly well for most conditions; (b) alpha is still good; (c) GLB and coefficient H overestimated reliability with small…
Descriptors: Test Theory, Test Reliability, Factor Analysis, Test Length
Shaojie Wang; Won-Chan Lee; Minqiang Zhang; Lixin Yuan – Applied Measurement in Education, 2024
To reduce the impact of parameter estimation errors on IRT linking results, recent work introduced two information-weighted characteristic curve methods for dichotomous items. These two methods showed outstanding performance in both simulation and pseudo-form pseudo-group analysis. The current study expands upon the concept of information…
Descriptors: Item Response Theory, Test Format, Test Length, Error of Measurement
Chenchen Ma; Jing Ouyang; Chun Wang; Gongjun Xu – Grantee Submission, 2024
Survey instruments and assessments are frequently used in many domains of social science. When the constructs that these assessments try to measure become multifaceted, multidimensional item response theory (MIRT) provides a unified framework and convenient statistical tool for item analysis, calibration, and scoring. However, the computational…
Descriptors: Algorithms, Item Response Theory, Scoring, Accuracy
Fatih Orçan – International Journal of Assessment Tools in Education, 2025
Factor analysis is a statistical method to explore the relationships among observed variables and identify latent structures. It is crucial in scale development and validity analysis. Key factors affecting the accuracy of factor analysis results include the type of data, sample size, and the number of response categories. While some studies…
Descriptors: Factor Analysis, Factor Structure, Item Response Theory, Sample Size
Matayoshi, Jeffrey; Cosyn, Eric; Uzun, Hasan – International Journal of Artificial Intelligence in Education, 2021
Many recent studies have looked at the viability of applying recurrent neural networks (RNNs) to educational data. In most cases, this is done by comparing their performance to existing models in the artificial intelligence in education (AIED) and educational data mining (EDM) fields. While there is increasing evidence that, in many situations,…
Descriptors: Artificial Intelligence, Data Analysis, Student Evaluation, Adaptive Testing
Lin, Yin; Brown, Anna; Williams, Paul – Educational and Psychological Measurement, 2023
Several forced-choice (FC) computerized adaptive tests (CATs) have emerged in the field of organizational psychology, all of them employing ideal-point items. However, despite most items developed historically follow dominance response models, research on FC CAT using dominance items is limited. Existing research is heavily dominated by…
Descriptors: Measurement Techniques, Computer Assisted Testing, Adaptive Testing, Industrial Psychology
Cheng, Ying; Shao, Can – Educational and Psychological Measurement, 2022
Computer-based and web-based testing have become increasingly popular in recent years. Their popularity has dramatically expanded the availability of response time data. Compared to the conventional item response data that are often dichotomous or polytomous, response time has the advantage of being continuous and can be collected in an…
Descriptors: Reaction Time, Test Wiseness, Computer Assisted Testing, Simulation
Nikola Ebenbeck; Markus Gebhardt – Journal of Special Education Technology, 2024
Technologies that enable individualization for students have significant potential in special education. Computerized Adaptive Testing (CAT) refers to digital assessments that automatically adjust their difficulty level based on students' abilities, allowing for personalized, efficient, and accurate measurement. This article examines whether CAT…
Descriptors: Computer Assisted Testing, Students with Disabilities, Special Education, Grade 3
Kalkan, Ömür Kaya – Measurement: Interdisciplinary Research and Perspectives, 2022
The four-parameter logistic (4PL) Item Response Theory (IRT) model has recently been reconsidered in the literature due to the advances in the statistical modeling software and the recent developments in the estimation of the 4PL IRT model parameters. The current simulation study evaluated the performance of expectation-maximization (EM),…
Descriptors: Comparative Analysis, Sample Size, Test Length, Algorithms
Sedat Sen; Allan S. Cohen – Educational and Psychological Measurement, 2024
A Monte Carlo simulation study was conducted to compare fit indices used for detecting the correct latent class in three dichotomous mixture item response theory (IRT) models. Ten indices were considered: Akaike's information criterion (AIC), the corrected AIC (AICc), Bayesian information criterion (BIC), consistent AIC (CAIC), Draper's…
Descriptors: Goodness of Fit, Item Response Theory, Sample Size, Classification
Baris Pekmezci, Fulya; Sengul Avsar, Asiye – International Journal of Assessment Tools in Education, 2021
There is a great deal of research about item response theory (IRT) conducted by simulations. Item and ability parameters are estimated with varying numbers of replications under different test conditions. However, it is not clear what the appropriate number of replications should be. The aim of the current study is to develop guidelines for the…
Descriptors: Item Response Theory, Computation, Accuracy, Monte Carlo Methods